Machine Learning Algorithms to Predict Tree-Related Microhabitats using Airborne Laser Scanning

: In the last few years, the occurrence and abundance of tree-related microhabitats and habitat trees have gained great attention across Europe as indicators of forest biodiversity. Nevertheless, observing microhabitats in the ﬁeld requires time and well-trained sta ﬀ . For this reason, new e ﬃ cient semiautomatic systems for their identiﬁcation and mapping on a large scale are necessary. This study aims at predicting microhabitats in a mixed and multi-layered Mediterranean forest using Airborne Laser Scanning data through the implementation of a Machine Learning algorithm. The study focuses on the identiﬁcation of LiDAR metrics useful for detecting microhabitats according to the recent hierarchical classiﬁcation system for Tree-related Microhabitats, from single microhabitats to the habitat trees. The results demonstrate that Airborne Laser Scanning point clouds support the prediction of microhabitat abundance. Better prediction capabilities were obtained at a higher hierarchical level and for some of the single microhabitats, such as epiphytic bryophytes, root buttress cavities, and branch holes. Metrics concerned with tree height distribution and crown density are the most important predictors of microhabitats in a multi-layered forest.


Introduction
In forestry, the occurrence and abundance of habitat trees-defined as "standing live or dead trees providing ecological niches (Tree-related Microhabitats, hereafter TreMs) such as cavities, bark pockets, large dead branches, epiphytes, cracks, sap runs, or trunk rot" [1]-highly contribute to biodiversity conservation, as they provide nutrition and protection for numerous living organisms, allowing their lifecycles. Promoting the retention of habitat trees from harvesting activities and ensuring a homogeneous spatial distribution of them inside the forests are crucial for fostering biodiversity conservation. For this reason, several studies have focused on investigating the relationships between forest structure characteristics and the occurrence and abundance of habitat trees and TreMs [2,3].
However, monitoring and assessing the occurrence and the spatial distribution of habitat trees in forests is a challenging task. Although several studies confirm that it tends to increase with large diameter classes [4][5][6], as for forest age, the frequency of habitat trees is expected to increase [7].
However, the frequency of habitat trees differs among European forests [8,9], and several other factors may affect their occurrence in forests [9,10] as well as their ecological heterogeneity, depending on the TreMs abundance and diversity. In turn, TreMs occurrence, abundance, and diversity are affected by many aspects of forestry and forest management, such as forest stand structure, management systems [7,11], forest ownership [12,13], tree species, and vitality [6,10,14].
Though habitat trees and TreMs are often associated with old-growth forests, in the last few decades, the number of studies focused on the evaluation of TreMs in managed forests has increased to support the integration of biodiversity conservation with timber production and the provision of other forest ecosystem services. In this context, silviculture could help not only in enriching the habitat tree abundance in forests but also to maintain their spatial and temporal continuity, which is of critical importance for the lifecycles of living organisms (Grove, 2001). Managing forests with the aim of enriching the presence of habitat trees and TreMs is challenging and requires an evaluation of ecosystem service trade-offs [5,15]. For this purpose, habitat trees' inventory in forests is crucial for obtaining a clear evaluation of their abundance and spatial distribution, as well as for planning forestry activities aimed at integrating biodiversity conservation aims with timber production. In fact, large trees represent potential conflict between timber production and habitat tree maintenance. On the other hand, inventory activities are time-consuming and expensive, especially if all trees must be inspected to check for TreMs occurrence.
For this reason, new approaches to detecting and predicting TreMs abundance on a large scale, with field surveys only for calibration and validation, are necessary. Remote sensing techniques offer mighty support for monitoring forest resources [16]. Such approaches could support forest managers in assessing forest biodiversity and defining forest planning activities. In the last few decades, their potential has strongly improved in terms of versatility (the derived information can be used for multiple purposes) and of the quality of the data (the precision of the information has increased significantly). Therefore, remote sensing is widely adopted for applications in forest inventory, land use mapping, and forest structure evaluation, and for assessing other Sustainable Forest Management indicators [17,18]. For instance, the use of LiDAR in forest inventory has strongly increased in the last few decades [19], providing new methods and improving the quality of data in order to allow as much information as possible to be obtained to assess forest structure and biodiversity [20], as well as forest productivity [21,22]. Though LiDAR is widely adopted to assess forest stand characteristics, the integration of different aerospace tools-such as the unmanned aerial vehicle [23,24] and satellite imagery [25][26][27]-or the integration of terrestrial and airborne laser scanning [28] has proved to be very useful for detect foresting inventory attributes in the Mediterranean and temperate forest ecosystems.
Nevertheless, attempts to use remote sensing techniques, such as satellite or airborne imagery, to detect habitat trees are rather rare (but see [4,29]) and focused mostly on beech, fir and spruce forests. In this scenario, we propose an investigation aimed at detecting habitat trees and TreMs through Airborne Laser Scanning (ALS) data in a multi-layered Mediterranean forest. To date, this study represents the first study that uses ALS data to detect TreMs abundance and habitat trees in a mixed and multi-layered Mediterranean forest. In this study, TreMs on all standing trees were observed and registered in the field according to the recent hierarchical classification [9], consisting of four levels. Subsequently, Machine Learning (ML) algorithms were applied to evaluate the weights of the ALS metrics useful for detecting TreMs. ML has exhibited excellent predictive abilities in several studies addressing different topics (see, for example, Zhou et al., [30]) and has recently been used to identify TreMs in Terrestrial Laser Scanning point clouds [29]. Specifically, the aim of this work is twofold: to demonstrate that ALS metrics can predict habitat trees and TreMs, applying ML, and more in depth, to determine to what extent ALS supports the prediction of TreMs, sorting them in four hierarchical categories. Finally, we highlight and discuss the ALS metrics according to their contributions to the prediction of TreMs.

Study Area
The study was carried out in Bosco Pennataro in Central Apennine, in Molise Region (Italy). The forest is located at a mean elevation of 930 m a.s.l., with a mean precipitation of about 1000 mm year −1 and mean annual temperature of 8.3 • C. Bosco Pennataro is a mixed and multi-layered forest, almost 200 ha large ( Figure 1). Sixteen tree species were counted in the whole study area, with plots that present at least 3 and at most 11 tree species (see Appendix A, Figure A1, and Table A1), where Turkey oak (Quercus cerris L.) is the most abundant. Further information about the characteristics of the forest structure can be found in Appendix A.

Study Area
The study was carried out in Bosco Pennataro in Central Apennine, in Molise Region (Italy). The forest is located at a mean elevation of 930 m a.s.l., with a mean precipitation of about 1000 mm year -1 and mean annual temperature of 8.3 °C. Bosco Pennataro is a mixed and multi-layered forest, almost 200 ha large ( Figure 1). Sixteen tree species were counted in the whole study area, with plots that present at least 3 and at most 11 tree species (see Appendix A, Figure A1, and Table A1), where Turkey oak (Quercus cerris L.) is the most abundant. Further information about the characteristics of the forest structure can be found in Appendix A. Figure 1. Study area. Geographical distribution of field plots within which forest inventory variables were collected. The sampling scheme was developed under the project FRESh LIFE using the oneper-stratum stratified sampling scheme [31].
Historically, Bosco Pennataro played an important ecological and cultural role in the Molise region, representing one of the five forests with regional ownership. In addition, it is part of the Natura 2000 network (code IT7212124), and it has been recognized within the Man and Biosphere program as a core area of the Collemeluccio-Montedimezzo Alto Molise Man and Biosphere reserve. Although in the past, Bosco Pennataro was largely exploited for its high productive capacity, currently, it is rather managed for conservative objectives, as high forest with continuous canopy cover and unevenly aged trees [32].

Field Data
Field data were collected in 2016 within 35 square plots, 0.05 ha large. The field observations of interest were all the trees inside the plot with a Diameter at Breast Height (DBH) ≥2.5 cm. For each of them, data on the DBH, tree height, crown length, tree vitality, and tree position were collected through a precision silvicultural approach using Field-Map technology (http://www.fieldmap.cz). In addition, we derived the count of TreMs on each standing tree within the sampling plots. The observed TreMs were then grouped according to the recent hierarchical classification system developed by [9,33], accounting for 64 different microhabitats (Table 1). Specifically, we grouped TreMs variables into four levels of hierarchical classification: Single TreMs (level 1), TreMs CAT-1 Figure 1. Study area. Geographical distribution of field plots within which forest inventory variables were collected. The sampling scheme was developed under the project FRESh LIFE using the one-per-stratum stratified sampling scheme [31].
Historically, Bosco Pennataro played an important ecological and cultural role in the Molise region, representing one of the five forests with regional ownership. In addition, it is part of the Natura 2000 network (code IT7212124), and it has been recognized within the Man and Biosphere program as a core area of the Collemeluccio-Montedimezzo Alto Molise Man and Biosphere reserve. Although in the past, Bosco Pennataro was largely exploited for its high productive capacity, currently, it is rather managed for conservative objectives, as high forest with continuous canopy cover and unevenly aged trees [32].

Field Data
Field data were collected in 2016 within 35 square plots, 0.05 ha large. The field observations of interest were all the trees inside the plot with a Diameter at Breast Height (DBH) ≥2.5 cm. For each of them, data on the DBH, tree height, crown length, tree vitality, and tree position were collected through a precision silvicultural approach using Field-Map technology (http://www.fieldmap.cz). In addition, we derived the count of TreMs on each standing tree within the sampling plots. The observed TreMs were then grouped according to the recent hierarchical classification system developed by [9,33], accounting for 64 different microhabitats (Table 1). Specifically, we grouped TreMs variables into four levels of hierarchical classification: Single TreMs (level 1), TreMs CAT-1 (level 2), TreMs CAT-2 (level 3), and TreMs SUM (level 4). This last category considers the sum of all the TreMs encountered for each standing tree. In addition, at level 4, we also counted trees where at least one TreMs was encountered (considered as habitat trees). In this study, we did not consider fallen deadwood.

ALS Data Acquisition and Processing
The ALS dataset was collected using a YellowScan LiDAR sensor, mounted on a light conventional helicopter. Data were taken on a single flight, in March 2016, by Oben srl (http://www.oben.it) within the project FRESh LIFE (https://freshlifeproject.net/). The sensor provides up to three echoes per shot, allowing the retrieval of orographic information under vegetation cover, such as the terrain elevation model, slope, and aspect. The sensor was set with a maximum scan angle of ±50 • and a pulse frequency of 20 kHz, resulting in an average density of 30 pulses m 2 , with an average point cloud density equal to 60 points/m 2 (min = 20; max = 105) and accuracy equal to ±15 cm ( Figure 2). The ALS raw data were analyzed using the LAStools software (https://rapidlasso.com). From the original point clouds, 37 canopy metrics were extracted in correspondence to the plots, which were subsequently used to predict plot-related forest biodiversity variables such as TreMs and habitat trees. The ALS metrics included the point height (min, max), coefficient of variation of height (avg, qav, std, ske, kur), cover density (cov_gap, dns.gap), point height percentiles (p1, p5, p10, p25, p50, p75, p90, p95, p99) and bicentiles (b10, b20, b30, b40, b50, b60, b70, b80, b90). See Appendix B for further details about the ALS metrics.  To avoid including poorly informative variables, TreMs categories and habitat trees were inspected to identify possible "near-zero-variance" variables. Specifically, we dropped all the TreMs and habitat trees variables that showed the following characteristics: i) the frequency of the most prevalent value over the second most frequent value was above the ratio 95%/5%, and ii) the percentage of unique values over the total number of samples was below 10%.

Machine Learning Algorithm Implementation
ML is typically free from making specific assumptions about the response variable for linear prediction and provides a better fit for complex non-linear relationships [34]. In this study, we used a widely adopted ML algorithm, random forest [35,36], to quantify the abilities of the ALS metrics in predicting the four levels of TreMs and habitat tree variables. Random forest was successfully implemented for forest inventory analysis, and it is recognized to provide accurate information for forest stand inventory attributes [37].
In our modeling setup, for each microhabitat type and category, we considered the count of microhabitats detected within each sampling plot as the response variable. Models were calibrated within the R package "caret" [38] and evaluated by implementing a five-times-repeated, nested crossvalidation scheme, which proved able to yield robust and unbiased performance estimates regardless of sample size [39]. In such a procedure, we initially split the data into seven folds, each containing To avoid including poorly informative variables, TreMs categories and habitat trees were inspected to identify possible "near-zero-variance" variables. Specifically, we dropped all the TreMs and habitat trees variables that showed the following characteristics: (i) the frequency of the most prevalent value over the second most frequent value was above the ratio 95%/5%, and (ii) the percentage of unique values over the total number of samples was below 10%.

Machine Learning Algorithm Implementation
ML is typically free from making specific assumptions about the response variable for linear prediction and provides a better fit for complex non-linear relationships [34]. In this study, we used a widely adopted ML algorithm, random forest [35,36], to quantify the abilities of the ALS metrics in predicting the four levels of TreMs and habitat tree variables. Random forest was successfully implemented for forest inventory analysis, and it is recognized to provide accurate information for forest stand inventory attributes [37].
In our modeling setup, for each microhabitat type and category, we considered the count of microhabitats detected within each sampling plot as the response variable. Models were calibrated within the R package "caret" [38] and evaluated by implementing a five-times-repeated, nested cross-validation scheme, which proved able to yield robust and unbiased performance estimates regardless of sample size [39]. In such a procedure, we initially split the data into seven folds, each containing five samples. Subsequently, we used six out of seven folds (i.e., 30 samples) to develop a separate Random forest, including variable selection and parameter tuning steps, then use the left-out fold to evaluate the model. This phase was repeated for all the seven folds. The variable selection procedure applied to each of the seven Random forest models was designed to reduce overfitting as much as possible. Since each cross-validation round included 30 samples against 11 covariates, we tested all the possible combinations of two, three, and four covariates at most (according to the empirical rule determined by [40]; also, see [41] and [42] for similar approaches), choosing the set that yielded the highest predictive accuracy. In addition, parameter tuning was carried out by selecting the most appropriate "mtry" value (i.e., the number of variables randomly sampled as candidates at each split in Random forest). To avoid further reducing the 30 samples used in each cross-validation round, the covariates combination and the "mtry" value that reported the best accuracy were identified through a "leave-one-out" scheme (29 samples against one). The predictive performance within the internal "leave-one-out" validation was assessed through the Root Mean Square Error (RMSE). Moreover, the accuracy within the external seven-folds cross-validation was evaluated through the Normalized Mean Absolute Error (NMAE, normalized by the difference between the maximum and minimum actual data: MAE/(max − min); [43]).

TreMs Occurrence and Abundance
A total of 40 out of 64 TreMs types were observed during the field survey, out of which seven represent 73% of the total amount ( Table 2). The TreMs frequencies within the plots range between 14 and 101, while those for habitat trees fall between 8 and 51. Lianas and crown microsoil are, in absolute terms, the most frequent TreMs encountered in the study area, representing 36% of the TreMs. The results highlight that cavities, such as rot-holes originating from branch breakage, and growth forms, such as root buttresses and water sprouts, are also rather frequent, representing 9% and 16%, respectively. Among deadwood, dead branches of small diameter are more frequent than dead branches with large diameters, and those that are not sun-exposed are more frequent than those that are sun-exposed.

Predicting TreMs and Habitat Trees Using ALS Metrics
The predictive abilities of the Random forest algorithm in the identification of TreMs slightly differs among the levels of the hierarchical classification, with NMAE values that range between 0.249 and 0.429 (Table 3). Overall, similar average values of the NMAE were observed for all the hierarchical levels. However, slightly better performance was reached for TreMs CAT-2 (0.270), while Single TreMs was the worst-predicted category (0.294), and the categories TreMs CAT-1 and TreMs SUM showed intermediate values equal to 0.287.
Looking at the level 1, though not clearly marked, the predictive performance increases with the frequency of TreMs, for which abundant TreMs such as DE13, GR22, and CV31 are more detectable than less frequent TreMs such as EP31, GR12, and DE11. More precisely, DE13 and GR22 are those microhabitats for which the best results were observed, 0.249 (SE = 0.067) and 0.259 (SE = 0.102), respectively. On the contrary, EP31 and GR12 showed NMAE values of 0.429 (SE = 0.268) and 0.355 (SE = 0.111), respectively ( Table 3). The remaining TreMs presented NMAE values that ranged between 0.265 and 0.289.

Ranking Metrics Contribution in Predicting Microhabitat Abundance
In absolute terms, cov_gap, max, and c03 are the most recurrent ALS metrics, which contributed to predicting most of the microhabitat types and categories ( Figure 3). However, the combination of ALS metrics in the prediction is not fixed, but the selection of metrics changes for the prediction of each microhabitat type and category.
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 21 In absolute terms, cov_gap, max, and c03 are the most recurrent ALS metrics, which contributed to predicting most of the microhabitat types and categories ( Figure 3). However, the combination of ALS metrics in the prediction is not fixed, but the selection of metrics changes for the prediction of each microhabitat type and category.
For example, at level 1, the most recurrent metrics are cov_gap (for CV31, CV32, DE11, GR11, GR12, and GR22), c03 (for CV31, CV32, DE11, GR11, and OT21), d02 (for DE11, EP33, GR22, and OT21), and kur (for CV31, EP31, EP33, and GR12), while the less recurrent ones are c01, p01, and b90, which were selected only once for the prediction of OT21, GR22, and GR11, respectively. Nevertheless, looking at EP31 (i.e., the best predicted TreMs), the model selected b70, d00, kur, and c04 as contributing ALS metrics in the prediction. Particularly, b70 (the percentage of ALS returns whose heights are below 70% of the maximum tree height, after the subtraction of the height cut-off value) is the most important predictor, contributing to about 60% of the prediction of EP31 ( Figure  4). Moreover, a positive relationship between the predictor and EP31 was observed, even if the sigmoid curve indicates that there is a peak that marks the end of the positive trend for higher values of b70.
The most recurrent ALS metrics for TreMs CAT-1 prediction are cov_gap (CV3, EP3, GR1, and GR2) and c03 (CV3, CV4, and GR1), while d00 and b70 are never selected. The results show that four predictors (c03, d02, c04, and c01) allow the prediction of CV4. Moreover, a positive trend between CV4 and c03, which is the most important predictor, was observed, even if there is a slightly negative trend with higher values of c03. Nevertheless, looking at EP31 (i.e., the best predicted TreMs), the model selected b70, d00, kur, and c04 as contributing ALS metrics in the prediction. Particularly, b70 (the percentage of ALS returns whose heights are below 70% of the maximum tree height, after the subtraction of the height cut-off value) is the most important predictor, contributing to about 60% of the prediction of EP31 (Figure 4). Moreover, a positive relationship between the predictor and EP31 was observed, even if the sigmoid curve indicates that there is a peak that marks the end of the positive trend for higher values of b70.   The most recurrent ALS metrics for TreMs CAT-1 prediction are cov_gap (CV3, EP3, GR1, and GR2) and c03 (CV3, CV4, and GR1), while d00 and b70 are never selected. The results show that four predictors (c03, d02, c04, and c01) allow the prediction of CV4. Moreover, a positive trend between CV4 and c03, which is the most important predictor, was observed, even if there is a slightly negative trend with higher values of c03.
The most recurrent ALS metrics for TreMs CAT-2 are max (CV, DE, EP, and GR), c03 (CV, GR, and OT), and cov_gap (CV, DE, and GR), while c04 is the only metric that is never selected.
The results show that GR prediction requires the c03, cov_gap, max, and p01 predictors, of which c03 is the most important predictor.
At level 4, the most recurrent ALS metrics are max and c04, which contributed to predicting both Count TreMs and habitat trees. Conversely, d02, c03, d00, kur, and c01 are those ALS metrics that are not selected by the model for the prediction of TreMs SUM categories. Particularly, max is the most important ALS metric, representing over 70% variable importance for the prediction of habitat trees, showing a negative relationship.

Discussion
In this study, we presented a new approach to predicting the abundance of microhabitats and habitat trees within forests using ALS data. Metrics derived from ALS data have been used to evaluate many forest variables, such as the tree height, canopy density, and canopy stratification [20]. Our approach consisted of relating predictors derived from ALS point clouds with forestry attributes such as the abundance of TreMs using an ML algorithm (i.e., Random forest). The results showed that the Random forest algorithm exhibits good performance in predicting microhabitats belonging to the hierarchical classification. Nevertheless, some challenges are still unsolved due to the high variability among microhabitats, the unknown role played by tree species composition, and-more generally-the role played by forest structure, which is even more marked in mixed and stratified forests. Overall, the results can be considered satisficing compared to the difficulties and the efforts necessary for collecting forest inventory data in multi-layered forests, within which field surveys are also very challenging. The small size of the sample represents the most hindering factor for the better calibration of the algorithm. However, the approach can be easily applied at a larger scale offering great support for monitoring forest biodiversity. ALS data allow the identification of TreMs and habitat trees, fostering the development of predictive maps, which are very useful for assessing forest biodiversity at a large scale. For this reason, the use of remote sensing, particularly ALS data, could represent a cost-effective and powerful tool for the development of suitable ecological indicators based on the abundance of microhabitats.

TreMs Prediction and Forest Structure
As was expected, TreMs are more frequent on dominant trees (see Appendix A), as previously stated by many authors [4][5][6]. Their abundance is affected by forest structure [11,14] and can be balanced through silvicultural interventions retaining or removing large trees in the forests.
Moreover, if, on one hand, forestry can regulate the abundance of TreMs, on the other hand, the lack of silvicultural interventions for a long period and the consequent aging of trees, as in this study area [32,44], fostered the abundance of such TreMs that reflect the naturalness of the forest (e.g., lianas, crown microsoil, branch holes, and dead branches).
Regarding the forest structure characteristics, beyond the tree diameter, which is recognized to be useful for assessing the TreMs abundance [4,5], this study highlights that the tree height and the density of tree crowns also affect the assessment of TreMs abundance, especially with ALS devices. The results highlight that cov_gap, max, and c03 are the most important ALS metrics for predicting TreMs abundance. Such metrics describe the inverse of the canopy cover (i.e., the canopy gaps or density), the maximum height of all returns above the height cut-off (i.e., 1.30 m), and the number of returns between 20 and 30 meters, respectively (see Appendix B). However, a well-defined relationship between ALS metrics and the abundance of TreMs is missing, probably because of the small sample size. However, the study reveals that few ALS metrics allow the prediction of some individual microhabitats such as dead branches or lianas.
An overall negative relationship between tree height and the abundance of both TreMs and habitat trees was observed, even if it became positive for very high trees. The justification is that we have to expect to find more microhabitats in trees with large branches and crown area rather than on higher trees with a small crown. Assessing the ratio between the DBH and crown area of trees could represent an important index for assessing the abundance of TreMs.
The heterogeneity of tree heights-i.e., the vertical distribution of trees and tree crowns, especially in the mixed forests-is not an optimal condition for using ALS data to assess TreMs abundance. Usually, ALS allows the production of significant canopy height models, but for our purposes, it is important to have a high-resolution point cloud not only in the top layer but also in the intermediate canopy layers. Low-density point clouds in multi-layered forests might represent a crucial constraint to the detection of trees [45] and other forest inventory variables.
We observed that the predictive accuracy increases for forest stands with low-density crowns, which means poor amounts of leaves in the top layers (i.e., trees with large branches), and as a consequence, a more dense point cloud is available in the intermediate layers, as shown by the relative variable importance of ALS metrics for predicting CV4 or also GR12, which often occur on trees with large diameters and large branches.
The results could be improved significantly by using highly dense point clouds, like those developed by Terrestrial Laser Scanners, especially for microhabitats such as bark structures, injuries, woods, cracks, scars, and fungi [29], but this entails higher costs for monitoring activities.

TreMs Hierarchical Prediction
Observing and counting TreMs in the field is very challenging and, often, the efforts are sustainable only for research issues rather than for practical management. The outcomes of this study demonstrate that ALS data might support the prediction of TreMs and habitat tree abundance, even if numerous efforts are necessary to improve the accuracy of prediction. Looking at the hierarchical levels, the predictive accuracy differs among levels as well as among the TreMs of the same level. Overall, better results were observed for TreMs CAT-2 (level 3) and for some TreMs at single level.
At single level, better predictive accuracy was obtained for TreMs that occur on large trees and which often occupy the top and the higher layers of canopy such as DE13 (dead branches ø 10-20 cm, ≥50 cm, not sun-exposed), CV31 (branch hole >5 cm), and OT21 (fork formed by branches with formation of microsoil). The predictive performance decreases for microhabitats, such as EP31 (epiphytic bryophytes, as mosses, covering more than 25% of the trunk), GR12 (root buttress cavities with entrance >10 cm large), and DE11 (dead branches and limbs ø 10-20 cm, sun-exposed), which are less frequent. The TreMs frequency affects the prediction performance, and the algorithm failed in the prediction of TreMs with frequencies lower than 50. For this reason, the prediction of TreMs such as "dendrotelms and water-filled holes" was possible only at level 2, where the single TreMs were grouped at the higher hierarchical level (CV4). Additionally, in this case, the high-density point clouds of intermediate-top layers-such as c03 (number of returns between 20 and 30 m)-impact the prediction accuracy. The prediction performance increased when moving from single level to the upper levels of the hierarchical classification, up to TreMs CAT-2. By contrast, the higher variability among the TreMs at level 4 slightly hinders the prediction of Count TreMs and habitat trees. Though this aspect could be partially tackled through a powerful LiDAR sensor, the variability among microhabitats remains very huge and could represent a bottleneck in the use of ALS for microhabitat prediction. However, in this study, the categories that resulted better detection at level 3 are somewhat linked to veteran trees, such as tree cavities. Furthermore, some characteristics that could be defined as "species-specific" are highlighted, e.g., GR for beech and for European hop-hornbeam, or EP for oak and maples. Nevertheless, further investigations are necessary to verify the correlation between TreMs occurrence and tree species [46]. Though the performance obtained at level 4, for both Habitats Trees and Count TreMs, was slightly worse, ALS data allow their detection. The huge variability of the TreMs types at this level, due to the cumulative effect, probably impacts the performance accuracy, even if the results can be considered satisficing. This is an important finding because it demonstrates that ALS data support forest management through the identification and mapping of TreMs abundance, fostering the identification of biodiversity hotspots.

TreMs as Biodiversity Indicators
The increased interest in the assessment of microhabitat abundance, at the European level, highlights how TreMs provide significant support to forest management with the final aim of integrating biodiversity conservation with the provision of other ecosystem services. With the recognition of the importance of TreMs and habitat trees for assessing forest biodiversity, the pan-European set of Criteria and Indicators for sustainable forest management can be significantly enriched and improved [47][48][49]. TreMs abundance can be considered a useful biodiversity indicator because they provide refuges for numerous living organisms [1], reflecting the habitat value of the trees and naturalness (e.g., the lack of silvicultural interventions over time) of forests [50].
The presence of TreMs-such as epiphytes, dendrotelms, and water-filled holes-indicates the long-term absence of silvicultural interventions, and they can be considered indicators of the aging of forest stands, such as for the deadwood [51,52]. The abundance of TreMs can be considered as a variable useful for the identification of forest functional destination units (e.g., [53]), such as biodiversity hotspots, and for defining forestry strategies to balance biodiversity conservation with the provision of other forest ecosystem services [5,54]. Assessing TreMs abundance can be beneficial not only in old-growth forests but also for supporting forest practitioners in the selection of standards in the coppice forests-which are very common in the Apennine mountains, representing about 40% of Italian forest area [55]-as well as for supporting forest management and planning [56,57].
Developing a more user-friendly method for predicting the occurrence and abundance of these ecological structures, based on forest inventory data, is strongly necessary for extending their assessment at the country level. In light of this, remote sensing will continue to, and increasingly, play a crucial role in the forest sector.

Conclusions
The detection of microhabitats in forests represents an emerging, but challenging, task for assessing forest biodiversity and for integrating biodiversity conservation in forest management with remote sensing techniques, which strongly supports the detection of these particular ecological niches, and it will continue to attract research interest.
In this study, we demonstrate that assessing the density and the stratification of ALS point clouds allows to predict the microhabitat abundance. Dense and stratified point clouds, especially in the intermediate and top layers, tally with the abundance of TreMs.
The frequency and diversity of microhabitat types impact the remote sensing predictive performance. Abundant microhabitats are easier to detect than less frequent microhabitats. The models used in this study highlight that abundance is predicted at all levels of hierarchical classification. However, the best performance was obtained at the Single TreMs level, for some microhabitats, and at the TreMs CAT-2 level.
Aspects of the forest structure, particularly the tree height and crown density, represent important predictors of TreMs and Habitat Tree abundance in terms of ALS data. The multi-layered nature of unevenly aged forest hinders the detection performance. TreMs that occur on the top layers of the canopy are more detectable than TreMs occurring on the bases of tree trunks, for which a dense point cloud is required.
Further investigations are necessary to distinguish between the role played by forest structure, and thus controlled through forestry, and the natural role played by tree species in the occurrence of TreMs.
In conclusion, ALS allows the prediction of the abundance of TreMs, but assessing their diversity remains challenging. Mapping TreMs abundance supports forest managers in planning activities, such as identifying forest patches or stands with high habitat value. Assessing the habitat value of forests could support the development of biodiversity indicators, such as forest naturalness, an indicator that could affect the social perception of the importance of forest functions and the ecosystem services that are provided. Acknowledgments: This research was supported by the LIFE program and the Italian Academy of Forest Sciences, in the framework of the project "FRESh LIFE-Demonstrating Remote Sensing integration in sustainable forest management" (LIFE14 ENV/IT/000414). We thank the Carabinieri Forestali of Montedimezzo for the logistic support provided during the field campaign.

Forest Structure and TreMs Frequency According to Field Data
A total of 2380 trees were measured and observed within the 35 plots, with a DBH ranging from 2.5 cm to 104 cm and tree height ranging from 1.4 m to 39.6 m ( Figure A1). Except for some outliers, Turkey oak and beech are the species that present larger trees, with higher variability, in terms of both tree DBH and tree height, with trees that are over 30 m in height (Q. cerris, F. sylvatica, F. excelsior, and C. betulus) representing most of the top canopy layer. It is interesting to note how trees of Abies alba are present only in the understory layer, probably due to the natural expansion of regeneration from mature pure and mixed stands, which are in the surrounding forest area [58].

Forest Structure and TreMs Frequency According to Field Data
A total of 2380 trees were measured and observed within the 35 plots, with a DBH ranging from 2.5 cm to 104 cm and tree height ranging from 1.4 m to 39.6 m ( Figure A1). Except for some outliers, Turkey oak and beech are the species that present larger trees, with higher variability, in terms of both tree DBH and tree height, with trees that are over 30 m in height (Q. cerris, F. sylvatica, F. excelsior, and C. betulus) representing most of the top canopy layer. It is interesting to note how trees of Abies alba are present only in the understory layer, probably due to the natural expansion of regeneration from mature pure and mixed stands, which are in the surrounding forest area [58]. The growing stock of the forests ranges between 183 and 647 m 3 ha -1 , with the numbers of trees per hectare varying from 454 to 3705. The forest is characterized by a total of 16 tree species, ranging from 3 to 11 species among plots.
Regarding the frequency of TreMs, in the study area were counted a total of 2484 TreMs, on 851 Habitat Trees.  The growing stock of the forests ranges between 183 and 647 m 3 ha −1 , with the numbers of trees per hectare varying from 454 to 3705. The forest is characterized by a total of 16 tree species, ranging from 3 to 11 species among plots.
Regarding the frequency of TreMs, in the study area were counted a total of 2484 TreMs, on 851 Habitat Trees. The abundance of TreMs is significant correlated with the dimensions of the trees [5]. TreMs frequency and variability increase with increments in both tree DBH and tree height. The abundance of TreMs is significant correlated with the dimensions of the trees [5]. TreMs frequency and variability increase with increments in both tree DBH and tree height.   The abundance of TreMs is significant correlated with the dimensions of the trees [5]. TreMs frequency and variability increase with increments in both tree DBH and tree height.    Table A2. ALS metrics description. Metrics for which a variable importance was calculated are in bold and italics and indicated with *. Percentile 75 of height distribution p90

Metrics
Percentile 90 of height distribution P95 Percentile 95 of height distribution p99 Percentile 99 of height distribution b10 The percentage of ALS returns whose heights are below 10% of the maximum tree height, after the subtraction of the height cut-off value b20 The percentage of ALS returns whose heights are below 20% of the maximum tree height, after the subtraction of the height cut-off value b30 The percentage of ALS returns whose heights are below 30% of the maximum tree height, after the subtraction of the height cut-off value b40 The percentage of ALS returns whose heights are below 40% of the maximum tree height, after the subtraction of the height cut-off value b50 The percentage of ALS returns whose heights are below 50% of the maximum tree height, after the subtraction of the height cut-off value b60 The percentage of ALS returns whose heights are below 60% of the maximum tree height, after the subtraction of the height cut-off value b70* The percentage of ALS returns whose heights are below 70% of the maximum tree height, after the subtraction of the height cut-off value b80 The percentage of ALS returns whose heights are below 80% of the maximum tree height, after the subtraction of the height cut-off value b90* The percentage of ALS returns whose heights are below 90% of the maximum tree height, after the subtraction of the height cut-off value