Multispectral LiDAR-Based Estimation of Surface Fuel Load in a Dense Coniferous Forest

: Surface fuel load (SFL) constitutes one of the most signiﬁcant fuel components and is used as an input variable in most ﬁre behavior prediction systems. The aim of the present study was to investigate the potential of discrete-return multispectral Light Detection and Ranging (LiDAR) data to reliably predict SFL in a coniferous forest characterized by dense overstory and complex terrain. In particular, a linear regression analysis workﬂow was employed with the separate and combined use of LiDAR-derived structural and pulse intensity information for the load estimation of the total surface fuels and individual surface fuel types. Following a leave-one-out cross-validation (LOOCV) approach, the models developed from the di ﬀ erent sets of predictor variables were compared in terms of their estimation accuracy. LOOCV indicated that the predictive models produced by the combined use of structural and intensity metrics signiﬁcantly outperformed the models constructed with the individual sets of metrics, exhibiting an explained variance (R 2 ) between 0.59 and 0.71 (relative Root Mean Square Error (RMSE) 19.3–37.6%). Overall, the results of this research showcase that both structural and intensity variables provided by multispectral LiDAR data are signiﬁcant for surface fuel load estimation and can successfully contribute to e ﬀ ective pre-ﬁre management, including ﬁre risk assessment and behavior prediction in case of a ﬁre event.


Introduction
Forest surface fuels are considered the most complex fuel types in fire management and their detailed physical description is required by the majority of fire behavior prediction systems, such as FARSITE, FlamMap, and BEHAVE [1][2][3]. Surface fuels incorporate all live and dead biomass between the ground and ladder fuels, being responsible for surface fire initiation and spread (i.e., litter, herbaceous vegetation, shrubs and trees less than 1.8 m, branches, twigs, and logs) [4][5][6][7][8][9]. One of the most significant surface fuel components is surface fuel load (SFL), which is defined as the dry weight of the surface fuel layer per unit area (kg/m 2 ) [6]. SFL is used as an input variable in almost all fire-related applications, since it significantly influences many aspects of fire behavior, such as fireline intensity and rate of spread [5,6,10,11]. SFL quantification can also contribute to wildland fire prevention, e.g., through treatment of hazardous fuels, maintenance of low-risk forest understory, fuel connectivity estimation, and smoke emissions assessment [6,12,13]. The load of individual surface fuel types (SFTs) is also critical because each type has a generally different effect on fire behavior; therefore, SFTs are incorporated in different surface fuel models. Such models require a numerical As evidenced by the abovementioned studies, LiDAR data do not always provide accurate SFL evaluations despite their three-dimensional properties. They present limitations, such as attenuation by dense foliage and possible point positioning errors, which can hinder near-surface load assessment [53][54][55][56]. Moreover, different sensor characteristics, forest composition, and topography conditions can lead to variations in the laser penetration depth and, hence, the results in terms of the most suitable SFL prediction method and derived accuracy [24,34,57]. The use of full-waveform or discrete return LiDAR data can also define the most appropriate analysis approach to be adopted and its predictive performance. For instance, although full-waveform LiDAR data provide more information compared to the discrete return systems, they are rarely available due to higher cost and/or storage capacity requirements [58,59]. The same applies to the ability of using pulse intensity information, which improves the SFL estimation accuracy but requires its prior normalization, especially in topographically complex terrains [47,60,61]. Nevertheless, this is not always possible since the normalization of LiDAR intensity values requires plane trajectory information, which may be unavailable to the final user [47].
The supplementary capability of multispectral LiDAR in providing intensity information over multiple wavelengths has been successfully employed in a number of forest studies (e.g., Reference [62][63][64][65][66]). Given that multispectral LiDAR is still an emerging technology, current literature mainly deals with land cover and species mapping [62,64,65,[67][68][69][70][71][72] but-to the best of our knowledge-has not yet been used for SFL estimation. Multispectral LiDAR data were used in Reference [66], in order to derive an active normalized burn ratio in a boreal peatland ecosystem. The results suggested that multispectral intensity information is useful in the examination of forest understory. The estimation of canopy fuel parameters was addressed in Reference [73], where a comparative analysis of unispectral and multispectral LiDAR data was performed for the prediction of canopy fuel weight, canopy base height, vegetation height and biomass through regression analysis. The results of this research highlighted the usefulness of multispectral LiDAR compared to the unispectral LiDAR data in the examination of forest fuel parameters. Lopes Queiroz et al. [74] focused on the volume estimation of CWD applying generalized additive models (GAMs) and multispectral LiDAR information yielding high accuracy of R 2 = 0.72.
With respect to the SFL estimation methods, most relevant studies follow a multiple regression analysis approach. Although Jakubowski et al. have stated that complex fuel components, such as fuel load, are best predicted through more sophisticated analysis (e.g., SVM and RF modeling), such machine learning methods require a substantially larger sample size, which is extremely time-consuming and labor-intensive to acquire [75][76][77]. Additionally, the simplicity and higher interpretability of linear models renders multiple regression analysis a fundamental and the most widely used tool for the prediction of forest parameters-including SFL-using LiDAR-derived variables [77,78].
In order to address the literature gap in the use of multispectral LiDAR and SFL estimation, the aim of the present study was to investigate the potential of high-density, discrete return multispectral LiDAR data in estimating the SFL of an uneven-aged structured hybrid fir forest (Abies borisii regis), characterized by dense canopy and topographically complex terrain. We employed multiple linear regression analysis to examine the contributions of structural and intensity variables obtained from multispectral LiDAR to the reliable prediction of total surface fuel loads and individual SFTs.

Study Area
The area of interest is the Pertouli University Forest, which has been managed by the Aristotle University of Thessaloniki since 1934 for research and educational purposes. It is located in the Trikala Prefecture of Central Greece (latitude 39 • 32 -39 •  climate is characterized as transitional, namely Mediterranean-Mid-European, with cold, rainy winters and warm, dry summers. Average annual precipitation reaches 1472.3 mm, 307.7 mm of which fall from May to September. The geological ground of the area is mostly flysch and limestones with Cambisol soil type. The entire area covers 3296.59 ha, 2427.62 ha of which is forest or partly forested, predominantly consisting of pure natural hybrid Abies borisii regis (Abies alba Mill. X Abies cephalonica Loud.) stands. The structure of the forest is uneven-aged due to the shade-tolerant properties of the species. In particular, young regeneration is able to survive under the reduced light intensity of the understory for a prolonged period of time until light availability and growing space conditions become appropriate for further development.
Due to the fire-resistance property of the Abies species covering a substantial part of the forest, the fires that occurred in the study area in the past few decades were mainly caused by lightnings and human negligence. Nonetheless, given the increased tourism activity of the last few years and the presence of fire-prone pines plantations (Pinus nigra and Pinus sylvestris), the entire forest, especially the areas surrounding the plantations, requires particular caution in the summer season. In addition, according to Reference [79], climate change has resulted in an increased risk of fire in areas previously non-susceptible to fire (e.g., Northern European countries, high-altitude areas). Taking all the above under consideration, the need for efficient fire prevention planning and management in such forests has increased constantly.
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 23 The entire area covers 3296.59 ha, 2427.62 ha of which is forest or partly forested, predominantly consisting of pure natural hybrid Abies borisii regis (Abies alba Mill. X Abies cephalonica Loud.) stands. The structure of the forest is uneven-aged due to the shade-tolerant properties of the species. In particular, young regeneration is able to survive under the reduced light intensity of the understory for a prolonged period of time until light availability and growing space conditions become appropriate for further development.
Due to the fire-resistance property of the Abies species covering a substantial part of the forest, the fires that occurred in the study area in the past few decades were mainly caused by lightnings and human negligence. Nonetheless, given the increased tourism activity of the last few years and the presence of fire-prone pines plantations (Pinus nigra and Pinus sylvestris), the entire forest, especially the areas surrounding the plantations, requires particular caution in the summer season. In addition, according to Reference [79], climate change has resulted in an increased risk of fire in areas previously non-susceptible to fire (e.g., Northern European countries, high-altitude areas). Taking all the above under consideration, the need for efficient fire prevention planning and management in such forests has increased constantly.

Airborne LiDAR Data
Multispectral LiDAR data were acquired in October 2018 over the entire area of the Pertouli University Forest. The data collection was performed using a RIEGL VQ-1560i-DW laser scanner sensor (Horn, Austria) mounted on an airplane, which flew at an average altitude of 2243 m above the mountainous terrain. The sensor operated at a pulse repetition rate of up to 1 MHz and a

Airborne LiDAR Data
Multispectral LiDAR data were acquired in October 2018 over the entire area of the Pertouli University Forest. The data collection was performed using a RIEGL VQ-1560i-DW laser scanner sensor (Horn, Austria) mounted on an airplane, which flew at an average altitude of 2243 m above the mountainous terrain. The sensor operated at a pulse repetition rate of up to 1 MHz and a maximum scan angle of ±32 • . Information derived from two laser channels of different wavelengths were provided, namely the Green (GR) and Near Infrared (NIR) (532 nm and 1064 nm, respectively). Detailed sensor specifications can be found at the manufacturer's website (http: //www.riegl.com/nc/products/airborne-scanning/produktdetail/product/scanner/55/).

Field Survey
Field measurements were conducted in the beginning of November 2018 and the end of May 2019. Based on the study area's climate, vegetation conditions at the time of the two field surveys were constant. Thus, the temporal gap between LiDAR data acquisition and the field work of May is negligible.
A destructive fuel sampling method was applied to a set of fixed area plots. Fixed area plots offer an equal probability sampling and an improved representation of the actual SFL variation [18]. A plot of a fixed area and shape is used, within which all surface fuels are measured using a fuel load measurement method of choice [18]. The destructive measurement approach was employed including the collection, drying, and weighting of the fuel material. It is considered one of the most reliable fuel load measurement methods, although mostly used for research rather than operational inventory purposes due to its time-consuming and cost-inefficient nature [16,17].
Surface fuels were collected from 35 sample plots (Figure 1), the locations of which were recorded using a handheld GPS with an average horizontal positional accuracy of 3 m. Each plot covered an area of 1000 m 2 (rectangles with dimensions 40 × 25 m) with pure Abies borisii regis overstory. The selection of the specific plot dimensions was based on the official sampling method adopted in the framework of the Pertouli University Forest management plan development. The distribution of plots over the entire study area was based on different topography and vegetation density conditions. Homogeneity of the surface fuels across each plot was an additional criterion that had to be met in order to include it in the sample dataset. Areas which were inaccessible and/or subject to human or natural disturbances between the time period of the LiDAR data acquisition and field measurements were not included in the sampling process.
Although destructive sampling is considered one of the most accurate direct methods for quantifying surface fuel load on the field, it is extremely costly and time-consuming, especially when the area to be sampled is large. Given the impracticality of measuring each and every fuel particle over an area of 1000 m 2 , subsampling surface fuels and summarizing the respective measurements (i.e., averaging the measured weights of the subplots) is a commonly adopted method for the reliable description of surface fuel load [6,17,80]. Therefore, within each plot, five subplots of 1 m 2 were randomly placed and the included surface fuels were collected, categorized according to type, and stored in separate paper bags. Given the vegetation understory conditions, the number and extent of the subplots were considered sufficient to capture all spatial variation encountered within each 1000 m 2 plot and ensure representativeness of the plot as a whole.
The sampled SFTs included litter, grass and forbs, shrubs, low regeneration, and 1-h (<0.64 cm diameter), 10-h (0.65-2.54 cm diameter) and 100-h (2.55-7.6 cm diameter) downed woody debris. The samples were dried in 80 • C for 72 h and weighed [81]. The load of each surface fuel type (SFTL) was calculated as the arithmetic mean of the respective load values measured at the five subplots of each plot (Table 1). Accordingly, the total surface fuel load (TSFL) per plot area was derived from Remote Sens. 2020, 12, 3333 6 of 23 the sum of all SFL values divided by five (i.e., the number of 1 m 2 subplots measured in each plot) ( Table 1). All load measures were expressed in kg/m 2 .

LiDAR Data Preprocessing
Two discrete-return LiDAR datasets were provided by the contractor, namely GEOSYSTEMS HELLAS S.A. (https://www.geosystems-hellas.gr). The first dataset included the raw point cloud, and the second was the already preprocessed one derived by the former dataset. Prior to delivery, the raw LiDAR waveform data had been transformed to discrete returns preserving a maximum of seven returns per transmitted pulse. Further preprocessing of the point cloud for the generation of the second delivered dataset included georeferencing and correction for atmospheric and noise effects. Subsequently, the two channels were merged without retaining the information of the channel that each point belonged to, any duplicated points were removed, and the remaining ones were classified into six classes (i.e., ground, low, medium, and high vegetation, low points-noise). The final point cloud was characterized by an average point density of 82.99 points/m 2 and a pulse spacing of 0.13 m. The higher point density of the final product was generated by merging the two channel point clouds and the overlap between adjacent strips, which ensured a spatial continuous coverage of the study area.
Prior to SFL estimation, we processed the dataset further. The point cloud covering each plot was extracted and height-normalized using the k-nearest neighbor approach with an inverse-distance weighting (kNN-IDW) [82]. Overlapping points derived from multiple flightlines were subsequently eliminated and the resulting point clouds were used for the extraction of last-of-many (lm), first-of-many (fm), and single (s) returns. Finally, a suite of structural descriptive statistics (hereafter referred to as height metrics) for the lm, fm, s, and all returns was calculated using LASTools software and rLiDAR package implemented in R [83,84]. In total, 49 height metrics were computed, including percentiles and bincentiles at relative heights and height value statistics ( Table 2).
Considering the mountainous terrain of the study area and the different reflectance properties of the forest canopy along its vertical profile, a large intensity range variability was expected. Hence, the computation of the intensity descriptive statistics (hereafter referred as intensity metrics) required a prior intensity range normalization. This process was performed on the raw LiDAR data since, as mentioned above, the channels of the already preprocessed point cloud had been merged by the operator without providing the channel information for each point. Table 2. The LiDAR-derived height distribution metrics calculated for the surface fuel load (SFL) estimation. All described metrics were computed for the first-of-many (fm), last-of-many (lm), single (s), and all echoes. According to Reference [85], the intensity differences in pairs of intermediate returns are mostly unrelated to range and, hence, an insignificant improvement of the intensity measurements consistency is expected to be achieved by range normalization. On the contrary, the authors state that range normalization of s or fm returns over ragged terrain is more likely to improve the reliability of intensity measurements in forest applications. Nevertheless, since fm echoes are weaker and located lower in the canopy compared to the single echoes, their intensity differences are mostly related to the high heterogeneity of the variable vegetation materials (i.e., foliage and wood) rather than range [85]. Taking all the above under consideration, range normalization and further analysis of only the single-return intensity values was performed in the present study.

LiDAR Height Distribution Metrics Definition
More specifically, range intensity normalization was performed to the s returns of both GR and NIR channels as follows [85][86][87][88][89]: where I norm is the normalized intensity, I original the observed intensity, R re f a reference range specified by the user, R the actual range (i.e., distance) between a return and the sensor, and f a calibration parameter which can have values from 2 to 4 depending on the target and the return type. In forested areas, suitable values for f range between 2 and 2.5 [90]. Thus, R re f was set to 2198 based on the average flying altitude, and, given the coniferous forest, and the s returns to be calibrated, the f parameter was set to 2.
The intensity normalized single-return point cloud covering each plot was extracted from both channels and height-normalized according to the abovementioned method. The point cloud of each plot was vertically divided into two parts: below 1-m height (ground and near-surface s returns) and above or equal to 1-m height (canopy s returns). The division of the point cloud was performed for detailed examination purposes. The 1-m threshold was selected based on the knowledge of the understory vegetation conditions and the average canopy base height (CBH) characterizing the sample plots. More specifically, CBH is usually higher than 1.5 m across the study area, whereas, in case of a low CBH (i.e., <3 m), there is a high chance of the understory vegetation to be very close or even overlap the canopy layer rendering their discrimination impossible. Consequently, after examining multiple height thresholds, 1 m was selected in order to assure that no canopy layers are considered as near-surface ones. It should be noted that this threshold was considered suitable for our study area and the specific surface fuel types that were analyzed. Therefore, different understory and overstory conditions (e.g., SFT, height, CBH) would require the examination and application of different height thresholds. In total, 28 intensity metrics were calculated using the ground/near-surface, canopy, and the entire single-return point cloud for each plot and both GR and NIR channels (Table 3). Table 3. The LiDAR-derived intensity distribution metrics calculated from GR and NIR channels for surface fuel load (SFL) estimation. All described metrics were computed only for the ground/near-surface (below 1-m height), canopy (at or above 1-m height), and all single (s) returns.

Surface Fuel Load Estimation
Multiple linear regression analysis was applied using the calculated height distribution and single-return intensity metrics (Tables 2 and 3)-separately and combined-for the construction of fifteen models that estimate the TSFL and SFTLs (i.e., litter, grass and forbs, 1-h, and 10-h fuels load) (i.e., one model for TSFL and each SFTL using structural and intensity metrics separately and together). Although shrubs, 100-h fuels, and regeneration were measured during field work, they were not present in all sample plots (i.e., the load of these SFTs in several plots was zero). As a result, the reduced sample size for these SFTs was inadequate for the construction of a reliable prediction model with sufficient variance explanation rate and no overfitting. Therefore, the specific SFTs were eliminated Remote Sens. 2020, 12, 3333 9 of 23 from further analysis. The TSFL, however, was estimated as the sum of all SFTs, including shrubs, 100-h fuels, and regeneration where present.
Due to the large number of the LiDAR-derived metrics to be considered as potential predictive variables, the selection of the most significant ones in terms of correlation with SFL and predictability was performed through the application of the simulated annealing (SA) algorithm [91,92]. SA constitutes an effective and widely used search algorithm for global optimization problems, which, in case of variable selection, tries to minimize relative RMSE (%) by repeatedly solving the predictive model (in our case the linear regression model) [91].
The selection of the most appropriate predictor variables by SA requires their prior quantity indication by the user. Following the general rule of thumb of one predictor per ten observations [93], the potential of using either three or four variables was explored. The inclusion of more predictors in the model would increase the model complexity and, hence, the risk of overfitting the model to the training data. Nevertheless, and in order to further support our choice, we also performed an additional experiment building regression models with an increasing number of predictor variables ranging from one to ten (including both structural and intensity metrics), which is presented in Section 4 below.
Prior to SA, transformations were applied to the field-measured SFL values (log and square root) and all LiDAR-derived metrics (log, square root, and square) in order to account for potential non-linear relationships between the response and predictive variables. Subsequently, the methodology proposed by [92] was implemented in the free and open-source R software, compiled into an automated workflow.

Cross-Validation
All generated models were cross-validated using the leave-one-out cross validation (LOOCV) approach. The selection of the most reliable regression model for total surface fuels and each SFT was based on four standard goodness-of-fit metrics, namely the coefficient of determination (R 2 ) (Equation (2)), Root Mean Square Error (RMSE) (Equation (3)), relative Root Mean Square Error (rRMSE) (Equation (4)), and relative bias (rbias) (Equation (5)): where y i is the measured SFL for plot i,ŷ i is the predicted SFL for plot i, y is the mean measured SFL, and n the number of observations (i.e., the number of sample plots). In case of log or square root transformation of the response variable in the model, the residual variance (Equation (6)) was calculated for bias correction (Equation (7) for log and Equation (8) for square root transformation) of the actual predicted SFL values.
where σ 2 is the residual variance, y i is the measured SFL for plot i,ŷ i is the predicted SFL for plot i, X i is the actual predicted SFL value for plot i, and n is the number of observations (i.e., the number of sample plots).

Results
The implementation of SA variable selection and linear regression analysis resulted in the generation of different models-including three or four independent variables-for SFTL and TSFL prediction, among which the ones with the highest predictive accuracy were eventually selected.
The computed goodness-of-fit statistics (described in Section 3.3) indicated that the use of four predictors in the regression model, instead of three, led to more reliable models in terms of accuracy. Therefore, four predictors participated in all selected regression models presented below.
The selection of the appropriate number of predictor variables to be employed in the models was based on the general rule of thumb of one predictor per ten observations. In order to further support this choice, we devised an experiment where the regression models were built using an increasing number of predictors ranging from one to ten (including both structural and intensity metrics). Figure 3 illustrates an example of the LOOCV results (i.e., R 2 and rRMSE) for the TSFL models. More specifically, Figure 3a shows the increase in the R 2 of the models resulting from the sequential addition of one independent variable while Figure 3b depicts the training and validation rRMSE of the respective models highlighting the change in their numerical difference according to the participating number of predictors. The inclusion of additional predictor variables increases the validation R 2 (up to 0.9). However, the difference between training and validation rRMSE also increases, which is an indication of possible overfitting. The models' accuracy still increases for more than four predictors but at a slower pace than for less than four predictors. Similar results were obtained for the regression models constructed for the individual SFTs and, therefore, they are not presented here for the sake of brevity. Hence, our decision to select four predictors for the final models balances between the rule of thumb of approximately 10 samples per 1 predictor and the experimental observation that the highest rate of accuracy increase (relatively to the simplest model with a single input variable) is achieved with four predictors.
Tables 4-6 present the goodness-of-fit statistics derived from the LOOCV process for the models developed in the present study (i.e., produced using only structural, only intensity metrics and both) Table 7 includes the predictor variables selected by SA in the most accurate models, as indicated by LOOCV. The scatterplots of the observed versus predicted load values and the residuals plots for the latter regression models (Table 7) are illustrated in Figures 2 and 3, respectively.
The goodness-of-fit statistics presented in Table 4 showcase that R 2 values ranging from 0.44 to 0.58 (26.79-48.56% rRMSE) could be achieved by the models developed with the use of only structural metrics for the TSFL, litter, 1-h and 10-h fuels load. Grass load was the most accurately estimated variable in this case with the respective model yielding an R 2 of 0.71 and 36.61% rRMSE. In the case of using only intensity metrics as predictor variables, the performance of the developed models was lower with R 2 values of 0.39-0.55 (23.84-50.36% rRMSE) ( Table 5). Table 6 indicates that the use of both structural and intensity metrics resulted in models with higher estimation performance with R 2 values of 0.59-0.71. In particular, the TSFL and grass/forbs models were the most accurate, both reaching an R 2 of 0.71 and rRMSE of 19.34% and 36.33%, respectively. Litter load was predicted less accurately with 0.69 R 2 and 37.59% rRMSE. The models with the lowest predictive accuracy were the ones of FWD (i.e., 1-h and 10-h fuels). In particular, 59% of the 1-h fuels load variance was explained with an rRMSE of 28.89% while the cross-validation of the 10-h fuels model revealed an R 2 of 0.6 and 31.26% rRMSE. All models presented negative rbiases ranging between −1.4 and −0.26 indicating an overestimation of the field-measured load values.
A comparison of the models developed by the different sets of LiDAR metrics (Tables 4-6) revealed the need for synergy between the height distribution and intensity metrics for reliable SFL prediction except for the case of grass/forbs load estimation, in which, based on the results presented in Tables 4  and 6, the use of only height distribution metrics is adequate. Table 7 showcases that metrics related to both lower and upper fuel layers were used in the models, which indicates that LiDAR data can directly interact with the surface fuel layers and indirectly associate the SFL with the canopy characteristics of the study area, as well. In addition, GR and NIR channels are used-either separately or together-in the estimation of all SFTLs, except for the grass/forbs load. metrics). Figure 3 illustrates an example of the LOOCV results (i.e., R 2 and rRMSE) for the TSFL models. More specifically, Figure 3a shows the increase in the R 2 of the models resulting from the sequential addition of one independent variable while Figure 3b depicts the training and validation rRMSE of the respective models highlighting the change in their numerical difference according to the participating number of predictors. The inclusion of additional predictor variables increases the validation R 2 (up to 0.9). However, the difference between training and validation rRMSE also increases, which is an indication of possible overfitting. The models' accuracy still increases for more than four predictors but at a slower pace than for less than four predictors. Similar results were obtained for the regression models constructed for the individual SFTs and, therefore, they are not presented here for the sake of brevity. Hence, our decision to select four predictors for the final models balances between the rule of thumb of approximately 10 samples per 1 predictor and the experimental observation that the highest rate of accuracy increase (relatively to the simplest model with a single input variable) is achieved with four predictors.
(a) (b) Figure 3. (a) Coefficient of determination (R 2 ) of the regression models for total surface fuel load (TSFL) estimation derived from the leave-one-out cross-validation (LOOCV) process versus the different numbers of predictor variables (i.e., LiDAR-derived structural and intensity metrics) for total surface fuel load (TSFL), (b) relative root mean squared error (rRMSE) of the regression models for TSFL estimation derived from the models' training (blue line) and LOOCV process (orange line) versus the different numbers of predictor variables. Table 4. The goodness-of-fit statistical measures obtained from the leave-one-out cross validation (LOOCV) process and employed transformations of the response variables in each developed regression model including the LiDAR height distribution metrics for the estimation of the total and surface fuel types' load (TSFL and load of each surface fuel type (SFTL), respectively). TSFL includes the load of all surface fuel types (SFTs) sampled in each plots (i.e., litter, grass and forbs, shrubs, regeneration, 1-h, 10-h, and 100-h fuels).   Table 6. The goodness-of-fit statistical measures obtained from the leave-one-out cross validation (LOOCV) process and employed transformations of the response variables in each developed regression model produced by the combined use of structural and intensity metrics for the estimation of the total and surface fuel types' load (TSFL and SFTL, respectively). TSFL includes the load of all SFTs sampled in each plots (i.e., litter, grass and forbs, shrubs, regeneration, 1-h, 10-h, and 100-h fuels).  Table 7. Selected independent variables participating in the regression models for the estimation of the total and surface fuel types' (SFT) load using LiDAR-derived height and intensity distribution descriptive statistics. The total surface fuel load (TSFL) includes the load of all SFTs sampled in each plot (i.e., litter, grass and forbs, shrubs, regeneration, 1-h, 10-h, and 100-h fuels). The definition of the independent variables' names can be found in Table 3, while the lm, fm, s, gr, can suffices refer to the last-of-many, first-of-many, single, single ground/near-surface (i.e., below 1m height), and single canopy returns (i.e., at or above 1 m height), respectively. The absence of these suffices indicates that the respective metric was calculated for the entire point cloud. GR and NIR suffices are included only in the intensity metrics of single returns (only single returns were range normalized) and refer to the Green and Near-Infrared channels. The predictor variables are presented with transformations (logarithmic-log, square root-sqrt and square), as employed in each model. A comparison of the aforementioned variations explained by the most accurately developed models (Table 6) can be also performed by an examination of the observed versus predicted plots depicted in Figure 4. The models' fits to the field-measured load values are similar with the hypothetical 1:1 relationship (Figure 4). Regarding the residual distributions presented in Figure 5, a fair homogeneity can be observed without displaying any strong distribution patterns or obvious systematic trends.

Discussion
In the present study, SA and multiple linear regression analysis were implemented for the prediction of TSFL and SFTL (i.e., litter, grass/forbs, 1-h, and 10-h fuels load) using high-density multispectral LiDAR data in a dense forest over sloped terrain. LiDAR-derived height and single-return intensity distribution metrics were used individually and combined as potential independent variables for the construction of fifteen regression models (with four predictors each). Following the LOOCV approach, the goodness-of-fit statistics showcased that the incorporation of LiDAR intensity information to the regression analysis workflow leads to the construction of models with higher estimation performance compared to the ones including only structural metrics. Overall, the results indicate a significant contribution of the multispectral LiDAR data to reliably predict SFL even in dense canopy and topographically complex conditions.
With respect to the selection of the number of independent variables in the models, the use of more predictors will most likely lead to increased accuracy in terms of explained variance. Nonetheless, the risk of overfitting the regression models increases when more independent variables are included. A number of rules of thumb has been proposed in the literature for the estimation of a minimum acceptable sample size that are based on various factors, such as the nature and structure of the data, number of predictors to be used, and complexity of the predictive model to be constructed [93][94][95][96][97].
Given that a large sample size requires extreme workload and is extremely costly, we collected 35 samples from our study area. Hence, we examined three and four predictors in the model following the general rule of thumb of 10 samples per 1 predictor [93]. Yet, different number of predictor variables were also tested (see Figure 3 and the associated text in Section 4), which showcased that the highest rate of accuracy increase was exhibited up to four variables, after which the difference between training and validation error started increasing. Our suggestion is, therefore, to respect the 10 samples per predictor empirical rule. Eventually, the more ground truth samples are available, the better the control against overfitting a modeler has.
As for the results of the regression analysis, the load variation explained by the models employing structural or intensity metrics alone was 0.44-0.71 (26.79-48.56% rRMSE) and 0.39-0.55 (23.84-50.36% rRMSE), respectively. The 0.71 R 2 was actually achieved by the grass/forbs model while the models for the other SFTs revealed R 2 values lower than 0.6. The incorporation of the pooled metrics in the SA process resulted in more accurate results, based on the load variation explained by the regression models which ranged from 0.59 to 0.71 with rRMSE between 19.34% and 37.59% ( Table 6). The loads of grass/forbs and total surface fuels were mostly accurately predicted (R 2 = 0.71), whereas the FWD (i.e., 1-h and 10-h fuels) load was estimated with the lowest accuracy (R 2 = 0.59 and 0.6, respectively). The low predictive performance of the FWD models can be attributed to the fact that the amount of downed woody debris in a forested area is not only affected by stand characteristics but weather conditions (i.e., strong winds or heavy snow), natural debranching and presence of logging operations (i.e., residues), as well [98,99], information which is not possible to retrieve from LiDAR data. The similar estimation performances of two of the developed grass/forbs models (Tables 4 and 6) is attributed to the fact that although different pools of predictor variables were used in the SA process (i.e., only structural in the first and all metrics in the second pool), intensity metrics were not selected in either case. For the sake of conciseness, the following part of the discussion is focused on the regression models incorporating both the structural and intensity metrics since these are the ones that yielded the highest estimation accuracy.
The LiDAR-derived variables ( Table 7) that were found to be mostly correlated with the reference data (i.e., the field-measured SFL values) were I90TH grNIR for TSFL, I90TH grNIR 2 for litter load, logbb05 lm for grass/forbs load, log H75TH for 1-h fuels and Isd canNIR 2 for 10-h fuels load, presenting Pearson's correlation coefficients of 0.32, 0.28, 0.62, 0.27, and 0.30, respectively. An examination of all independent variables included in the models led to the observation that the first three regression models (i.e., the ones predicting TSFL, litter, and grass/forbs load) incorporated metrics related to the lower part of the canopy (i.e., metrics for the lm returns, low height percentiles, and s returns below 1-m height) indicating that the LiDAR data were able to interact with the surface fuel layers in a direct manner. On the contrary, variables describing the upper canopy (i.e., metrics for the fm returns, high height percentiles, and canopy s returns) were mostly used for the 1-h and 10-h fuels load estimation, revealing the ability of LiDAR to indirectly associate the FWD load with the canopy characteristics of the study area. This can be considered a rather logical association, since stand density and height may indicate the amount of the living tree branches available for falling on the ground, which in turn is linked with the downed branches load. All regression models included GR and/or NIR channel intensity metrics, except for the grass/forbs one, where a bincentile metric, namely the b05 lm , was selected instead. In our study area, grass and forbs were mostly located in forest openings; thus, the respective load was higher in plots with lower canopy cover and density. Given that canopy cover and density had a significant impact on the amount of grass and forbs per plot area, the use of height percentile and bincentile metrics for the fm and lm returns is expected for describing the specific SFT, since they encompass the description of both the upper and lower canopy conditions. All of the remaining SFTs (i.e., litter, 1-h fuels, and 10-h fuels) are downed. Hence, intensity information-which is directly associated with the surface reflectivity of the target-was required in order to retrieve information related to the load of the different downed fuel types by correlating the LiDAR-derived intensity metrics with the respective reference data. In these cases, the selection of the single-return intensity information of the NIR channel is reasonable given that green vegetation strongly reflects energy in the NIR spectrum and can be employed for the discrimination of living and dead surface fuel materials. On the contrary, the GR channel has substantially lower reflectance from vegetation compared to NIR and its larger beam divergence can decrease the number of registered canopy echoes [63]. Therefore, the GR channel information can be considered useful in the examination of the below-canopy forest environments.
With respect to the LiDAR multispectral information, as mentioned in the first paragraph of Section 3.1, the contractor of the LiDAR acquisition provided us with the merged point cloud from the two channels, without maintaining the information about which channel each point was derived from. Intensity measures were derived from the raw discrete-return LiDAR data (provided separately), but we did not have access to the proprietary software employed for correcting atmospheric and noise effects; thus, we could not regenerate the single-channel point clouds. Therefore, the contribution of each channel to the final models cannot be proved experimentally at the present time in our case, but there is at least a strong evidence from the results of Reference [73] mentioned in Section 1. Nevertheless, the existence of a second channel effectively doubles the point cloud density (reaching to an average of 82.99 points/m 2 ), which is highly expected to increase the surface fuel estimation potential, even if that cannot be proven experimentally.
Upon further examination of the selected independent variables, it was observed that nearly all predictors were log, square root or square transformed indicating non-linear relationships with the reference data. Similarly, given the parametric nature of linear regression analysis, transformations of the response variables were also necessary to be involved in order to examine potential nonlinearity with the predictor variables and finally meet the assumptions of the linear regression model. Thus, the logarithmic and square root transformations of the response variables were selected in all load prediction cases to achieve variance homogeneity and normality of the residual plots.
Even though a direct comparison of regression analysis results derived from other studies requires particular caution, it is worth noting that more variation in the load of different surface fuel layers was explained in this work compared to most of the other reported prediction accuracies described in Section 1. Possible reasons may encompass the use of higher-density data (e.g., Reference [47,50]), as well as the different forest composition and topography characterizing other examined areas, which affect the canopy penetration capabilities of the laser [57]. The use of pulse intensity-related variables also had a substantial influence on the results' accuracy since the point cloud structural information (i.e., height metrics) alone is not adequate for reliable load estimation of the different SFTs [47,100]. However, intensity range normalization requires specific sensor-and flight-related information (e.g., smoothed best estimate of plane trajectory) which are not always available to the final user. Plot sizes also differ among studies. The 1000 m 2 sample plot area selected in our work provided an additional advantage by avoiding inherent variability and the subsequent variation of the results due to possible spatial misplacement of the rectangular plots caused by GPS measurement inaccuracies. Moreover, the estimation accuracy is strongly influenced by the employed fuel sampling method (e.g., planar intersect, photoload, destructive) and possible data cleaning.
Contrary to the above mentioned studies, the research of Chen et al. (2017) for SFL estimation in Australia yielded significantly higher prediction accuracy (R 2 of up to 0.89) compared to our work [48]. The authors examined the potential of year since last fire and surface fuel depth variables using both airborne and terrestrial LiDAR data. Except for the abovementioned reasons, the substantial outperformance of these results compared to ours can be also attributed to the available LiDAR data type. In particular, the type of LiDAR instrument being employed can substantially influence the results since each one provides different level of information. While the airborne LiDAR collects data mostly from the upper canopy layers, terrestrial LiDAR can provide much more detailed information from the lower stand layers [101].
Despite the fact that this research has reached its aims, there were some unavoidable limitations. First of all, although the distribution of the sample plots across the study area was performed in the most representative manner possible, most related studies recommend using at least 40 samples in regression analysis and, therefore, it is possible that the limited sample size has caused additional uncertainty in the predictions [102]. Furthermore, in the absence of shrubs, 100-h fuels, and regeneration from some plots, their load estimation was impossible to be performed because the remaining plot number (i.e., where these SFTs were present) was not sufficient for conducting a robust regression analysis. Sampling plots of smaller dimensions (<1000 m 2 ) would be able to decrease the workload probably leading to an increased sample size, which could be an effective solution to both aforementioned limitations. Nevertheless, as in the majority of related studies that employ a rather limited sample size and given the complex forest structure and steep topography, the importance that the samples capture the SFL variability across the area was considered a priority [75].
Finally, the predictive models constructed in the framework of this research cannot be successfully transferred to other forest ecosystems. Although the models showed a reliable predictive performance in our study area, difference in vegetation composition, topographic conditions, and sensor characteristics would cause diverse relationships between the response and LiDAR-derived predictive variables, significantly influencing the results. SFL estimation in another forested area should encompass new reference data collection and development of different regression models. Yet, we argue that the employed methodology offers a robust framework for estimating SFL from LiDAR data, even in dense coniferous forests with complex topography and understory composition.

Conclusions
The present research focused on the estimation of SFL in a dense coniferous forest using high-density multispectral LiDAR data. In particular, the TSFL and individual SFTLs were predicted through SA and a regression analysis workflow. LiDAR-derived height and single-return intensity distribution metrics were used individually and combined as potential independent variables for the construction of fifteen regression models (with four predictors each). LOOCV process was applied for the validation and comparison of the developed models in terms of accuracy. LOOCV results suggest the contribution of multispectral LiDAR information in reliably predicting SFL in our study area which is characterized by dense canopy and steep topography. More specifically: • The use of either structural or intensity metrics alone results in suboptimal accuracy compared to their synergetic use, except for the case of grass/forbs load that can be reliably estimated using only structural metrics.

•
The incorporation of intensity metrics derived from both GR and NIR channels in one of the developed regression models provides some evidence in favor of the contribution of LiDAR multispectral information to estimate SFL.
• TSFL and the grass/forbs load was predicted with the highest accuracy. Cross-validation of the respective regression models revealed their high performance of 0.71 R 2 with rRMSE of 19.34% for TSFL and 36.33% for grass/forbs load prediction. • FWD models are characterized by the weakest performance with R 2 reaching 0.59 and 0.6 for 1-h and 10-h fuels' load estimation, respectively. The lower accuracy compared to the rest of the models can be attributed to the additional effect that extreme weather conditions, logging activities and natural debranching have on the amount of FWD on the forest surface.
Taking all into account, the results suggest the capability of multispectral LiDAR to successfully contribute to effective pre-fire management (i.e., fire risk assessment and behavior prediction in case of potential fire event) through reliable load quantification of forest surface fuels (either total or individual SFTs). Given the limitations of the study described in Section 5, it would be interesting to examine the potential of multispectral LiDAR in SFL prediction with the use of increased sample plots in order to validate the results. Further research could also focus on the use of different LiDAR instruments, such as terrestrial LiDAR, which would provide additional information about the forest understory, or spaceborne LiDAR, which constitutes a rather cost-efficient source of three-dimensional data.