Estimating Stand and Fire-Related Surface and Canopy Fuel Variables in Pine Stands Using Low-Density Airborne and Single-Scan Terrestrial Laser Scanning Data

: In this study, we used data from a thinning trial conducted on 34 different sites and 102 sample plots established in pure and even-aged Pinus radiata and Pinus pinaster stands, to test the potential use of low-density airborne laser scanning (ALS) metrics and terrestrial laser scanning (TLS) metrics to provide accurate estimates of variables related to surface and canopy ﬁres. An exhaustive ﬁeld inventory was carried out in each plot to estimate the main stand variables and the main variables related to ﬁre hazard: surface fuel loads by layers, fuel strata gap, surface fuel height, stand mean height, canopy base height, canopy fuel load and canopy bulk density. In addition, the point clouds from low-density ALS and single-scan TLS of each sample plot were used to calculate metrics related to the vertical and horizontal distribution of forest fuels. The comparative performance of the following three non-parametric machine learning techniques used to estimate the main stand- and ﬁre-related variables from those metrics was evaluated: (i) multivariate adaptive regression splines (MARS), (ii) support vector machine (SVM), and (iii) random forest (RF). The selection of the best modeling approach was based on a comparison of the root mean square error (RMSE), obtained by optimizing the parameters of each technique and performing cross-validation. Overall, the best results were obtained with the MARS techniques for data from both sensors. The TLS data provided the best results for variables associated with the internal characteristics of canopy structure and understory fuel but were less reliable for estimating variables associated with the upper canopy, due to occlusion by mid-canopy foliage. The combination of ALS and TLS metrics improved the accuracy of estimates for all variables analyzed, except the height and the biomass of the understory shrubs. The variability demonstrated by the combined use of both types of metrics ranged from 43.11% for the biomass of duff litter layers to 94.25% for dominant height. The results suggest that the combination of machine learning techniques and metrics derived from low-density ALS data, drawn from a single-scan TLS or a combination of both metrics, may represent a promising alternative to traditional ﬁeld inventories for obtaining valuable information about surface and canopy fuel variables at large scales. Author Contributions: Conceptualization, C.A.-R. and A.D.R.-G.; methodology, C.A.-R., S.A.-P., J.G. Á .-G. and A.D.R.-G.; software, J.G.-H., E.G.-F. and J.G. analysis, .-G.; investigation,


Introduction
The use of accurate data obtained in forest inventories, including data on the canopy, understory, woody debris and litter fuel, is crucial for fire and forest management. Complete spatial and physical descriptions of the forest fuel layer (i.e., loads, horizontal coverage, spatial continuity, etc.) are needed as input variables in most fire behavior simulators [1][2][3]. Despite the essential role of forest fuel characterization in forest fire management and fire behavior, it remains challenging due to high spatial and temporal variability [4]. Traditional field measurements constitute the most accurate method of quantifying these variables, but field-based methods are often expensive and time-consuming and do not capture the range of variation across the landscape; also, in some cases, the data collection methods cannot feasibly capture three-dimensional (3D) forest characteristics [5].
Remote-sensing LiDAR (airborne light detection and ranging) technology with ALS (airborne laser scanning) and TLS (terrestrial laser scanning) sensors has been used as a costefficient method for monitoring a variety of forest inventory and forest fuel variables [6][7][8][9]. LiDAR scanning, with discrete or full-waveform technology, has been widely employed to characterize forest inventory variables and canopy fuel characteristics at a landscape scale using an area-based approach (ABA) in the last two decades [10][11][12][13][14][15][16][17][18]. However, fewer studies have examined the potential of ALS remote sensing for characterizing the lower canopy structure, the understory, and near-ground fuels, owing to the persistent difficulties associated with the presence of taller elements that occlude the underlying elements [19][20][21][22][23][24]. Although full-waveform ALS provides information about the interaction between the signal and the environment it travels through, as opposed to discrete ALS, which provides greater detail about the vertical distribution of the forest components [25][26][27], less attention has been devoted to the former due to the greater complexity of the data and the limited availability of processing tools [27,28].
Terrestrial laser scanning (TLS) has been proven to have great potential for characterizing and quantifying fuel variables that are essential for determining fire risk and fire behavior [29][30][31][32]. Most fuel characterization studies that use TLS data have focused on analyzing shrub vegetation in ecosystems without tree cover, often with the aim of obtaining spatially explicit, three-dimensional data as inputs for physical-based fire-spread models [33][34][35][36]. Such detailed studies require multiple scans in order to reconstruct the vegetation structure via piecewise approximation of the volume or by voxelization [30,34,36,37], which are time-consuming processes in terms of both field acquisition and the post-processing of data. However, single-scan TLS surveys have rarely been used to characterize forest fuels.
As a complement to ALS measurements, the increasingly accessible TLS technology, which scans the canopy structure from below (i.e., upwards), may enable accurate fine-scale assessment of fire fuel [20,38]. Although TLS-derived data are less suited for characterizing the complete forest structure over large areas, accurate information on the below-canopy structure can be a valuable addition to ALS observations, providing a better understanding of the relationship between canopy structure and the spatial distribution of understory and near-surface fuels. Therefore, TLS should be viewed as a completely different approach for characterizing forest structure at the plot scale, offering a high level of detail [39] that can complement the development of estimation models supported by remote sensing data to the required spatial extent, such as those obtained from ALS surveys [20].
In addition, these studies are based on the fusion of ALS and TLS multi-angle scan point clouds to produce complex 3D geometric information at a very detailed scale that exceeds the requirements of most operational fire simulators, such as Farsite and FlamMap [2,3]. Furthermore, such fusions require complex and time-consuming georeferencing, data matching and post-processing protocols [56,59].
Accordingly, the aim of this study was to test the potential use of low-density ALSderived metrics, TLS-derived metrics and a combination of both metrics to provide accurate estimates of surface and canopy fire-related variables to characterize Atlantic pine forest ecosystems at the stand level, comparing three non-parametric machine learning techniques: (i) multivariate adaptive regression splines (MARS), (ii) support vector machines (SVM) and (iii) random forest (RF).

Study Area and Sample Plots
The study area is located in NW Spain ( Figure 1). The data set corresponds to 34 thinning trials established in pure, even-aged stands of maritime pine (Pinus pinaster Ait.) (20 locations) and radiata pine (Pinus radiata D. Don) (14 locations). The thinning trials were carried out in 2009-2010, when the stand age ranged from 12 to 32 years (mean age 17.6 years) for maritime pine and from 12 to 22 years (mean age 16.3 years) for radiata pine. The average slope was 23.8%, with values ranging from 3.9 to 35.2%. Three rectangular plots (25 × 40 m) were established in each location, and a different treatment was randomly applied to each of the three plots: control (C, unthinned), moderate thinning (LT, 20% of the basal area removed) or heavy thinning (HT, 40% of the basal area removed). The same treatment that was applied in the plot was also applied in a 5-m buffer area around each plot. The corners of the sample plots were georeferenced using a Trimble Geo7X GPS (Geospatial, USA) with subsequent post-processing of the information to obtain subcentimeter accuracies of the coordinates. All the trees within the plots were measured before and after thinning and were re-measured in 2011-2012, 2013-2014 and 2015-2016. However, surface fuels were only inventoried in 2015-2016 and, due to the complexity of the approach used to estimate these fuel loads, the moderately thinned plots were not considered. Therefore, only the control treatment and heavy thinning were analyzed for surface fire-related variables (68 sample plots), and the complete trials were used to analyze the canopy fire-related variables (102 sample plots). The thinning trials are described in further detail by Arellano-Pérez et al. [60].

Tree and Stand Variables
The diameter at breast height (d) of all the trees was measured to the nearest 0.1 cm, at two perpendicular angles, with a graduated caliper. Total tree height (h) and height to the base of the live crown (h blc , defined as the lower insertion point of live branches in a tree) were measured to the nearest 0.1 m with a digital hypsometer in a random sample of 30 trees and in an additional sample, including all dominant trees (the proportion of the 100-largest diameter trees per hectare). Generalized h-d models [61] and crown profile models [62,63] developed in the study area for P. radiata and P. pinaster were used to estimate h and h blc for the remaining trees. For each tree, the total volume, the total aboveground biomass, and the fine biomass (consisting of the biomass of needles and fine twigs (particles of diameter < 0.6 cm)) were estimated using the individual tree-volume equations and the compatible biomass systems developed for maritime pine and radiata pine in Galicia [61]. The number of stems per hectare (N), stand basal area (G), mean stand height (h), stand dominant height (H, defined as the mean height of the 100 thickest trees per ha), stand volume (V) and stand total aboveground biomass (W) were calculated from the tree variables for each plot and thinning treatment on a per-area basis (Table 1).

Tree and Stand Variables
The diameter at breast height (d) of all the trees was measured to the nearest 0.1 at two perpendicular angles, with a graduated caliper. Total tree height (h) and heigh the base of the live crown (hblc, defined as the lower insertion point of live branches tree) were measured to the nearest 0.1 m with a digital hypsometer in a random sam of 30 trees and in an additional sample, including all dominant trees (the proportio the 100-largest diameter trees per hectare). Generalized h-d models [61] and crown pro models [62,63] developed in the study area for P. radiata and P. pinaster were used estimate h and hblc for the remaining trees. For each tree, the total volume, the t aboveground biomass, and the fine biomass (consisting of the biomass of needles and twigs (particles of diameter < 0.6 cm)) were estimated using the individual tree-volu equations and the compatible biomass systems developed for maritime pine and rad pine in Galicia [61]. The number of stems per hectare (N), stand basal area (G), mean st height (ℎ ), stand dominant height (H, defined as the mean height of the 100 thickest t per ha), stand volume (V) and stand total aboveground biomass (W) were calculated fr the tree variables for each plot and thinning treatment on a per-area basis (Table 1).

Canopy Fuel Variables
Three structural canopy fuel variables were estimated from tree measurements each plot: canopy base height (CBH), canopy fuel load (CFL) and canopy bulk den (CBD). CBH was calculated as the height from the ground to the base of the live crow trees, as an average per plot; CFL was calculated by dividing the fine biomass of all trees of the sample plot by the plot area, and CBD was calculated by dividing CFL by crown length, estimated as the difference between mean stand height (ℎ ) and CBH. M and standard deviation values for the main stand characteristics and canopy fuel varia for each species and treatment are shown in Table 1.  Three structural canopy fuel variables were estimated from tree measurements for each plot: canopy base height (CBH), canopy fuel load (CFL) and canopy bulk density (CBD). CBH was calculated as the height from the ground to the base of the live crown of trees, as an average per plot; CFL was calculated by dividing the fine biomass of all the trees of the sample plot by the plot area, and CBD was calculated by dividing CFL by the crown length, estimated as the difference between mean stand height (h) and CBH. Mean and standard deviation values for the main stand characteristics and canopy fuel variables for each species and treatment are shown in Table 1.

Understory Fuel Variables
For each sample plot, two transects of 32 m were established from the middle of one of the 40 m sides of the plot to the vertices of the opposite side (see Figure 1). The linear shrub cover, differentiated by species, was recorded along each transect, and the depth of the litter and duff layers and the height of each shrub species were measured at 4 m intervals. The mean understory cover (COVshrub) and depth of the litter and duff layers (d LFH ) of the sample plot were estimated from these measurements. The mean shrub height (h shrub ) was calculated by weighting the heights of understory species according to the respective cover, and the fuel strata gap (FSG) was calculated as the difference between CBH and h shrub . The total understory load (Wshrub) and litter and duff layer fuel load (WLFH) were calculated using the equations proposed by Arellano-Pérez et al. [60] for these types of fuels in pine stands in Galicia. Finally, the downed woody debris load (Wdebris) was estimated by the planar transect method [64,65]. Mean and standard deviation values for the main surface fuel variables are shown in Table 2 for each species and thinning treatment. Each sample plot was scanned once by positioning the TLS equipment in the middle of the left transect coinciding with the field measurements. This position was also georeferenced with a Trimble Geo7X GPS with subsequent post-processing of the information to obtain sub-centimeter accuracies. Three-dimensional point clouds with a density of 7.67 mm point spacing (~130 points/m) at 10 m were collected using the FARO Laser Scanner Focus 3D × 130 (Faro, USA). The TLS is a class-I laser (wavelength = 1550 nm) with a range of 0.6 to 130 m, for objects with 90% reflectivity, and an accuracy of ± 2 mm at 10 m range. The point clouds from the single scan on each sample plot were cropped considering a 10-m radius to prevent incorporating any information from outside the plot. The height above the terrain of each point was calculated from digital elevation models (DEMs) created from each scan, using the "normalize" function in the FORTLS package [66] in R [67]. In the first step, ground points were segmented using the cloth simulation filter algorithm [68], and a DEM was generated via spatial interpolation of the z-value for these points, using a k-nearest-neighbor approach with inverse-distance weighting (KNN-IDW). Normalized heights were then calculated by subtracting the smoothed DEM from the z-values. These values were used to calculate 43 TLS metrics related to the vertical distribution of the point clouds. The metrics and the corresponding descriptions are summarized in Table 3. Table 3. Variables related to the height distribution of the normalized TLS point cloud.

Metric Description
TLShmax ( In addition, three other metrics were derived from the point cloud: (i) the ratio of the number of observed laser returns and the maximum number of laser returns (TLSPmax); (ii) the ratio of the number of observed laser returns above 2 m in height and the maximum number of laser returns above 2 m in height (TLSPA2m_max); and (iii) the ratio of the number of observed laser returns below 2 m in height and the maximum number of laser returns below 2 m in height (TLSPB2m_max). The maximum number of laser returns was estimated as the number of returns that would be obtained, taking into account the laser scanning density on the surface of a cylinder of 10 m radius, with a base on the ground and height equal to the maximum height of the trees in the plot or in the height interval 0-2 m or 2 m-maximum height. We hypothesize that these metrics should be related to the bulk density of the total vegetation and shrubs in the plot, respectively.
Finally, the vertical point distribution was modeled using the two-parameter Weibull probability density function (pdf). For this purpose, the relative frequency of the number of laser returns in each cylindrical layer of height by 1 cm and radius 10 m was calculated by dividing the number of returns in this layer by the total number of laser returns. The scale (Wscale) and shape (Wshape) parameters of the Weibull pdf were then estimated using the first and the second moments of the relative distribution of vertical laser returns [69].
A total of 48 variables (the 43 metrics related to the vertical distribution of the point clouds summarized in Table 3-the three ratios characterizing the bulk density of vegetation and shrubs and the two Weibull pdf parameters) were finally considered as potential independent variables for the modeling of fuel loads.

ALS Data
ALS data set used in this study was obtained in the second round of the Spanish countrywide ALS measurements, which are publicly available in Spain through the National Plan for Aerial Orography (hereafter referred to as the PNOA). Square ALS blocks of 2 km on each side, covering the whole region of Galicia, were obtained from the CNIG (Centro Nacional de Información Geográfica) computer server (http://centrodedescargas.cnig. es/CentroDescargas/index.jsp (accessed on 7 June 2018)). The ALS data were collected in July-September 2015 (West Galicia), August-December 2016 and February 2017 (East Galicia), roughly coinciding with the timing of the field inventories. The scanning sensors used to collect the ALS data in the study area were a LEICA ALS50 (Leica Geosystems AG, Germany) (West Galicia) and a LEICA ALS80 (Leica Geosystems AG, Germany) (East Galicia). The standard laser pulse density was 0.5 first returns m −2 ; the vertical accuracy of the scanning survey varied between 0.15 and 0.2 m and the horizontal accuracy varied between 0.3 and 0.4 m. The ALS datasets were processed using Fusion/LDV software [70]. The software parametrization and the workflow for processing the ALS point cloud are described in detail by González-Ferreiro et al. [12]. Briefly, all ALS echoes classified as "ground" were used to create a digital elevation model (DEM) with a spatial resolution of 2 m. Normalized heights were then calculated by subtracting the smoothed DEM from the z-values. These heights were used to calculate metrics related to the vertical distribution of the point clouds, similar to those obtained for the TLS data (Table 3), and to distinguish tree canopies (echoes above 2 m) and the shrub layer (echoes below 2 m) when computing the ALS height metrics.

Model Development
In a first step, models were developed separately for estimating the main fire-related variables and other stand variables from TLS-and ALS-derived metrics by using three different modeling approaches: multivariate adaptive regression splines (MARS), random forest (RF) and support vector machines (SVM). After selection of the best approach, the models were then refitted by combining ALS and TLS metrics as independent variables to examine the effect on the estimates and the goodness-of-fit statistics. In addition to using the TLS-and ALS-based metrics, a logical variable (I sp ) was also included to discriminate between overstory tree species, with a value of 1 for P. radiata and 0 for P. pinaster.
MARS is a nonparametric technique that was proposed by Friedman [71]. This method constructs a non-linear regression model by fitting piecewise linear regressions in distinct intervals of the independent variable space. The general form of the non-parametric regression model is as follows: where β 0 is the intercept of the model, B i (X) are piecewise linear basis functions with X a vector of independent variables, β i is the coefficient of the ith base, M is the number of basis functions and ε is the error. Each piecewise linear basis function takes on the following two forms: (1) a hinge function with the form max(0, x-cte) with x-cte > 0 or max(0, cte-x) with cte-x > 0 where x is an independent variable and cte is a constant value called "knot"; or (2) a product of two or three hinge functions, which can therefore model the interaction between two or three independent variables (x) defining the model degree (k = 1, 2 or 3). The number of basis functions (M) and the model degree (k) were optimized by performing jack-knife cross-validation so that the training data set consists of all sample plots except one (test data set) and the model is fitted as many times as there are sample plots. The optimal model will be that yielding the lowest root mean square error (RMSE): where y i andŷ i are the observed and predicted values of the dependent variable, respectively, and n is the number of observations. The predicted values used were those obtained in cross-validation for the out-of-bag samples (test data set). The "earth" package [72] in R software [67] was used to fit the MARS models. RF is a widely used non-parametric classification and regression technique proposed by Breiman [73], consisting of an ensemble of decision trees. Bagging is used to create different training subsets and then fit a tree for each one, providing an estimate for those samples not included in the training subset. The final estimate in regression for each sample is obtained as a weighted mean of the estimates of a large number of individual trees [74].
RF also provides a measure of the importance of input features through random permutation, which can be used for feature-ranking or selection [75]. An RF model requires the number of trees to be grown and the number of randomized independent variables selected in each split to be established. In this study, the "randomForest" package [76] in R software [67] was used to fit the RF models by establishing the number of trees as 500 and the number of independent variables selected in each split as the square root of the total number of independent variables. The optimal RF model was again obtained using a cross-validation procedure by minimizing the RMSE obtained, using the residuals of the weighted estimates of the out-of-bag samples.
SVMs have been developed from artificial neural networks [77], and the models are developed by a set of vectors (or hyperplanes, if a higher dimension is requested) that minimize the mean error using kernel functions. The type of kernel used is a parameter that must be correctly established. Kernels also require their own parameterization, which can make employing this type of technique unsuitable for users with limited experience. Three different kernel functions were used in this study: linear (SVM-L), radial (SVM-R) and polynomial (SVM-P). The values of the hyperparameters associated with each of the kernel functions were optimized to minimize the RMSE obtained by jack-knife cross-validation. The "e1071" package [78] of R software [67] was used to fit the SVM models.
Two goodness-of-fit statistics were used to compare the results of the three modeling approaches: the RMSE(%), calculated as the percentage of RMSE over the mean value of the dependent variable, and the R 2 , defined as the square of the Pearson's correlation coefficient between observed and estimated values of the dependent variable (r 2 y iŷi ).

Models Based on TLS-Derived Metrics
The goodness-of-fit statistics for the best models obtained with the three different approaches are shown in Table 4. Overall, models with the highest percentage of variability were explained (R 2 ) and the lowest RMSE were obtained using the MARS approach, whereas the least accurate estimates corresponded to the models obtained by the RF approach. The SVM models produced very similar overall results to MARS models but only when the kernel with the best goodness-of-fit statistics was selected for each dependent variable. The results suggest that parametrization of the radial kernel produces more accurate estimates of variables related to biomass or volume accumulation (W debris , W LFH , W shrub , CFL, W and V), while for height-related variables (h shrub , FSG, h, CBH and H), a simpler linear kernel is accurate enough. The use of the polynomial kernel, which is more complex to parameterize, did not improve the accuracy of the estimates, except with the model used to estimate W LFH . Table 4. Goodness-of-fit statistics of the models fitted using the three different approaches, considering TLS-derived metrics as independent variables. MARS = multivariate adaptive regression splines; SVM = support vector machine with linear (SVM-L), radial (SVM-R) or polynomial (SVM-P) kernels; and RF = random forest. The best results for each dependent variable are highlighted in bold. The explanatory ability of the models obtained with MARS depended on the characteristics of the dependent variable analyzed. For variables related to shrub or tree height, the variability explained by the models ranged from 53.1% for h shrub (RMSE % = 42.55) to values of around 93% for mean (h) and dominant (H) canopy heights (RMSE % = 5.43 and 5.39, respectively), with values of around 77% for FSG and CBH (RMSE % = 13.30 and 11.92, respectively). The shrub height is not an arithmetic mean but a value weighted by the coverage of the different shrub species present, so that estimation from TLS metrics not only depended on the vertical distribution of the point cloud but also on the horizontal distribution. This may explain why a smaller amount of variability was explained than for the other height variables that can be more easily estimated from the metrics of the vertical distribution of the point cloud.
The values of the goodness-of-fit statistics obtained for mean and dominant heights are within the upper range of values reported in the relevant literature: Huang et al. [79] reported explained variability values of 90.25%, with an RMSE% of 6.91, for the estimation of individual tree heights in mixed hardwood stands in Culai Mountain National Forest in China; Liang and Hyyppa [80] obtained RMSE% values ranging from 6.9 to 21.3% in single scans for tree-height estimation in a study including mixed stands of P. sylvestris, Picea abies, and Betula sp. And pine-and spruce-dominated stands in Finland; Srinivasan et al. [81] explained 92% of the observed variability in the height estimation of individual P. taeda trees; and, in a comprehensive study of different TLS algorithms for processing TLS datasets, Liang et al. [8] used 24 sample plots selected from varying forest-stand conditions in boreal forests and obtained RMSE% values for a single scan that ranged from 12 to 23% in "Easy" plots (clear visibility with minimal understory vegetation and low stem densitỹ 600 trees/ha), from 18 to 41% in "Medium" plots (moderate stem densities~1000 trees/ha and sparse understory vegetation), and from 28 to 57% in "Difficult" plots (plots with high stem densities of~2000 trees/ha and dense understory vegetation). Nonetheless, these comparisons are inconclusive as, in all these studies, the goodness-of-fit statistics refer to estimates of individual tree heights and not to stand values. On the other hand, Torralba et al. [82] fitted a model for estimating H from TLS-derived metrics with a single scan that explained 73% of the observed variability, with an RMSE% value of 8% in mixed stands of Mediterranean forest.
Very few studies have characterized the vertical structure of forest fuels using TLS data [83,84]. García et al. [83] fitted CBH and FSG estimation models in Scots pine and larch, as well as mixed stands of oak and birch forests in Cheshire (UK) from a single TLS scan, with RMSE% values of 46.54 and 49.68%, respectively, which are much higher than those obtained in this study (8.29 and 8.57%).
Regarding the variables related to overstory biomass or volume (G, W, V, CFL and CBD), the variability explained by the MARS models ranged from 42.35% for CFL (RMSE% = 20.42) to 58.65% for V (21.84%), with values slightly above 51% for G and CBD (RMSE% = 18.70 and 26.54, respectively), and around 56% for W (RMSE% = 21.29). The RMSE% values are within the lower range of values reported in previous studies, especially for V and W, although, in most of these studies, the stand variables were calculated by aggregating tree values derived from allometric models including tree diameter and height estimates that were mainly obtained from several scans and not directly estimated, as in the present study [85,86]. However, the accuracy of this individual tree-modeling approach is strongly dependent on stem detection, which is affected by occlusion effects, low point density and a high amount of noise at certain locations in the plot, even when the multi-scan mode is used [7,87]. In single scans, stem detection rates vary between 20 and 100%, with values around 73% in forests with densities of between 400 and 1400 stems/ha [80,[88][89][90].
Strahler et al. [91] obtained an error of 15.49% in the estimation of basal area in a P. ponderosa stand in New South Wales (Australia) with a single scan; in a study with 30 scan plots installed in six New England hardwood and conifer forest stands, Yao et al. [92] explained 66% of the observed variability in G, with an RMSE% value of 15.24% and 85% of the variability observed in W, with an RMSE% of 10.69%; Skowronski et al. [51] reported RMSE% values for CBD and CFL estimates of 18.48% and 15.68%, respectively, in wildfire-prone stands in the New Jersey Pinelands; Torralba et al. [82] explained 70% of the observed variability in W with an RMSE% of 21% in Mediterranean forests using TLS-metrics obtained with a single scan similar to those used in this study; in a study of 40 plots in mature P. sylvestris forests in Spain, Molina-Valero et al. [93] estimated G from a single scan with an RMSE% of 21.84%, and explained 75% of the observed variability.
Regarding the MARS models for estimating surface fuel loads (W LFH , W debris and W shrub ), the explained variability is very similar, varying between 34.62% for W shrub and 36.53% for W debris , although the RMSE% values are very different (11.96% for W LFH ; 97.07% for W debris and 44.51% for W shrub ), as a consequence of the large variability observed in the W debris load, especially in the heavily thinned plots.
Most studies estimating shrub biomass using TLS data have focused on reconstructing the shrub shape from the point cloud, either by piecewise approximation of the volume or by voxelization [30,34,36,37]. Studies that use TLS to predict the biomass distribution of shrub and other fuel layers without reconstructing shapes from the returns are scarce. For example, Chen et al. [21] developed a predictive model to estimate surface fuel loads in south-eastern Australian Eucalyptus forests, including the topography and previous fire disturbance as possible independent variables. The best model explained 89% of the observed variability, with litter depth obtained via TLS data, the number of years since the last fire, canopy density obtained with ALS data, and plot elevation as independent variables; when ALS and topographic variables were not considered, the amount of observed variability explained was reduced to 69%. Wallace et al. [94] explained 74% (RMSE% = 16.3%) of the observed variability in aboveground surface biomass in eucalypt-dominated lowland forests and 50% (RMSE% = 41.2%) in floodplain riparian forests in Victoria (Australia) using TLS data. Alonso-Rego et al. [32] developed an equation system for estimating size-disaggregated shrub and litter fuel loads for four treeless shrub communities in Galicia (NW Spain), using metrics derived from a single TLS scan with an RMSE% of 30.55% for W LFH and 9.50% for W shrub . Li et al. [95] obtained a model that explained 69% of the shrub layer biomass in temperate mixed broadleaf-conifer forest in Jilin Province (China), using TLS-derived shrub volume.

Models Based on ALS-Derived Metrics
The goodness-of-fit statistics of the best models obtained with the three different modeling approaches are shown in Table 5. Similar to the models based on TLS metrics, MARS and SVS approaches outperformed the RF approach. Taking these results into account, and to facilitate comparison between those models with different metrics (TLS vs. ALS), the discussion below only considers the MARS model statistics.
As with models based on TLS metrics, the accuracy of estimates from models based on ALS metrics varied greatly, depending on the dependent variable analyzed. For variables related to tree or shrub height, the explained variability was very similar to that obtained with the TLS models for h (89.16%; RMSE% = 6.53%), and H (93.07%; RMSE% = 5.46%); it was much higher for CBH (84.57%; RMSE% = 9.79%) and FSG (85.51%; RMSE% = 10.47%) and much lower in the case of h shrub (32.01%; RMSE% = 51.22%), probably due to the difficulty in adequately characterizing the understory in dense forests using low-density LiDAR data collected from an airborne platform [23,96]. Comparative studies show that ALS yields more accurate canopy-height estimations than TLS [41,97], while TLS yields a more accurate characterization of the understory vegetation [19,98].
Regarding the variables related to overstory volume or biomass, the models based on ALS metrics were similar to those based on TLS metrics for G (R 2 = 0.4939; RMSE% = 19.18%) and much better for the other variables (V−R 2 = 0.6599; RMSE% = 19.80%-; W−R 2 = 0.6452; RMSE% = 19.14%-; CFL−R 2 = 0.5203; RMSE% = 18.63%-; and CBD−R 2 = 0.5587; RMSE% = 25.26%). For surface fuel loads, model estimates based on ALS metrics were slightly more accurate for W debris (R 2 = 0.4069; RMSE% = 93.84%) and W shrub (R 2 = 0.4796; RMSE% = 39.71%) and much worse for W LFH (R 2 = 0.1049; RMSE% = 14.03%). As with models based on TLS metrics, in the case of the latter variable, the RMSE% was low, despite the fact that a very low amount of the variability was explained by the model (≈10%) due to the low variability of litter fuel load in the 68 plots analyzed; by contrast, in the case of debris fuel load, the RMSE% was very high due to the high variability observed in the plots, especially in the heavily thinned plots. Table 5. Goodness-of-fit statistics of the models fitted using the three different approaches, considering ALS-derived metrics as independent variables. MARS = multivariate adaptive regression splines; SVM = support vector machine with linear (SVM-L), radial (SVM-R) or polynomial (SVM-P) kernels; and RF = random forest. The best results for each dependent variable are highlighted in bold. All these goodness-of-fit values are within the range previously reported for lowdensity LiDAR data and forests of similar characteristics. For example, for pure P. radiata forests, the V models developed by González-Ferreiro et al. [10] yielded an RMSE% of 30.4% for LIDAR data, with a density of 0.5 first-returns/m 2 and 24.98% for 8 first-returns/m 2 ; the AGB models used by the same authors yielded RMSE% values of 26.77% for 0.5 firstreturns/m 2 and 23.73% for 8 first-returns/m 2 . Using the same data and applying a hybrid approach based on the combined use of multiple linear regression (MLR) and a genetic algorithm to select the best set of independent variables, García-Gutiérrez et al. [99] reported RMSE% values of 23.03% and 21.29% for G models, using LiDAR data with densities of 0.5 and 8 first-returns/m 2 , respectively; 27.24% and 24.63% for V models fitted to LiDAR data sets of 0.5 and 8 first-returns/m 2 , respectively; and for W models, RMSE% values of 26.77% for 0.5 first-returns/m 2 and 24.04% for a laser pulse density of 8 first-returns/m 2 . In a study in P. radiata forest in Galicia, González-Ferreiro et al. [11] obtained RMSE% values of 6.94% for h; 7.13% for CBH; 12.40% for CFL; and 21.85% for CBD with LiDAR data with a density of 0.5 first-returns/m 2 . In a study of a network of thinning trials in 4 pure and even-aged P. pinaster stands in Asturias (NW Spain), Hevia et al. [14] obtained RMSE% values for h, H and CBH of the same order as those obtained in this study (5.22; 6.50 and 8.43%, respectively), although the results were much better for fuel load variables (8.09% for CFL and 9.78% for CBD); however, LiDAR data density was also much higher (8-16 first-returns/m 2 ) than that used in this study. Jiménez et al. [100] developed models for estimating stand variables in P. pinaster and P. radiata forests in Galicia, using a combination of PNOA LiDAR data and LANDSAT-ETM data in plots of the Fourth Spanish National Forest Inventory (4-SNFI) obtaining RMSE% values for G of 41. 62% for P. pinaster and 27.64% for P. radiata; the H values were 6.31 and 11.76%, respectively, for the same species; for crown biomass, the values were 42.86 and 26%, respectively, and for stem biomass, 46.92 and 23.99%, respectively. When also using 4-SNFI data and PNOA LiDAR data, González-Ferreiro et al. [12] obtained RMSE% values for the estimation of CFL of 44.35% for P. radiata and 45.39% for P. pinaster. In a study of P. pinaster and P. radiata forests in north-western Spain using machine learning approaches, Novo-Fernández et al. [101] obtained RMSE% values for estimating V of 37.95 and 37.23%, respectively, and 38.51 and 38.27%, respectively, for estimating W from data of the 4-SNFI and the same PNOA LiDAR data used in this study.

MARS Models Combining TLS-and ALS-Derived Metrics
The mathematical expressions of MARS models obtained using TLS metrics, ALS metrics, and both, combined as independent variables for estimating surface fire-related variables, canopy fire-related variables and stand variables are shown in Tables A1-A3, respectively (see Appendix A). The combination of ALS and TLS metrics improved the predictive ability of models based on a unique metric type in all cases except for h shrub and W shrub , as shown by the values of the goodness-of-fit statistics ( Table 6) for each set of metrics (TLS, ALS, and TLS + ALS). Improved accuracy of estimates is particularly important for those variables related to overstory volume or biomass, with RMSE reductions of 7.56% for CFL; 9.21% for V; 9.52% for CBD; and 10.72% for G. Table 6. Goodness-of-fit statistics of the models fitted, using the three different sets of metrics (TLS, ALS and TLS + ALS) as independent variables.  These results were expected as these variables are strongly related to canopy size, and its vertical and horizontal distribution and the combination of ALS-derived metrics (considering zenith angle) and ground-based TLS-derived metrics provides much more complete information regarding spatial structure [102]. The models developed for estimating W LFH and W debris that were fitted using metrics from both sensors also showed RMSE reductions of 6.47 and 7.22%, respectively.
The relative importance of each independent variable for the best-fit models is shown in Figure 2. The importance of each variable was estimated on the basis of the RMSE reduction that incorporation in the model implies, assuming that the other independent variables are already included in the model. The relative importance of each independent variable for the best-fit models is shown in Figure 2. The importance of each variable was estimated on the basis of the RMSE reduction that incorporation in the model implies, assuming that the other independent variables are already included in the model.  Overall, the percentile metrics of both ALS and TLS data sets were the variables that contribute most to the variability explained by the models, especially for those dependent variables related to height (ℎ , ℎ , FSG, CBH and H). However, as expected, in the case Figure 2. Relative importance of independent variables for the best-fit models for estimating surface fire-related variables (upper), canopy fire-related variables (middle) and stand variables (lower), respectively. The values were estimated from the RMSE reduction that the incorporation of each variable in the model implies, assuming that the other independent variables were already included in the model.
Overall, the percentile metrics of both ALS and TLS data sets were the variables that contribute most to the variability explained by the models, especially for those dependent variables related to height (h shrub , h, FSG, CBH and H). However, as expected, in the case of variables related to fuel loads, metrics related to profile shape (skewness, interquartile distance, L-moments, or the shape parameter of the Weibull distribution) or to the horizontal distribution of the point cloud (TLSP A2m_max or TLSP B2m_max ) contributed significantly to the reduction of RMSE values. These results were similar to those obtained in multiple studies that used metrics instead of reconstructing 3D structures, especially in the case of ALS point clouds [41,101,[103][104][105].
For all models fitted using TLS-derived metrics, except those related to variables associated with surface fuels (h shrub , W debris , W LFH and W shrub ), the estimates were species-dependent, i.e., the species-related logical variable (I sp ) appeared as a significant independent variable alone and/or modulating the effect of other independent variables. However, this logical variable was not part of the adjusted models using ALS-derived metrics, except in the case of H. By combining both metrics, the species-related logical variable only affected the estimates of the dominant height model. These results were consistent with differences in the resolution and spatial coverage of each LiDAR platform, conditioned by point density and scan positions that affect occlusions, limiting their ability to characterize different canopy zones [49,53]. TLS measures the forest from below (i.e., upward), producing high-resolution data that provide a detailed characterization of the differences in the vertical structure of the stands of these two species, especially in the lower canopy. On the other hand, the low-density ALS used has much lower resolution, and the downward view of the forest leads to many more occlusions and hinders characterization of the lower canopy. Therefore, the greater variability in the TLS point cloud metrics is driven by the inclusion of a species-related factor in the models. In the case of the dominant height models, even the low-density ALS used allows the upper crown structure to be characterized in great detail, which, when combined with the remarkable differences in H between species in the dataset used (16.42 m and 22.39 m for P. pinaster and P. radiata, respectively), results in a great variability of the upper percentiles that is driven by the inclusion of the logical variable affecting the species. Despite the differences in point density of the sensors, models fitted for this variable yielded very similar goodness-of-fit statistics, explaining almost 93% of the observed variability. Similar results have already been reported in benchmarking studies [19,45,98]. The species-independent nature of the fitted models, based on ALS-derived metrics, enables the construction of estimation maps without pre-stratification according to pine species or the added use of classification models using remote sensing data, which could increase the uncertainty of the estimates.
Finally, Figure A1 (Appendix A) shows the scatter plots of observed versus predicted values, obtained with the best-fitting models for estimating surface and canopy fire-related and stand variables. These graphs do not show any evident trends indicating underor over-estimation biases for any of the dependent variables analyzed. In addition, no anomalous trends were observed in the models fitted only with ALS-or TLS-derived metrics (graphs not included).

Conclusions
Rational decision-making related to the management of forest fuels to reduce fire risk and regarding the best alternatives to fight fires, based on predicted fire behavior, must be based on objective and reliable characterization of the variables related to the initiation and spread of forest fires, i.e., surface and, especially, crown fires.
The results obtained in this study indicated that the use of metrics obtained from both low-density ALS and single-scan TLS surveys shows promise for characterizing the main surface and canopy fuel variables related to fire risk. However, each platform has trade-offs, which, unsurprisingly, affect the models' estimations. ALS captures the vegetation structure from above, providing a higher level of detail in the upper canopy and declining accuracy in representing canopy characteristics with increasing canopy depth, especially for species with dense canopies, such as those analyzed here, and with the low-density ALS used in the study. Conversely, TLS captures the vegetation structure from below, providing high-resolution data on the internal characteristics of canopy structural diversity and understory vegetation. However, TLS data are less reliable for estimating upper canopy variables due to occlusion by mid-canopy foliage.
Combining the advantages of both sensors by integrating both sets of metrics as predictor variables in the models yielded an overall improvement in the accuracy of the estimates, clearly indicating the potential of integrating ALS and TLS systems in forest inventories.
Models were developed in this study by integrating independently derived ALS and TLS metrics, without the need to combine both point clouds by complex post-processing methods; however, the combination of these metrics would enable permanent 3D character-ization of the forest structure when required, enabling very accurate subsequent analyses and even characterizing variables not previously considered.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the restrictions of the research projects from which the study is derived.

Acknowledgments:
We are grateful to Mario López Fernández for his inestimable technical support in conducting all the field sampling tasks in the plot network.

Conflicts of Interest:
The authors declare no conflict of interest.