Estimation of Individual Tree Stem Biomass in an Uneven-Aged Structured Coniferous Forest Using Multispectral LiDAR Data

: Stem biomass is a fundamental component of the global carbon cycle that is essential for forest productivity estimation. Over the last few decades, Light Detection and Ranging (LiDAR) has proven to be a useful tool for accurate carbon stock and biomass estimation in various biomes. The aim of this study was to investigate the potential of multispectral LiDAR data for the reliable estimation of single-tree total and barkless stem biomass (TSB and BSB) in an uneven-aged structured forest with complex topography. Destructive and non-destructive ﬁeld measurements were collected for a total of 67 dominant and co-dominant Abies borisii-regis trees located in a mountainous area in Greece. Subsequently, two allometric equations were constructed to enrich the reference data with non-destructively sampled trees. Five different regression algorithms were tested for single-tree BSB and TSB estimation using height (height percentiles and bicentiles, max and average height) and intensity (skewness, standard deviation and average intensity) LiDAR-derived metrics: Generalized Linear Models (GLMs), Gaussian Process (GP), Random Forest (RF), Support Vector Regression (SVR) and Extreme Gradient Boosting (XGBoost). The results showcased that the RF algorithm provided the best overall predictive performance in both BSB (i.e., RMSE = 175.76 kg and R 2 = 0.78) and TSB (i.e., RMSE = 211.16 kg and R 2 = 0.65) cases. Our work demonstrates that BSB can be estimated with moderate to high accuracy using all the tested algorithms, contrary to the TSB, where only three algorithms (RF, SVR and GP) can adequately provide accurate TSB predictions due to bark irregularities along the stems. Overall, the multispectral LiDAR data provide accurate stem biomass estimates, the general applicability of which should be further tested in different biomes and ecosystems.


Introduction
Stem volume and biomass are among the most significant productivity and carbon sequestration factors since they are vital for sustainable forest management and climate change mitigation [1]. According to [2], plant biomass (above and below ground) is the main conduit for CO 2 removal from the atmosphere, storing more than 80% of the total aboveground carbon [3,4].
In recent decades, there has been an increasing demand for accurate and timely vegetation biomass estimation due to the rising consumption of biomass products, especially in managed forests. Above-ground biomass (AGB) is defined as the sum of the dry mass of every tree component standing above the soil level (e.g., stem, leaves, needles, branches, bark), typically expressed as mass at the individual tree level [5][6][7]. Among the different AGB components, the stem biomass is considered to be the most crucial, being the dominant material for timber products and paper production [8]. Traditionally, biomass ness of different machine learning (ML) algorithms for biomass estimation using both single-tree and plot-based approaches on LiDAR-derived point clouds, reporting that SVR produced the most accurate biomass models. In addition, the authors stated that manual crown delineation does not necessarily result in more accurate biomass estimates since the relationship between the accuracy of both crown delineation and biomass estimation is rather complex. In [78], the authors performed a comparative analysis between the tree-level and area-based approach suggesting their combined implementation for reliable AGB estimation in a boreal forest using the nearest neighbor algorithm.
Although BSB is considered to be the most significant parameter for growth models, stem volume, biomass and carbon allocation among tree species [79], most of the LiDARbased studies focus on the TSB and AGB estimation. Recently, [75] presented a comparison between the potential of multispectral single-photon and linear-mode LiDAR systems in plot-based TSB estimation, using GP regression. The results showcased that the linear mode LiDAR systems outperformed the single-photon systems in terms of Root Mean Square Error (RMSE). In [74], the authors predicted TSB in a boreal forest, using measurements derived from 1476 trees, low-density LiDAR data and two regression algorithms (RF and linear). According to their findings, the estimation accuracy was similar using RF and linear regression in terms of RMSE and the coefficient of determination (R 2 ), although the RF provided more consistent results over the trials [74]. The stem volume and TSB of individual pines were also examined using full-waveform [1] and discrete-return LiDAR data [80]. As reported by [1], the full-waveform-derived metrics do not improve the volume and biomass estimates significantly compared to the discrete data, due to the limitations of single-tree delineation. In [80], the authors employed the crown geometric volume (CGV) method on LiDAR point cloud to estimate the TSB for three different tree density classes. According to the results, the low tree density sites provided the most accurate stem volume estimates (R 2 = 0.68) compared to the others (i.e., R 2 = 0.62 for the medium and R 2 = 0.44 for the dense), since the high tree density leads to large tree segmentation errors.
Recent developments in LiDAR technology have led to commercial multispectral systems, which provide essential information on the vertical distribution of physiological processes [81]. The intensity of multispectral LiDAR channels contains complementary information for improved land cover and species classification [82][83][84][85]. Until now, the intensity features have been most widely used for land cover and species classification but few studies have examined their contribution to forest inventory parameters and biomass estimation [75,86]. More specifically, [86] compared the capability of multispectral and conventional LiDAR data for modelling and predicting forest characteristics (e.g., AGB, Ginis coefficient of DBHs, Shannon diversity index, and number of trees per hectare) at the plot level in a boreal forest. The results showcased that the multispectral data improved the AGB estimations (i.e., R 2 = 0.87 for the multispectral and R 2 = 0.72 for the single spectral). In 2020, [87] compared the suitability of monospectral and multispectral LiDAR for canopy fuel parameter estimations in a boreal forest. According to their findings, the multispectral LiDAR provided improved canopy fuel estimates compared to the monospectral LiDAR. Additionally, multispectral LiDAR data are more capable for individual tree detection compared to the monospectral data due to the spectral differentiation of the vegetation objects [88].
Although many different tree species have been examined in the literature, few studies have explored the potential of multispectral LiDAR remote sensing in biomass estimation of the different Abies species [89][90][91], and there are still no studies related to the biomass estimation of the Abies borisii-regis. Furthermore, compared to the vast majority of studies on the biomass estimation in plantations and semi-natural regenerated forests, only a handful of studies have been conducted in uneven-aged structured forests [12,[92][93][94][95][96]. In order to fill this literature gap, the present study aims to estimate TSB and BSB in an uneven-aged structured Abies borisii-regis forest employing multispectral LiDAR-derived predictive models. More specifically, two allometric equations were built to derive TSB and BSB estimates, respectively, which were used as additional reference data. Next, a set of regression algorithms (i.e., Generalized Linear Models-GLMs, GP, RF, SVR and Extreme Gradient Boosting-XGBoost) was examined using LiDAR-derived height and pulse intensity information in order to investigate their potential in reliable TSB and BSB estimation.

Study Area
The study was conducted in Pertouli University Forest, located on the southeast side of Mount Pindos in Thessaly, Greece (latitude 39 • 32 -39 • 35 N, longitude 21 • 33 -21 • 38 E) ( Figure 1). The forest has been under the management of the Aristotle University of Thessaloniki since 1934 for research and education purposes. It covers almost 33 km 2 , and the altitude ranges between 900 and 2050 m, with an average of 1350 m. The climate is characterized as transitional Mediterranean/Mid-European with high variability among seasons. Uneven distribution of the annual precipitation is observed, ranging from 1213.9 mm of rainfall and 328.2 mm of snow in winters to 298.8 mm during the vegetative period. As for the geology, the area's soil type is classified as Cambisol, and the main rocks are flysch and limestones which, along with the mountainous terrain, are quite vulnerable to weathering and erosion. Almost 90% of the forest area is dominated by Abies borisii-regis, and the remaining area is covered by individual trees of Scots pine and black pine (Pinus silvestris and Pinus nigra), beech (Fagus moesiaca), Austrian oak (Quercus cerris), willows (Salix caprea and Salix eleagnos), English yew (Taxus baccata), maple (Acer optusatum), European hophornbeam (Ostrya carpinifolia), laburnum (Cytisus laburnum), linden (Tilia cordata) and plantations of foreign species. The forest is divided into 173 management compartments, with a 10-year management rotation, and the selective harvesting approach is applied.
Trees of the genus Abies are tall evergreen conifers of the Pinaceae family, native to high mountain regions in the North Temperate Zone, and Abies borisii-regis is the natural hybrid between Abies alba and Abies cephalonica [97]. Abies borisii-regis composes allaged stands with many small trees and natural regeneration in the understory and mature trees in the overstory. Trees of the same age can have different heights, mostly due to biological characteristics, tree to tree competition, and soil quality type.

Airborne LiDAR Data
Multispectral LiDAR data were acquired in October 2018 over the Pertouli University Forest, using a RIEGL VQ-1560i-DW laser scanner sensor mounted on an airplane, and included information derived from two spectral channels, namely the Green and Near Infrared (NIR) (532 nm and 1064 nm, respectively). Before their delivery, the preprocessing of the LiDAR data was performed by the provider (i.e., GEOSYSTEMS HELLAS S.A.), including the transformation of the full-waveform dataset (i.e., 19 pulses/m 2 ) to discrete returns. As a result, the delivered data included a georeferenced point cloud in the Universal Transverse Mercator zone 34 coordinates system based on World Geodetic System 1984, along with trajectory files. The point cloud density for both channels was almost identical (i.e., 44.65 points/m 2 for the Near Infrared and 44.85 points/m 2 for the Green), with identical nominal pulse spacing of 0.15 m and 0.18 m, respectively, up to seven returns per pulse and a scan angle range from −32 • to +32 • . In addition, aerial photographs were acquired simultaneously with the LiDAR data, providing high resolution images of the entire forest.

Field Survey
Field measurements were conducted in May 2019 and July 2020. The first field survey included non-destructive sampling of tree clusters in six parts of the forest, while during the second survey, destructive sampling of 32 trees was performed in a predetermined forest compartment ( Figure 2). We chose to enrich the dataset with non-destructive sampling in order to keep the forest disturbance to a minimum, due to the invasive nature of destructive sampling.  (a-f) plots employed for non-destructive sampling (i.e., measurement of DBH for stem biomass estimation using allometric equations); (g) area of interest for destructive sampling.
Non-destructive sampling of stem biomass was carried out in six fixed plots, each of which covered an area of 1000 m 2 (40 m × 25 m). The distribution of the sampling plots across the study area was performed according to the different tree density, DBH distribution, and topographic features. This information was derived from the forest census data, of the 10-year forest management plan conducted in 2018. In each plot, only overstory trees that were detectable by the LiDAR sensor were measured. More specifically, DBH was recorded for each selected tree using a Haglof Mantax Blue caliper. Tree locations were determined using a Garmin eTrex 30x touch with an average horizontal position accuracy of 4 m. As a result, a set of 35 non-destructively sampled trees was recorded and correctly identified in the point cloud.
Destructive stem biomass sampling was carried out in one specific forest compartment due to legal restrictions regarding timber harvesting in the study area. As in the previous sampling method, in each plot the measurement was performed only to the trees that could be detected by the LiDAR sensor. The selection of the specific sample trees was based on the different stem diameters, heights and soil qualities with the aim to collect representative samples from all forest conditions. DBH was recorded on the standing trees using a Haglof Mantax Blue caliper, and their locations were determined using a Garmin eTrex 30x touch. Tree height was obtained after the felling of each tree using a Leica Disto D2 laser distance meter and a ruler. In particular, the laser distance meter was used for the measurement of the distance between the top of the tree and the bottom of the trunk, and the ruler was used to measure the length of the above-ground stump portion. The trunk was then separated from the foliage and divided into three different sections according to its shape (neiloid, cone, and paraboloid) [98,99]. A disk sample was taken from each section and the distance between the disks was measured. Likewise, the fresh weight of the disks was measured using a handheld scale. For smaller trees, with a height lower than 10 m, two samples were taken due to the invariable shape of the trunk.
The bark from each sample was removed and stored in separate paper bags. The circumference and the width of each disk without the bark were measured using a tape measure and converted to diameter under the assumption that the tree's stem is circular [100]. The separated bark samples and the wood were dried at 85 • C for at least 72 h and weighed [98]. The ratio of the measured dry to fresh mass of the stem particles represents the tree wood and bark density. Next, the stem volume was calculated using the following functions (Equations (1)-(3)) [101]: where H is the distance between the two disks, r 1 , r 2, and r 3 are the radius of the first, second, and third disk, respectively, and r s is the radius at the end of the merchantable height. Eventually, the wood and bark density of the samples was multiplied with the volume of each stem partition for the TSB and BSB biomass calculation (Table 1 and Figure 3).

Methods
The methodology for the LiDAR-based TSB and BSB estimation is composed of three separate parts, namely the LiDAR processing, the allometric equations development and the model construction along with its' cross-validation. Figure 4 illustrates the flowchart of the entire workflow, which is described in detail in the following sections (Sections 3.1-3.3).

LiDAR Data Processing
LiDAR data processing was performed using the free and open-source R [102] and Spyder software [103]. First, the collected GPS points from the trees' positions were buffered to 4 m radius, which represents the radius of the GPS error. This buffered area was used for the identification of the exact field-measured tree within the LiDAR point cloud. Considering that large scan angles can have a significant impact on the LiDAR-derived canopy structure metrics [104] and intensity values are highly affected by the large scan angles [105], echoes with scan angles greater than ±15 • were removed to avoid errors originating from off-nadir angles.
Subsequently, range intensity normalization of the point cloud was performed. Since single and first-of-many returns are located higher in the canopy and they are significantly stronger than the intermediate returns, the differences in intensity values are influenced by the large variety of the vegetation materials (i.e., foliage and wood) intercepting with the pulse [106,107]. Furthermore, intensity differences between intermediate and last returns are highly affected by the pulse energy losses sustained at the previous returns and by the pulse discretization procedure [107]. However, it is valid to assume that single and first-of-many returns have no significant transmission loss before being reflected back to the sensor [108]. Therefore, only the single and first-of-many returns were used for the range intensity normalization process, where both incidence angle and target reflectance were considered as constants [9,107,109]. The range normalized intensity (I norm ) was calculated as according to the following equation (Equation (4)): where I obs is the observed intensity, R act is the LiDAR-to-object range, R ref is a reference range and the exponent f represents the rate of energy losses by the pulse due to the travel to and from the target [110]. The reference range parameter was set to the average flight altitude and the parameter f was set according to the radar theory [111,112] as f = 2 for targets larger than the footprint [113]. The intensity normalization was performed for each separate flightline and channel (i.e., Green and NIR). The intensity normalized point cloud was merged with the intermediate and last returns for each tree area and noise removal was performed based on the Statistical Outliers Removal algorithm [114]. The point cloud was classified into ground and non-ground measurements using cloth simulating filtering [115]. Subsequently, the digital terrain model (DTM) was generated, and point cloud height normalization was performed employing the triangulated irregular network (TIN) algorithm. The TIN was preferred over other interpolation algorithms due to its' effectiveness in areas of steep slopes and densely distributed point clouds [116]. The pit-free algorithm was subsequently employed for CHM generation, since it provides significantly improved tree detection accuracy in highdensity LiDAR data [117]. Both the terrain and CHM had a resolution of 0.42 m, which was calculated by the following (5) [118]: Additionally, morphological erosion with a cross-shaped structuring element was applied to the CHM to reduce the over-segmentation phenomenon of the watershed algorithm [119]. It is also essential to mention that, in this study, the segmentation algorithm was employed without providing treetops or any kind of local maxima feature. Next, singletree segmentation was performed using the watershed algorithm and the CHM. It is worth noting that a series of established tree segmentation algorithms provided by the package lidR was tested (e.g., [67,69,75]), and the watershed was found to significantly outperform the other algorithms.
Next, a crown normalization process was employed, inspired by the Ptrees algorithm [68], where the correlation between the reference and normalized elevations of each echo is exploited [117]. A simple subtraction of the terrain height from the vegetation points' height can cause irregularities in the normalized canopy structure in sloped terrains, resulting in a significant impact on the biomass estimation accuracy ( Figure 5) [120]. To eliminate the terrain effects to each segmented tree, the highest reference elevation was denoted as the treetop, and the height difference between the treetop's reference elevation and normalized elevation was calculated. Finally, this height difference was subtracted from the reference height of each echo resulting in the normalized point cloud. In order to confirm that the point cloud of each individual segmented tree actually corresponded to the one measured on the field, we used very-high resolution aerial photographs and field observations, according to the method described in [64]. As a result, all trees measured through destructive sampling and 35 out of 55 tree measurements derived from the non-destructive sampling were successfully identified. Finally, a suite of LiDARderived height and intensity metrics were calculated, a summary of which is presented in Table 2. Table 2. LiDAR-derived metrics calculated for the stem biomass estimation. All the height metrics were calculated using all the echoes of each point cloud (Green and NIR channels), and the intensityrelated metrics were obtained using only first-of-many and single returns.

Single-Tree Biomass Estimation Using Allometric Equations
As mentioned in the introductory section, stem biomass is usually calculated using the DBH and tree height as inputs in empirical allometric equations. However, no allometric equations are available for stem biomass in Abies borissi-regis. As a result, new speciesspecific allometric equations were created, using the destructively sampled trees, for total (wood and bark) and woody stem biomass. The most commonly used allometric equation form for biomass estimation is presented below (6): where a and b are the scaling coefficient and scaling exponent, respectively, y is the estimated biomass and x is a tree dimension variable [121]. The ordinary least squares linear (OLS) regression was used to create two allometric equations based only on DBH. The tree height was not employed so that errors derived by incorrect measurements could be avoided. Such errors can arise since tree height measurement on the field constitutes a rather laborious task, difficult to be performed on standing trees with very high accuracy [122]. Finally, the measured diameters (DBH), TSB and BSB of the 35 non-destructively sampled trees were log-transformed and employed into an OLS regression to determine the scaling coefficients [123]. The performance of the allometric equations was evaluated based on the Relative Square Error (RSE), R 2 , and Adjusted R 2 (adjR 2 ).

Single-Tree Biomass Estimation Using Regression Algorihtms
In this study, five different regression algorithms, namely Generalized Linear Models (GLMs), Gaussian Process (GP), Random Forest (RF), Support Vector Regression (SVR), and Extreme Gradient Boosting (XGBoost), were examined for their potential to reliably predict both TSB and BSB. Prior to the statistical analysis, all the predictors (i.e., reference data) were "Center" scaled and divided into two sets, namely training (70%) and testing (30%). Additionally, a nested k-fold cross-validation was applied in each training set to obtain the training accuracy of each model, using 10-fold in the outer and 3-fold in the inner loop. The best estimation model was selected based on the arithmetic mean of estimates (i.e., Mean Absolute Error-MAE, Mean Square Error-MSE, relative bias-rbias, RMSE, and R 2 ) in the k-fold cross-validation partitions [77]. The following sections (Sections 3.3.1-3.3.6) present a brief description of each examined algorithm and the respective tuning parameters.

Generalized Linear Models
Introduced by Nelder and Wedderburn [124], generalized linear models provide a wide range of statistical analysis methods [125]. In the present research, a normal error distribution was assumed, and the log link function was employed for the GLM. Furthermore, an exhaustive screening variable selection was applied and a maximum threshold of three variables was defined to avoid overfitting due to the small sample size. During this process, the Small-Sample Corrected Akaike Information Criterion [126] was employed to select the best five models in each fold. Hereafter, the five models were imported to the internal 3-fold cross-validation and the model with the minimum MSE was selected. Last, the best model was selected utilizing the statistics MAE, MSE, and rbias.

Gaussian Process
The GP is a non-parametric technique consisting of a group of random variables, any finite number of which has a joint Gaussian distribution, specified by its mean and covariance function [127]. In regression problems, a Gaussian Process Regression (GPR) model can make predictions utilizing a priori knowledge about similarities between pairs of examples in a domain (kernels) [128] and provide uncertainty measures over predictions [129]. In this study, the radial basis function (rbf) and the linear kernel were tested. As far as the rbf kernel is concerned, a grid search for the sigma inverse width parameter was used, in contrast to the linear kernel, where no hyperparameters were optimized.

Random Forest
Random Forest, introduced by [130], combines several randomized decision trees and ensembles their performance by averaging [131]. According to this approach, a large number of trees are grown from a different bootstrap sample of the data and the predictor variables at each node are randomly selected [132]. In this paper, two hyperparameters were tuned: the number of trees to grow and the number of predictors for each tree. The number of trees to be grown was varied from 50 to 500 with a 50 step, and the number of selected predictors was defined as the number of predictors divided by three [133].

Support Vector Regression
Support Vector Machine (SVM) was initially developed for classification and later extended for regression problems [134]. In regression, the use of a kernel function, the Vapnik-Chervonenkis control of the margin and the number of support vectors are considered as the main characteristics of the SVR [135,136]. The epsilon-SVR algorithm was applied in this study, according to which a nonlinear mapping transforms the input data into an m-dimensional feature space, and a linear model is established [137]. In our work, the rbf kernel was employed and two hyperparameters were optimized, namely the cost of violation of restrictions and sigma parameter.

Extreme Gradient Boosting
The Gradient Boosting algorithm (XGBoost) [138] is an ML technique for regression and classification problems, providing a model in the form of linear combinations of simple predictors by solving an infinite-dimensional convex optimization problem [128,139]. The XGBoost provides a linear model solver and tree learning algorithm based on "boosting decision trees" for regression and classification tasks, preventing over-fitting problems [129]. In this study, the linear-based algorithm was employed, and a set of hyperparameters presented in Table 3 was optimized. Table 3. Candidate hyperparameters used for the "XGBoost" algorithm.

Hyperparameter Values
Number of boosting iterations 50 to 1000, with 50 step Maximum depth of the tree 1 to 5 Step size of each boosting step 0.01 to 0.07, with 0.01 step Minimum loss reduction (gamma) 1 to 10 Subsample ratio of columns when constructing each tree 0 to 1, with 0.1 step Minimum sum of instance weight needed in a child 5 to 30, with 5 step Subsample ratio of the training instance 0 to 1, with 0.1 step

Model Performance and Evaluation
The performance of each model was assessed in terms of the MAE, MSE, rbias, R 2 , and RMSE. The regression models were validated to an independent testing set (i.e., 30% of the total number of samples), which was randomly split from the training data (i.e., 32 destructively and 35 non-destructively in total), to obtain the predictive performance of each algorithm. In addition, the relative importance of individual predictors in each model was evaluated using the variable importance plots [140]. The variance importance, calculated using a model-agnostic procedure [141], provides a list of the ten most significant variables for each model, in descending order. The least important variable could potentially be excluded from the predictive models, decreasing the computational cost [77]. However, in the GLM case, the feature importance represents the percentage of GLMs containing each variable after the training procedure.

Results
In the present study, BSB and TSB were estimated using five different regression algorithms. More specifically, allometric equations were initially developed and evaluated for their accuracy to enrich the reference dataset with samples collected by the non-destructive method. Subsequently, five regression algorithms (i.e., GLM, RF, SVR, GP, and XGBoost) were evaluated for their potential in reliably estimating BSB and TSB. In addition, the Pearson's correlation coefficient was used to evaluate the potentially existing linear association between the LiDAR-derived variables employed in each model ( Figure 6) [93]. The results are presented in detail in the following Sections 4.2 and 4.3. Figure 6. Pearson's correlation matrix between predictor variables derived from multispectral LiDAR data.

Allometric Equations
Considering that the number of samples (i.e., 32) collected on the field through the destructive sampling method was rather limited, we developed two allometric equations for the BSB and TSB estimation. The development of both models was solely based on the use of DBH without the tree height due to the possible inaccuracy of height measurements. Table 4 presents the results of the allometric TSB and BSB estimation models. The R 2 and adjR 2 for both models amounted to 0.96; however, the TSB one resulted in higher RSE.

Barkless Stem Biomass
As described in Section 3.3, the BSB of individual trees was predicted using five different regression algorithms (i.e., GLM, RF, SVR, GP, and XGBoost) and the LiDARderived intensity and height-related point cloud metrics (Table 2). Table 5 presents the testing performance of the models. The model fits are illustrated in Figure 7, showing the difference between the regression and the hypothetical 1:1 lines.  In general, all the models achieved a reasonable prediction with R 2 greater than 0.7 and RMSE lower than 195 kg, except for the GP with the linear kernel. More specifically, the developed GLM with three predictors achieved the highest performance in terms of MAE and RMSE, although the highest R 2 was obtained by the RF. The RF provided better results compared to the other algorithms (i.e., LR, GP, SVR, XGBoost), with a MAE of 152.65 kg, −5.08% rbias, RMSE of 175.76 kg, and R 2 of 0.78. The application of the GP algorithm with the rbf kernel led to the lowest MAE (162.23 kg) and RMSE (184.24 kg), while R 2 was higher (i.e., R 2 = 0.75) compared to the linear kernel. Both the SVR and the GLM resulted in similar estimation performance, as evidenced by the R 2 (0.73 and 0.72, respectively). The XGBoost model gave slightly lower results in terms of R 2 (0.69) compared to the RF, GP and GLM algorithms, with the second-highest RMSE (195.25 kg), an MAE of 157.29 kg, and an rbias of −5.08%. Additionally, Table 6 presents the best-and worst-performing linear models, according to the calculated statistical evaluation measures (i.e., MAE, MSE, rbias, RMSE and R 2 ), and the correlation between the variables was examined, showing low correlation (<0.5) between selected predictors (Figure 7). Table 6. The best-and worst-performing GLMs selected for the barkless stem biomass (BSB) estimations, using the exhaustive screening variable selection (described in Section 3.3.1). Regarding the relative predictors' importance for the BSB estimation (Figure 8), the GP (both examined kernels) and the SVM models indicated the Hmax as the most important variable, followed by the p99, p95 and p90. The same height percentiles were selected by the RF algorithm, which proved to be the best-performing algorithm in terms of R 2 , followed by the int_std of the NIR channel. The p90 was considered as the most important variable for the XGBoost model as well. On the contrary to the ML algorithms, the int_std of the Green channel was identified as the most important variable presenting the highest frequency of occurrence, alongside the Hmax, p99 and p95 (Table 7). Overall, regarding the ML models, it is evident that the significance of the height-derived LiDAR metrics is considered higher compared to that of the intensity metrics (i.e., int_std_nir), as opposed to the GLM, where the intensity metrics (i.e., int_std_green and int_ske_green) are among the most valuable for the BSB estimation.

Total Stem Biomass
The performance of the five different regression algorithms for TSB estimation is presented in this section (Table 8). Additionally, the models' fit is illustrated in Figure 9, showing a comparison between the regression and the hypothetical 1:1 line. Table 9 presents the best-and worst-performing linear regression models for TSB estimation.   Table 9. The best-and worst-performing GLMs selected for the total stem biomass (TSB) estimations, using the exhaustive screening variable selection (described in Section 3.3.1). The TSB models showed a decrease in R 2 and significantly higher RMSE and rbias metrics in comparison with the BSB models. More specifically, the RF model resulted in the highest R 2 (0.65) among the other algorithms, but with low MAE (152.65 kg) and RMSE (175.76 kg). The GP with the rbf kernel achieved marginally better R 2 over the linear kernel (0.62 and 0.6, respectively), although the latter resulted in lower rbias (−2.02%), RMSE (164.03 kg) and MAE (219.18 kg). The SVR model provided an R 2 of 0.57, with a MAE of 166.84 kg and high rbias (−6.15%). The GLM with three predictors achieved low performance in terms of RMSE and R 2 , although, it showed the lowest rbias (−1.12%). In addition, the best-performing linear model employed high correlated variables (i.e., Havg and p70), which is evidenced in Table 9. Finally, the XGBoost algorithm achieved the lowest R 2 (0.31) and the highest RMSE (290.89) compared to the rest of the employed algorithms.

Model
The relative predictor importance of the GLM and ML models for TSB is presented in Table 10 and Figure 10, respectively. In terms of the GLM, the int_std of the Green channel, followed by the p70 and b50 were identified as the most consistent variables for the TSB estimation (Table 10). In all cases, the Hstd was identified as the most important variable. More specifically, in the GP (both examined kernels) and SVR models, the significance of the Hstd was followed by the Hmax. In the RF model case, the Hstd, the p80, and int_ske of both the Green and NIR channels are the four selected most important variables. In the XGBoost model, the Hstd variable was followed by the p80, p30 and int_ske of the NIR channel in terms of importance. It can be observed that, contrary to the BSB model predictors' importance, the LiDAR-derived intensity metrics from both channels are among the most significant variables, especially in the GLM and the RF model.

Discussion
The present study focuses on the potential of multispectral LiDAR-derived variables alongside with linear and ML regression algorithms for single-tree stem biomass estimation (i.e., TSB and BSB) in an uneven-aged structured forest. In the absence of species-specific allometric equations for biomass estimation, destructive sampling was performed for the construction of two allometric equations (i.e., for TSB and BSB estimation), so that the reference dataset could be enriched without performing additional destructive sampling. Thus, DBH was measured in the field and used as input data in the developed allometric equations for the generation of additional reference TSB and BSB estimates. All reference data were subsequently employed for the tree-level biomass estimation using five different regression algorithms (i.e., GLM, RF, SVR, GP, XGBoost). Both intensity and height metrics were extracted from the LiDAR point cloud and employed as potential independent variables in the predictive models. According to [92], the intensity-height combined AGB models provided higher R 2 compared to the ones using only the height information, especially in a study area like the Pertouli University Forest, which is characterized by complex structure, high canopy cover and intense topography [106]. Overall, the results demonstrate that BSB can be reliably estimated using the GLM and the ML algorithms (i.e., RF, SVR, GP and XGBoost) in an uneven-aged structured coniferous forest, while TSB estimations are more reliable using the RF, SVR and GP (with both kernels), with the RF providing the best results in both cases (i.e., BSB and TSB).
In the case of the GLM application, a maximum number of three independent variables (i.e., LiDAR-derived height and intensity metrics) was incorporated in the predictive models in order to avoid potential overfitting. In this study, an exhaustive screening variable selection method and the Small-Sample Corrected Akaike Information Criterion were employed for optimal linear model selection, taking into consideration the MAE, MSE, and rbias statistics. The estimation accuracy of the developed GLM for BSB estimation was similar to the ML models, resulting in the best prediction performance in terms of MAE, MSE, rbias, RMSE and R 2 (i.e., MAE = 121.71 kg, MSE = 35585.08 kg, rbias = 1.89%, RMSE = 167.09 kg, R 2 = 0.72). However, the GLM severely underperformed (MAE = 174.55 kg, MSE = 45549.33 kg and R 2 = 0.44) in the case of the TSB estimation compared to the rest of the examined algorithms (i.e., RF, GP and SVR). Even though the Green channel presents lower sensitivity in canopy returns registration, due to its large beam divergence [142], the int_std was the most frequently selected variable in both BSB and TSB linear models (Tables 6 and 9). It is noteworthy that the int_std of the Green channel is the variable selected in the best performing linear models, which indicates the significance of the LiDAR intensity information for stem biomass estimation. In general, height metrics were selected in every GLM but, contrary to the intensity metrics, the best performing BSB model consisted of the least frequent height-related variables (i.e., p10 and p70). These two variables are highly correlated, with a Pearson's correlation coefficient greater than 0.5. Moreover, in the TSB GLM, both the p70 and the b50 were rated as the second and third most frequently selected variables in the best-and worst-performing regression models, respectively (Table 10). A comparison of the BSB and the TSB GLM accuracy reveals that the former performs significantly better. The low predictive performance of the GLM in the TSB estimations can be attributed to the fact that bark volume and biomass variation do not only depend on the tree species but also on the growth rate, the environmental conditions and the genetic constitutions of trees [137]. In addition, allometric equations can introduce uncertainty in the estimated biomass results mainly due to measurement errors, transect size, the fraction of the AGB and the characteristics of the study area [143]. As a result, although the linear models performed well in the BSB predictions, they proved to be inadequate for TSB estimations (i.e., MAE = 173.92 kg, and R 2 = 0.32), and new bark biomass ones need to be developed with the use of a large sample size (e.g., sample size >100).
On the contrary, ML algorithms can reliably predict tree attributes in complex structured forests, which is mostly credited to their better generalization ability compared to the GLM [144]. They also provide additional valuable information for the variables, such as the feature importance, which gives the possibility to exclude the insignificant variables with minor contribution to the predictions. However, one of the main limitations of ML algorithms is that they are considered as a black box to users, since it is not feasible to examine the internal iterations and perform detailed interpretations of all model components [74,145].
In the present work, all the ML models had similar performances in the BSB estimations, with the RF presenting the highest values of the statistical evaluation measures (i.e., MAE, RMSE, and R 2 ). As opposed to the GLM, all the ML models identified only height-related LiDAR-derived metrics as the most important features for the prediction of BSB, such as p99, p90, p95, and Hmax. With respect to TSB estimations, the RF algorithm resulted in an R 2 of 0.65 with among the lowest MAE and RMSE (i.e., MAE = 152.65 kg and RMSE = 175.76 kg), and identified the Hstd, the p80 and int_ske of both the Green and NIR channels as the most important variables of the regression model. On the contrary, the XGBoost algorithm was found to be prone to overfitting due to the small sample size (i.e., MAE = 236.67 kg, RMSE = 290.89 kg and R 2 = 0.31) [146]. According to the literature, the RF algorithm has a significant advantage over the other ML algorithms in terms of its accuracy and its ability to perform with small sample sizes [131]. The RF and GLM-related findings of this work are similar to other studies (e.g., [44,77,146]), although none of them have been conducted in an uneven-aged structured forest.
Although individual tree approaches are becoming increasingly popular due to the rapid development of LiDAR sensors providing high-density point clouds, there is a restricted number of papers for individual tree stem biomass estimation. Consequently, we compared our results with similar reports that exploit either approach, for stem biomass or volume estimation. Despite the complex terrain, the multi-layered forest structure and the denser LiDAR point cloud of [77], we achieved better results in terms of R 2 in all of the tested algorithms (i.e., RF, SVR, XGBoost and GLM). In 2020, [147] tested the RF, SVR, and GLM in plantations, resulting in a higher performance compared to ours in terms of R 2 , but much higher rbias values as well. In addition, [60] used linear models with logarithmic transformations for stem volume estimation in a boreal forest, using primarily height metrics and reported superior results compared to the ones of the present study. This variation in the results can be attributed to the significant difference between the study sites, as the one in our work is a naturally regenerated uneven-aged forest with complex topography, contrary to semi-natural regenerated and hilly forest in Finland.
As for the tree segmentation process, its accuracy was of utmost importance in this study for the subsequent reliable stem biomass estimation since the LiDAR metrics (i.e., potential independent variables for the regression models) are calculated for each individually segmented tree. According to [148], tree segmentation and matching in multi-layered forests is an arduous process resulting in the lowest matching accuracy among the different forest types. More specifically, trees in the lower layers of the forest are difficult to identify and the ability to segment those trees is contextual to the penetration capabilities and pulse characteristics of the LiDAR sensor. Moreover, although multispectral LiDAR sensors provide valuable information on the vertical distribution of physiological processes [81], the higher the degree of canopy overlapping, the lower the capability of tree detection algorithms to accurately identify them [149]. Consequently, only trees able to be detected in the LiDAR point cloud (e.g., dominant, co-dominant) were sampled, aiming to achieve the highest matching rate possible. The destructively sampled trees were all identified in the point cloud, while only 63% of the non-destructively sampled ones were correctly identified. Finally, tree segmentation errors caused by complex forest structure, intense topography, and detection algorithm inaccuracies, could be eliminated by employing a single-tree crown normalization process, based on the watershed segmentation algorithm, and segmenting the point cloud into single trees without prior treetop detection. As a result, the tree height was defined as the highest original point (in the height-unnormalized point cloud), overcoming deformations in the tree crown due to the normalization [120].
Although this research reached its aim, there are some unavoidable limitations. First of all, the sample size is considered small for such a complex structured forest, which has potentially caused additional uncertainty in the predictions [106]. Furthermore, due to legal restrictions, it was not possible to destructively sample a large number of small trees, which could provide better estimates. In addition, this study primarily focused on the trees that could be detected by the LiDAR sensor (e.g., dominant, co-dominant), excluding all the understory trees, which have a significant role in the total forest biomass and dynamics. As a result, the methodology and models can be transferred in similar biomes with the assumption of estimating the stem biomass mainly of the dominant and co-dominant trees. Nevertheless, the segmentation and identification of the understory trees in multi-layered forests with complex terrain is a laborious task due to the overlapping projections of the overstory and the understory tree, which should be studied further.
Overall, the results of our work demonstrated the ability of LiDAR data to estimate individual-tree stem biomass in a multilayered forest, which is critical for forest inventory management and could potentially replace the laborious field measurements required in traditional stem biomass estimation methods. This study also highlights the importance of the pulse intensity metrics in stem biomass estimation, apart from the standard heightderived ones. Besides, this study is the first contribution in the context of LiDAR remote sensing for biomass estimation in an uneven-aged structured Abies borissi-regis forest with complex topography, and one of the few studies that use primarily LiDAR data and destructive sampling for single-tree stem biomass estimation.

Conclusions
In this study, we investigated the retrieval of stem biomass estimates at the single-tree level, using multispectral LiDAR data in a complex structured forest. In particular, the BSB and TSB were predicted through linear regression and ML regression algorithms. The results demonstrated the capability of the multispectral LiDAR data to provide reliable stem biomass estimates in a dense, uneven-aged structured forest with rough topography. The main conclusions are listed below: • BSB can be adequately estimated from all the tested algorithms, contrary to the TSB where only a portion of the selected algorithms (i.e., RF, GP and SVR) provided accurate estimations.

•
All the algorithms provided more accurate BSB estimations compared to the TSB ones.

•
The RF algorithm resulted in the most precise BSB and TSB estimates in terms of R 2 MAE and RMSE, compared to the rest of the examined algorithms.

•
The intensity metrics significantly contribute to accurate BSB and TSB estimation, being among the most important variables in the best performing algorithms.

•
The ML regression algorithms outperformed the GLM in the TSB estimation, which can be credited to the better generalization capacity and ability of the former to model complex relationships.
It is evident that LiDAR data can significantly contribute to the development and management of complex structured forest resources, providing valuable information about the AGB to different spatial extents. This research aids in improving the stem biomass estimates in uneven-aged structured forests with complex topography, where field measurements can be extremely laborious and costly. The promising results of this study will encourage further research on stem biomass estimation using primarily LiDAR data in complex forests with endemic species, significant for effective forest biomass and carbon management.
Given the limitations of this study described in Section 5, it would be interesting to examine the potential of LiDAR in stem biomass estimation with an increased number of sample trees, in order to validate the results of the present study. In addition, the integration of terrestrial and airborne LiDAR data for stem biomass estimations in complex structured forests could also be investigated, providing additional information about the DBH, tree height and the understory trees. Finally, further research could focus on the contribution of Unmanned Areal Vehicle (UAV) LiDAR systems in tree detection and stem biomass estimation over natural forests.

Conflicts of Interest:
The authors declare no conflict of interest.