remote sensing Estimation of Individual Tree Biomass in Natural Secondary Forests Based on ALS Data and WorldView-3 Imagery

: Individual-tree aboveground biomass (AGB) estimation can highlight the spatial distribution of AGB and is vital for precision forestry. Accurately estimating individual tree AGB is a requisite for accurate forest carbon stock assessment of natural secondary forests (NSFs). In this study, we investigated the performance of three machine learning and three ensemble learning algorithms in tree species classiﬁcation based on airborne laser scanning (ALS) and WorldView-3 imagery, inversed the diameter at breast height (DBH) using an optimal tree height curve model, and mapped individual tree AGB for a site in northeast China using additive biomass equations, tree species, and inversed DBH. The results showed that the combination of ALS and WorldView-3 performed better than either single data source in tree species classiﬁcation, and ensemble learning algorithms outperformed machine learning algorithms (except CNN). Seven tree species had satisfactory accuracy of individual tree AGB estimation, with R 2 values ranging from 0.68 to 0.85 and RMSE ranging from 7.47 kg to 36.83 kg. The average individual tree AGB was 125.32 kg and the forest AGB was 113.58 Mg/ha in the Maoershan study site in Heilongjiang Province, China. This study provides a way to classify tree species and estimate individual tree AGB of NSFs based on ALS data and WorldView-3 imagery. AGB map with 0.25 m spatial resolution based on the species, and additive biomass The average individual tree AGB in the and the forest This study took advantage of accurate biomass equations, multiple remotely sensed datasets, and ensemble learning algorithms for the inversion of individual-tree parameters (i.e., DBH and AGB) that could be applied at local and regional scales. The effects of features and algorithms for the tree species classiﬁcation of NSFs were deeply investigated in this study. This study provided spatially detailed AGB information at the individual tree level and demonstrated the reliability of individual tree level AGB estimation using multi-source remote sensing data. The individual tree-based AGB distribution map is a critical basis for quantifying error propagation of AGB estimation from the individual tree level to the plot level and to the landscape level. In addition, employing better quality data (e.g., hyperspectral images and UAV-LiDAR), developing advanced algorithms, and quantifying error sources are still fundamental and vital for AGB estimation in the future.


Introduction
Forests are essential resources and require sustainable management approaches to monitor forest structures and understand the impacts of global climate change on terrestrial ecosystems. Forest biomass is defined as the total dry weight of organic material, both aboveground and belowground, in forests. Aboveground biomass (AGB) is the sum of the dry weight of stems, branches, and foliage above the ground surface [1] and plays a vital role in carbon stock assessment and the carbon cycle [2][3][4]. Due to anthropogenic deforestation, the biomass of natural secondary forests (NSFs) in northeastern China has changed considerably in the past decades [5]. The accurate estimation of forest biomass has substantial significance in understanding the ecological and global changes [6].
There are three main methods of estimating forest biomass: physiological model-based simulation [7], measurements from traditional field survey data [8], and inversion from remote sensing datasets [9,10]. Physiological model-based simulation methods usually estimate forest biomass at local or regional scales and rely on various input variables (e.g., radiation, climate conditions, and altitude) [11]. Forest resource survey methods (e.g., direct destructive harvest measurement or indirect allometric equation models) are the primary sources of forest AGB measurement and provide accurate information for the dynamic management and planning of forest resources [12]. National Forest Inventory (NFI) programs often rely upon the plot-based field inventory and monitoring of forest and extreme gradient boosting (XGboost) [44]. However, the stacking algorithm in the ensemble algorithm is rarely used for individual tree species classification.
Despite the potential for improved biomass estimation at the individual tree level, few published studies estimate individual tree AGB of NSFs in China using multi-source remote sensing data. Compared to an area-based approach, individual tree AGB estimation based on tree species classification can provide spatially explicit AGB distributions and a clear understanding of the AGB distribution of each tree species. It is crucial to consider the value of combining traditional harvesting methods with remote sensing technology to estimate AGB for individual trees, which not only takes advantage of the high accuracy of allometric growth equations, but also has the potential to accurately map the spatial distribution of biomass based on individual trees. This synergistic approach produces a large amount of data and detail expression, which paves the way for precision forestry and is also suitable for management decision-making in small or specific areas [33].
Although researchers have incorporated tree species and allometric growth equations to estimate the AGB of individual trees for specific environments [33], it has not been fully investigated in NSFs of China, especially in terms of estimating and mapping individual tree scale AGB. Since NSFs have complex structures, they pose difficulties for individual tree parameter estimation (e.g., DBH) and tree species classification. This research aims to determine if ensemble learning algorithms can improve the accuracy of tree species classification of NSFs compared to classic machine learning algorithms. In addition, we will investigate whether the integration of tree species with individual parameters (e.g., DBH) will support the fine-scale mapping of AGB in NSFs. The objectives of this study are to: (1) investigate the effects of different features in tree species classification based on Airborne Laser Scanning (ALS) data and WorldView-3 imagery, (2) explore the performance of ensemble learning algorithms in classifying tree species of NSFs, (3) estimate AGB for individual trees and evaluate the estimation accuracy, and (4) map the spatial distribution of AGB at the individual tree scale. An individual tree AGB map with sufficient resolution could highlight spatial details and provide reliable data to support subsequent research and forest management decision-making for the NSFs of China.

Overview of the Proposed Methodology
We implemented a five-step methodology to investigate the effect of ensemble learning algorithms and features from different data sources on tree species classification and to evaluate the accuracy of individual tree AGB estimation of various tree species: (1) data preprocessing, (2) individual tree delineation using a region-hierarchical cross-sectional analysis (RHCSA) algorithm, (3) feature extraction and selection, (4) tree species classification using machine and ensemble learning models, and (5) individual tree AGB estimation and mapping. A flowchart summarizing these five steps is provided in Figure 1.

Study Area
The 3282 ha study area is located in the Maoershan Experimental Forest Farm, Shangzhi City, Heilongjiang Province, China (see Figure 2), which ranges from 127 •

Study Area
The 3282 ha study area is located in the Maoershan Experimental Forest Farm, Shangzhi City, Heilongjiang Province, China (see Figure 2), which ranges from 127°29′E to 127°44′E and from 45°14′N to 45°29′N. The terrain is mountainous, rising from south to north with an average elevation of about 300 m. The study area has a temperate continental monsoon climate, with an average annual temperature and annual precipitation of 2.4 °C and 700 mm, respectively. Maoershan is a typical natural secondary forest in northeastern China and is dominated by various broad-leaved trees, such as Manchurian ash

Field Inventory Data
This study used inventory data of 12 sample plots dominated by the main tree species of the study area obtained in 2019. Nine sample plots were 0.09 ha (30 m × 30 m) and three sample plots were 1 ha (100 m × 100 m). In total, the parameters of 4058 trees were measured, including DBH (cm), tree height (m), and crown width (CW) (m). The tree species,  This study used inventory data of 12 sample plots dominated by the main tree species of the study area obtained in 2019. Nine sample plots were 0.09 ha (30 m × 30 m) and three sample plots were 1 ha (100 m × 100 m). In total, the parameters of 4058 trees were measured, including DBH (cm), tree height (m), and crown width (CW) (m). The tree species, location, and health of trees were recorded. DBH was measured using a perimeter ruler with a starting diameter of 5 cm. Tree height was measured using the Vertex IV ultrasound instrument system, and the location of each tree was recorded using a real-time kinematic global positioning system receiver with positional error estimated to be within 5 cm. Descriptive statistics for the main tree species in the 12 sample plots are listed in Table 1.

ALS Data and Preprocessing
The ALS data used in the study were acquired on September 2015 using the LiCHy airborne observation system designed by the Chinese Academy of Forestry (CAF) and mounted on a fixed-wing aircraft. The system was equipped with a LiDAR sensor (RIEGL LMS-Q680i) with data collected at an average flight altitude of 1200 m above ground level and an average speed of 65 m/s. This system generated laser beams with a wavelength of 1550 nm wavelength, a beam divergence of 0.5 mrad, a laser pulse duration of 3 ns, a laser pulse repetition rate of 200 kHz, and a scan width of 60 • perpendicular to the flight direction. The average point density within each plot was 8.1 pts/m 2 (min = 6.3 pts/m 2 , max = 11.0 pts/m 2 , and σ = 1.5 pts/m 2 ).
The ALS data preprocessing included: (1) noise elimination, (2) classification of ground and non-ground points, (3) canopy height model (CHM) generation, and (4) CHM filling and smoothing. Noise elimination removed the effect of noise points such as air points, low points, and isolated points. Point cloud filtering classified the raw LiDAR points cloud into non-ground and ground points. We used kriging to generate a digital elevation model (DEM) and digital terrain model (DTM) from the LiDAR ground points and nonground points, respectively, and then generated a CHM with 0.25 m × 0.25 m based on the difference between the DTM and the DEM. This ALS data processing was implemented using GreenValley International's LiDAR 360 V5.1. To fill the CHM, we applied a semiautomatic hole-filling algorithm [45]. To mitigate potential invalid values and outliers, the CHM was smoothed using a Gaussian filter with a window size of 3.

WorldView-3 Imagery and Preprocessing
WorldView-3 imagery data were obtained on 9 September 2017 from a fourth-generation Digital Globe satellite. The image had cloud cover of 0.8%, a sun elevation angle of 49.0 • , and a sun azimuth angle of 163.1 • . It comprised four multispectral bands (red, green, blue, Remote Sens. 2022, 14, 271 6 of 23 and near-infrared) with 2 m spatial resolution and one panchromatic band with 0.5 m spatial resolution.
The preprocessing of the WorldView-3 imagery, including radiometric calibration and the atmospheric correction, was implemented using ENVI 5.3 software. The Fast Line-ofsight Atmospheric Analysis of Spectral Hypercube (FLAASH) radiative transfer model was implemented for atmospheric correction and conversion to surface reflectance in the ENVI environment. After preprocessing the WorldView-3 imagery, histogram equalization was applied to adjust image brightness values. Panchromatic bands and multispectral were fused to improve the nominal spatial resolution of the multispectral imagery. The fused imagery was matched to the CHM using the relative geographic referencing with a matching error of less than 1 pixel (0.5 m).

Individual Tree Crown Delineation
This study applied the region-hierarchical cross-sectional analysis (RHCSA) algorithm to segment individual trees. This approach is less affected by forest type or crown size, and has adaptability to the typical NSFs in northeastern China [46]. Similar to other ITCD algorithms (e.g., watershed) [47] and level sets [48], the RHCSA algorithm considers tree canopies as a continuous 3-D topographic surface where a tree crown manifests as a mountain-like uplift on the CHM. The vertices of the tree are equivalent to mountaintops, with height decreasing continuously from the treetop to the boundary of the crown. The CHM was scanned horizontally from top to bottom with a fixed height threshold, and multiple cross-sections were generated for each layer. We determined whether each crosssection appeared for the first time or was produced by multiple intersecting tree crowns. Cross-sections that were generated by multiple trees were divided from high to low. This process was repeated to find the lowest heights of the CHM. The details of the RHCSA were described in [46].
The ITCD accuracy assessment considers correspondence between delineated crowns and reference crowns on a location-by-location basis [35]. From the two perspectives of detected crowns and reference crowns, crown matching considers both the overlapped crown area and the position of treetops. From the reference crown perspective, accuracy metrics reflect how well the reference crowns are detected and delineated, while the detected crown perspective reflects whether detected crowns are represented by reference crowns [46]. To best delineate the reference crowns, rather than the visual interpretation method used in many prior studies, reference crowns were generated in ArcGIS 10.2 (Esri, Redlands, CA, USA) based on the crown sizes and the locations recorded in the field. The individual tree crown matching from the detected crown perspective and the reference crown perspective was statistically evaluated. Producer's accuracy (PA), user's accuracy (UA), and overall accuracy (OA) metrics were also used to assess the detection accuracy: where OA is overall accuracy, N pm and N pnm represent the number of 1:1 matches and near-matches from the reference crown perspective, respectively, N Ref represents the total number of reference crowns, N um and N unm represent the number of 1:1 matches and near-matches for the detected crown perspective, respectively, and N Det represents the total number of detected crowns. The matching relationship between the reference crowns and the detected crowns is shown in [35].

Tree Species Classification
We used the LiDAR-based individual tree crown segmentation results (1:1 matches and near-matches from both the reference and detected crown perspectives) as classification objects to conduct an object-based classification. Then, we extracted spectral features (8), texture features (28), and vegetation indices (8) from the WorldView-3 data, and height features (92), intensity features (92), and canopy features (3) from the ALS point cloud. To investigate the effects of different algorithms on individual tree species classification, three machine learning algorithms and three ensemble learning algorithms were applied and compared using the three feature sets (WorldView-3 features, ALS features, and WorldView-3 + ALS features). The results of individual-tree species classification provided the basis for subsequent individual-tree AGB estimation.

Classification System and Sample Selections
We classified the main tree species identified in the Fifth Forest Resources Inventory for the Maoershan Experimental Forest Farm for the 12 plots measured in the field. In the classification system, we accounted for sample size and the separability of tree species categories. We considered the major tree species (e.g., white birch, Changbai larch, and Korean pine), which are widely distributed, and grouped minor tree species (e.g., Miyabe maple, and lilac) together.
Based on stratified sampling, a representative and evenly distributed sample was selected using tree species as strata. This classification system consisted of seven categories with sample sizes as follows: white birch (280), Manchurian walnut (247), Manchurian ash (152), elm (314), Korean pine (352), Changbai larch (171), and others (205), with the latter including all other tree species sampled. A 10-fold cross-validation method was used to test the classification accuracy. The region of interest (ROI) was manually sketched based on the location of the individual tree and crown width in ArcGIS 10.2.

Feature Extraction and Selection
A total of 44 feature variables were extracted from WorldView-3 data: eight spectral features, 28 texture features, and eight vegetation indices. We extracted eight feature variables from the mean and standard deviation of the grayscale values of four bands (red, green, blue, and near-infrared) using 3 × 3 windows. As a higher-order texture feature, the Gray-Level Co-occurrence Matrix (GLCM) [49] reflects the frequency of different combinations of gray levels at various distances and directions on the image. We extracted 28 GLCM texture features, including the mean, variance, homogeneity, second moment, correlation, entropy, and contrast, for the four image bands (red, green, blue, and near-infrared). Additionally, eight vegetation indices were extracted, including the normalized vegetation index (NDVI), the difference vegetation index (DVI), the perpendicular vegetation index (PVI), the triangular vegetation index (TVI), the modified simple ratio vegetation index (MSR), the atmospherically resistant vegetation index (ARVI), the ratio vegetation index (RVI), and the enhanced vegetation index (EVI) [50].
To eliminate the effects of low shrubs, values above 2 m in the CHM were defined as vegetation hit points [51]. LiDAR height metrics based on the distribution of all returns, such as the mean, standard deviation, minimum, maximum, variance, coefficient of variation, bias, kurtosis, and accumulated height quantile, as well as the height percentile with 5% intervals, were calculated from the ALS data (46 features). Given the value of intensity and echo width distribution for tree species classification in prior studies [52], we also calculated intensity metrics of all returns such as the mean, standard deviation, minimum, maximum, variance, coefficient of variation, bias, kurtosis, accumulated intensity quantile, and intensity percentile with 5% intervals (46 features). In addition, we classified the first or single returns and calculated all features of the first-or-single returns [53]. Three more canopy metrics were calculated, including the canopy cover [54], leaf area index (LAI) [55], and gap fraction [55]. With this combination of metrics, a total of 187 LiDAR features, Previous studies have found that feature selection algorithms are effective for avoiding the dimensional curse [56] or the Hughes effect [57] and improving classification performance [40]. RF filters have achieved good results for feature selection to reduce data redundancy and require two parameters: the number of decision trees (n_estimators) and random_state. We set n_estimators to 1000 and random_state to 10 using python 3.7 and machine learning libraries. We used the Mean Decrease Accuracy (MDA) to assess variable importance. MDA considers the impact of removing individual variables on RF accuracy, i.e., the variable with a more significant decrease in accuracy contributes more to the classification than other variables.

Classification Algorithms
To test the performance of tree species classification with different features and algorithms, ensemble learning algorithms (i.e., Boosting, Bagging, and Stacking), and machine learning algorithms (SVM, KNN, and Convolutional Neural Networks (CNN)) were implemented based on three sets of features (WorldView-3 features, ALS features, and WorldView-3 + ALS features). We executed these algorithms using the machine learning package (Sklearn) and the deep learning package (Tensorflow) in python 3.7.
• SVM: SVM is a generalized linear classifier that performs binary classification of data in a supervised learning manner, where the decision boundary is the hyperplane of maximum margins solved for the learned samples [58]. SVM can perform nonlinear classification by a kernel method; the parameters of this study were set as kernel = 'linear'.
The KNN method is a multivariate nonparametric algorithm that uses a set of predictor feature variables (X) to match each target pixel to a number (k) of the most similar nearest neighbor reference pixels for which values of response variables (Y) are known [59]. This study set the number of nearest neighbors to 5 with uniform weight. • CNN: CNN, first developed in 1995 for the classification of handwritten images [60], is a representative deep learning algorithm. CNN interprets spatial data by scanning with a series of trainable moving windows and has the capability of representation learning in a translation-invariant manner according to its hierarchical structure. In this study, the CNN had a simple structure with an input layer, two hidden layers, and an output layer, and was implemented using an epoch of 1000 and a batch size of 60. • Boosting: The idea of the boosting algorithm is that for a complex task, the result of multiple learners' judgment will be better than that of a single learner. Representative boosting algorithms include adaptive boosting, the gradient boosting decision tree (GBDT), and XGBoost. XGBoost, which is an improvement of the GBDT algorithm [61], was used in this study. The main characteristics of this algorithm are (1) prevention of overfitting by regularization terms, a (2) loss function with first-order derivative and second-order derivatives, and (3) faster-running speed. The classification function was set to "multi: softmax", the depth of the tree was set to 5, and the learning rate was 0.5.

•
Bagging: The RF classifier is a bagging approach that combines multiple decision trees [62]. RF has excellent reported classification performance, requires little human intervention, has fast computational speed, is not predisposed to overfitting, and is robust in dealing with noisy data. The parameters of this study were set as follows: the number of trees was 1000 and the random state was set to 10. • Stacked generalization (SG): Stacking or SG differs from bagging and boosting in two ways. First, stacking usually considers heterogeneous learners (combining different learning algorithms), while bagging and boosting mainly consider homogeneous learners. Second, stacking combines base models with meta models, while bagging and boosting combine weak learners based on deterministic algorithms. Stacking is an ensemble framework for two-layer models. The first layer (base model) consists of several base models, using the original data as model input data to obtain the prediction structure; the second layer (meta model) uses the prediction results of the first layer model as input data for retraining, which constitutes the complete stacking model [50]. We adopted SVM, KNN, and CNN as the base models, and then retrained the models using their predicted values as the input values of the meta model.

Accuracy Assessment
The user accuracy (UA), producer accuracy (PA), and overall accuracy (OA) were used to assess the classification accuracy. The equations were shown as follows: where N represents the total number of samples; N ii is the number of samples assigned to the correct category; N i+ and N +i are the number of samples predicted to be in category i and the number of samples in category i, respectively.
2.6. Estimation of Individual Tree AGB 2.6.1. Individual-Tree DBH Inversion To estimate individual-tree AGB, it is necessary to produce an inverse of DBH based on tree height. In temperate coniferous and broad-leaved mixed forests, commonly used tree height curve models are roughly divided into the nine types shown in Table 2. Based on field inventory data, tree height curve models of seven tree species types were analyzed using R. Then, the accuracy of the nine models was compared, and the best model was selected to inverse DBH. Subsequently, we utilized the individual detected trees to verify the accuracy of the DBH inversion. Table 2. Tree height curve models were compared in this study.

Number
Model Expression Power function H = aD b 4 Schumacher Schumacher Richard H and D are Height and DBH, respectively; a, b, and c are model parameters.
The coefficient of determination (R 2 ) and RMSE were used to assess the goodness-of-fit of models with calculation formulas as follows: where y i represents the measured DBH; y represents the mean of the measured DBH;ŷ i represents the predicted DBH; n represents the number of samples.

AGB Estimation for Individual Trees
Based on the inversion of the optimal tree height curve model for each tree species, an individual-tree DBH map was obtained. Then, we estimated the biomass of the stems, branches, and foliage parts of individual trees using species-specific additive biomass equations [63,64] for northeastern China (Table 3). Finally, individual tree AGB was calculated based on the sum of the stems, branches, and foliage biomass of different tree species in the 12 sample plots. The additive biomass equations follow the structure below: where W a , W s , W b , and W f represent aboveground biomass, stem biomass, branch biomass, and foliage biomass (kg), respectively; D represents diameter at breast height (cm); ln is the natural logarithm; i = [s, b, f], representing stem, branch, and foliage, respectively; ε i is the error term; a i , b i , a i *, and b i * are regression coefficients, which are listed in Table 3 for different tree species and components. Note: The AGB of "Others" categories is the average AGB of other minor tree species. The field estimated AGB was also calculated using this method.

Individual Tree Crown Delineation
Accurately detecting the location and number of trees is critical for estimating AGB at the individual tree level. Since RHCSA was based on CHM for individual tree crown delineation, the resolution of the CHM had a significant impact on the results. Accordingly, this study investigated the effect of the spatial resolution of CHM on the individual tree detection rate and the overall accuracy of the RHCSA algorithm, with the goal of determining the optimal resolution where the numbers of detected individual trees best matched the counts in the field and the overall accuracy was highest. We found (Figure 3a) that when the CHM resolution was 0.25 m, the detection rate was close to 100%, which means that the total number of detected tree crowns using the RHCSA algorithm was closest to the actual numbers of individual trees in the three sample plots. Figure 3b shows that the OA of individual tree matching initially improved with increasing CHM resolution (0.5 m-0.25 m), but OA began to gradually decline with further increases in resolution (0.25 m-0.1 m). Thus, the 0.25 m spatial resolution CHM was adopted for the RHCSA algorithm in this study. 3a) that when the CHM resolution was 0.25 m, the detection rate was close to 100%, which means that the total number of detected tree crowns using the RHCSA algorithm was closest to the actual numbers of individual trees in the three sample plots. Figure 3b shows that the OA of individual tree matching initially improved with increasing CHM resolution (0.5 m-0.25 m), but OA began to gradually decline with further increases in resolution (0.25 m-0.1 m). Thus, the 0.25 m spatial resolution CHM was adopted for the RHCSA algorithm in this study.  We applied the RHCSA algorithm to detect individual trees in the 0.25 m CHM and detected 2.9 million individual trees in the study area. We used the three 100 m × 100 m sample plots (2688 trees in total) to assess the accuracy of individual tree crown delineation. Figure 4 illustrates the individual treetops detected using the RHCSA algorithm (blue lines) with green points representing reference treetops. We applied the RHCSA algorithm to detect individual trees in the 0.25 m CHM and detected 2.9 million individual trees in the study area. We used the three 100 m × 100 m sample plots (2688 trees in total) to assess the accuracy of individual tree crown delinea tion. Figure 4 illustrates the individual treetops detected using the RHCSA algorithm (blue lines) with green points representing reference treetops. Based on the matching conditions listed in [35], PA, UA, and OA were calculated and the accuracy assessment results for the three plots are shown in Table 4. The field inven tories of the three sample plots measured 2688 trees, while the RHCSA algorithm detected 2648 trees with a detection rate of 98.51% and an average match rate (including 1:1 matches and near-matches) of 72.3%. The highest match rate was obtained in Plot 2, pos sibly due to the presence of fewer small trees, which were suppressed by taller trees in other plots. The least accurate match rate was obtained in Plot 3, which may be due to the uneven and very dense distribution of trees in Plot 3. Based on the matching conditions listed in [35], PA, UA, and OA were calculated and the accuracy assessment results for the three plots are shown in Table 4. The field inventories of the three sample plots measured 2688 trees, while the RHCSA algorithm detected 2648 trees with a detection rate of 98.51% and an average match rate (including 1:1 matches and near-matches) of 72.3%. The highest match rate was obtained in Plot 2, possibly due to the presence of fewer small trees, which were suppressed by taller trees in other plots. The least accurate match rate was obtained in Plot 3, which may be due to the uneven and very dense distribution of trees in Plot 3.

Feature Selection
After extracting feature variables from ALS data and WorldView-3, we selected the greatest contributing variables using the RF algorithm and used MDA to rank the variables. In general, the number of features can be determined as the square root of the total number of features. This process led to the selection of 13 ALS feature variables to participate in tree species classification ( Figure 5). Elev_per_95-height 95% quantile; Elev_per_25(first returns)-first-return height 25% quantile; Elev_mean-height mean; LAI-leaf area index; GapFraction-canopy cover; Int_skew-intensity skew; Elev_AIH_20-height 20% cumulative percentile; Int_mean-intensity mean; Int_medianintensity median; Elev_AIH_95-height 95% cumulative percentile; Elev_per_75-height 75% quantile.
To ensure the consistency of the number of WorldView-3 and ALS features and eliminate the influence of the number of features on algorithm comparison, we also selected intensity median; Elev_AIH_95-height 95% cumulative percentile; Elev_per_75-height 75% quantile.
To ensure the consistency of the number of WorldView-3 and ALS features and eliminate the influence of the number of features on algorithm comparison, we also selected 13 WorldView-3 features ( Figure 6).

The Performance of Machine Learning Algorithms in Tree Species Classification
We tested the classification performance for three machine learning methods (SVM, KNN, and CNN) based on three sets of features (WorldView-3 features, ALS features, and WorldView-3 + ALS features).
The WorldView-3 features performed better than the ALS features for tree species classification with the SVM and KNN algorithms when the feature sets were considered independently. The combination of the ALS features and WorldView-3 features outperformed the single feature sets, which indicated the advantage of simultaneously applying LiDAR and passive imagery in tree species classification. Table 5 shows that the CNN classifier had the best performance of the three classifiers regardless of features included, while the SVM classification had the worst performance.  Table 6 shows the performance of the ensemble learning algorithm for tree species classification. Regardless of the features, the accuracy of the ensemble algorithms was slightly higher than that of the machine learning algorithms for tree species classification. The three ensemble algorithms showed some differences under different features: the RF algorithm had the highest classification accuracy (OA = 61.7%) with WorldView-3 features; the SG (CNN) algorithm was the best with ALS features and WorldView-3 + ALS features. The classification performance of the combination of WorldView-3 features and ALS features was superior to a single feature set with the ensemble learning algorithms, except for the RF, where the WorldView-3 features alone achieved similar accuracy. It is worth noting that the CNN algorithm had only slightly lower classification accuracy (OA = 72.8%) than SG (CNN) (OA = 75.0%) with the combination of WorldView-3 features and ALS features, which was higher than the other ensemble learning algorithms.
We selected the best performing algorithm and features (SG (CNN) algorithm and WorldView-3 features + ALS features) and calculated a confusion matrix (Table 7) showing the classification accuracy of each tree species. The PA and UA ranged from 66.9% to 82.1% and from 67.0% to 84.2%, respectively. The highest PA was Manchurian walnut (82.1%) and the lowest PA was elm (66.9%). The highest UA was elm (84.2%) and the lowest UA was Manchurian ash (67.0%). The latter situation was caused by high misclassification between Manchurian ash and elm.

Individual Tree DBH Inversion
We fitted tree height curve models for the seven tree species categories. Table 8 shows the optimal model for each tree species. For Changbai larch (CL), the optimal model was the logistic model, while the other six tree species had a power function model. The R 2 values of the seven models were all above 0.60 and the RMSE was between 1.22 m and 2.29 m. Manchurian walnut (WA), Manchurian ash (AS), and Korean pine (KP) had the best results, while the white birch (WB), elm (EL), and Changbai larch (CL) models were weaker. The model for "Others" had a moderate performance. The accuracy of DBH inversion directly affected the estimation results of individual tree AGB. Figure 7 shows the accuracy of estimating DBH for the seven tree species, with the R 2 values for the estimation of DBH ranging from 0.67 to 0.90, while the RMSE ranged from 1.00 cm to 2.10 cm. Overall, white birch, Manchurian walnut, Manchurian ash, and the "Other" categories had the best estimation results, while elm and the two conifers (Korean pine and Changbai larch) performed poorly (R 2 of approximately 0.6). The weaker performance was likely associated with the suppression of young trees by upper trees, which prevented the LiDAR from accurately detecting tree heights and consequently decreased the accuracy of the DBH inversion. The Korean pine (KP) in the study area were predominantly young trees and the lack of data for larger DBH may have led to fundamental model flaws. The accuracy of DBH inversion directly affected the estimation results of individual tree AGB. Figure 7 shows the accuracy of estimating DBH for the seven tree species, with the R 2 values for the estimation of DBH ranging from 0.67 to 0.90, while the RMSE ranged from 1.00 cm to 2.10 cm. Overall, white birch, Manchurian walnut, Manchurian ash, and the "Other" categories had the best estimation results, while elm and the two conifers (Korean pine and Changbai larch) performed poorly (R² of approximately 0.6). The weaker performance was likely associated with the suppression of young trees by upper trees, which prevented the LiDAR from accurately detecting tree heights and consequently decreased the accuracy of the DBH inversion. The Korean pine (KP) in the study area were predominantly young trees and the lack of data for larger DBH may have led to fundamental model flaws. Figure 7. Accuracy assessment between measured DBH and estimated DBH using ALS data. Shaded areas represent 95% confidence intervals, and solid lines are the 1:1 line. Figure 8 shows scatterplots of the accuracy of individual tree AGB based on the validation samples, with R 2 values ranging from 0.68 to 0.85 and the RMSE ranging from 7.47 kg to 36.83 kg for all tree species. Overall, broad-leaved tree species were generally estimated with better accuracy than coniferous species, with white birch achieving the high- Figure 7. Accuracy assessment between measured DBH and estimated DBH using ALS data. Shaded areas represent 95% confidence intervals, and solid lines are the 1:1 line. Figure 8 shows scatterplots of the accuracy of individual tree AGB based on the validation samples, with R 2 values ranging from 0.68 to 0.85 and the RMSE ranging from 7.47 kg to 36.83 kg for all tree species. Overall, broad-leaved tree species were generally estimated with better accuracy than coniferous species, with white birch achieving the highest AGB prediction accuracy (R 2 = 0.84, RMSE = 11.29 kg). Korean pine had the worst AGB prediction performance (R 2 = 0.68, RMSE = 9.66 kg). As noted previously, the KP plot was dominated by young trees, which led to more significant errors in tree height and DBH estimation due to the shading of taller trees, which ultimately affected the accuracy of the AGB estimation. In practical applications, the CNN algorithm is recommended because it has good accuracy and can improve efficiency by eliminating the integration process of the ensemble algorithm. Therefore, we selected the CNN algorithm and the feature set from the combination of WorldView-3 and ALS for mapping individual-tree AGB in the study area and generated a map with a spatial resolution of 0.25 m showing the distribution of individual tree AGB (Figure 9). The value range of AGB is from 0 to 571 kg, with lower values corresponding to roads, rivers, and villages, and higher values occurring far from roads. The regions of no data (white in Figure 9) were generated by filtering out non-forest areas (including farmland, villages, roads, etc.) using a mask. Note that there are also several voids in this map (see in Figure 2), which correspond to the removal of clouds. In practical applications, the CNN algorithm is recommended because it has good accuracy and can improve efficiency by eliminating the integration process of the ensemble algorithm. Therefore, we selected the CNN algorithm and the feature set from the combination of WorldView-3 and ALS for mapping individual-tree AGB in the study area and generated a map with a spatial resolution of 0.25 m showing the distribution of individual tree AGB (Figure 9). The value range of AGB is from 0 to 571 kg, with lower values corresponding to roads, rivers, and villages, and higher values occurring far from roads. The regions of no data (white in Figure 9) were generated by filtering out non-forest areas (including farmland, villages, roads, etc.) using a mask. Note that there are also several voids in this map (see in Figure 2), which correspond to the removal of clouds. combination of WorldView-3 and ALS for mapping individual-tree AGB in the study area and generated a map with a spatial resolution of 0.25 m showing the distribution of individual tree AGB (Figure 9). The value range of AGB is from 0 to 571 kg, with lower values corresponding to roads, rivers, and villages, and higher values occurring far from roads. The regions of no data (white in Figure 9) were generated by filtering out non-forest areas (including farmland, villages, roads, etc.) using a mask. Note that there are also several voids in this map (see in Figure 2), which correspond to the removal of clouds.

Discussion
In this study, we estimated individual tree AGB of different tree species for NSFs by combining WorldView-3 imagery and ALS data. We generated a fine AGB map with a spatial resolution of 0.25 m by combining species-specific additive biomass equations with remote sensing data.

Individual Tree Crown Delineation Algorithm
In this study, we used three 100 m × 100 m sample plots (2688 total trees) to assess the accuracy of individual tree crown delineation. While our study had a high detection rate (98.51%) using the RHCSA algorithm, the overall accuracy for crown delineation (72.3%) was lower than other studies under similar forest stand conditions [46]. For example, Zhao et al. delineated individual tree crowns in natural secondary forests of northeastern China and achieved an average overall accuracy of 85.12% for coniferous forest plots, 83.86% for deciduous forest plots, and 86.44% for coniferous and broadleaved mixed forest plots [46]. However, one reason could be the difference in the origin of the reference data, which we based on the location of individual trees and crown size measured in the field. In previous studies, the reference tree crowns were frequently delineated using visual interpretation within the CHM or other remotely sensed images [35,46]. This method tends to miss the detection of young trees in the lower canopy, which are often obscured by trees in the upper canopy. The NSF study area has high canopy closure, complex understory conditions, and many young trees. Such conditions frequently lead to merging-matched situations in individual tree crown delineation [35], which may result in slightly lower OA.

Tree Species Classification
From the feature selection results (Figure 5), the spectral and textural features of the green band were particularly significant. This may be because the chlorophyll, carotenoid, anthocyanin, and lutein content of different tree species are closely related to the reflectance of the green band. Previous studies have likewise found that green band spectral and textural characteristics are vital in differentiating tree species [65,66]. We included ALS data in this study because the 3D canopy structure is closely related to point cloud parameters. We found four LiDAR metrics-the coefficient of variation of the first returns' intensity (Int_cv(first returns)), the variance of the first returns' height (Elve_var(first returns)), the 95% quantile of height (Elev_per_95), and the 25% quantile of the first returns' height (Elev_per_25(first returns))-to be most important based on MDA ranking. These four LiDAR metrics were all extracted from the first returns, and their importance was likely because the first return is sensitive to tree crown structure [38] and can be used to distinguish different tree species.
The effects of three machine learning and three ensemble learning algorithms on tree species classification using different features were explored in this study. Similar to previous studies [38,40,67], our study found that combining WorldView-3 and ALS data produced better classification performance than a single data source.
For machine learning algorithms, previous studies found no significant difference between RF and SVM [56]. However, we found that their performance varied in NSFs of China, with SVM performing well below the accuracy of the RF algorithm. The main reasons for this phenomenon are differences in forest conditions and data sources. Additionally, SVM performs optimally in solving binary classification problems and has difficulties in solving multi-class problems [68]. CNN, as a branch of machine learning, is a kind of deep learning algorithm that shows advantages over other classical algorithms in this study (such as KNN and SVM). In recent years, CNN models have been increasingly used in forestry, for example, for predicting forest inventory parameters [50] and identifying different tree species [36].
The ensemble learning algorithm performed well in this study by combining learners to strengthen the classifier. Compared with the individual machine learning algorithms, ensemble learning algorithms performed better in tree species classification, especially RF, XGBoost, and SG (CNN). RF and XGBoost are often used in tree species classification and have achieved excellent performance [69,70]. However, the stacking framework is less common in this application. This approach is characterized by the ability to combine heterogeneous base learners, e.g., the stacking framework can combine boosting and bagging. In this study, we chose SVM, KNN, and CNN as the base models of the SG framework to explore whether ensemble algorithms can effectively improve the accuracy of tree species classification. The use of the SG algorithm showed more advantages in strengthening weak learners (e.g., KNN and SVM) than strong learners (e.g., CNN). This result is similar to the finding of Du et al. [50] in estimating AGB, which supports application of the SG framework for solving regression and classification problems.

Comparison with Other Similar Products
We analyzed the average AGB for each tree species and compared the finer resolution individual tree AGB products of this study to previously published AGB products generated for northeastern China. Figure 10  . Manchurian walnut had the highest average AGB in the study area, whereas the "Others" category consisted mainly of small trees with smaller DBH and, therefore, had the lowest average AGB.
The individual tree AGB of different tree species estimated in this study is slightly lower than the results of previous studies. For example, Kwak et al. [71] reported that the average AGB of Korean pine in low (240 N/ha), medium (370 N/ha), and high (1340 N/ha) density stands at 150.21 kg, 415.98 kg, and 473.31 kg, respectively. The main reason for the difference with the individual tree AGB of Korean pine in this study (138.67 kg) is likely related to the differences in the age of the forest stands. In their study [71], the stands were older than 50 years and the larger DBH of the individual trees resulted in higher AGB values. Liu et al. [72]  We analyzed the average AGB for each tree species and compared the finer resolution individual tree AGB products of this study to previously published AGB products generated for northeastern China. Figure 10 shows the average AGB values for the seven species: WA (142.30 kg) > KP (138.67 kg) > EL (121.43 kg) > AS (109.74 kg) > WB (92.50 kg) > CL (75.52 kg) > Others (61.51 kg). Manchurian walnut had the highest average AGB in the study area, whereas the "Others" category consisted mainly of small trees with smaller DBH and, therefore, had the lowest average AGB. The individual tree AGB of different tree species estimated in this study is slightly lower than the results of previous studies. For example, Kwak et al. [71] reported that the average AGB of Korean pine in low (240 N/ha), medium (370 N/ha), and high (1340 N/ha) density stands at 150.21 kg, 415.98 kg, and 473.31 kg, respectively. The main reason for the difference with the individual tree AGB of Korean pine in this study (138.67 kg) is likely The average individual tree AGB in the area is approximately 125.32 kg and the forest AGB is about 113.58 Mg/ha. These results are greater than previous AGB studies based on sample plot measurements; for example, Zhang et al. showed that the density of the average forest AGB in northeast China was 83.50 Mg/ha [3] and Su et al. estimated the average above-ground biomass to be 92.79 Mg/ha in temperate conifer-broadleaf mixed forests [73]. The higher AGB values in this study could be because the individual-tree level AGB captures finer spatial information than the AGB estimated at a broader scale.

Limitations and Future Research
Although the individual-tree AGB achieved good accuracy in this study, it still has some limitations. First, our AGB estimates were explicitly tied to crown delineation; thus, it is critical to ensure the accuracy of the ITCD process. Although many ITCD studies have been conducted [35,46,74,75], crown delineation is still a major challenge due to the complexity of forest conditions. Therefore, in further research, it is necessary to develop a stable and efficient individual ITCD algorithm that is suitable for NSFs. In addition, there are differences in the properties of various tree species, and the AGB values of individual trees can vary greatly due to wood density and tree height [8]. Improving the accuracy of tree species classification would further enhance the estimation accuracy of individual tree AGB. Spectral resolution and crown structure are important in tree species classification [56]; therefore, we will consider the value of hyperspectral and very-high-spatial-resolution LiDAR data, e.g., from unmanned aerial vehicles (UAVs), for individual tree species classification. We will also explore the effects of different classification algorithms (e.g., Cubist and multiple architectures of CNN algorithms). Various CNN architectures, such as AlexNet, VGG, Inception series, and ResNet, have shown good performance in the image vision field. We expect that the application of these algorithms will lead to a significant improvement in the accuracy of tree species classification.
Another limitation of this study is the reliance on the biomass equations for each tree species and DBH inversion. While we used high accuracy equations developed from harvesting methods, our individual tree AGB estimation approach is dependent on having access to similar highly accurate biomass equations for the study area. While many studies developed biomass models with broad applicability [76], few used tree height; thus, estimation of DBH is particularly critical for individual tree AGB estimation.
Although our analysis demonstrated that each process had good accuracy, we did not consider error propagation of the individual tree AGB estimation in this study. Thus, future studies should systematically assess how the error propagates and explore how the uncertainty of each procedure affects the results of biomass products.

Conclusions
This study evaluated the synergistic utilization of high-spatial-resolution multispectral imagery and low-posting-density LiDAR data for tree species classification and individual tree AGB estimation in NSFs. Similar to previous studies [77], we found that the combination of active and passive remotely sensed data performed better than either single data source in tree species classification. The CNN model was the best machine learning algorithm (OA = 72.8%), while for ensemble learning algorithms, the SG (CNN) model performed best (OA = 75.0%), in both cases using a combination of ALS features and WorldView-3 features. We found that the stacking model particularly improved the accuracy and stability of the base model for weaker learners (e.g., KNN and SVM). The ensemble algorithm generally outperformed the machine learning algorithms for tree species classification with the texture features of the green band and the first returns of LiDAR being the most important for the classification. Most tree species had satisfactory accuracy, with R 2 values ranging from 0.68 to 0.85 and RMSE ranging from 7.47 kg to 36.83 kg for individual tree AGB estimation. Among the species considered, white birch had the best performance in AGB estimation with R 2 = 0.84 and RMSE = 11.29. We also found that the number of detected trees by the CHM-based individual tree crown delineation was closest to the number of trees recorded in the field when the resolution of CHM was 0.25 m. We generated an individual tree-based AGB map with 0.25 m spatial resolution based on the tree species, DBH, and additive biomass equations. The average individual tree AGB in the area was 125.32 kg and the forest AGB was approximately 113.58 Mg/ha. This study took advantage of accurate biomass equations, multiple remotely sensed datasets, and ensemble learning algorithms for the inversion of individual-tree parameters (i.e., DBH and AGB) that could be applied at local and regional scales. The effects of features and algorithms for the tree species classification of NSFs were deeply investigated in this study. This study provided spatially detailed AGB information at the individual tree level and demonstrated the reliability of individual tree level AGB estimation using multi-source remote sensing data. The individual tree-based AGB distribution map is a critical basis for quantifying error propagation of AGB estimation from the individual tree level to the plot level and to the landscape level. In addition, employing better quality data (e.g., hyperspectral images and UAV-LiDAR), developing advanced algorithms, and quantifying error sources are still fundamental and vital for AGB estimation in the future.