Evaluation of Different Algorithms for Estimating the Growing Stock Volume of Pinus massoniana Plantations Using Spectral and Spatial Information from a SPOT6 Image

Precise growing stock volume (GSV) estimation is essential for monitoring forest carbon dynamics, determining forest productivity, assessing ecosystem forest services, and evaluating forest quality. We evaluated four machine learning methods: classification and regression trees (CART), support vector machines (SVM), artificial neural networks (ANN), and random forests (RF), for their reliability in the estimation of the GSV of Pinus massoniana plantations in China’s northern subtropical regions, using remote sensing data. For all four methods, models were generated using data derived from a SPOT6 image, namely the spectral vegetation indices (SVIs), texture parameters, or both. In addition, the effects of varying the size of the moving window on estimation precision were investigated. RF almost always yielded the greatest precision independently of the choice of input. ANN had the best performance when SVIs were used alone to estimate GSV. When using texture indices alone with window sizes of 3 × 5 × 5 or 9 × 9, RF achieved the best results. For CART, SVM, and RF, R2 decreased as the moving window size increased: the highest R2 values were achieved with 3 × 3 or 5 × 5 windows. When using textural parameters together with SVIs as the model input, RF achieved the highest precision, followed by SVM and CART. Models using both SVI and textural parameters as inputs had better estimating precision than those using spectral data alone but did not appreciably outperform those using textural parameters alone.


Introduction
Forest growing stock volume (GSV) is one of the most important forest characteristics, both economically and environmentally, because it is a key determinant of forest productivity [1]. Precisely estimating forest GSVs on large scales is crucial for monitoring forest carbon dynamics, assessing forest ecosystem services, and evaluating forest quality [2][3][4][5]. Traditional ground-based GSV estimation strategies that rely on field measurements of tree height and diameters at breast height (DBH) are time-consuming and labor-intensive [1]. GSV is estimated over large areas almost

Study Area
This study focused on an approximately 7600 ha region of Pinus massoniana plantations near Taizi mountain in Jingshan County of China's Hubei Province (30 • 48 -31 • 02 N and 112 • 48 -113 • 03 E) ( Figure 1). Topographically, the area is characterized by a hilly landscape with moderate slope and its elevation is between 40.3 and 467.4 m above sea level. It has a subtropical monsoon climate with a mean temperature of 16.4 • C. The average annual precipitation is 1094.6 mm, of which over 53% falls between April and August. The dominant soil type at the site is yellow-brown soil. Yellow-brown soil is a transitional soil between yellow, red, and brown soils, which are common in mixed subtropical evergreen broad-leaved and deciduous broad-leaved forests widely. The landscape of the study area is characterized by pure forests, mixed forests, shrub land, water, cultivations, and forest openings. The percentage of forest cover is 85%. The main tree species in the area are Pinus massoniana Lamb. and Quercus acutissima Carr. Less abundant species include Cunninghamia lanceolata (Lamb.) Hook., Ilex chinensis Sims, Platycarya strobilacea Sieb.et Zucc., Liquidambar formosana Hance, Dalbergia hupeana Hance, Pistacia chinensis Bunge., and Celtis sinensis Pers. [26].

Study Area
This study focused on an approximately 7600 ha region of Pinus massoniana plantations near Taizi mountain in Jingshan County of China's Hubei Province (30°48′-31°02′ N and 112°48′-113°03′ E) ( Figure 1). Topographically, the area is characterized by a hilly landscape with moderate slope and its elevation is between 40.3 and 467.4 m above sea level. It has a subtropical monsoon climate with a mean temperature of 16.4 °C. The average annual precipitation is 1094.6 mm, of which over 53% falls between April and August. The dominant soil type at the site is yellow-brown soil. Yellowbrown soil is a transitional soil between yellow, red, and brown soils, which are common in mixed subtropical evergreen broad-leaved and deciduous broad-leaved forests widely. The landscape of the study area is characterized by pure forests, mixed forests, shrub land, water, cultivations, and forest openings. The percentage of forest cover is 85%. The main tree species in the area are Pinus massoniana Lamb. and Quercus acutissima Carr. Less abundant species include Cunninghamia lanceolata (Lamb.) Hook., Ilex chinensis Sims, Platycarya strobilacea Sieb.et Zucc., Liquidambar formosana Hance, Dalbergia hupeana Hance, Pistacia chinensis Bunge., and Celtis sinensis Pers. [26].

Field Data and Stand Volume Estimation
The field sample plots were established in August 2015, August 2018, and November 2019. In total, 68 square plots of the size 20 × 20 m each were established in pure Pinus massoniana plantations by applying the method of combining stratified sampling and random sampling. Considering the standards for division of Pinus massoniana plantation age, there were 17 plots of young forest, 11 of middle-age forest, 15 of near-mature forest, 11 of mature forest, and 14 of over-mature forest. Sample plots in every group of forest ages were distributed randomly in terms of geography. In each plot, trees with DBH over 5 cm were tallied. DBH, tree height (H), and crown diameter (CD) of the individual tree in the plot were all measured. Trees with a DBH below 5 cm were excluded. The LAI-2200 instrument (LI-COR Inc., Li-Cor, Lincoln, NE, USA, 2010) was used to indirectly measure the

Field Data and Stand Volume Estimation
The field sample plots were established in August 2015, August 2018, and November 2019. In total, 68 square plots of the size 20 × 20 m each were established in pure Pinus massoniana plantations by applying the method of combining stratified sampling and random sampling. Considering the standards for division of Pinus massoniana plantation age, there were 17 plots of young forest, 11 of middle-age forest, 15 of near-mature forest, 11 of mature forest, and 14 of over-mature forest. Sample plots in every group of forest ages were distributed randomly in terms of geography. In each plot, trees with DBH over 5 cm were tallied. DBH, tree height (H), and crown diameter (CD) of the individual tree in the plot were all measured. Trees with a DBH below 5 cm were excluded.
The LAI-2200 instrument (LI-COR Inc., Li-Cor, Lincoln, NE, USA, 2010) was used to indirectly measure the leaf area index (LAI) of each plot. Measurements were taken either under overcast conditions, or alternatively within two hours after sunrise or before sunset. At each site, two above-canopy and nine low-canopy readings were taken with an opaque, 180 • , view-restricting cap placed over the sensor in order to mask out the operator [6]. After finding the total number of all trees in a plot, the stand density (Density) was estimated in trees per hectare (trees ha −1 ). The survey also provided information about aspect, slope, and slope position. A differential global positioning unit (Trimble GeoXH6000 GPS units) was used to locate the center and the four corners of each plot, allowing the plots to be geo-referenced against satellite data. The summarized characteristics of the 68 plots are documented in Table 1.
The volumes of each individual growing trees were predicted by applying the allometric model (Equation (1)) based on DBH and H. These models yielded the following expression [27]: where, V is the individual tree volume, DBH is the tree diameter at a height of 1.3 m, and H is the total tree height. The volumes estimated from all individual trees within a sample plot were summed to obtain the GSV of the sample plot, which was then up-scaled to per hectare. The GSV of the plot was measured in cubic meter per hectare (m 3 ha −1 ). The field data revealed that the plot-level GSVs The formula was as follows [28]: where, V is volume of a single tree and A is the stand age.

SPOT 6 Image and Processing
This study was based on a single SPOT 6 panchromatic-multispectral image that was acquired on 6 August 2015 under clear sky conditions. The panchromatic image had a spatial resolution of 1.5 m while the resolution of the multispectral image was 6 m. The panchromatic image was orthorectified using ground control points and digital elevation model (DEM) data. Then, the corrected panchromatic image was used to rectify the multispectral data. The raw digital number values for the multispectral data were converted to spectral radiance values and then into top of atmosphere (TOA) reflectance. Atmospheric correction was performed using the Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) approach [29]. All image processing was performed using the ENVI 5.1 software package.

Spectral Vegetation Indices and Texture Parameters
The average surface reflectance values from the multispectral data were used to compute eight vegetation indices (VIs) ( Table 2) Table 3) based on the SPOT6 panchromatic band at the maximum spatial resolution of 1.5 m using the GLCM method [30].
Notes: B, R, and NIR represent SPOT6 reflectance in the blue, red, and near-infrared wavelengths, respectively. Parameters L and γ represent the SAVI term (set to 0.5) and the ARVI term (set to 1), respectively. The coefficients used in the EVI algorithm are C1 = 6.0, C2 = 7.5, C3 = 1, and G = 2.5 [17]. Table 3. Formula of texture measurements used in this study [30].

Grey Level Co-Occurrence Matrix Based Texture Parameter Estimation Formula
Here, i and j are the row and column numbers. N is the number of pixels that are summed. µ i , µ j , σ 2 i , and σ 2 j are the means and standard deviations of P i and P j . P(i, j) is the normalized cooccurrence matrix.

Optimum Window Selection
Image texture is a function of the image's spatial resolution. Texture parameters derived with the GLCM method are highly sensitive to the moving window size [20]. Small window sizes are known to exaggerate differences within the window but retain a high spatial resolution, whereas larger windows may cause inefficient extraction of texture information due to over-smoothing of textural variations [39]. There is no consensus as to what moving window size is optimal, so it is necessary to test a range of window sizes to determine which provides the best speed and precision when estimating GSV. Therefore, seven window sizes were used to estimate the stand volume in this work: 3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11, 13 × 13, and 15 × 15.

Classification and Regression Tree (CART)
The CART algorithm is a basic machine learning method proposed by Breiman et al. [40]. The regression tree algorithm was used to build models because the response variable in this study was the forest GSV. The CART algorithm was implemented with the "rpart" in the R software package (version 3.4.3; R Core Team, Viennna, Austria, 2017). Feature selection and pruning are the two core issues in decision tree algorithms. The Gini index and post-pruning were used as the binary split criterion for the selection of split features and to modify the tree, respectively.

Support Vector Machine (SVM)
Support vector machines (SVM) use a nonlinear kernel function to project input data onto a high-dimensional feature space, where complex non-linear patterns can be simply represented [12,41]. In the new hyperspace, the SVM aims to identify an optimal hyperplane that fits the data and minimizes the training error and the complexity of the model [42]. The Gaussian radial basis kernel function of the form was used in this study [43,44]. The best regularization and bandwidth parameter were determined using the training data. The SVM model was implemented with "e1071" packages in the R software package.

Artificial Neural Network (ANN)
A multi-layer neural network with a back-propagation algorithm, which may be the most popular type, was used in this study [42,45]. In this context, the neurons in the input layer were the predicted variables selected by the Boruta algorithm, while the number of neurons in the hidden layer, which carried weights representing the linkages between the predictors and GSV data, was determined using both the training and validation data. The output layer was a single neuron that represented the output values of the forest GSV. We examined the number of neurons ranging from 1 to 10 in the hidden layer to build different models. The ANN algorithm was implemented with the "neuralnet" package in the R software package.

Random Forest (RF)
Ntree (i.e., to the number of variables) and Mtry (i.e., to the number of variables to randomly sample as candidates at each split) parameters are highly sensitive to prediction robustness of RF algorithms [12]. Wang et al. [46], Breiman [47], and Zhao et al. [4] have all indicated that the default values are often a good choice, and that the influence of user-defined parameters on the sensitivity of prediction is very small. Thus, the Ntree value was set to 500 and the Mtry value was set to one-third of the predictive variables. The RF algorithms were implemented with the "randomForest" package in the R software package.

Model Testing and Comparison
The field data was randomly divided into training (70%) and testing (30%) data. The testing data (n = 20) was used to validate the CART, SVM, ANN, and RF models. A random selection of training samples was repeated 100 times to reduce variability. Three common statistical parameters, namely the coefficient of determination (R 2 ), root mean square error (RMSE), and relative RMSE (rRMSE) were calculated to evaluate the models' performances.
whereŷ i was the predicted forest GSV and y i was the forest GSV observed in the field.

Effects of Moving Window Size on the Precision of GSV
The effect of varying the moving window size on the precision of GSV estimation depended on the algorithm used for variable selection. The R 2 decreased with increasing moving window size when CART, SVM, and RF were used to estimate the GSV. In other words, 3 × 3 and 5 × 5 window yielded the highest R 2 and lowest RMSE and rRMSE when using these methods (Table 4). For the ANN method, the R 2 initially decreased with the moving window size and then increased, the highest R 2 values (0.83) was achieved with this model when using a window size of 15 × 15 in this case (Table 4). CART = classification and regression tree, SVM = support vector machine, ANN = artificial neural network (ANN), RF = random forest, R 2 = the coefficient of determination, RSME = root mean square error, and rRMSE = the relative RMSE.

Performance of CART, SVM, ANN, and RF
To evaluate the performance of CART, SVM, ANN, and RF for predicting the forest GSV when using only textural parameters, four machine learning algorithms were compared in terms of their precision (R 2 ) and RMSE (Figure 2). When the window size was set as 3 × 3, 5 × 5, and 9 × 9, RF demonstrated the best regression performance in terms of model precision and RMSE. The ANN achieved the highest precision when the window size was 11 × 11, 13 × 13, or 15 × 15 ( Figure 2). On the other hand, a comparative analysis based on texture parameters and textural parameters together with SVIs demonstrated that RF was the best method for estimating GSV. The R 2 values for models generated using this method ranged from 0.82 to 0.86 (Figure 3). For models using only SVI data as inputs, ANN yielded the highest precision followed by SVM and RF. When using textural parameters together with SVIs as inputs, SVM achieved the second highest precision (Figure 3).

Performance of Spectral, Texture, and Fusion of Spectral and Texture Information
Regression models using SVIs as inputs achieved worse estimating precision than those using textural parameters independent of the choice of regression method. The R 2 value for the model based on texture data alone (the R 2 values were 0.76, 0.75, 0.66, and 0.82 for CART, SVM, ANN, and RF, respectively) was substantially higher than that for the model based on SVIs alone (Figure 3). Likewise, the RMSE and rRMSE values for the model based on textural parameters alone were considerably lower than those for the model based on SVIs alone. The RMSE values were 36.61 m 3 /ha, 37.96 m 3 /ha, 45.44 m 3 /ha, and 32.36m 3 /ha, and the rRMSE values was 29.29%, 30.37%, 31.45%, and 25.89% for CART, SVM, ANN, and RF, respectively. Fusing spatial and textural information did not greatly increase the estimating precision relative to that achieved using textural parameters alone, although minor improvements were observed with the SVM and RF methods (Figure 3). The R 2 values achieved with the CART methods when using textural parameters alone as inputs were almost On the other hand, a comparative analysis based on texture parameters and textural parameters together with SVIs demonstrated that RF was the best method for estimating GSV. The R 2 values for models generated using this method ranged from 0.82 to 0.86 (Figure 3). For models using only SVI data as inputs, ANN yielded the highest precision followed by SVM and RF. When using textural parameters together with SVIs as inputs, SVM achieved the second highest precision (Figure 3).

Performance of Spectral, Texture, and Fusion of Spectral and Texture Information
Regression models using SVIs as inputs achieved worse estimating precision than those using textural parameters independent of the choice of regression method. The R 2 value for the model based on texture data alone (the R 2 values were 0.76, 0.75, 0.66, and 0.82 for CART, SVM, ANN, and RF, respectively) was substantially higher than that for the model based on SVIs alone (Figure 3). Likewise, the RMSE and rRMSE values for the model based on textural parameters alone were considerably lower than those for the model based on SVIs alone. The RMSE values were 36.61 m 3 /ha, 37.96 m 3 /ha, 45.44 m 3 /ha, and 32.36m 3 /ha, and the rRMSE values was 29.29%, 30.37%, 31.45%, and 25.89% for CART, SVM, ANN, and RF, respectively. Fusing spatial and textural information did not greatly increase the estimating precision relative to that achieved using textural parameters alone, although minor improvements were observed with the SVM and RF methods (Figure 3). The R 2 values achieved with the CART methods when using textural parameters alone as inputs were almost equal to those achieved using textural parameters together with SVIs ( Figure 3). Figure 4 indicates the importance of explanatory variables when the RF algorithm is used. The NDVI and MEAN calculated with 3 × 3 and 5 × 5 window sizes, were of higher importance than the other variables.

Discussion
The GSV of Pinus massoniana plantations was estimated using four machine learning algorithms (CART, SVM, ANN, and RF) together with spectral and textural information from a SPOT6 image. The effect of varying the window size on the precision of GSV estimation was investigated, and the usefulness of textural parameters from SPOT6 images for forest GSV estimation was assessed.
Image texture is a complex aspect of visual perception [48]. The optimal moving window size for extracting textural parameters from the SPOT6 image generally depends on the choice of regression method. In this study, the CART, RF, and SVM methods performed best with 3 × 3 or 5 × 5 moving windows, while 15 × 15 windows were best for ANN. These results are consistent with those of previous studies [6,49], in which small windows (3 × 3 or 5 × 5) were found to be best when using the multiple linear regression method to estimate forest variables. Small window sizes increase sensitivity to interpixel differences in the proportions of tree crown and shadow and may better detect fine-scale variation in pixel brightness, whereas larger windows may extract texture information inefficiently due to over-smoothing of textural variation [50][51][52]. Some researchers have suggested that the window size should match that of the sample plots to increase precision [7,53,54]. Moreover, Chryasfis et al. [5] concluded that no single window size could adequately characterize the texture conditions in different phenological scenes when using the bagging least absolute shrinkage and selection operator (LASSO) algorithm. One of the reasons ANN performed well at 15 × 15 windows, while CART, RF, and SVM performed better at 3 × 3 or 5 × 5 windows, may be related to the autocorrelation of texture parameters from different window sizes.
MLAs have fewer assumptions, higher methodological accuracy, and high non-linear adaptation, and can efficiently model the complex non-linear relationships between forest

Discussion
The GSV of Pinus massoniana plantations was estimated using four machine learning algorithms (CART, SVM, ANN, and RF) together with spectral and textural information from a SPOT6 image. The effect of varying the window size on the precision of GSV estimation was investigated, and the usefulness of textural parameters from SPOT6 images for forest GSV estimation was assessed.
Image texture is a complex aspect of visual perception [48]. The optimal moving window size for extracting textural parameters from the SPOT6 image generally depends on the choice of regression method. In this study, the CART, RF, and SVM methods performed best with 3 × 3 or 5 × 5 moving windows, while 15 × 15 windows were best for ANN. These results are consistent with those of previous studies [6,49], in which small windows (3 × 3 or 5 × 5) were found to be best when using the multiple linear regression method to estimate forest variables. Small window sizes increase sensitivity to interpixel differences in the proportions of tree crown and shadow and may better detect fine-scale variation in pixel brightness, whereas larger windows may extract texture information inefficiently due to over-smoothing of textural variation [50][51][52]. Some researchers have suggested that the window size should match that of the sample plots to increase precision [7,53,54]. Moreover, Chryasfis et al. [5] concluded that no single window size could adequately characterize the texture conditions in different phenological scenes when using the bagging least absolute shrinkage and selection operator (LASSO) algorithm. One of the reasons ANN performed well at 15 × 15 windows, while CART, RF, and SVM performed better at 3 × 3 or 5 × 5 windows, may be related to the autocorrelation of texture parameters from different window sizes.
MLAs have fewer assumptions, higher methodological accuracy, and high non-linear adaptation, and can efficiently model the complex non-linear relationships between forest biophysical parameters and RS data [55][56][57][58][59]. Four machine learning regression algorithms, CART, SVM, ANN, and RF were explored in this study. The RF method yielded more precise GSV estimates than the other tested methods (CART, SVM, and ANN) when texture parameters alone, or a fusion of spectral and texture information, were used. This is consistent with the results of Zhao et al. [4], who found that RF achieved the greatest precision when estimating forest variables in black locust plantations on the Loess Plateau. Similarly, Fallah et al. [60] concluded that SVR and RF outperformed k-NN in volume/ha estimation. These results may be due to the fact that RF is more sensitive to overfitting during model training, and can handle high data dimensionality [47,61,62]. The algorithm of the random forest method naturally includes the interactions of variables which are often ignored in other models because of their complexity [63]. However, Zhang et al. [12] found that ANN and SVM were the best methods for estimating sawgrass marsh AGB, with both methods achieving correlation coefficients (r) above 0.9, while ANN was the best method for total biomass estimation (r = 0.94). ANN achieved the best estimating precision independently of the SVIs of input data in this study. This result may be explained by the fact that ANN has advantages in complex pattern learning and generalizing in noisy environments. For CART, it was observed that for larger values of observed GSV, the predicted values of GSV were systematically lower (i.e., always below the 1:1 line), although high R 2 was obtained when texture parameters or a combination of texture parameters and SVIs were used. This may be contributed to by over-fitting. Another reason for underestimation may be due to optical saturation under the conditions of high biomass in dense forests [64]. Overall, CART, SVM, ANN, and RF each have their own advantages. The selection of MLAs may be related to the type of RS information applied. For example, in this study, ANN could be used when applying SVIs alone for GSV estimation, and RF was a good choice when texture parameters or a fusion of textural and spectral information were used. The main drawback of these MLAs was that they did not reveal the functional relationships between the target and predictor variables. MLAs are often referred to as "black box" approaches [7,12,42,65].
Models using textural parameters as inputs achieved greater precision than spectral models, confirming that high-resolution textural information is more useful than spectral data for forest parameter estimation, especially in dense and complex forests characterized by a mosaic of dense and sparse vegetation cells [6,7,66]. Our results indicate that combining spectral and textural parameters yields, at best, marginal improvements in estimating precision compared to using textural parameters alone when estimating forest GSV. This is consistent with the conclusions of Chrysafis et al. [5], based on their work on Mediterranean forests. Textural information was indispensable in the estimation of forest parameters, especially when a high-resolution image was adopted. However, there were many uncertainties that needed to be considered, such as window size. It may be necessary to determine the optimal moving window size for extracting textural parameters.
In this research, forest GSV tended to be overestimated at lower values and underestimated at higher values ( Figure 3). This uncertainty may be contributed to the inconsistency between the time of in-site data collection and RS image, although we applied the growth model of Pinus massoniana tree for stand volume conversion from 2018 or 2019 to 2015. The growth was calculated from the stand age of forest, which decreased the precision in our study. We suggested that it was better to keep the time of field data surveyed consistent with time of RS acquisition in estimation of forest parameters.

Conclusions
Four machine learning regression algorithms including classification and regression tree (CART), support vector machine (SVM), artificial neural network (ANN), and random forest (RF) algorithms were used to estimate the growing stock volume (GSV) of Pinus massoniana plantations in the northern subtropical area of China based on spectral vegetation indices (SVIs) and textural parameters extracted from a SPOT6 remote sensing image. The RF method was found to offer the best performance for GSV estimation, although SVM, CART, and ANN also performed reasonably well. Increasing the size of the moving window used when extracting data from the SPOT6 image reduced the predictive performance of the CART, SVM, and RF methods, and the highest coefficient of determination (R 2 ) values in these cases were achieved with 3 × 3 or 5 × 5 windows. Using both spatial and textural parameters as model inputs improved estimating precision relative to using spectral data alone. However, the combined approach did not appreciably outperform models using only textural parameters as inputs.
Author Contributions: Y.D. supervised the work and provided advice for preparing and revising the paper. J.Z. proposed the idea, collected field survey data, organized the writing, and revised the paper. Z.Z. contributed to the programming and experiments, and drafted the manuscript. Q.Z. and Z.H. analyzed the data and prepared figures. J.X. collected field survey data and did image preprocessing. P.W. helped to collect field survey data and revise the manuscript. All authors read and approved the final manuscript.
Funding: The research was funded by the national key research and development plan (grant number, 2017YFC050550404) and project 2662020YLPY020 supported by the fundamental research funds for the central university.