A Levenberg – Marquardt Backpropagation Neural Network for Predicting Forest Growing Stock Based on the Least-Squares Equation Fitting Parameters

Traditional field surveys are expensive, time-consuming, laborious, and difficult to perform, especially in mountainous and dense forests, which imposes a burden on forest management personnel and researchers. This study focuses on predicting forest growing stock, one of the most significant parameters of a forest resource assessment. First, three schemes were designed—Scheme 1, based on the study samples with mixed tree species; Scheme 2, based on the study samples divided into dominant tree species groups; and Scheme 3, based on the study samples divided by dominant tree species groups—the evaluation factors are fitted by least-squares equations, and the non-significant fitted-factors are removed. Second, an overall evaluation indicator system with 17 factors was established. Third, remote sensing images of Landsat Thematic Mapper, digital elevation model, and the inventory for forest management planning and design were integrated in the same database. Lastly, a backpropagation neural network based on the Levenberg–Marquardt algorithm was used to predict the forest growing stock. The results showed that the group estimation precision exceeded 90%, which is the highest standard of total sampling precision of inventory for forest management planning and design in China. The prediction results for distinguishing dominant tree species were better than for mixed dominant tree species. The results also showed that the performance metrics for prediction could be improved by least-squares equation fitting and significance filtering of the evaluation factors.


Introduction
To promote the management and long-term development of forest resources, managers and researchers need to retain the latest information about forest resources and track the spatial changes of forest landscapes [1].Forest inventory can represent the forest extent, determine species composition, and track changes from the past to the present.Traditional ground surveys effectively provide objective and reliable information for monitoring and managing forest resources [2,3].However, traditional ground-based field measurements are expensive, time-consuming, labor-intensive, and hard to implement, particularly in mountainous and forest areas.This lay a burden on forest management personnel and researchers [4,5].
In China, two main types of surveys of large-area forest resources are available: national forest inventory (NFI), which is repeated every five years, and inventory for forest management planning and design, which is conducted at 10-year intervals.However, for both ecological monitoring and forest product utilization, the long survey cycles do not satisfy the social needs [6].
Since the 1970s, satellite images, such as Landsat Thematic Mapper (TM) and Satellite Pour l'Observation de la Terre (SPOT), have been routinely employed to estimate continuous forest parameters.By combining remote sensing image with the actual plots data, managers and researchers can estimate forest variables [7][8][9].
Scholars have been trying to find more efficient and credible prediction models [10][11][12] to partially replace the traditional and expensive measuring techniques.
Due to their adaptability and flexibility, artificial neural networks (ANNs) have become a substitutable and effective method to simulate nonlinear and complex ecosystems like forests.With robust data structures and highly interrelated relationships, ANN models have become popular.Being only slightly affected by data quality and bias, ANNs can also learn more complicated models and data tendencies [2,13].ANNs therefore have advantages that can be used to solve difficult problems in forest resource management [13][14][15][16].
Traditional BP algorithms are sensitive to convergence into a local minimum, which may lead to instability of the training or sub-optimal results [17].The Levenberg-Marquardt (LM) algorithm provides is less sensitive to local converges and therefore, it provides a better learning training approach for the back-propagation network.It chooses a balanced tradeoff between the training speed and the stability [18].
As one of the fundamental indicators of forest inventory [19], forest growing stock helps to determine the productive capacity of an area identified as forest available for wood production.The spatiotemporal dynamics of forest growing stock of various tree species play a key role in the balance between a continuous supply of timber and the sustainable development of the whole ecosystem [20].By applying biomass expansion factors, the forest growing stock can be transformed to estimate the aboveground and belowground biomass [21].Moreover, the increase or decrease in forest growing stock is essential to calculate the carbon storage of the forestry department [22].During the estimation or prediction of forest growing stock, data from permanent plots (the basic units of NFI, set up by systematic sampling methods at the intersection of kilometer networks referring to the topographic maps with 1:50,000 map scale, usually with an area of 0.0667 ha) are frequently chosen to validate various models [9,[23][24][25][26].However, in China, the number of subcompartments (the basic unit of inventory for forest management planning and design, divided by the terrain boundaries including ridge line, valleys, roads, etc. or forest ownership boundaries, with the maximum area of 15 ha in South China and 25 ha in other parts of China) is much larger than the number of permanent plots.For example, in Zhejiang province in China (Figure 1), the number of permanent forest plots was 4252 in 2004 [6].In contrast, the number of subcompartments was 39,377 in the city of Longquan (Figure 1) in Zhejiang province in 2007.To reduce the cost of investigating, estimation of forest growing stock based on subcompartments is more valuable than methods based on permanent forest plots.
This study focuses on predicting forest growing stock.First, three models were designed according to whether the study samples were based on mixed tree species or dominant tree species groups and whether the evaluation factors could be fitted by the least-squares equation and filtering by setting the significance p ≤ 0.05.Second, we established an overall evaluation factor system with 17 factors.Third, we integrated the remote sensing images of Landsat TM, digital elevation model (DEM), and the inventory for forest management planning and design into a database.Last, the Levenberg-Marquardt backpropagation (LM-BP) model was used to predict the forest growing stock.

Study Area
The study area, Longquan city, located in the southwest of Zhejiang province, has a total area of 3059 km 2 , which extends from 118°42′ E to 119°25′ E, and 27°42′ N to 28°20′ N. Longquan includes 22 basic forestry management departments that contain 19 townships, two forest farms, and one scenic area (Figure 1).There are 265,667 ha of forest resources, with 14.56 million cubic meters of forest growing stock and 84.2% forest coverage [27].

Study Area
The study area, Longquan city, located in the southwest of Zhejiang province, has a total area of 3059 km 2 , which extends from 118 • 42 E to 119 • 25 E, and 27 • 42 N to 28 • 20 N. Longquan includes 22 basic forestry management departments that contain 19 townships, two forest farms, and one scenic area (Figure 1).There are 265,667 ha of forest resources, with 14.56 million cubic meters of forest growing stock and 84.2% forest coverage [27].

Evaluation Factors
Forest growth is comprehensively influenced by multiple environments that primarily contain three categories: climate, topography, and soil.
Due to the costly data, not all of the environmental factors are suitable for estimating forest growth [28], which has resulted in many researchers trying to choose some factors for their experiments and accordingly make some harvest predictions [29][30][31][32].
To balance the costs and the richness of information, we chose comprehensive evaluation factors (Table 1), which had been applied in a previous paper [3].The inventory data for forest management planning and design, containing 39,377 forest subcompartments with 83,078 subplots, were provided by the Forestry Bureau of Longquan in 2007.In order to eliminate erroneous and incorrect data, nonforest, zero-volume, and zero-canopy density subplot samples were removed by preliminary processing.Therefore, the survey data of forest resources planning finally included 28,707 forest subcompartments that consisted of 38,898 subplots.In the data, four dominant tree species (Chinese fir (Cunninghamia lanceolata (Lamb.)Hook.), Masson pine (Pinus massoniana Lamb.), Taiwan pine (Pinus taiwanensis Hayata), and hard broadleaves accounted for the vast majority of the dominant tree species (Table 2).We separated the research data into four groups (Table 2): Group 1, Chinese fir; Group 2, Masson pine; Group 3, Taiwan pine; and Group 4, hard broadleaves.The remaining 613 subplots, which consisted of 26 dominant tree species that included soft broadleaves, fruit trees, Chinese cryptomeria, etc., accounted for a very small proportion (1.58%) of the total data.Thus, the 613 subplots were removed because the small size and variety of dominant tree species might have caused the experiment results to be unstable and unreliable.

Study Scheme Design
For modeling and prediction, the remaining subcompartments were divided into two parts in which there were 11 basic forestry management departments for modeling and another 11 for predicting.Specifically, the modeling area included Badu town, Baoxi township, Chatian town, Jianchi town, Lanju township, Pingnan town, Shangyang town, city forest farm, Tashi township, Xiaomei town, and Zhuyang township (Figure 1); the remaining 11 basic forestry management departments were used for predicting (Figure 1).We then designed three schemes: Scheme 1 involved modeling and predicting for all mixed dominant tree species; Scheme 2 involved modeling and predicting for the four groups of dominant tree species, i.e., Chinese fir, Masson pine, Taiwan pine, and hard broadleaves; and Scheme 3 involved modeling and predicting for the four groups of dominant tree species based on the evaluation factors fitted by the least-squares equation.

Data Integration
For modeling and prediction, we integrated all the required forest resource survey data into the same relational database.The soil depth, A-layer depth of soil, tree age, canopy density, and forest growing stock data were obtained from the inventory for forest management planning and design.The elevation, slope, aspect, surface curvature, solar radiation index, and topographic humidity index data were first obtained from DEM.In addition, the normalized difference vegetation index (NDVI) and the characteristic values of the bands (Bands 1 to 5 and 7) were initially obtained from Landsat TM remote sensing images.The average unit volume (m 3 /ha) of the forest resources was the only predicted parameter.

Evaluation Factors Fitting
In Scheme 3, we pre-fitted the evaluation factors and the average volume per unit by the least-squares equation.The three evaluation factors-soil depth, A-layer depth of soil, and canopy density-did not need to be fit in advance because their increase in values usually indicates the increase in forest growing stock.
The detailed operating procedure was as follows: Step 1: The normalized values for each evaluation factor of each sample were calculated with the following formula: where x i,j is the original value of the i th evaluation factor of the j th sample, max(x i ) is the maximum of all values for the i th evaluation factor, min(x i ) is the minimum of all values for the i th evaluation factor, and y i,j is the normalized value of x i,j .
Step 2: The average volume per unit (z i,j ) was grouped by rounding the values of y i,j × 100, except for the three evaluation factors of soil depth, A-layer depth of soil, and canopy density.Then, their least-squares fitting equations between z i,j and rounded y i,j × 100 were established (Table 3).
Step 3: In Table 3, R denotes the correlation coefficient between each evaluation and the average volume per unit, and p denotes the significance of the least square fitting regression.There were some evaluation factors with lower values of correlation coefficient (R) and higher values of significance (p) in Table 3. Usually, P has three values, 0.001, 0.01, and 0.05.By letting p equal to 0.05 in this paper, the evaluation factor would be retained if its p ≤ 0.05.Therefore, for the dominant tree species Chinese fir, the remained factors were tree ages, slope, aspect, and band 1; for Masson pine, the remained factors were tree ages, slope, elevation, solar radiation index, NDVI, and band 1; for Taiwan pine, the remained factors were tree ages, curvature, and band 7; and for hard broadleaves, the remained factors were tree ages, slope, band 2, and band 3. Including the 3 retained-factors mentioned above, soil depth, A-layer depth of soil, and canopy density, there were eight evaluation factors remained in modelling and prediction for Chinese fir, ten evaluation factors remained for Masson pine, seven evaluation factors remained for Taiwan pine, and eight evaluation factors remained for hard broadleaves.To all remained evaluation factors, the fitted values of x i,j were recalculated by the fitting equations (Table 3) and were renormalized by Equation ( 1).

Improved BP Neural Network Model Based on LM Algorithm
Although there are many feasible network models, the backpropagation neural network (BPNN) has outstanding assortment ability and pattern identification in practical applications [33].The BPNN model minimizes the mean square error between the predicted values and the expected values by modifying the connection weight of network [34].Usually, a BPNN comprises three types of successive layers: input layer, hidden layer, and output layer.Its operating processes can be described in two steps.First, the input signal is forward-propagated from the input layer, via the hide layer, to the output layer.During this process, the weight value and offset value of the network are maintained constant, and the status of each layer of neurons will only affect the next layer of neurons.The difference between the predicted output and expected output of the network is defined as the error signal.Second, the error signal is backpropagated from the output layer to the input layer in a layer-by-layer manner.During this process, the weight value of network is regulated by the error feedback to make the predicted output of network more closer to the expected one [34].
The LM algorithm is essentially an evolution of the gradient descent algorithm and Newton algorithm, which can considerably speed up the convergence rate by reducing the iteration process and then produce more accurate data.Compared with the disadvantages of traditional BPNNs, such as slow convergence speed and local minimum problems, the convergence rate of the LM algorithm is the fastest of all traditional or improved networks.The improved BPNNs with the LM algorithm have been shown to achieve excellent results, especially when applied to evaluation and prediction [3,[35][36][37][38][39].
Therefore, taking the comprehensive evaluation factors (Table 1) as input variables and the forest growing stock as the output variable, the LM-BP was selected as the modeling and prediction model in this paper.As mentioned above, the values of comprehensive evaluation factors were normalized to inputs by Equation (1), both in Scheme 1 and Scheme 2. In Scheme 3, the inputs were preprocessed by three steps.First, the values of input variables were pre-fitted with the evaluation factors and the average volume per unit from the least-squares method, except for the three evaluation factors of soil depth, A-layer depth of soil, and canopy density.Second, the evaluation factor would be retained if its significance p ≤ 0.05 when fitted by the least square method.Last, the inputs were normalized by Equation (1).
The LM-BP computer program was coded by MATLAB R2011b, and its main steps were as follows: Step 1: The parameters were set as follows: Max_Epochs = 1000 Input_Num = 17 Output_Num = 1 Hidden_Neuron_Num = 2 × Input_Neuron_Num + Output_Neuron_Num TransferFcn = {'tansig' 'purelin'} TrainFcn = 'trainlm' PerformFcn = 'mse' where Max_Epochs is the maximum of training epochs, Hidden_Neuron_Num is the number of nodes in the hidden layer, Input_Neuron_Num is the number of nodes in the input layer, Output_Neuron_Num is number of nodes in the output layer, tansig is the transfer function that transfers values to the hidden layer from the input layer, purelin is the transfer function that transfers values to the output layer from the hidden layer, trainlm is the training function corresponding to the LM algorithm, and mse (mean square error) is the performing function.
Step two: Net creation.Net = newff (I, O, Hidden_Neuron_Num) where I is the input matrix and O is the output vector.
Step three: Net training.
[Net TR] = train (Net, I, O) Step 4: Prediction.y = sim (Net, I_test) where Net is the trained net, I_test is the input matrix of predicting samples, and y is the prediction result.

Model Performance Metrics
The following equations were used to calculate the accuracy of each model.By letting t i represent the measured values, y i represent the predicted values at sample i, t denote the observed mean, and y denote the predicted mean, the first statistical measurement, i.e., the group absolute percentage error (GAPE), was defined with Equation ( 2): where n is the number of samples in the test dataset.
The second criterion is the mean absolute error (MAE) as defined in Equation ( 3): Then, the mean absolute percentage error (MAPE) is defined by Equation ( 4): The root mean squared error (RMSE) is one of the most frequently used factors for assessing the capability of ANNs, which can be obtained with Equation ( 5): The factor of agreement (IA) is a nondimensional factor of calculation that can be obtained with Equation ( 6): The IA distinguishes between the expected and measured values, and it is restricted from 0 to 1. High value of IA means good consistency between values.
The correlation coefficient R 2 is defined with Equation ( 7):

Modeling
During the modeling period, the optimal model was chosen from 10 experiments performed for each scheme.The results are presented in Table 4.  4); MAE is the mean absolute error defined by Equation (3); RMSE is the root mean squared error defined by Equation ( 5); IA is the factor of agreement defined by Equation ( 6); R 2 is the correlation coefficient defined by Equation ( 7).
In the three modeling schemes, the GAPE values were much less than 10%, meaning the accuracies of the group estimation were above 90%.The performance metrics MAE, RMSE, IA, and R 2 in Scheme 1 were all significantly inferior to those in Schemes 2 and 3, and the MAPE in Scheme 1 was generally inferior to that in Schemes 2 and 3. Through a pooled analysis of Scheme 1 to Schemes 2 and 3, the R 2 were significantly improved from 0.6388 to 0.7427 and 0.7272, respectively, and MAPE decreased from 41.0680% to 32.0223% and 33.6355%, respectively.Schemes 2 and 3 were much better than Scheme 1. Except GAPE, other performance metrics of Scheme 3 were slightly worse than those of Scheme 2 in the modelling results (Table 5).Note: GAPE is the group absolute percentage error defined by Equation (2); MAPE is the mean absolute percentage error defined by Equation (4); MAE is the mean absolute error defined by Equation (3); RMSE is the root mean squared error defined by Equation ( 5); IA is the factor of agreement defined by Equation ( 6); R 2 is the correlation coefficient defined by Equation (7).

Predicting
The prediction results are presented in Table 6 and Figure 2. Table 6 shows the sample size and performance metrics of each scheme.Figure 2 shows the scatter plots between predicted values and measured values of forest growing stock.5); IA is the factor of agreement defined by Equation ( 6); R 2 is the correlation coefficient defined by Equation (7).
During prediction, except the scheme for Taiwan pine as the dominant tree, the GAPE values of the other schemes were less than 10%, which means that the accuracies of the group estimation were above 90%.In Scheme 2, the GAPE for predicting Taiwan pine was as much as 15.5888%, so its group estimation precision was 84.4112%.However, in Scheme 3, the GAPE for predicting Taiwan pine has been improved greatly that in Scheme 2.
The pooled analysis for Schemes 2 and 3, achieved by merging the prediction results of the four dominant tree species into a whole of prediction result, and the total performance metrics were recalculated and are shown in Table 5.From Scheme 1 to Schemes 2 and 3, the R 2 were improved from 0.5279 to 0.6460 and 0.6823, respectively, and MAPE decreased from 50.2675% to 41.7724% and 37.5268%, respectively.The other performance metrics, including GAPE, MAE, RMSE, and IA, all improved to some extent from Scheme 1 to Schemes 2 and 3.
The GAPEs in all schemes were less than 5% (Table 5), which indicated that the total group estimation precisions were much greater than the highest standard of the total sampling precision of forest resource inventory in China (GB/T 26424-2010).

Discussion 331
The results of modeling indicated that the performance metrics for distinguishing dominant tree 332 species were much better than that for determining mixed dominant tree species.However, except 333 GAPE, other performance metrics of Scheme 3 were slightly worse than those of Scheme 2 in the 334 modelling results.This probably occurred because we used the values fitted between the evaluation 335 factors and the average volume per unit using the least-squares equations instead of the original 336 values of the evaluation factors before modeling, and remove the non-significant evaluation factors 337 when their P-values were greater than 0.05, meaning the fitted values were closer to their averages 338 than their original values; meanwhile, some detail in the information could have also been lost.339 The prediction results indicated that the performance metrics for distinguishing dominant tree 340 species were also better than for determining mixed dominant tree species, which is similar to most 341 research results [40,41].This result was apparently different from the published forest growing stock 342 simulation [8], which used the k-nearest-neighbor estimation method to estimate the forest growing 343 stock based on Landsat TM images and stand-level field inventory data.The results of that study 344

Discussion
The results of modeling indicated that the performance metrics for distinguishing dominant tree species were much better than that for determining mixed dominant tree species.However, except GAPE, other performance metrics of Scheme 3 were slightly worse than those of Scheme 2 in the modelling results.This probably occurred because we used the values fitted between the evaluation factors and the average volume per unit using the least-squares equations instead of the original values of the evaluation factors before modeling, and remove the non-significant evaluation factors when their P-values were greater than 0.05, meaning the fitted values were closer to their averages than their original values; meanwhile, some detail in the information could have also been lost.
The prediction results indicated that the performance metrics for distinguishing dominant tree species were also better than for determining mixed dominant tree species, which is similar to most research results [40,41].This result was apparently different from the published forest growing stock simulation [8], which used the k-nearest-neighbor estimation method to estimate the forest growing stock based on Landsat TM images and stand-level field inventory data.The results of that study showed that the volumetric estimation errors of different tree species were significantly higher than those of the overall estimations.
Regardless of the modeling results, all the prediction performance metrics of Scheme 3 were better than those of Scheme 2, implying that the prediction results that were based on the least-squares equation fitting and significance p filtering for evaluation factors were more stable and reliable.
In Schemes 1 and 2, the predicted values were systematically lower than the measured values from the modeling results and the prediction results.This may be due to the modeling areas being chosen manually rather than randomly.However, in Scheme 3, for both modeling and prediction, the average predicted values were very close to the average measured values.This may be explained by the fitted factor values obtained using the least-squares equations being closer to their averages than their original values and removing the non-significant evaluation factors to reduce the random noise, meaning the predicted results would be closer to the expected values.
Despite high MAPE values, the group estimation accuracy from our study may be good enough for some purposes.For example, for forest management planning and design at a regional level, the estimated results should be a partial substitution for field investigation.An acceptable level of estimation error for total forest volume is about 20% when the inventory area is larger than 30 ha [8,42].The total performance metrics indicate that all GAPE values were lower than 5%, meaning the overall accuracies were over 95%, which exceed the highest standard of total sampling precision (90%) of inventory for forest management planning and design in China.Especially in Scheme 3, both the modeling results and the predicting results had GAPE values lower than 2%, meaning the overall accuracies were over 98%.Therefore, the results should be used in forest management planning and design in situations where there is a lack of available measured data.
The prediction results could likely be further improved with further research.There are, however, two limitations to our study: overfitting and limited network layers.In our study, we found that higher modeling accuracy did not necessarily mean higher prediction accuracy, which might have been caused by overfitting in the modeling procedure.Due to the full connection network structure, the number of the layers was very limited, therefore, the prediction accuracy was also limited.The next study plan may depend on a new model, such as the deep-learning model.

Conclusions
Three conclusions were drawn from the data analyses in this study.First, the prediction results of all schemes showed that their average predicted values were close to the average measured values and all GAPE values were lower than 5% (Table 5), indicating that group estimation precision of all predicting schemes were much greater than 90%, which is the highest standard of the total sampling precision for volume of forest resource inventory in China.Second, the prediction results for distinguishing dominant tree species were better than for mixed dominant tree species.Lastly, the results showed that the performance metrics for prediction could be improved using least-squares equation fitting for the evaluation factors and by using the significance P filtering the evaluation factors.
We chose one part of an area in the research region as the modeling area and the other as the prediction area instead of randomly choosing modeling samples [3,43].This procedure is more convenient in practice because the field measurements in the inventory for forest management planning and design are recorded in some contiguous areas while others are predicted.This research also reflects the characteristics of sufficient samples and reduced data costs during the prediction in which the

Figure 1 .
Figure 1.Administrative map of the study area.

Figure 1 .
Figure 1.Administrative map of the study area.

Sub•
The point with the measured value as x coordinate and the predicted value as y coordinate (a) Measured values (m 3 /ha) (b) Measured values (m 3 /ha) (c) Measured values (m 3 /ha) (d) Measured values (m 3 /ha) (e) Measured values (m 3 /ha) (f) Measured values (m 3 /ha) (g) Measured values (m 3 /ha) (h) Measured values (m 3 /ha) (i) Measured values (m 3 /ha) Commented [M3]: We changed 1-9 changed the numbers into letters in Please confirm if the .. before the 'Th necessary, if not, please delete the p Response: We have changed the co contents.

Figure 2 .
Figure 2. Scatter plots between predicted values and measured values for Schemes 1, 2 and 3. Subgraph (a) represents Scheme 1 with mixed dominant tree species; subgraphs (b-e) represent Scheme 2 for Chinese fir, Masson pine, Taiwan pine, and hard broadleaves, respectively; and subgraphs (f-i) represent Scheme 3 for Chinese fir, Masson pine, Taiwan pine, and hard broadleaves, respectively.

Table 1 .
Comprehensive evaluation factors for forest growth.

Emission and Reflection Radiometer (a Japanese sensor which is one remote sensory device on board the Terra satellite launched into Earth orbit by NASA in 1999 and has been collecting data since February 2000) in 2009, with 30-m resolution, IMG data type, Universal Transverse Mercator (UTM) projection, and World Geodetic System 1984 (WGS84). The data set was supplied by the International Scientific & Technical Data Mirror Site, Computer Network Information
Center, Chinese Academy of Sciences (http://www.gscloud.cn).

Table 2 .
Groups divided by the dominant tree species and the size of subplots.

Table 3 .
The least-squares fitting equations for the evaluation factors for Scheme 3.
Note: NDVI is normalized differential vegetation index; bands 1 through 5, and band 7, are the bands of the images from Landsat Thematic Mapper.

Table 4 .
Detailed performance metrics of the modeling results for each scheme.

Table 5 .
The total performance metrics for each scheme for modeling and predicting.

Table 6 .
Detailed performance metrics of the prediction results for each scheme.