Improving Aboveground Biomass Estimation of Pinus densata Forests in Yunnan Using Landsat 8 Imagery by Incorporating Age Dummy Variable and Method Comparison

Optical remote sensing data have been widely used for estimating forest aboveground biomass (AGB). However, the use of optical images is often restricted by the saturation of spectral reflectance for forests that have multilayered and complex canopy structures and high AGB values and by the effect of spectral reflectance from underlayer shrub, grass, and bare soil for young stands. This usually leads to overestimations and underestimations for smaller and larger values, respectively, and makes it very challenging to improve the estimation accuracy of forest AGB. In this study, a novel methodology was proposed by incorporating stand age as a dummy variable into four models to improve the estimation accuracy of the Pinus densata forest AGB in Yunnan of Southwestern China. A total of eight models, including two parametric models (LM: linear regression model and LMC: LM with combined variables), two nonparametric models (RF: random forest and ANN: artificial neural network) without the age dummy variable, and four corresponding models with the age dummy variable (DLM, DLMC, DRF, and DANN), were compared to estimate AGB. Landsat 8 Operational Land Imager (OLI) images and 147 sample plots were acquired and utilized. The results showed that (1) compared with the two parametric models, the two nonparametric algorithms resulted in significantly greater estimation accuracies of Pinus densata forest AGB, and the increases of accuracy varied from 8% to 32% for 100 modeling plots and from 12% to 35% for 47 test plots based on root mean square error (RMSE); (2) compared with the models without the age dummy variable, the models with the age dummy variable greatly reduced the overestimations for the plots with AGB values smaller than 70 Mg/ha and the underestimations for the plots with AGB values larger than 180 Mg/ha and, thus, significantly improved the overall estimation accuracy by 14% to 42% for the modeling plots and by 32% to 44% for the test plots based on RMSE; and (3) the texture measures derived from the Landsat 8 OLI images contributed more to improving the estimation accuracy than the original spectral bands and other transformations. This implied that two nonparametric models, coupled with the use of the age dummy variable and texture measures, offered a great potential for improving the estimation accuracy of Pinus densata forest AGB.


Introduction
As forest ecosystems play an important role in global carbon cycling and the mitigation of carbon concentrations in the atmosphere, accurately estimating forest biomass is necessary and has been widely studied [1][2][3][4][5][6][7][8].Due to the difficulty and high cost of collecting field data, especially belowground biomass, most of the existing studies have focused on forest aboveground biomass (AGB) [9][10][11][12][13][14][15][16][17] and the use of remotely sensed data [18][19][20].However, the estimation accuracy of forest AGB varies depending on many factors, including the remotely sensed images, the independent variables used, and the algorithms used to model the relationship of AGB with the independent variables [9].
As remote sensing technologies develop, various image data from Landsat, SPOT, QuickBird, IKONOS, WorldView, ASTER, MODIS, AVHRR, Radarsat, and ALOS PALSAR have become available and are used for AGB estimation [9].Many studies have also investigated the data fusion of different sensor images for improving AGB estimation [21,22].Given a study area, what kind of sensor data should be selected has become very challenging during the past several decades [23].Overall, Landsat imagery has been most widely used for this purpose because of free downloading, a long-time history, large coverage, and medium spatial and temporal resolutions [9,21,[24][25][26][27].Especially, the new Landsat 8 provides images with more spectral bands and a higher radiometric resolution than previous Landsat satellites and, thus, a greater potential for improving AGB estimation [12,28].
Selecting spectral variables that have strong relationships with forest AGB is the key to increasing the accuracy of AGB estimation using remote sensing data [9,12,19,26].Various spectral bands and other potential variables derived from the bands, such as vegetation indices, various transformations, texture measures, and fractional images have been used to construct the estimation models of forest AGB [9,24,26].Generally, the models with the combinations of vegetation indices, texture measures, and environmental variables such as elevation, slope, and aspect can lead to higher estimation accuracy than those with original bands, and this is especially true for the forests with complex canopy structures [9,12,19,21].
Various parametric and nonparametric algorithms have been developed for mapping forest AGB using remotely sensed images [9,19].Simple linear regression models (LMs) and nonlinear models such as power models [29,30] and logistic regression models [31] are often used.The LMs account for the relationships between forest AGB and predictors and are the most popular statistical models [9,32].However, the relationships are often nonlinear and show power, exponential, or logarithmic forms.The nonlinear regression models or the linearization of the nonlinear relationship are common methods to deal with the nonlinear relationships.However, the determination coefficients obtained are often smaller than 0.5 [27,33], and the estimates obtained are, thus, not reliable [32].
Nonparametric models are an alternative approach to improve the estimation accuracy of forest AGB using remote sensing images.Many nonparametric algorithms, such as artificial neural networks (ANN) [34,35], random forest (RF) [36,37], k-Nearest Neighbors (kNN) [38], support vector machine (SVM) [39,40], and Maximum Entropy (MaxEnt) [41,42], have been explored to model the relationships between forest AGB and predictors.However, the model structures derived from these algorithms are often difficult to interpret [9,19,32].The RF, introduced by Leo Breiman [43], can be used for either classifying categorical variables or estimating continuous variables such as forest AGB [43,44].It provides the average prediction from a number of regression models created by a subset of training data selected randomly with a random subset of predictors [45,46].Recently RF has been frequently used in AGB estimation by integrating the samples from field inventories, remote sensing, and other predictors [47,48].Moreover, ANN, proposed in 1943, is a mathematical model inspired by biological neural networks.Its algorithm is the product of artificial intelligence as black-box models due to unknown weights and a difficulty in terminating the learning process [49].Therefore, it has a strong ability for fitting the data [50] and provides a robust solution for complex and nonlinear problems due to its universal approximation properties [24].Many different neural network models have been developed [51] and are widely used in various fields [52].Unfortunately, the estimation accuracy of the nonparametric models is very much limited by the used sample sizes.
Moreover, forest AGB estimates derived from remote sensing data are associated with uncertainties [19], which has been widely recognized and for which substantial research has been conducted [53,54].The uncertainties often result in underestimations for forests with great AGB values and overestimations for forests with small AGB values.The underestimations usually occur when AGB reaches 100-150 Mg/ha [33,55,56], while the overestimations often happen for forests with AGB less than 40 Mg/ha [33].Many studies have indicated that the uncertainties are related to the structures of forest ecosystems, the topographic characteristics, the remotely sensed data (insensitivity and saturation) and their spatial resolutions, and the methods used [33,[56][57][58][59][60][61][62].Especially, spectral reflectance due to data saturation becomes less sensitive to AGB changes in the dense, multilayer, and complex canopy forest ecosystems and often leads to the underestimations of forest AGB [19].Few studies have analyzed the saturation values of forest AGB for different forest ecosystems, and only a few reports have demonstrated methods to reduce the impact of data saturation on AGB estimation accuracy [20,22,33].
The heterogeneity of complex forest canopy structures may be the major impact for the data saturation and underestimation for the forests with high AGB values [22].The data saturation varies depending on vegetation types because of the characteristics of their surface reflectance, including tree species, ages, and forest canopy structures [19,63].Zhao et al. [22] analyzed the saturation values using Landsat imagery for different vegetation types, slopes, and aspects in a subtropical region of China; estimated AGB considering the stratification of forest types using a stepwise linear regression; and improved the estimation accuracy of AGB.The stratification of forest types based on tree species and environmental variables can provide the potential of improving the estimation of forest AGB [22,32,33].Moreover, incorporating the tree age into the estimation models can also lead to an improvement of the AGB/carbon sequestration estimates and can reduce the overestimation and underestimation [64][65][66][67][68].
Iizuka and Tateishi [64] and Sanga-Ngoie et al. [68] introduced tree volume-derived age into tree carbon estimation models for improving the estimation accuracy of tree carbon sequestration using optical remote sensing imagery.Moreover, Zheng et al. [65] used 60 sample trees to obtain a forest stand age map in which the ages for most of the forests were less than 21 years old, then developed a polynomial model with spectral variables from Landsat ETM+ images and stand age involved, and obtained a high estimation accuracy of AGB.Lefsky et al. [66] mapped the forest stand age using the time series of Landsat images, then estimated the AGB of young stands using Lidar data in western Oregon, USA, and found that the estimation models were appropriate for the forests with the classes of ages 14.5 to 20.5 years but resulted in overestimations for the smaller age classes.Based on the years in which the trees were planted, Liu et al. [67] also obtained and added the age variable into their linear regression models to estimate the AGB of plantations using Landsat images.The studies indicate that taking tree ages into account would obtain a considerable improvement in estimating AGB.The studies dealt with reducing the overestimation of AGB for young forests, but the reports related to the improvement of the underestimation for the forests with high AGB values are still lacking.
Overall, the overestimation and underestimation of forest AGB commonly exist when optical remote sensing imagery is used.Reducing the overestimation and underestimation is critical to increase the estimation accuracy of forest AGB but is greatly challenging.The objective of this study was to develop a novel method of reducing the overestimation and underestimation to improve the accuracy of estimating forest AGB when optical images were utilized.The improvement was explored first by using Landsat 8 OLI images and four models to estimate AGB, including two parametric models: LM and a LM with combined variables (LMC), and two nonparametric models: RF and ANN.The stand group age as a dummy variable was then introduced into the four models, which led to four new models with the age dummy variable involved.Moreover, the eight models were compared for the estimation performance of forest AGB, and the significant contributions of adding the age dummy variable into the models to mitigating the overestimation and underestimation were statistically examined.The analyses and examinations were conducted in Pinus densata forests distributed in Yunnan of Southwestern China.

Materials and Methods
In Figure 1, the methodologic framework of this study is illustrated, consisting of the following steps: 1) the selection of the study area; 2) the collection of the sample plot and tree biomass data; 3) the calculation of the tree and plot AGB; 4) the acquisition, preprocessing, and analysis of the Landsat 8 OLI images; 5) the development of the models by incorporating the age dummy variable; and 6) the assessment and comparison of the forest AGB predictions from the models with and without the age dummy variable.underestimation were statistically examined.The analyses and examinations were conducted in Pinus densata forests distributed in Yunnan of Southwestern China.

Materials and Methods
In Figure 1, the methodologic framework of this study is illustrated, consisting of the following steps: 1) the selection of the study area; 2) the collection of the sample plot and tree biomass data; 3) the calculation of the tree and plot AGB; 4) the acquisition, preprocessing, and analysis of the Landsat 8 OLI images; 5) the development of the models by incorporating the age dummy variable; and 6) the assessment and comparison of the forest AGB predictions from the models with and without the age dummy variable.

Study Area
The study area is located in Shangri-La City, northwestern Yunnan of Southwestern China, and the AGB of Pinus densata forests were mapped (Figure 2).The study area is characterized by a cold temperate zone with the mean altitude of 3459 m above sea level.The annual mean temperature is about 5.4 • C with the monthly highest and lowest temperature of 13.3 • C in July and −3.8 • C in December, respectively.The winters are chilly but sunny because of the high altitude.The annual mean precipitation is about 607 mm.The seasonal distribution of precipitation is uneven with 70% in the rainy season (June-September).The evaporation is about 1671 mm per year, and the relative humidity is about 70%.The soil type is mainly dark brown forest soil [69].The study area is dominated by cold, temperate coniferous forests with dominant tree species of Pinus, Picea, Larix, and Abies.

Study Area
The study area is located in Shangri-La City, northwestern Yunnan of Southwestern China, and the AGB of Pinus densata forests were mapped (Figure 2).The study area is characterized by a cold temperate zone with the mean altitude of 3459 m above sea level.The annual mean temperature is about 5.4°C with the monthly highest and lowest temperature of 13.3°C in July and −3.8°C in December, respectively.The winters are chilly but sunny because of the high altitude.The annual mean precipitation is about 607 mm.The seasonal distribution of precipitation is uneven with 70% in the rainy season (June-September).The evaporation is about 1671 mm per year, and the relative humidity is about 70%.The soil type is mainly dark brown forest soil [69].The study area is dominated by cold, temperate coniferous forests with dominant tree species of Pinus, Picea, Larix, and Abies.Pinus densata is an endemic and pioneer tree species for reforestation at the subalpine and alpine areas in the Hengduan Mountains [70].It is a typical cool-temperature coniferous tree species distributed between cold-temperature coniferous forests dominated by Picea, Abies, and Larix and warm-temperature coniferous forests dominated by Pinus yunnanensis and P. armandii.In the study area, most of the forests are pure with a single canopy layer dominated by Pinus densata or mixed with small amount of Quercus pannosa, Pinus armandii, Picea likiangensis, Larix potaninii var.macrocarpa, and Betula spp., with the altitude ranging from 3000 m to 3700 m.

Measurement and Calculation of Aboveground Biomass for Sample Trees and Plots
A total of 147 square sample plots were measured in the field in 2017.The plots were selected in the P. desata pure forests of the study area by considering the stand ages, elevation, slope, and aspect, and the sampling distances between the plots were about 1 km (Figure 2).Each plot had an area of 30 m × 30 m, and its coordinates of location, elevation, slope, and aspect were measured.Within each plot, the diameter at breast height (1.3 m above ground) (DBH) and height (H) of each tree were recorded.The stand age (SA) of each plot is the mean age of three standard trees having similar DBH values to the average DBH of the plot.The ages of the three trees were measured by counting the annual rings on the wood cores obtained by a growth cone at the tree base.
Moreover, a spatial distribution map of the Pinus densata forest stand ages for the study area was obtained from the Shangri-La City forest inventory conducted in 2016 for forest management and planning.In the forest inventory, a spatial distribution map of forest types, that is, stand compartments, was produced in the field using visual interpretation based on aerial photographs.The compartments had similar dominant tree species and homogeneous canopy structures.Within each of the forest compartments, a certain number of 25.8 m × 25.8 m plots were selected, and within each of the plots, the DBH of each tree was measured and the average DBH was obtained.Then, 3 to 5 trees having a DBH similar to the plot average DBH were selected to measure the tree ages by counting the layers of branches along the tree trunk because the growth of Pinus densata trees is characterized by adding a layer of branches each year.
In addition, a total of 100 sampling trees were selected from the 147 sample plots to measure tree AGB.The selection of the sample trees was based on their DBH classes from 6 cm to 76 cm with a 2 cm interval and on their elevation, slope, and aspect.At least three sample trees were obtained for each of the DBH classes.The values of the tree AGB components including wood, bark, branches, and needles were obtained according to the method of Wang [71].The biomass samples of wood and bark for each sample tree were collected by taking a 3 cm thickness disk at a 2 m interval along the tree trunk.The biomass values of each stem and its bark were measured by a method of volume and density.The volumes of the wood segments and barks were calculated using their lengths and diameters of the surfaces and under the barks.The samples of branches and leaves were selected, collected, and weighted by the method of graded branches.All the samples were dried to constant weights at 105 • C using an oven, and the sample density values of wood and bark were measured using a drainage method.Finally, the biomass values of wood and bark for the sample trees were calculated using the volumes and the corresponding sample density values.The branch and needle biomass of the sample trees were obtained using the ratios of fresh weight to the corresponding dry matter.
The AGB data of the individual sample trees (Unit: kg) were fit using a power function, and the AGB of each plot (Unit: Mg/ha) was summed using following Equation (1).
where AGB s is the AGB of a plot, AGB i is the estimated AGB of tree i, and n is the number of trees within the plot.Although the biomass samples of tree components were used, in fact, the biomass values of the sample trees and plots were associated with uncertainties and should be considered as reference values.

Remote Sensing Data and Preprocessing
Three Landsat 8 Operational Land Imager (OLI) images obtained from the website of the United States Geological Survey (USGS) were used in this research (Table 1).The images were geo-referenced to a Universal Transverse Mercator coordinate system with zone 47 north with a root mean square error (RMSE) of less than one pixel.An atmospheric calibration of the images was conducted using the dark object subtraction approach [72], and their topographic correction was made using the C-correction approach using a digital elevation model (DEM) at a spatial resolution of 30 m × 30 m [73,74].The images were then mosaicked and clipped according to the study area (Figure 2b).A total of 231 remote sensing variables were derived (Table 2), and they included 8 bands, 22 vegetation indices, 9 image transformations, and 192 textural measures (including mean, variance, homogeneity, contrast, dissimilarity, entropy, angular second moment, and correlation on seven bands with three moving window sizes: 3 × 3, 5 × 5, and 7 × 7 pixels).Vegetation indices and image transformations have been widely used in the estimation of forest AGB, and the texture measures were selected to capture the forest canopy structures [32].The relationships between the spectral variables and AGB were analyzed using a Pearson correlation analysis, and the spectral variables with significant correlations (p ≤ 0.05) were selected to build the AGB estimation models.Sum visible bands (VIS234), albedo (ALBEDO), ratio of RED and albedo (red_ALBEDO), the first three components from tasseled cap transform (K-T transform), the first three principal components of principal component analysis (PCA) Texture measures Grey-level co-occurrence matrix-based texture measures including mean, angular second moment, contrast, correlation, dissimilarity, entropy, homogeneity, and variance using moving window sizes of 3 × 3, 5 × 5, and 7 × 7 pixels

Modeling Methods
Four models including LM, LMC, RF, and ANN were compared to estimate the AGB of the Pinus densata forests.The spectral variables that had statistically significant correlations with plot AGB were selected to develop the LM model by a stepwise regression with the variance inflation factor (VIF) of 10 to test the collinearity of the independent variables.In addition to the spectral variables above, logarithmic, quadratic, and cubic transformations were calculated for each spectral variable and used as the combined variables to construct the LMC model using a stepwise regression with VIF of 10.The introduction of the combined variables provided the potential to model the nonlinear relationship of AGB with the spectral variables.The modeling of RF was carried out by the Random Forest package in R software.Both the number of regression trees (ntree) and the number of input variables per node (mtry) were set, the optimal ntree was determined by bootstrap and RMSE, and the mtry was tested from one-third of all the independent variables [43].Finally, we used the neuralnet package in R software to develop the ANN model.The number of nodes and layers for the models was set according to the default of the package, and the number of hidden layers was adjusted according to the predicted results.In addition, the independent variables used for ANN and RF were the same as those utilized for LM.

Modeling Methods with Age Dummy Variable
In this study, the sample plots were grouped into young stand, middle age stand, near-mature stand, mature stand, and overmature stand according to the stand ages (SA) of the Pinus densata forests.The age intervals of young stand, middle age stand, near-mature stand, mature stand, and overmature stand were SA ≤ 20 years, 20 years < SA ≤ 30 years, 30 years < SA ≤ 40 years, 40 years < SA≤ 60 years, and SA > 60 years, respectively.The age intervals of the stand groups were determined according to the growth rate and forest management objectives of Pinus densata.In this region, Pinus densata forests with the age range of 40 to 60 years are considered as wood mature with the maximum annual value and can be harvested [75,76].Whether the stand age was used as a dummy variable and included into the models (LM, LMC, ANN, and RF) of AGB prediction and whether the potential combination of neighboring age groups was made were determined by following steps: (1) The models LM, LMC, ANN, and RF were developed and used to predict the AGB values of both the modeling and test plots; (2) The predictions of AGB were compared with the reference values from the field plots, and the mean residuals of the predictions were statistically tested for their significant differences from zero at the significance level of 0.05 based on each of the stand age groups; (3) Given a stand group and a model, the existence of the significant difference implied an overestimation or underestimation, and the age of the stand group as a dummy variable was, thus, added into the model; otherwise, the age dummy variable was not involved; (4) The models with the age dummy variable were developed, the significant difference tests of the obtained mean residuals from zero were conducted, and the reduction of the error was analyzed; and (5) When the models without the age dummy variable led to the mean residuals that were not significantly different from zero at the significance level of 0.05 for two neighboring age groups, the sample plots of the neighboring age groups were combined and the above four steps were repeated using the combined dataset.
Based on the results, the forest stands were divided into four groups: young stand, middle-age stand, near-mature stand, and mature and overmature stand in which the stand group ages were used as a dummy variable and added into the models.

Assessment and Validation of Predictions from Models
The evaluation of the obtained AGB models and the corresponding estimates was conducted using a determination coefficient (R 2 ) and RMSE between the observed values and estimated values of AGB to assess and validate the models based on the plot dataset used for model fitting.Moreover, the applicability of the models was also validated using the test dataset with the mean error (ME) defined as the average value of the residuals (observed values minus estimated values).The mean error of each age group for the different estimation models was statistically tested for its significant difference from zero at the significance level of 0.05 using single sample t-test by SPSS.Among the 147 plots, 100 plots were randomly selected and used for the model fitting and 47 plots were used for the validation of the models.

Statistical Characteristics of Sample Plot Data
The AGB values of individual sample trees were graphed against their DBH and H values, and the relationships of tree AGB with both DBH and H could be fit using a power function.Figure 3 shows the relationship for DBH, and the corresponding relationship for H was omitted.The obtained model with the determination coefficient of 0.992 and the RMSE of 30.778 kg was as follows: The AGB values of the trees within each plot (Unit: Mg/ha) were estimated using this equation, and the AGB value of each sample plot was obtained using Equation (1).The mean error of each age group for the different estimation models was statistically tested for its significant difference from zero at the significance level of 0.05 using single sample t-test by SPSS.Among the 147 plots, 100 plots were randomly selected and used for the model fitting and 47 plots were used for the validation of the models.

Statistical Characteristics of Sample Plot Data
The AGB values of individual sample trees were graphed against their DBH and H values, and the relationships of tree AGB with both DBH and H could be fit using a power function.Figure 3 shows the relationship for DBH, and the corresponding relationship for H was omitted.The obtained model with the determination coefficient of 0.992 and the RMSE of 30.778 kg was as follows: The AGB values of the trees within each plot (Unit: Mg/ha) were estimated using this equation, and the AGB value of each sample plot was obtained using Equation (1).

Correlation between Spectral Variables and AGB
In this study, the Pearson correlation coefficients between all 231 spectral variables and the plot forest AGB were calculated.It was found that only 36 spectral variables had statistically significant correlations with plot forest AGB and are listed in Table 4, including 28 texture measures, 7 vegetation indices, and 1 component of PCA.The variance texture measure of band 4 using window size 3 × 3 pixels (VA3_4) and the angular second moment of band 1 with window size of 7 × 7 pixels (SM7_1) had the highest correlation.The absolute values of the correlation coefficients varied from 0.200 to 0.326.Overall, the texture measures had higher correlations with AGB than other spectral variables.
Table 4.The spectral variables that had significant correlations with AGB (ND32: normalized difference vegetation index using G and B bands; ND452: normalized difference vegetation index using RED, NIR, and B bands; SIPI: structure insensitive pigment index; KT_1: the first component of K-T transform; KT_2: the second component of K-T transform; B6_PCA: the 6th component of PCA; MSAVI: modified soil adjusted vegetation index; and all other variables are texture measures: the first two capital letters represent the names of texture measures, including mean (ME), angular second moment (SM), contrast (CO), correlation (CC), dissimilarity (DI), entropy (EN), homogeneity (HO), and variance (VA), the first number represents window size: 3 for 3 × 3, 5 for 5 × 5, and 7 for 7 × 7, and the second number represents the band number of the Landsat images; * and ** indicate significance levels of 0.05 and 0.01, respectively; the significance level of 0.01 is highlighted).Four models, LM, LMC, RF, and ANN, were used to fit the modeling dataset, and their results are listed in Table 5.By a linear stepwise regression, the correlation texture measure for band 7 using the window size 7 × 7 (CC7_7), the variance texture measure for band 4 using the window size at 3 × 3 (VA3_4), the angular second moment texture measure for band 1 using the window size 5 × 5 (SM5_1), the dissimilarity texture measure for band 1 using the window size 5 × 5 (DI5_1), and structure insensitive pigment index (SIPI) were selected for LM.For LMC, the selected variables included quadratic VA3_4 ((VA3_4) 2 ), cubic SM5_1 ((SM5_1) 3 ), logarithmic entropy for band 1 using the window size at 5×5 (Log(EN5_1)), and SIPI.The spectral variables used for LM were also utilized for developing the models ANN and RF.

No
The results showed that both parametric models, LM and LMC, had smaller values of R 2 and larger values of RMSE than the two nonparametric methods, RF and ANN.The ANN led to the largest R 2 (0.663) and smallest RMSE (35.158Mg/ha).The determination coefficient R 2 from ANN was statistically significantly larger, and its RMSE was significantly smaller than those from LM and LMC.However, the R 2 and RMSE values from ANN were not significantly different from those obtained by RF.The LMC only slightly improved the estimates compared with LM based on their RMSE values.This indicated that ANN provided the most accurate estimates of AGB, then RF, LMC, and LM.Similar results were noticed in the scatter graphs of the predicted AGB values against the referenced values in Figure 4.The scatter graphs from ANN and RF had a narrower distribution and looked closer to the line of y = x than those scatter graphs from LMC and LM.For all the four models, however, there were overestimations when the plot AGB values were smaller than about 70 Mg/ha and underestimations when the plot AGB values were greater than about 180 Mg/ha.The only difference was that, compared with LMC and LM, the ANN and RF resulted in slightly smaller overestimations and underestimations.

Models with Age Dummy Variables
Overall, two nonparametric models, ANN with age dummy variable (DANN) and RF with age dummy variable (DRF), had a larger R 2 and a smaller RMSE than the wo parametric models, LM with age dummy variable (DLM) and LMC with age dummy variable (DLMC) (Table 6).The DRF had the largest R 2 and smallest RMSE, followed by DANN, then DLMC, and then DLM.Compared with the models without the age dummy variable, all the models with the age dummy variable greatly increased the coefficients of determination R 2 and reduced the RMSE values.Compared with the corresponding models without the age dummy variable, the DLM led to the greatest increase of R 2 , then DLMC, DRF, and DANN.The DRF resulted in the greatest decrease of RMSE, then DLMC, DLM, and DANN.The improvements were also noticed by the scatter graphs of the predicted AGB values against the plot referenced values in Figure 5.The overestimations still happened for the plots with AGB values smaller than about 50 Mg/ha, and the underestimations occurred for the plots with AGB values larger than 190 Mg/ha for both DLMC and DLM (Figure 5a,b) and 200 Mg/ha for both DRF and DANN (Figure 5c,d).However, the overestimations and underestimations greatly decreased by the models with the age dummy variable compared with the models without the age dummy variable.In addition, compared with those without the age dummy variable, the models with the age dummy variable slightly reduced the predicted AGB values of the plots at which the overestimations existed and slightly increased the predicted AGB values of the plots at which the underestimations occurred.

Models with Age Dummy Variables
Overall, two nonparametric models, ANN with age dummy variable (DANN) and RF with age dummy variable (DRF), had a larger R 2 and a smaller RMSE than the wo parametric models, LM with age dummy variable (DLM) and LMC with age dummy variable (DLMC) (Table 6).The DRF had the largest R 2 and smallest RMSE, followed by DANN, then DLMC, and then DLM.Compared with the models without the age dummy variable, all the models with the age dummy variable greatly increased the coefficients of determination R 2 and reduced the RMSE values.Compared with the corresponding models without the age dummy variable, the DLM led to the greatest increase of R 2 , then DLMC, DRF, and DANN.The DRF resulted in the greatest decrease of RMSE, then DLMC, DLM, and DANN.The improvements were also noticed by the scatter graphs of the predicted AGB values against the plot referenced values in Figure 5.The overestimations still happened for the plots with AGB values smaller than about 50 Mg/ha, and the underestimations occurred for the plots with AGB values larger than 190 Mg/ha for both DLMC and DLM (Figure 5a,b) and 200 Mg/ha for both DRF and DANN (Figure 5c,d).However, the overestimations and underestimations greatly decreased by the models with the age dummy variable compared with the models without the age dummy variable.In addition, compared with those without the age dummy variable, the models with the age dummy variable slightly reduced the predicted AGB values of the plots at which the overestimations existed and slightly increased the predicted AGB values of the plots at which the underestimations occurred.

Model Validation
In this study, we conducted the validation of the models with and without the age dummy variable based on the test dataset.For both the young and middle-age stands, the negative mean errors of the predictions from all the models without the age dummy variable except for ANN in the middle-age forests were statistically significantly different from zero (Figure 6a), indicating that the overestimations were statistically significant.Introducing the age dummy variable into the models significantly reduced the residuals and the resulting mean errors (that is, overestimations), except that from the model DRF became not significantly different from zero (Figure 6b).For the near-mature stand, the positive mean errors (underestimations) did not significantly differ from zero for all the models without the age dummy variable, and the models with the age dummy variable slightly deteriorated the estimations of the plot AGB.For the mature and overmature stand, the LM, LMC, ANN, and RF resulted in positive mean errors, that is, underestimations, being significantly different from zero at the significance level of 0.01.The models DLM, DLMC,

Model Validation
In this study, we conducted the validation of the models with and without the age dummy variable based on the test dataset.For both the young and middle-age stands, the negative mean errors of the predictions from all the models without the age dummy variable except for ANN in the middle-age forests were statistically significantly different from zero (Figure 6a), indicating that the overestimations were statistically significant.Introducing the age dummy variable into the models significantly reduced the residuals and the resulting mean errors (that is, overestimations), except that from the model DRF became not significantly different from zero (Figure 6b).For the near-mature stand, the positive mean errors (underestimations) did not significantly differ from zero for all the models without the age dummy variable, and the models with the age dummy variable slightly deteriorated the estimations of the plot AGB.For the mature and overmature stand, the LM, LMC, ANN, and RF resulted in positive mean errors, that underestimations, being significantly different from zero at the significance level of 0.01.The models DLM, DLMC, DANN, and DRF greatly mitigated the underestimations but were still statistically significant at the significance level of 0.05 for DLM and DLMC and at the significance level of 0.01 for DANN.For all the test plots, the overall mean errors were not significantly different from zero at the significance levels of 0.05 for all the models with and without the age dummy variable.
Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 23 DANN, and DRF greatly mitigated the underestimations but were still statistically significant at the significance level of 0.05 for DLM and DLMC and at the significance level of 0.01 for DANN.For all the test plots, the overall mean errors were not significantly different from zero at the significance levels of 0.05 for all the models with and without the age dummy variable.Compared with the coefficients of determination R 2 obtained using the models without the age dummy variables, the models with the age dummy variable increased the correlations between the referenced and predicted values of AGB for all the age group stands, but the increases were slight (Table 7).However, for the pooled test dataset, the increases were significant for all the models with the age dummy variable except for the DRF.Moreover, all the models with the age dummy variable significantly decreased the values of RMSE compared with those without the age dummy variable for all the age group stands and the pooled dataset.The decreases of RMSE using DLM, DLMC, DANN, and DRF were 69%, 71%, 69%, and 47% for the young stand; 45%, 47%, 53%, and 39 % for the mature and overmature stands; and 41%, 44%, 38%, and 32% for the pooled dataset, Compared with the coefficients of determination R 2 obtained using the without the age dummy variables, the models with the age dummy variable increased the correlations between the referenced and predicted values of AGB for all the age group stands, but the increases were slight (Table 7).However, for the pooled test dataset, the increases were significant for all the models with the age dummy variable except for the DRF.Moreover, all the models with the age dummy variable significantly decreased the values of RMSE compared with those without the age dummy variable for all the age group stands and the pooled dataset.The decreases of RMSE using DLM, DLMC, DANN, and DRF were 69%, 71%, 69%, and 47% for the young stand; 45%, 47%, 53%, and 39 % for the mature and overmature stands; and 41%, 44%, 38%, and 32% for the pooled dataset, respectively.The decreases of RMSE for the middle-age stand and near-mature stand were relatively smaller and varied from 12% to 21%.There were also three exceptions: DANN increased the values of RMSE by 4% and 7% for the middle-age stand and near-mature stand and DRF increased the RMSE value by 6% for the near mature stand compared with the corresponding models, ANN and RF.Overall, the amount of RMSE decreases and the portion of R 2 accounted for by adding the age dummy variable within the models decreased with the increase of the model performance without the age dummy variable.Finally, the maps of the predicted AGB values for the Pinus densata forests were generated using all eight models in Figure 7.For the models with the age dummy variable, in addition to the selected spectral variables, the spatial distribution map of the Pinus densata forest stand ages obtained from the Shangri-La City forest inventory was used.Overall, the spatial distributions of the predicted AGB values by the models with the age dummy variable, especially by DANN and DRF, were more heterogeneous than those by the models without the age dummy variables, implying that the models with the age dummy variable reduced the overestimations for the smaller AGB values and the underestimations for the larger AGB values and, thus, increased the heterogeneity of the predicted AGB values.

Remote Sensing Variables
Optical remote sensing data are commonly used for biomass estimation due to significant correlations between the spectral variables and biomass [12,19,33].Especially, Landsat images have been most widely utilized for this purpose because of free availability, a large coverage, and a long history [55,56,[77][78][79][80][81].However, the selection of the spectral variables becomes very critical for improving the estimation of forest AGB [19].In this study, the correlation analysis revealed that the texture measures had more significant contributions to increasing the estimation accuracy of forest AGB than other spectral variables, mainly because the texture measures captured the complex forest canopy structures [63,82].The finding was also supported by previous studies [12,33,[82][83][84][85][86].

Overestimation and Underestimation of Forest AGB
An obvious disadvantage of optical images is the saturation of their reflectance for the forests that have complex canopy structures and large values of AGB, which often leads to underestimations of high biomass values.Moreover, the reflectance values of young stands might also be affected by understory vegetation such as shrub, grass, and bare soils due to a low canopy density, which usually results in the overestimation of low biomass values.The results are widely known [19,33,56].

Remote Sensing Variables
Optical remote sensing data are commonly used for biomass estimation due to significant correlations between the spectral variables and biomass [12,19,33].Especially, Landsat images have been most widely utilized for this purpose because of free availability, a large coverage, and a long history [55,56,[77][78][79][80][81].However, the selection of the spectral variables becomes very critical for improving the estimation of forest AGB [19].In this study, the correlation analysis revealed that the texture measures had more significant contributions to increasing the estimation accuracy of forest AGB than other spectral variables, mainly because the texture measures captured the complex forest canopy structures [63,82].The finding was also supported by previous studies [12,33,[82][83][84][85][86].

Overestimation and Underestimation of Forest AGB
An obvious disadvantage of optical images is the saturation of their reflectance for the forests that have complex canopy structures and large values of AGB, which often leads to underestimations of high biomass values.Moreover, the reflectance values of young stands might also be affected by understory vegetation such as shrub, grass, and bare soils due to a low canopy density, which usually results in the overestimation of low biomass values.The results are widely known [19,33,56].
The underestimations may occur if the forest biomass values are greater than saturation value.In this study, it was found that all the original models led to underestimations when the biomass of the Pinus densata forest was greater than 180 Mg/ha.The threshold value at which the underestimation happened was higher than that (159 Mg/ha) of the pine forests dominated by Pinus massoniana [33] and those (100-155 Mg/ha) of the tropical forests [55,56], mixed forests, Cunninghamia lanceolata forests, and broadleaf forests [33].The difference might be due to different forest canopy structures caused by biophysical environments [19,33].Zhao et al. [33] found that pine forests had higher saturation values because of their relatively simple canopy structures than broadleaf forests and mixed forests [66].In addition, in this study the original models resulted in overestimations of the Pinus densata forest AGB for the forests that had AGB values lower than 70 Mg/ha because of a low canopy density.However, the threshold value was higher than that (40 Mg/ha) obtained by Zhao et al. [33] for plantations and young broadleaf forests.
In addition, in this study, the overestimation in the young stands and the underestimation in the mature and overmature stands were significant for the models without the age dummy variable.The estimation errors were significantly reduced by the models obtained by incorporating the age dummy variable by 47% to 71% for the young stands and 39% to 53% for the mature and overmature stands based on RMSE.For the middle-age and near-mature stands, the decreases of RMSE were not significant.This implied that the uncertainties of AGB estimates for the Pinus densata forests using remote sensing could be significantly decreased by considering the stand age as a dummy variable included into the estimation models.

Improvement of AGB Estimation by Incorporating Stand Age as a Dummy Variable
The heterogeneity and complexity of forest canopy structures may be the major reason for the reflectance saturation of optical images [19,33].In order to reduce the impact of data saturation on the estimation accuracy of forest AGB, various methods have been proposed.For example, Zhao et al. [33] used the stratification of vegetation types and slope aspects.In this study, stand age was introduced as a dummy variable into the original models for the young forests, middle-age forests, near-mature forests, and mature and overmature forests, which significantly reduced the overestimations for the young stands and the underestimations for the mature and overmature stands.Adding the age dummy variable into the models also decreased the threshold of the overestimations happening and increased the threshold of the underestimations occurring.Compared with the models without the age dummy variable, overall, the models with the age dummy variable reduced the values of RMSE by 32% to 44% depending the models for the pooled dataset.This might be mainly because stand age was highly correlated with forest growth and canopy structure and, thus, forest AGB.This implied that the use of stand age as a dummy variable provided a great potential in improving the estimation of forest AGB.However, the improvement of the forest AGB prediction accuracy by adding the age dummy variable within the models decreased with the increase of the model performance without the age dummy variable.This was because the room for potential improvement by adding the age dummy variable became smaller as the model performance without the age dummy variable became better.
It has to be pointed out that the stand age may interact with the selected spectral variables.Thus, the contributions of adding the age dummy variable into the models to improving the prediction accuracy of the forest AGB may include the additive and interactive effects.In order to clarify the effects, a potential method could compare the result (such as R 2 ) from a model with the age dummy variable with the sum of the R 2 values from the model without the age dummy variable and the model with the age dummy variable only.However, due to the limited space and time, this comparison was not conducted in this study.Thus, a further study is needed in the future.
Moreover, it is sometimes difficult to get a spatial distribution map of the forest stand ages for large regions.This is especially true for broad-leaf and natural forests.Several studies have dealt with the methods of how to obtain a stand age map.For example, Iizuka and Tateishi [64] and Sanga-Ngoie et al. [68] obtained tree ages using a tree age-volume relationship.Liu et al. [67] obtained stand ages based on the years in which the were planted.Zheng et al. [65] created a tree age map using sample trees.Lefsky et al. [66] derived the forest stand ages using the time series of Landsat images.The remote sensing-based method for estimating stand age looks cost-efficient, but the stand age estimates are associated with uncertainties that might decrease the accuracy of forest AGB estimates [66].
In this study, the Pinus densata forests are distributed over the subalpine and alpine areas of the Hengduan Mountains and geographically in one of three major forest regions in China, which consists of the south Qinghai, west Sichuan, northwest Yunnan, and southeast Tibet [70,71,87].Most of the Pinus densata forests are even-aged pure plantations or Pinus densata-dominated secondary mix forests [71,72].The stand age observations are relatively easily obtained by counting the layers of branches along the trunk of each tree due to its growth characteristics.Moreover, in China, the aforementioned forest inventories for forest management and planning are conducted nationally every five to ten years depending on the area.The forest inventories often lead to spatial distribution maps of stand age.Thus, obtaining stand age is not difficult.On the other hand, for the forested areas and the tree species for which stand ages are often difficult to obtain, an alternative is the use of tree growth cones.However, this method is often time-consuming and costly and also may lead to the damage of the trees.Thus, a cost-efficient method to create stand age maps should be developed in the future studies.In addition, the tropic rainforests often have not only various tree species but also different vertical and horizontal structural heterogeneities of canopy caused by different species, different growth, and different canopy sizes and structures.The stand ages are difficult to obtain.The application of the proposed modeling method with the age dummy variable is, thus, limited to the tropic rainforests.An alternative is modeling forest AGB by tree species or by canopy layer based on dominant species.The further studies are, of course, very challenging but important and urgently needed.
Although there is a limitation for the tropic rainforests, the proposed method can be widely applied to improve the estimation accuracy of forest AGB for even-aged pure or one species dominated forests, especially planted forests.China has an area of plantations with 62,000,000 ha that occupies 40% of plantations in the world.The plantations are often even-aged pure or one species dominated forests, and their stand ages are easily obtained.The plantations play an important role in the mitigation of carbon concentrations in the atmosphere.Thus, the proposed method provides a great potential in increasing the estimation accuracy of forest AGB for the plantations.

Method Comparison
Parametric and nonparametric algorithms have been widely used for biomass estimation of forests with remotely sensed data [19].In this study, two parametric models (LM and LMC) and two nonparametric algorithms (ANN and RF) were used to estimate the AGB of Pinus densata forests.It was found that compared with LM, LMC slightly improved the estimation of Pinus densata forest AGB because of the use of combined variables in LMC.Moreover, the two nonparametric methods led to significantly higher estimation accuracy of the Pinus densata forest AGB by reducing the overestimations for the forests with smaller AGB values and the underestimations for the forests with larger AGB values than the two parametric models.This implied that the two nonparametric methods had a stronger ability to capture the heterogeneity of the Pinus densata forest AGB.This finding was similar to previous studies [22,36,37,79,80,[88][89][90][91][92].The use of the age dummy variable further increased the estimation accuracy of the Pinus densata forest AGB for all the models, but the improvement in the estimation accuracy was relatively smaller for the two nonparametric models because of their higher accuracy when stand age was not used as a dummy variable.

Conclusions
When optical remote sensing images such as Landsat imagery are used to estimate forest AGB, overestimations and underestimations often take place for forests with small and large AGB values, respectively.Reducing the overestimations and underestimations becomes very important but difficult mainly to the mixed structures of tree canopies with shrubs, grass, and soil in young forests and the insensitivity and saturation of spectral reflectance in forests with multilayered canopy structures and high AGB values.In this study, a novel method was proposed to improve the estimation accuracy of Pinus densata forest AGB using Landsat 8 OLI images by reducing the overestimations and underestimations.In this method, a stand age dummy variable was introduced into two parametric models and two nonparametric models, leading to four new models.The models with and without the age dummy variable were compared to investigate the significant effects of the age dummy variable to reducing the overestimations and underestimations.The results led to following conclusions: 1) The two nonparametric algorithms had better performances of fitting and prediction for the modeling and test plots than the two parametric algorithms.The prediction differences between the kinds of models were statistically significant based on RMSE for all age group stands and the pooled dataset; 2) the models with the age dummy variable statistically significantly improved the estimation accuracy of Pinus densata forest AGB compared with the corresponding models without the age dummy variable for all age group stands and the pooled dataset, by greatly reducing the overestimations for the plots with smaller AGB values and the underestimations for the plots with larger AGB values; and 3) the texture measures derived from the Landsat 8 OLI images had higher correlations with the Pinus densata forest AGB than the original spectral bands and other transformations.This implied that the two nonparametric models, coupled with the use of the age dummy variable and texture measures, offered a great potential for improving the estimation accuracy of the Pinus densata forest AGB.

Figure 1 .
Figure 1.The methodological framework of estimating the forest aboveground biomass (AGB) using models by incorporating the age dummy variable based on the Landsat 8 OLI images and the sample plot data.

Figure 2 .
Figure 2. (a) The location of (b) the study area: Shangri-La City shown by a composite image and (c) the spatial distribution of Pinus densata forests according to the forest management inventory (FMI) data in 2016 and the sample plots investigated in 2017.

Figure 2 .
Figure 2. (a) The location of (b) the study area: Shangri-La City shown by a composite image and (c) the spatial distribution of Pinus densata forests according to the forest management inventory (FMI) data in 2016 and the sample plots investigated in 2017.

Figure 3 .
Figure 3.The scatter plot of forest aboveground biomass (AGB) with diameter at breast height (DBH) from 100 sample trees.

Figure 4 .
Figure 4.The scatter graphs of the predicted plot AGB values against the observed or reference values based on the modeling dataset (n = 100): (a) linear regression model (LM); (b) linear models with combined variables (LMC); (c) artificial neural network (ANN); and (d) random forest (RF).

Figure 4 .
Figure 4.The scatter graphs of the predicted plot AGB values against the observed or reference values based on the modeling dataset (n = 100): (a) linear regression model (LM); (b) linear models with combined variables (LMC); (c) artificial neural network (ANN); and (d) random forest (RF).

Figure 5 .
Figure 5.The scatter plots of the predicted plot AGB values against the observed values based on the modeling dataset (n = 100): (a) the linear regression with the age dummy variable (DLM); (b) the linear models considering combined variables and the age dummy variable (DLMC); (c) the artificial neural network with the age dummy variable (DANN); and (d) the random forest with the age dummy variable (DRF).

Figure 5 .
Figure 5.The scatter plots of the predicted plot AGB values against the observed values based on the modeling dataset (n = 100): (a) the linear regression with the age dummy variable (DLM); (b) the linear models considering combined variables and the age dummy variable (DLMC); (c) the artificial neural network with the age dummy variable (DANN); and (d) the random forest with the age dummy variable (DRF).

Figure 6 .
Figure 6.The statistical test results of the significant differences of mean errors from zero: (a) for the models without the age dummy variable and (b) the models with the age dummy variable (LM: linear regression, LMC: linear models considering combined variables, ANN: artificial neural network, RF: random forest; DLM: LM with the age dummy variable, DLMC: LMC with the age dummy variable, DANN: ANN with the age dummy variable, DRF: RF with the age dummy variable; YF: young forest, MA: middle-age forest, NM: near-mature forest, MO: mature and overmature forest; * and ** represent the significance levels of 0.05 and 0.01, respectively).

Figure 6 .
Figure 6.The statistical test of the significant differences of mean errors from zero: (a) for the models without the age dummy variable and (b) the models with the age dummy variable (LM: linear regression, LMC: linear models considering combined variables, ANN: artificial neural network, RF: random forest; DLM: LM with the age dummy variable, DLMC: LMC with the age dummy variable, DANN: ANN with the age dummy variable, DRF: RF with the age dummy variable; YF: young forest, MA: middle-age forest, NM: near-mature forest, MO: mature and overmature forest; * and ** represent the significance levels of 0.05 and 0.01, respectively).

Figure 7 .
Figure 7.The spatial distributions of the predicted aboveground biomass (AGB) values of the Pinus densata forests using eight models (LM: linear regression without the age dummy variable, LMC: linear models considering combined variables without the age dummy variable, ANN: artificial neural network without the age dummy variable, RF: random forest without the age dummy variable; DLM: linear regression with the age dummy variable, DLMC: linear models considering combined variables with the age dummy variable, DANN: artificial neural network with the age dummy variable and DRF: random forest with the age dummy variable).

Figure 7 .
Figure 7.The spatial distributions of the predicted aboveground biomass (AGB) values of the Pinus densata forests using eight models (LM: linear regression without the age dummy variable, LMC: linear models considering combined variables without the age dummy variable, ANN: artificial neural network without the age dummy variable, RF: random forest without the age dummy variable; DLM: linear regression with the age dummy variable, DLMC: linear models considering combined variables with the age dummy variable, DANN: artificial neural network with the age dummy variable and DRF: random forest with the age dummy variable).

Table 1 .
The parameters of the three Landsat 8 Operational Land Imager (OLI) images.

Table 2 .
The spectral variables (SV) derived from eight bands of the Landsat 8 OLI images.

Table 3 .
The stand ages of the modeling plots, test plots, and all the sample plots were 37.77 years, 38.02 years, and 37.85 years with the maximum AGB values of 344.38 Mg/ha, 235.35 Mg/ha, and 344.38 Mg/ha, respectively.Their sample means were 112.15Mg/ha, 115.43 Mg/ha, and 113.20 Mg/ha with the variation coefficients of 54.25%, 49.21%, and 52.47%, respectively.The sample means were not significantly different from each other at the significance level of 0.05.

Table 3 .
The statistics of the sample plot AGB values (MAS (year): mean age of stands; AGB (Mg/ha): aboveground biomass).The scatter plot of forest aboveground biomass (AGB) with diameter at breast height (DBH) from 100 sample trees.The statistics of plot AGB values were shown based on stand age groups in Table 3.The stand ages of the modeling plots, test plots, and all the sample plots were 37.77 years, 38.02 years, and 37.85 years with the maximum AGB values of 344.38 Mg/ha, 235.35 Mg/ha, and 344.38 Mg/ha, respectively.Their sample means were 112.15Mg/ha, 115.43 Mg/ha, and 113.20 Mg/ha with the variation coefficients of 54.25%, 49.21%, and 52.47%, respectively.The sample means were not significantly different from each other at the significance level of 0.05.

Table 3 .
The statistics of the sample plot AGB values (MAS (year): mean age of stands; AGB (Mg/ha): aboveground biomass).

Table 5 .
The evaluation results of the four models without the age dummy variable used to fit the modeling dataset (n = 100, LM: linear regression model, LMC: linear regression model with the combined variables, ANN: artificial neural network, RF: random forest; R 2 : coefficient of determination, and RMSE: root mean square error).

Table 6 .
The evaluation results of the four models with the age dummy variable based on the modeling dataset (n = 100).(DLM: the linear regression model with the age dummy variable; DLMC: the linear models with combined variables and the age dummy variable; DANN: the artificial

Table 6 .
The evaluation results of the four models with the age dummy variable based on the modeling dataset (n = 100).(DLM: the linear regression model with the age dummy variable; DLMC: the linear models with combined variables and the age dummy variable; DANN: the artificial neural network with the age dummy variable; DRF: the random forest with the age dummy variable; R 2 : the coefficient of determination, RMSE: the root mean square error).
neural network with the age dummy variable; DRF: the random forest with the age dummy variable; R 2 : the coefficient of determination, RMSE: the root mean square error).

Table 7 .
The coefficients (R 2 ) of determination and root mean square errors (RMSE) between the referenced and predicted values of AGB based on the test dataset (LM and DLM are linear regression without and with the age dummy variable, respectively, and LMC and DLMC are linear models considering combined variables without and with the age dummy variable, respectively; ANN and DANN are artificial neural network without and with the age dummy variable, respectively; RF and DRF are random forest without and with the age dummy variable, respectively; and given a stand age group, the greatest R2 values and the smallest RMSE values from the models with and without the age dummy variable are highlighted).