Biomass Estimation Models for Six Shrub Species in Hunshandake Sandy Land in Inner Mongolia, Northern China

Shrub biomass estimation is valuable in assessing ecological health, soil, and water conservation capacity, and carbon storage in arid areas, where trees are scattered, and shrubs are usually dominant. Most shrub biomass estimation models are derived from trees designed for trees, yet shrubs and trees show significant differences in morphology. However, current biomass estimation methods specifically for shrubs are still lacking. This study aimed to test various predictors’ performance in estimating shrub biomass, particularly providing an improved cone frustum volume model as a new predictor. Seven different variables, including three univariates and four composite variables, were selected as predictors in allometric models. Six dominant shrub species of different sizes and morphology in the semi-arid Hunshandake Sandy Land in Inner Mongolia were selected as samples to test the seven predictors’ performances in above-ground biomass estimation. Results showed that the single measurements performed poorly and were not suitable for shrub biomass estimation. The allometric models, including crown-related volumes as predictors, performed much better and were considered ideal for common shrub biomass estimation. The improved cone frustum volume model had more flexible geometric for shrubs of different shapes and sizes, with high fitting accuracy and stability among all the volume predictors. Therefore, we recommend the volume of an inverted cone frustum with a crown diameter and ground diameter as the long and short diameters as an excellent predictor of shrub biomass estimation, especially when studies involve various shrub species, and a general model would be needed.


Introduction
Dry lands constitute 40% of the global land area, and 37% of the world's human population occupy dry lands [1]. In China, arid and semi-arid land also accounts for 50% of the country's land area, and where shrub covers an area of 2 × 10 6 km 2 , twice the forest cover [2]. Shrubs naturally are the dominant species and should be considered ecological key species within many ecosystems in an arid area. They perform a broad range of functions, such as protecting the soil, maintaining biodiversity, carbon sequestration, providing habitat for animals, etc. [3][4][5][6][7].
An ecosystem of Elm (Ulmus pumila L.) woodland characterizes the climatic climax community in the semi-arid Hunshandake Sandy Land in northern China, and the vegetation is represented by sparse Ulmus pumila and abundant shrub communities [8]. Shrubs account for a large proportion of the ecosystem's total biomass and play an essential role in sand-dune stabilization and other ecological functions. A better understanding of the shrub biomass in Hunshandake Sandy Land will be of great significance for evaluating the ecosystem's structure and function. In addition, accurate ground biomass data is essential for correcting remote sensing data to estimate large-scale biomass distribution [9,10].
Rapid and non-destructive methods are needed for biomass estimation because of the labor and cost of pruning and weighing large plants and avoiding damage to the ecological environment [11][12][13][14][15]]. An allometric model describing the relationship between biomass and easily measured variables is a non-destructive and cost-effective method. It is considered the most accurate indirect method to estimate woody plant biomass and has been widely used in forestry [16,17]. More easily measured variables, such as diameter at breast height, height, crown size, and a combination of these variables have been popularly used to estimate tree biomass [18][19][20][21]. In most cases, successful relationships are developed from these simple measurements for trees [22].
Similarly, biomass estimation for various shrub species was developed using different dimensional measurements [14,23,24]. However, some of the variables, especially stemrelated variables and height, have been shown to predict tree biomass accurately but often perform poorly in shrub biomass estimation [21,22,[25][26][27]. Shrubs have distinct morphological characteristics compared to trees, with multiple stems from the plant's base and shorter height. Predictors especially suitable for shrubs need to be developed, and many researchers have paid attention to this problem and made great efforts to it. Ludwig et al. [16] showed that the volume and canopy area were generally suitable variables for estimating shrub biomass in desert regions. Conti et al. [24] tested the relative performance of crown-related variables relative to stem-related variables to predict shrubs' aboveground biomass in the semi-arid Chaco forest in central Argentina. Usó et al. [28] used three volume models as independent variables to estimate Mediterranean shrubs' biomass and compared estimated accuracy based on the three models. Huff et al. [25] showed that the crown area had a good prediction capability of shrub biomass, and shrub height did not increase prediction accuracy. Sternberg & Shoshany [29] used an inverted cone, upper-half spheroid, and cylinder to fit different species' shapes for estimating individual wood biomass in Israel.
Based on previous studies, this study further improved a particularly suitable predictor for shrub biomass estimation and assessed the new predictor's goodness with six other commonly used predictors. This study provided more options for researchers who need an accurate estimation of shrub biomass, and give some suggestions for choosing the appropriate method.

Study Area
The region of Hunshandake Sandy Land is in the middle of the Inner Mongolia Autonomous Region, which covers an area of 18,000 km 2 (42 • -44 • N, 113 • -118 • E) ( Figure 1). The Sandy Land belongs to the eastern part of the desert belt in northern China, which stretches across the sub-humid, semi-arid, and arid climate zones. The average annual temperature is 0.9 • C-5.5 • C, the average yearly precipitation is 250-400 mm, and the annual evaporation is 2000-2700 mm. More than 98% of dunes are fixed or semi-fixed by vegetation. The Ulmus pumila woodland is the native top plant community in the Hunshandake Sandy Land, which is the most widely distributed, especially in the central and eastern regions. Ulmus pumila is the only tree species in the study area, concentrated on the shady slopes of sand dunes and scattered on the gentle slope between the dunes. Under the canopy of the trees, the lack of typical shade-tolerant undergrowth cannot form a real forest environment. Shrubs are distributed mainly in the depressions between sand dunes, where better water conditions are available. Ribes (Ribes diacantha Pall.) (S1), Betula

Above-ground Biomass Sampling
The sample site had been built since 2013, about 100 hectares. All shrubs identified inside the sample site were measured, yielding a total of 30,000 individuals. The six dominant shrub species S1 to S6 were selected as samples, and averages of 20 individuals per species were randomly selected to cover the range of possible plant sizes. Five variables were measured for each individual plant using a long rod with scale: total height (h, cm), which is defined as the distance between the ground surface and highest crown point; maximum crown diameter (a1, cm) and its perpendicular diameter (b1, cm); and maximum diameter at the base (a2, cm) and its perpendicular diameter (b2, cm) ( Figure 3). The maximum crown diameter was measured at the height where the crown width can be seen to be the largest. The maximum diameter at the base was measured at ground level, using a long rod with scale through bush branches.
The total above-ground live biomass (with leaves) of the 20 shrubs per species were harvested, weighed with a hanging scale (Nops Goldenlark OEM BT-203, accuracy = 0.01

Above-ground Biomass Sampling
The sample site had been built since 2013, about 100 hectares. All shrubs identified inside the sample site were measured, yielding a total of 30,000 individuals. The six dominant shrub species S1 to S6 were selected as samples, and averages of 20 individuals per species were randomly selected to cover the range of possible plant sizes. Five variables were measured for each individual plant using a long rod with scale: total height (h, cm), which is defined as the distance between the ground surface and highest crown point; maximum crown diameter (a1, cm) and its perpendicular diameter (b1, cm); and maximum diameter at the base (a2, cm) and its perpendicular diameter (b2, cm) ( Figure 3). The maximum crown diameter was measured at the height where the crown width can be seen to be the largest. The maximum diameter at the base was measured at ground level, using a long rod with scale through bush branches.
The total above-ground live biomass (with leaves) of the 20 shrubs per species were harvested, weighed with a hanging scale (Nops Goldenlark OEM BT-203, accuracy = 0.01

Above-Ground Biomass Sampling
The sample site had been built since 2013, about 100 hectares. All shrubs identified inside the sample site were measured, yielding a total of 30,000 individuals. The six dominant shrub species S1 to S6 were selected as samples, and averages of 20 individuals per species were randomly selected to cover the range of possible plant sizes. Five variables were measured for each individual plant using a long rod with scale: total height (h, cm), which is defined as the distance between the ground surface and highest crown point; maximum crown diameter (a 1 , cm) and its perpendicular diameter (b 1 , cm); and maximum diameter at the base (a 2 , cm) and its perpendicular diameter (b 2 , cm) ( Figure 3). The maximum crown diameter was measured at the height where the crown width can be seen to be the largest. The maximum diameter at the base was measured at ground level, using a long rod with scale through bush branches.

The Seven Models for Estimating Above-Ground Biomass of Shrub
Seven estimation methods for shrub biomass were classified into two kinds of fitting strategies. The models in the first kind were based on allometric models with single measurements as predictors, namely, total height (H), crown diameter (CD), and ground diameter (GD). Models in the second kind were based on allometric models with four different volumes as predictors, namely, V1, which was the volume of a circular cylinder with CD as the diameter; V2, which was the volume of a circular cylinder with GD as the diameter; V3, which was the volume of an elliptical cylinder with a1 and b1 as the long and short diameters, respectively; and V4, which was the volume of an inverted cone frustum with CD and GD as the long and short diameters, respectively. The mathematical expressions The total above-ground live biomass (with leaves) of the 20 shrubs per species were harvested, weighed with a hanging scale (Nops Goldenlark OEM BT-203, accuracy = 0.01 kg), and recorded. A certain proportion of each individual's branches were stored in plastic bags as samples. The samples were transported to a laboratory and dried in an air-forced oven at 80 • C until constant weight. Each individual's water content was then calculated to convert fresh weight into dry weight, and the dry weight was used in the biomass models in this study. The descriptive statistics of the shrub measurements and the biomass data were shown in Table 1.

The Seven Models for Estimating Above-Ground Biomass of Shrub
Seven estimation methods for shrub biomass were classified into two kinds of fitting strategies. The models in the first kind were based on allometric models with single measurements as predictors, namely, total height (H), crown diameter (CD), and ground diameter (GD). Models in the second kind were based on allometric models with four different volumes as predictors, namely, V1, which was the volume of a circular cylinder with CD as the diameter; V2, which was the volume of a circular cylinder with GD as the diameter; V3, which was the volume of an elliptical cylinder with a1 and b1 as the long and short diameters, respectively; and V4, which was the volume of an inverted cone frustum with CD and GD as the long and short diameters, respectively. The mathematical expressions of these predictors were presented in Table 2, and the geometric representations are shown in Figure 3.
The formula's parameters correspond to the parameters in Figure 3. a 1 is the maximum crown diameter, and b 1 is its perpendicular diameter, a 2 is the maximum ground diameter, and b 2 is its perpendicular diameter.
Separate H and CD measurements were selected because they were the basic measurements of shrubs and popularly used in studies [25,30]. V1, V2, and V3 were selected because they were widely used in research and had definite geometric significance [7,12,31]. In this study, V4 has more geometries similar to actual shrubs and was introduced as a new predictor to improve estimation accuracy.
To obtain the most accurate biomass estimation models, initially, the main model forms, including linear, exponential, logarithmic, and power function proposed in the literature, were tested [19,25,32,33]. Finally, power function was selected to establish the allometric models because it generally performed well for all the six shrub species and predictors: where y is the above-ground shrub biomass, x is the predictor reflecting shrub size, a and b are the regression coefficients.

Regression Fitting and Accuracy Evaluation of the Seven Models
The Pearson's correlation coefficients between the predictors and the biomass were calculated to observe each predictor's estimation capability preliminarily. According to the bias-corrected Akaike Information Criterion (AICc), the best statistical model was selected, a likelihood criterion that penalizes the number of parameters [34]. We also reported the Coefficient of determination (R 2 ), Root Mean Square Error (RMSE), and the regression's P-value as alternative statistics reflecting the fitting accuracy.
Akaike Information Criterion (AIC) has been widely accepted for measuring the goodness of fit within a cohort of nonlinear models and frequently used for model selection [7,25,34,35]: where p is the number of parameters and ln(L) is the maximum log-likelihood of the estimated model. To provide a fair playing ground, we employed an AIC variant that corrects for small sample sizes, the bias-corrected AIC (AICc) [34,35]: where n is the sample size and p is the number of parameters. The best statistical model with the highest R 2 and lowest SEE and AICc was selected [7,12,32]. All the statistical analyses were processed using IBM SPSS Statistics (Version 25.0, Armonk, NY, USA).

Correlation Analysis between Predictors and Shrub Biomass
Pearson's correlation coefficients preliminarily indicated each predictor's estimation ability for shrub biomass, which were shown in Table 3. Among the three single measurements (H, CD, and GD), CD had the highest correlation coefficient with the six shrub species' biomass, and this result was consistent with previous research [25,30]. Compared with the single measurements, the volume models V1, V3, and V4 had a better correlation with shrub biomass in general. Among all the predictors, the improved volume model V4 obtained the highest correlation with shrub biomass.

Fitting Accuracy Evaluation of the Seven Models
Based on the allometric model (Equation (1)), seven fitting models including different variables were established as predictors, and the model parameters are shown in Table 4. The block diagram in Figure 4 showed the mean and variation range of AICc, R 2 , and RMSE, indicating the fitting accuracy and the fitting stability of different shrub species. Among the seven fitting models, the volume models (except V2) performed best, with lower AICc, and an average R 2 > 74%, RMSE < 0.50. V2 did not perform as well as the other three volume models, which may be related to the weak correlation between GD and shrub biomass (Table 3). V4 further improved the fitting accuracy with average higher R 2 , lower AICc, and RMSE compared to V1 and V3. The variation of R 2 , AICc and RMSE of V4-biomass models were relatively small across the six shrub species, which indicated that the V4-biomass model had a more stable fitting accuracy for different shrub species. Among the three single measurements, CD (average R 2 > 69%, RMSE < 0.54, AICc < −20) performed relatively better than H and GD (average R 2 < 52%, RMSE > 0.70, AICc > −11). Nevertheless, compared to the volume-biomass models (except V2), the AICc, R 2 , and RMSE of CD-biomass models had a large variation across the six shrub species, suggesting that the CD-biomass model may not be accurate when fitting some shrub species.
Correlation analysis and fitting accuracy assessment showed that the V4-biomass allometric model best-estimated shrub biomass among the seven models. The fitting curves and parameters of the V4-biomass allometric models for the six species were shown in Figure 5. All equations were statistically significant (p < 0.01), and R 2 indicated that the models explained 63%-86% of biomass variability for the six shrub species. RMSE showed that approximately 95% of the observations fell within ±1.23% (double RMSE) of the fitting line.      The mathematical structure and the model parameters are the same as the Formula (1). Correlation analysis and fitting accuracy assessment showed that the V4-biomass allometric model best-estimated shrub biomass among the seven models. The fitting curves and parameters of the V4-biomass allometric models for the six species were shown in Figure 5. All equations were statistically significant (p < 0.01), and R 2 indicated that the models explained 63%-86% of biomass variability for the six shrub species. RMSE showed that approximately 95% of the observations fell within ± 1.23% (double RMSE) of the fitting line.

Discussion
In recent years, with the attention to global warming and the carbon cycle, more and more researchers have begun to estimate the carbon sequestration capacity of vegetation in arid regions, where the vegetation is generally dominated by shrubs herbs [36]. It is first necessary to estimate the shrubs' biomass more accurately to study shrubs' carbon

Discussion
In recent years, with the attention to global warming and the carbon cycle, more and more researchers have begun to estimate the carbon sequestration capacity of vegetation in arid regions, where the vegetation is generally dominated by shrubs herbs [36]. It is first necessary to estimate the shrubs' biomass more accurately to study shrubs' carbon sequestration capacity [4]. However, the current shrub biomass estimation methods are not yet mature, and most of them are derived from the methods of tree biomass estimation. For trees, a single measurement such as diameter at breast height or height can generally get better prediction results, but this does not apply to shrubs because the shape of shrubs is very different from that of trees, and the shape of shrubs is more variable. Therefore, various compound predictors have appeared to meet researchers' needs for shrub biomass estimation [30,33].
Nevertheless, researchers are easily confused when choosing among predictors because the studied shrub species may be very different, and it is hard to determine which predictor is suitable. This research attempts to find and improve predictors that are suited to different shrub species. Therefore, we selected six shrub species with different shapes and sizes and tested the performance of seven predictors for biomass estimating, including the new predictor (V4) that we improved ourselves.
Among the seven predictors, the single measurements performed relatively poorly compared to the volume models. Brown [37] established a suitable relationship between individual stem diameter and biomass for shrubs in northern Rocky Mountain. Still, in our study, it was impractical to measure each stems' basal diameter because shrub species in our study area usually had dozens of stems at the base. Hence, we used the diameter of the area occupied by the shrub stems on the ground as GD rather than the diameter of the individual shrub stems. Among the three single measurements, CD performed much better than GD and H according to correlation analysis, which was consistent with the results in many other studies [5,23]. However, all these three single measurements performed poorly in the fitting analysis. The error varied significantly, indicating that the models' accuracy with single measurements for different shrub species is unstable. The reason might be that the morphology of the shrub species was quite different, and it was difficult to accurately characterize the size of the shrub with a single measurement. Therefore, compound predictors have to be considered when simulating shrub biomass.
Among the four compound predictors, V1, V3, and V4 performed well, showing a good correlation with biomass and high fitting analysis accuracy. The volume model V2 associated with GD did not perform as well as the three other volume models, which was related to the weak correlation between GD and biomass (Table 3). V1 and V3, which are associated with CD, performed much better in biomass estimation, consistent with the strong correlation between CD and biomass. Conti et al. [24] also confirmed the superior performance of CD-related variables in estimating shrub biomass in their studies.
Murray and Jacobson [12] believed that surface areas and volumes could be derived from individual plants' shapes, thereby further improving the development of biomass predictors for shrub species. Sternberg & Shoshany [29] tried different volume models to improve the estimation accuracy of shrub biomass in Israel. In this study, the V4 biomass model showed the best fitting accuracy and stability for different shrub species. Our study's six shrub species had significantly different shapes and sizes, and the V4 model showed a good fitting effect on most of them. This is because the geometry of the V4 model is flexible and can be adapted to different shrub shapes. For example, if the GD is close to 0, the geometric shape would be similar to an inverted cone.
Additionally, if the GD is less than the CD, it would be an inverted cone frustum; and if the GD is close to the CD, the geometric shape would be similar to a cylinder. These geometric shapes can satisfy most shrub morphology, suggesting that V4 can be more widely used in different shrub species. In many studies, target shrubs were often multiple species, while the shapes of different shrub species were very different. Therefore, researchers need to use different volume models to estimate all shrub species' volume to obtain better biomass estimates. The flexibility of the V4 can satisfy volume simulation of different shrubs shapes, provided that we fully measure the shrubs according to the routine without worrying about the differences of individual shrubs. Hence, if V4 was used as the shrub volume estimation method, and then use the volume as a predictor to establish a V4-biomass model, the research method could be simplified, and a more accurate and reliable biomass estimate could be obtained.
However, V4 involved more variables, which may be considered laborious in practice. Thus, we believe that if shrub measurements are sufficient, V4 would be the best predictor for shrub biomass estimation. Otherwise, the crown-related volumes V1 and V3 would be appropriate predictors.
In this study, we uniformly adopted a power function to establish allometric models. However, the linear function is also a commonly used form of function, and the relationship between volume (V1, V3, V4) and shrub biomass also showed a good linear relationship in this study ( Figure 5). This article focused on selecting predictors and did not discuss the form of functions in detail. However, we tested linear functions, exponential functions, logarithmic functions, and polynomial functions before choosing power functions to construct the allometric model. However, no matter which form of the function we chose, the goodness of fit was consistent with the law: crown-related volumes > CD > H > GD. Therefore, we believe that, in addition to the functional form, the selection of independent variables (predictors) is of great significance to the effectiveness of the model and should be paid enough attention.

Conclusions
Our study showed that the single measurements were not suitable for estimating shrub biomass. The crown-related volumes performed much better than single measurements when used as predictors of shrub biomass estimation. Among the crown-related volumes, the improved predictor, the volume of an inverted cone frustum with a crown diameter and ground diameter as the long and short diameters performed best according to fitting accuracy and stability of different shrub species. Consequently, we suggest that if studies involve various shrub species, and a general model is needed to accurately estimate shrub volume and then use the volume as a predictor of biomass estimation, the volume of an inverted cone frustum with a crown diameter and ground diameter as the long and short diameters would be an ideal choice.