Allometric Models Based on Bayesian Frameworks Give Better Estimates of Aboveground Biomass in the Miombo Woodlands

The miombo woodland is the most extensive dry forest in the world, with the potential to store substantial amounts of biomass carbon. Efforts to obtain accurate estimates of carbon stocks in the miombo woodlands are limited by a general lack of biomass estimation models (BEMs). This study aimed to evaluate the accuracy of most commonly employed allometric models for estimating aboveground biomass (AGB) in miombo woodlands, and to develop new models that enable more accurate estimation of biomass in the miombo woodlands. A generalizable mixed-species allometric model was developed from 88 trees belonging to 33 species ranging in diameter at breast height (DBH) from 5 to 105 cm using Bayesian estimation. A power law model with DBH alone performed better than both a polynomial model with DBH and the square of DBH, and models including height and crown area as additional variables along with DBH. The accuracy of estimates from published models varied across different sites and trees of different diameter classes, and was lower than estimates from our model. The model developed in this study can be used to establish conservative carbon stocks required to determine avoided emissions in performance-based payment schemes, for example in afforestation and reforestation activities.


Introduction
The miombo is the largest continuous dry deciduous forest in the world.It extends across much of Central, Eastern and Southern Africa including parts of Angola, the Democratic Republic of Congo, Malawi, Mozambique, Tanzania, Zambia and Zimbabwe [1].The woodlands are rich in plant diversity and have the potential to contain a substantial amount of carbon, up to 39.6 Mg ha 1 in aboveground biomass [2][3][4].Though this is only 20% of Sub-Saharan African equatorial forests, which are estimated to stock between 72 and 152 tonnes of carbon per hectare [5], this carbon pool could be significant considering that miombo covers 2.7 million km 2 [6].However, there is significant uncertainty in the amount of biomass carbon in miombo woodlands; only a few studies report reliable estimates, making the estimation of carbon balance within the ecosystem uncertain.
The miombo woodland vegetation is highly heterogeneous [1,7,8], a phenomenon that makes it difficult to assess biomass.The density, patch size and shape (configuration) of tree stands are heavily influenced by variations in environmental conditions and human-induced disturbance factors (e.g., selective harvesting, frequent fires and grazing).These factors reshape the community structure and geometry of individual trees [9].Naturally, there are distinct woodland types, such as wet miombo and dry miombo, with different floristic characteristics [6].Most are dominated by Brachystegia, Julbernardia and Isoberlinia, but others are dominated by Colophospermum mopane (Benth.),Vachelia, Combretum, and Terminalia.In addition to the natural variations, the miombo woodlands have also undergone changes due to heavy use for supply of wood fuel.Carbon stocks in the miombo woodlands show spatial variability because of these differences in growth conditions and species composition [10].
Efforts to integrate information on the miombo woodlands' carbon stocks in national forest assessment programs are often limited by a lack of BEMs.Biomass models present advantages for biomass estimation because of their capacity for non-destructive estimation, which, when developed, yields high accuracy, applications over large areas of calibration, and a potential for follow-up measurements.Biomass models are typically developed and tested from destructive sampling, although there have been recent attempts with non-destructive approaches such as fractal branch analysis [11].A large number of BEMs have been developed for sub-Saharan Africa from destructive harvest data [12], with models archived in the GlobAllomeTree database [13].The wide range of BEMs presents challenges to users including the choice between: (1) existing models and developing new ones; (2) species-specific and mixed-species models; (3) simple bivariate power-law functions and models with multiple predictors; and (4) single predictors (e.g., DBH) and groups of predictors (e.g., DBH in combination with height, and/or wood density) [14].Ultimately, the choice of model affects the accuracy of the resultant carbon estimates.This study builds on those discussions to address the challenge of choosing between existing models and developing new ones.
A study by Kuyah et al. [15] reported a general lack of biomass models for tree species and environmental conditions in Malawi, possibly because earlier efforts in the region were directed towards wood volume models [16,17].Of the more than 800 models reviewed by Henry et al. [12], only 21 volume models (thirteen species specific and eight generalized) are available for Malawi.These models thoroughly cover the species commonly found in the miombo woodland, i.e., Brachystegia boehmii (Taub.),Brachystegia floribunda (Benth.),Brachystegia spiciformis (Benth.),Brachystegia utilis (Hutch.& Burtt Davy), Julbernardia paniculata (Benth.)and Pterocarpus angolensis (DC.).However, they focused mainly on the bole and the merchantable compartments, and estimate biomass through conversion of modeled bio-volume using wood density.Estimation of biomass in the country therefore largely relies on general purpose models cited in Brown [18][19][20], and those from neighboring miombo countries such as Mozambique [21], and Tanzania [8,22,23].
The lack of BEMs can constrain a country's ability to develop a database on the biomass available within and outside forests, the potential to participate in the global carbon trade through Reducing Emission from Deforestation and Forest Degradation (REDD+), and the national obligation to communicate to the United Nations Framework Convention on Climate Change emissions and removals from all sources and sinks.There is need therefore to either identify appropriate allometric models from the literature or develop new valid models that account for the diversity of trees within the miombo woodland ecosystem in Malawi.We develop a simple diameter-based power-law model for estimating aboveground biomass in the miombo woodlands and associated land use systems.We further demonstrate how the model compares with existing regional models commonly used in the miombo eco-region of southern Africa.

Study Site
The study was conducted in the districts of Kasungu, Salima and Neno in Malawi.Kasungu and Salima are located in the central region while Neno is located in the southern region of Malawi.Table 1 provides a detailed description of the location, climatic conditions and edaphic characteristics of each site.The natural vegetation across the three sites consists mainly of miombo woodland interspersed with marsh and grassy river channels in Kasungu, scattered fallows with dry grasslands in Salima, and montane grassland in Neno.The three sites are characterized by declining tree cover, particularly on communal land.Reducing tree cover is attributed to increasing pressure from agricultural expansion from an increasing population.Trees are also felled for wood fuel and charcoal, which supplies most of the domestic energy needs, for curing bricks in Neno, and for curing and construction of tobacco-curing structures in Kasungu.

Selection of Trees for Harvesting
Trees for building allometric models were sampled from three 10 km ˆ10 km sites located in the districts of Kasungu, Salima and Neno.The sites are part of 100 km 2 benchmark designed by the Africa Soil Information Service (AfSIS) for surveillance of land degradation in Malawi [24].Each 100 km 2 site is divided into 16 clusters of 2.5 km ˆ2.5 km; each cluster has 10 randomly placed centroids that can be used to draw plots.Out of the 10 centroids per cluster, 3 centroids were randomly selected and located using a GPS device.Plots measuring 30 m ˆ30 m were drawn for selected centroids, and an inventory of all trees within the plot boundary generated.This inventory was used to inform the diversity of species and sizes, and to create the list of trees for destructive sampling.
Trees inventoried in each site were grouped into six diameter classes of below 10 cm, between 10-20, 20-30, 30-40, 40-50, and above 50 cm to cover the entire range of sizes spanned by the trees in the area.A randomized pre-sample of trees from each site was generated from the inventory list with respect of the stratified diameter class.Trees to be harvested were chosen through a blind selection without tree species association.In addition to random selection of trees for harvesting in each diameter class from the inventory list, larger trees which could not be found in the inventoried plots were purposively sought in the landscape.Tree DBHs (measured at 1.3 m above the ground) greater than 2 cm were considered for harvesting.The lower DBH threshold of 2 cm was adopted to take into account the high number of low biomass trees in the study [25,26].

Tree Measurements and Biomass Sampling
A total of 88 trees ranging in DBH from 5.1 to 105 cm were harvested in 28 plots: 6 plots in Kasungu, 17 plots in Salima, and 5 plots in Neno.There were 33 species of tree.A summary of the characteristics of the dataset are presented in Table 2.The DBH of each tree was measured using diameter tape.The diameters were determined over-bark to the nearest 0.1 cm, with the tape held horizontally and tightly at the stem.Trees with abnormalities, including multi-stemmed trees were measured using guidelines described by West (2009).To determine the crown area, the largest extension of the crown was visually identified, and the crown edge of tall trees was located using a clinometer and a percent scale.The crown diameter was measured twice with a measuring tape: the crown diameter at its widest point (l), and the distance directly perpendicular to the maximum extension, at the same height (w).The crown area (CrA) in m 2 was then calculated using the formula of an ellipse: The species name was recorded, and, when scientific names could not be established in the field, the local name was provided by the local people who participated in the data collection.
Selected trees were felled by cutting at the lowest possible point using a chainsaw.The length of the felled tree was measured along the longest axis with a measuring tape.This measurement was used as the total tree height in the analysis.Felled trees were separated into stem, branches and twigs (leaves and small branches).The stem and larger branches were cut into smaller pieces of approximately 1 m in length.The fresh weights of the smaller stem and branch sections, as well as that of the leaves, were determined in the field on a 300-kg weighing scale to the nearest 0.1 decimal.

Sub-Sampling and Drying
Sub-samples were taken from the stems, branches and twigs for the determination of their fresh weight in the field and dry weight in the laboratory.Three discs, about 3 cm thick, were taken around the DBH, at the middle of the stem and towards the end of the stem axis.Similarly, three small discs, about 3 cm thick, were cut from the main branches at the proximal, middle and end of the branch.The three discs were taken across the stem and branch axis to account for possible variation in wood density.The samples were weighed on a 3 kg balance, labeled and transported to the laboratory for dry weight determination.In the laboratory, the labeled samples were put in the oven and allowed to dry for about 24 h at 105 ˝C [27].Samples were subsequently removed from the oven and weighed to establish any change in weight, after which the dry weight was recorded.The ratio of sub-sample dry weight to fresh weight was used to convert component fresh weight to dry weight.

Development and Evaluation of Biomass Models
The total biomass of aboveground woody parts and the DBH of each tree were used to fit regression models to the logarithmic form (Model 2), or to the power-law function (Model 3), where a and b, respectively, are the proportionality and power coefficients of the models.The models were built for aboveground biomass with DBH alone as the predictor variable (Model 2), DBH in combination with the height (Model 4), and DBH in combination with the crown area (Model 5).
Allometric coefficients for biomass models were estimated from the data using hierarchical Bayesian analysis, and compared with those constructed using ordinary least squares (OLS) regression.In the hierarchical Bayesian, the analysis included a hierarchical structure to explicitly account for the nested structure of the dataset, i.e., trees nested within sites to account for dissimilarities among the sites in climate (Table 1) and vegetation structure (Table 2).Bayesian analysis was chosen because violation of statistical assumptions of error can lead to biased point estimates if OLS or maximum likelihood estimation are applied, especially with small samples [14].Bayesian estimation methods need a much smaller sample size to estimate parameters accurately [28], and the hierarchical Bayesian approach has recently been proposed for estimating allometric parameters [29,30].Unlike the traditional approaches, in the Bayesian framework, the model parameters are treated as random variables, and inferences are based on their posterior distributions given the data.Thus, the conditional probability of a, b and the dispersion parameter (ϕ) were estimated from the posterior distribution of samples using the GENMOD procedure of the SAS system.GENMOD uses a Markov Chain Monte Carlo simulation by Gibbs sampling to simulate samples from posterior distributions.By default, GENMOD assumes a uniform prior distribution on the regression coefficients.Since we did not have prior knowledge of the distribution of a and b, we compared various non-informative priors (normal, uniform, Jeffrey's) and decided on the one that gave the smallest deviance information criterion (DIC).We assumed the prior distribution of ϕ to be inverse gamma and we monitored convergence by running two chains with initial values of the parameters from the maximum likelihood estimation.We specified a burn-in period of 5000 iterations, and additional numbers of Monte Carlo chains (NMCs) ranging from 10,000 to 500,000 after each burn-in depending on the diagnostics (Gelman-Rubin, Geweke, Raftery-Lewis and Heidelberger-Welch).In some instances, the Raftery-Lewis diagnostic indicated failure to produce a valid first-order Markov chain in the posterior samples of the parameters after 100,000 NMCs.Therefore, the NMC was increased until the test passed the set criteria.The parameter estimates and their credible intervals were considered valid only when the analysis passed all the diagnostics.
The literature was reviewed to identify potentially suitable biomass models from countries with agro-ecological settings similar to Malawi.Only power-law models with information on the sample size, the range of diameters and the tree components included in building the allometry were selected for evaluation.These descriptive metadata were considered important since they indicate the valid range within which the models can be used.Table 3 provides a summary of biomass models considered useful, while noting that the differences in ecological conditions of the source data could be a major cause of bias in the application.The accuracy of published and developed models was evaluated by calculating the mean relative error (MRE), the root mean square error (RMSE), and the mean absolute percentage error (MAPE) in the total biomass of the harvested trees.Better models have lower MRE, RMSE, DIC, and MAPE.

Relationship between Aboveground Biomass and Predictor Variables
Trees harvested for development of allometric models thoroughly captured the variability of the miombo woodland vegetation in Malawi in terms of species diversity and tree sizes.This is important because the parameters of allometric models (a and b) have been shown to vary with tree species and ecological conditions [29].It has been recommended that allometric models be developed using a dataset of about 50 sample trees because sample sizes below this lead to underestimation of the allometric exponent [14].The sample size used was considered adequate to develop a generalizable model that is applicable to a large set of species of different sizes across the miombo woodland in Malawi.
The diameter at breast height was significantly correlated with the aboveground biomass of harvested trees, accounting for over 95% (p < 0.001) of the variation in aboveground biomass for trees harvested across the three sites (Figure 1a).The scatter plots of height and crown area against aboveground biomass show positive correlation, but also reveal greater variance compared to those of DBH (Figure 1b,c).The positive correlation observed between tree height or crown area and aboveground biomass is expected because of a fundamental allometric scaling law that operates between these variables.The greater variance between the height/crown area and aboveground biomass scatter is attributed to the variable tree height and crown area, even for trees of the same diameter class and species.This variation can be explained by differences in environmental condition and the historical disturbance levels at the different sites.For example, extraction of wood for fuel and charcoal production, livestock grazing and conversion of woodlands to cultivated fields are the main factors that change community structure and geometry of individual trees in the miombo [7,8,31].

Performance of New Allometric Models
Generalizable allometric models were developed for estimating biomass in the miombo woodlands in Malawi (Table 4).Evaluation of the models indicates that the quality of biomass estimates varies among models and depends on how the models are developed.A common approach to developing allometric models is to regress total tree biomass or the biomass of components against the DBH using ordinary least squares.A major limitation with such simple linear regression is heteroscedasticity-the variation in tree weight increases with increasing diameter.The data are therefore log transformed, although back transformation of logarithmic values to arithmetic units introduces bias that must be corrected [32].Model development techniques, such as the hierarchical Bayesian approach, have been recommended for circumventing many problems involved in allometry development [14].In this study, the Bayesian method performed better compared to ordinary least square regression.The DIC of 59.02 for the model built with ordinary least squares (Model 2) was larger than 51.04 for Bayesian (Model 1).The MAPE of 27.98% for ordinary least squares was larger than MAPE of 27.20% for Bayesian.This means that the Bayesian estimation is more accurate because even a small difference in absolute terms in the performance of two models indicates a better performance.Figure 2 also indicates that the model built using ordinary least squares tended to overestimate aboveground biomass.Previous studies have shown that the Bayesian approach provides a robust approach for estimating allometric coefficients [29,30].

Performance of New Allometric Models
Generalizable allometric models were developed for estimating biomass in the miombo woodlands in Malawi (Table 4).Evaluation of the models indicates that the quality of biomass estimates varies among models and depends on how the models are developed.A common approach to developing allometric models is to regress total tree biomass or the biomass of components against the DBH using ordinary least squares.A major limitation with such simple linear regression is heteroscedasticity-the variation in tree weight increases with increasing diameter.The data are therefore log transformed, although back transformation of logarithmic values to arithmetic units introduces bias that must be corrected [32].Model development techniques, such as the hierarchical Bayesian approach, have been recommended for circumventing many problems involved in allometry development [14].In this study, the Bayesian method performed better compared to ordinary least square regression.The DIC of 59.02 for the model built with ordinary least squares (Model 2) was larger than 51.04 for Bayesian (Model 1).The MAPE of 27.98% for ordinary least squares was larger than MAPE of 27.20% for Bayesian.This means that the Bayesian estimation is more accurate because even a small difference in absolute terms in the performance of two models indicates a better performance.Figure 2 also indicates that the model built using ordinary least squares tended to overestimate aboveground biomass.Previous studies have shown that the Bayesian approach provides a robust approach for estimating allometric coefficients [29,30].The power law model (Model 1) was better than the polynomial model with DBH and square of DBH.Model diagnosis showed that parameters of Model 1 were stable, and there were also not many outliers and influential observations as indicated by studentized residuals.The scaling exponent of this model is close to the theoretical value of the Metabolic Scaling Theory (8/3) [30].The use of power law models is supported in the literature because they are simple to construct and use, and their parameters have biological meaning [33,34].Although polynomial regression models with DBH and square of DBH (not shown) present lower DIC (48.83), diagnostics showed that the model was unstable, as indicated by highly inflated variance (variance inflation >10) and the standard error for ln(DBH 2 ).
Model 1 with DBH alone as the predictor variable was also better than models that include height and crown area as additional predictor variables (Table 4).The null hypothesis that parameter estimates for height and crown area are not significantly different from zero could not be rejected.Inclusion of these variables in the DBH model also did not substantially reduce the MAPE; instead, the bias (MRE) changed from underestimation to overestimation.The parameters of models with height and crown area were also unstable due to large values of influence statistics.Our findings that height and/or crown area does not substantially improve the predictive power of diameter models contradict recent findings in Peru, which show that crown radius greatly improves model performance, especially for the largest trees [35].The majority of allometric models in the miombo ecoregion use DBH as the sole predictor variable, e.g., Chamshama et al. [22], Ryan et al. [21] Chidumayo et al. [8] and Mugasha et al. [23].There are few models that include height as a second variable to DBH, e.g., Chamshama et al. [22].The lack of models that include height and AGB curve represents a trend line fit against harvested aboveground biomass (AGB).
The power law model (Model 1) was better than the polynomial model with DBH and square of DBH.Model diagnosis showed that parameters of Model 1 were stable, and there were also not many outliers and influential observations as indicated by studentized residuals.The scaling exponent of this model is close to the theoretical value of the Metabolic Scaling Theory (8/3) [30].The use of power law models is supported in the literature because they are simple to construct and use, and their parameters have biological meaning [33,34].Although polynomial regression models with DBH and square of DBH (not shown) present lower DIC (48.83), diagnostics showed that the model was unstable, as indicated by highly inflated variance (variance inflation >10) and the standard error for ln(DBH 2 ).
Model 1 with DBH alone as the predictor variable was also better than models that include height and crown area as additional predictor variables (Table 4).The null hypothesis that parameter estimates for height and crown area are not significantly different from zero could not be rejected.Inclusion of these variables in the DBH model also did not substantially reduce the MAPE; instead, the bias (MRE) changed from underestimation to overestimation.The parameters of models with height and crown area were also unstable due to large values of influence statistics.Our findings that height and/or crown area does not substantially improve the predictive power of diameter models contradict recent findings in Peru, which show that crown radius greatly improves model performance, especially for the largest trees [35].The majority of allometric models in the miombo ecoregion use DBH as the sole predictor variable, e.g., Chamshama et al. [22], Ryan et al. [21] Chidumayo et al. [8] and Mugasha et al. [23].There are few models that include height as a second variable to DBH, e.g., Chamshama et al. [22].The lack of models that include height and crown area may be attributed to difficulties associated with measuring these variables and an inability to measure them accurately-errors are pronounced for taller trees and trees with intersecting crowns [36]-and to the marginal value in improving the accuracy of allometric models when included as predictor variables along with DBH [33,37].

Performance of Published Allometric Models
Published models overestimated biomass with an MRE between 17% and 73%, and biomass with a MAPE between 33% and 81% (Table 3).The accuracy of the models varied across the evaluated sites and trees of different diameter classes (Figure 3).In the absence of local or country-specific allometric models, mixed species models by Chamshama et al. [22] and Ryan et al. [21] appear as the most appropriate for estimating aboveground biomass in Kasungu and Neno, which were both dominated by small diameter trees.Breaking down the error into DBH size classes showed that all published models overestimated biomass with a greater bias in trees with a DBH greater than 30 cm.The model by Brown [18] was best for trees with a DBH between 10 and 30 cm in diameter.
Forests 2015, 6, page-page were both dominated by small diameter trees.Breaking down the error into DBH size classes showed that all published models overestimated biomass with a greater bias in trees with a DBH greater than 30 cm.The model by Brown [18] was best for trees with a DBH between 10 and 30 cm in diameter.Published models overestimated biomass in Salima and larger diameter class, indicating that models should be used within their DBH range.For example, 38% (18 out of 47) of the trees from Salima exceeded the maximum DBH of trees used to build the models by Brown [18] and Chidumayo [8] (Table 3).This demonstrates the need to consider the DBH range in applying biomass modelsapplying models outside their DBH range will result in bigger errors, especially for the larger trees.Information on error breakdown is important since uncertainty in the resultant biomass depends on Published models overestimated biomass in Salima and larger diameter class, indicating that models should be used within their DBH range.For example, 38% (18 out of 47) of the trees from Salima exceeded the maximum DBH of trees used to build the models by Brown [18] and Chidumayo [8] (Table 3).This demonstrates the need to consider the DBH range in applying biomass models-applying models outside their DBH range will result in bigger errors, especially for the larger trees.Information on error breakdown is important since uncertainty in the resultant biomass depends on the size of the tree, and the individual trees of a particular size.For example, though the small diameter trees hold a small fraction of biomass in most landscapes [15,33], their contribution is significant when aggregated in areas where they occur in large number.Selection of models can therefore be informed by uncertainty introduced in the biomass estimated in trees of different diameter classes.

Conclusions
The power law model with DBH alone as the predictor variable developed using the Hierarchical Bayesian performed better than both the polynomial model with DBH and models that include height and crown area as additional predictor variables.The model can be used to establish conservative carbon stocks required to determine carbon sequestration in performance-based payment schemes, for example in afforestation and reforestation activities.This model should be evaluated for its applicability across land-use systems where logging and extensive extraction for fuel wood and wild fire are known to affect the structure of trees in the miombo woodlands.

Model 1 -
2 are built from the diameter at breast height (DBH) alone using Bayesian and least square regression, respectively.Models 3-5 were built with DBH in combination with height and crown area (CrA) using Bayesian methods.The criteria for the choice of model included the deviance information criterion (DIC), mean absolute percentage error (MAPE) and variance inflation factor (VIF). ‡ represents the 95% credible intervals in Model 1,2, 3-5 and the standard error of the parameter estimate in Model 2. NS Parameters are not significantly different from zero.

Figure 2 .
Figure 2. Performance of biomass models developed using ordinary least squares and hierarchical Bayesian methods.AGB curve represents a trend line fit against harvested aboveground biomass (AGB).

Figure 2 .
Figure 2. Performance of biomass models developed using ordinary least squares and hierarchical Bayesian methods.AGB curve represents a trend line fit against harvested aboveground biomass (AGB).

Figure 3 .
Figure 3.The mean relative error and mean absolute percentage error of the models applied to each of the sites and the trees of different size class.

Figure 3 .
Figure 3.The mean relative error and mean absolute percentage error of the models applied to each of the sites and the trees of different size class.

Table 1 .
Biophysical and climatic characteristics at the study areas in the districts of Kaungu, Salima and Neno.

Table 2 .
Summary of variables measured and biomass of trees harvested across the three sites in Malawi.CV is the coefficient of variation.

Table 3 .
Allometric models commonly used for estimating aboveground biomass (ABG) in the miombo woodlands.N is the number of trees (sample size) used for constructing the model; MRE is the mean relative errors, RMSE is the root mean square error, and MAPE is the mean absolute percentage error; * Calculate the mass of carbon in the trees.The estimate is therefore multiplied by 2.13 (1/0.47) to obtain dry mass of trees.