Estimating the Above-Ground Biomass in Miombo Savanna Woodlands (Mozambique, East Africa) Using L-Band Synthetic Aperture Radar Data

The quantification of forest above-ground biomass (AGB) is important for such broader applications as decision making, forest management, carbon (C) stock change assessment and scientific applications, such as C cycle modeling. However, there is a great uncertainty related to the estimation of forest AGB, especially in the tropics. The main goal of this study was to test a combination of field data and Advanced Land Observing Satellite (ALOS) Phased Array L-band Synthetic Aperture Radar (PALSAR) backscatter intensity data to reduce the uncertainty in the estimation of forest AGB in the Miombo savanna woodlands of Mozambique (East Africa). A machine learning algorithm, based on bagging stochastic gradient boosting (BagSGB), was used to model forest AGB as a function of ALOS PALSAR Fine Beam Dual (FBD) backscatter intensity metrics. The application of this method resulted in a coefficient of correlation (R) between observed and predicted (10-fold cross-validation) forest AGB values of 0.95 and a root mean square error of 5.03 Mg·ha−1. However, as a consequence of using bootstrap samples in combination with a cross validation procedure, some bias may have been introduced, and the reported cross validation statistics could be overoptimistic. Therefore and as a consequence of the BagSGB model, a measure of prediction variability (coefficient of variation) on a pixel-by-pixel basis was also produced, with values ranging from 10 to 119% (mean = 25%) across the study area. It provides additional and complementary information regarding the spatial distribution of the error resulting from the application of the fitted model to new observations.


Introduction
Forests play an important role in the global carbon (C) cycle, and their relation to anthropogenic and climate changes have been recognized in the literature (e.g., [1,2]).Also, the importance of accurately reporting the C content (biomass) of forested lands over time has been acknowledged by several studies (e.g., [3,4]) and is a requirement of international conventions (e.g., the United Nations Framework Convention on Climate Change, UNFCCC).Such information is critical, as it forms the basis of reporting to mechanisms developed under the UNFCCC, such as the initiatives focusing on Clean Development Mechanisms (CDM) and the voluntary post-Kyoto Protocol policy mechanism on Reducing Emissions from Deforestation and forest Degradation (REDD) in developing countries [5], an economic instrument to provide incentives for reducing emissions from the forest Reference Emissions Levels/Reference Levels (RELs/RLs) benchmarks, while providing co-benefits in terms of biodiversity and livelihoods [6,7], and a core issue under the ongoing climate negotiations [8].
Sensors onboard orbital platforms provide the only means of observing the Earth from a global and systematic perspective and, particularly, of assessing its different components, namely, land-use and land-cover change (LULC), forest monitoring and C stocks [9,10].Mapping and understanding the spatial distribution of forest above-ground biomass (AGB) using remote-sensing methods is an important and challenging task [11][12][13].These maps can be used to monitor forests (deforestation, regrowth and degradation processes), to estimate and model greenhouse gas emissions and the effects of conservation actions, sustainable management and enhancement of C stocks [4,14].However, most of the attempts to estimate forest AGB are approximations relying on a combination of land cover type and corresponding mean C values derived from field surveys, instead of spatially explicit biomass maps (e.g., [15]).
A few key active sensors onboard orbital and aerial platforms are providing useable information for forest AGB estimation, which could support both countries and industry in their endeavor of setting benchmarks and assess their performance in implementing a series of forest-related activities.Relationships have been established between forest AGB and (a) the backscattering coefficients of Synthetic Aperture Radar (SAR) data (different frequencies and polarizations, e.g., [16]) and (b) the vertical/horizontal distribution of Light Detection and Ranging (LIDAR) returns; (e.g., [17]).The common link between these sensors is that they offer information on the three-dimensional distribution of plant elements because of the penetrative capability [15].Other approaches have utilized tree or canopy height, from LIDAR, SAR interferometry (InSAR) and SAR polarimetric interferometry (PolInSAR), as a surrogate for biomass estimation (e.g., [18][19][20]).However, few approaches for consistent and reliable retrieval of forest AGB are available, as most suffer, for example, from saturation of the signal above certain levels, insufficient coverage, inconclusive or non-repeatable relationships, because of seasonal variation in leaf cover or moisture content and a lack of in situ data to support their calibration/validation (e.g., [15]).Consequently, there is a great uncertainty related to the estimation of forest aboveground biomass, especially in the tropical and sub-tropical regions (e.g., [21]).However, in this study, we bring forth an example of a human modified low carbon ecosystem in southern Africa dominated by Miombo woodlands, the largest savanna in the world [22,23].This ecosystem is strongly influenced by anthropogenic fires and supports the livelihoods of over 100 million people, while at the same time, it is also greatly threatened by desertification processes, deforestation, degradation of land and water resources and loss of biodiversity.
The main goal of this study was to test a combination of field data and Japan's Advanced Land Observing Satellite (ALOS) Phased Array L-band Synthetic Aperture Radar (PALSAR) backscatter intensity data to reduce the uncertainty in the estimation of forest AGB in the Miombo savanna woodlands of Mozambique (East Africa).The penetrative capability of L-band synthetic aperture radar (SAR) data and the resulting interaction with vegetation structure is the main motivation for using these data for modeling forest AGB with a view to propose an alternative methodological approach that could be applied for the assessment of national, regional or local forest C stocks with a reduced uncertainty in the reported estimates.
The paper is structured as follows.Section 2 provides the framework and the scientific background of the research, including a brief description of the study area.Section 3 describes the field data collection, as well as the remote sensing data used in this study.Section 4 includes a description of the machine learning method used to model forest aboveground biomass as a function of ALOS PALSAR data.The results and corresponding discussion are presented in Section 5, and the study is concluded in Section 6.

Background and Study Area
The study area is located in the central region of Mozambique, Zambézia province, district of Lugela, near the village of Mocuba (Figure 1) (16°04′S-17°16′S, 36°33′E-37°28′E).The climate is sub-humid, according to the Thorntwaite climate classification system.The dry season extends from March to November, the average annual precipitation ranges from 850 mm to 1,300 mm and the mean temperature from 20 °C to 27 °C.
The region is mostly covered by Miombo forests, the most extensive tropical savanna woodland formation of Africa that extends across some of the world's poorest countries [22] and directly supports the livelihoods of the local populations.The main economic activity in the Lugela district is agriculture, which accounts for around 82% of the economically active population, followed by hunting activities, wood collection (for construction timber, charcoal production and domestic fuel), collection of medicinal plants, palm wine extraction, sand extraction and fishing.During the past year, an energy company has been making some efforts to understand and support the livelihoods of local populations (who are still recovering from decades of war) in the context of a suitability assessment for the production of biofuel in the region.For that and as a first step, lacking infrastructures were built, namely roads and a bridge.A detailed assessment was carried out to determine the impact of a plantation with Jatropha curcas L. (a non-edible plant whose oil-rich seeds can be processed into biodiesel).The best location for the installment of the plantation in a ~10,000 ha parcel of land was decided, complying with the sustainability criteria listed in the Directive 2009/28/EC of the European Parliament [24].Under this Directive, the energy from biofuel cannot be derived from raw material obtained from land with a high carbon stock.Therefore, all continuously forested areas, i.e., land spanning more than one hectare with trees higher than five meters and a canopy cover of more than 30% could not be used as plantation areas, and evidence of the low carbon stock of the area with a canopy cover between 10% and 30% before conversion must be provided, thus contributing to the sustainable management (social and environmental) of this area of interest and, ultimately, contributing to the company's carbon emissions reduction targets as a whole.The ~10,000 ha parcel of land was therefore divided into different classes of land use and tree canopy cover (see Section 3.1 for details) to assess the feasibility of the plantations and biofuel production in this area.This study focused on the areas with a tree canopy cover below the 50% threshold, corresponding to 16.5% of the total area.

Field Measurements
Sampling was designed to accurately account for the total biomass C stocks in the selected carbon pools and was stratified using ancillary data provided from on-screen classification of very high spatial resolution satellite data (WorldView 2) images into four strata with different tree crown cover (0-10, 10-30, 30-40, 40-50%).A systematic grid of points with a random origin was created covering the Miombo area with a tree canopy cover between 10% and 50% in the ~10,000 ha parcel of land and used as a basis for plot location.A total of 51 circular plots with a 20-m radius were randomly selected over the grid and measured during a field campaign that took place in July and August 2011.At the plot level, the following data were recorded: geographic coordinates (with global positioning system, GPS), physiographic location, dominant aspect and slope.At the stand level, data recorded included the tree crown cover, litter layer, down dead wood, sampling of live non-tree vegetation (herbaceous plants and shrubs) and soil organic carbon, up to 30 cm.At tree level, species were identified, tree vitality classified and diameter at breast height (DBH) measured for all trees with a DBH greater than 5 cm.
Several C pools were estimated, namely the live above-and below-ground tree biomass and soil biomass.Root biomass (i.e., BGB) was not measured in the field, but assessed through a root-to-shoot ratio (R:S = 0.42) reported by Ryan et al. [25].As this study focuses only on the estimation of forest AGB, all the other pools measured in the field (down dead wood, litter, soil) were not included in the following sections.
(1) The allometric equation developed by Ryan et al. [25] for estimating the C content of the AGB of each tree is given by Equation (1): where B stem is the tree AGB (kg•C), DBH the diameter at breast height (cm) and ln the natural logarithm.C content was estimated by using a conversion factor of 0.47 [25].The coefficient of determination (R 2 ) of this fitted equation was 0.93 and the root mean square error was 0.52 ln(kg C) (n = 29).
(2) Chidumayo [26] developed relationships that relate AGB as a function of DBH (in Williams et al. [29]) and are presented in Equations ( 2) and (3): where AGB is in kg and DBH in cm.
(3) Separate equations for the estimation of AGB as a function of climate and primarily the mean monthly evapotranspiration (ET) and rainfall (R) were developed by Chave et al. [27], with these being wet (ET > R in less than one month per year), moist (ET > R more than one month and less than five months per year) and dry (ET > R more than five months per year) forests.These criteria basically defined the extent of the growing period (when R is greater than ET) as a proportion of the year and on a monthly basis.Thus, the same criteria can be rearranged in terms of growing period (GP) as (i) wet (GP > 11/12), (ii) moist (11/12 > GP > 7/12) and (iii) dry (GP < 7/12).Using the GP dataset produced by Silva et al. [14], the forests of the study area could be considered as belonging to the dry category.Therefore, Equation 4 was used: where AGB is in kg, ρ is the wood specific density (oven-dried on a green volume basis, g•cm −3 ) and DBH is in cm.(4) Brown et al. [28] developed another equation (Equation ( 5)) for the dry forest life zone: 34.4703 8.0671 0.6589 (5) where AGB is in kg and DBH is in cm.The model has a R 2 = 0.67 and a mean square error of 0.02208 (n = 32).
In the end, tree AGB were computed as the arithmetic mean of the values derived from these four equations.

ALOS PALSAR
The Japan Aerospace Exploration Agency (JAXA) launched ALOS on 24 January 2006, placing the satellite in a polar, sun synchronous orbit at ~700 km and ensuring a 46-day repeat cycle.PALSAR is one of the instruments onboard ALOS, which is an enhanced version of the Japanese Earth Resources Satellite (JERS-1) SAR instrument [30].PALSAR has a center frequency of 1,270 MHz (23.6 cm, i.e., L-band) and a chirp bandwidth of 14 MHz and 28 MHz.The instrument operated in five different observation modes, (i) Fine Beam Single (FBS), (ii) Fine Beam Dual (FBD), (iii) Polarimetric (PLR), (iv) ScanSAR and (v) Direct Transmission (DT).However, data acquired in the FBD mode was used, with these providing HH/HV or VV/VH polarizations (14 MHz bandwidth).Whilst 18 alternative off-nadir viewing angles were available (from 9.9° to 50.8°),only that acquired at 34.3° (HH and HV) polarization) were selected by JAXA and used with these acquired in ascending node with a 70 km swath width.Rosenqvist et al. [30] provide a complete reference and description of the ALOS PALSAR sensor.
The European Space Agency (ESA) catalogue was used to select the two ALOS PALSAR FBD (HH and HV polarizations) scenes required to cover the entire study area.The data were acquired on 21 June 2010, with an off-nadir angle of 34.3° (ascending orbits) (Figure 1), and were provided by ESA in level 1.1 (single look complex, SLC).The SARScape software (version 4.3.001)produced by Sarmap SA (http://www.sarmap.ch) was used for all ALOS PALSAR processing, which followed standard SAR processing (e.g., [31][32][33][34]).Prior to geocoding, the SLC data were converted to multi-look intensity (MLI) format.To obtain approximately square pixels in ground range coordinates, a multi-look factor of 1 in range and 5 in azimuth was used.The resulting MLI data was geocoded to a ground resolution of 15 m (both in range and azimuth).No filtering for speckle noise reduction was performed.The geocoding of the MLI data, which refers to a transformation from the slant-range/azimuth geometry to map projection geometry, was performed to obtain geocoded terrain corrected (GTC) images.GTC requires a digital elevation model (DEM), which was applied to the scenes acquired over the study area.A 90 m DEM retrieved from the Shuttle Radar Topography Mission (SRTM) over the study area was used.This DEM (version 4) was provided by the International Centre for Tropical Agriculture (CIAT) and obtained from the European Commission (EC) Joint Research Center (JRC).To obtain GTC images, a backward solution is usually implemented, which considers an input DEM, which is used to convert the positions of the backscatter elements into slant range image coordinates.The transformation of the three-dimensional object coordinates-given in a cartographic reference system-into the two-dimensional row and column coordinates of the slant range image, is performed by rigorously applying the Range and Doppler equations [32].In case of precise satellite orbits, which is the case of ALOS, the geocoding process is run in a fully automatic way, and pixel accuracy can be achieved.Please refer to Meier et al. [32] for a more comprehensive description of the GTC method used in SARScape.
Radiometric calibration was carried out by following the radar equation law and involved corrections for the scattering area, the antenna gain pattern and the range spread loss [35,36].A DEM is required to properly determine all required geometric parameters of the radar equation, so the calibration is performed during the data geocoding step, where the required parameters were already calculated.The calibrated value is a normalized dimensionless number (linear units, m 2 •m −2 ), and the corresponding value in the dB scale was generated by applying 10log 10 of the linear value and was generated as gamma nought (γ°).Even after a rigorous radiometric calibration, backscattering coefficient variations are clearly identifiable in range direction and in presence of topography, which requires a radiometric normalization.These variations are an intrinsic property of each imaged object and, thus, might be compensated, but it may not be corrected in absolute terms.In this study, a cosine correction method was applied, which was based on a modified cosine model [31] and applied to the backscattering coefficient to compensate for range variations.After the two scenes had been geocoded, they were combined to create a mosaic.Figure 1c shows the mosaic of the two ALOS PALSAR FBD geocoded scenes over the study area.

Extraction of ALOS PALSAR FBD Data at Field Plot Locations
Each plot measured in the field had a 20 m radius.Although the coordinates of each plot center were collected with GPS, there are always positional errors, especially when differential corrections are unavailable (errors up to 8-10 m are common).Although pixel accuracy can be achieved with ALOS PALSAR data geocoding (15 m ground pixel spacing), some error will always be present.Therefore, to compensate for these two sources of position errors, a buffer around each plot center with a 50 m radius was created.The buffer was selected as anything larger would have impinged on areas that were not homogeneous in terms of the plot tree canopy cover.All ALOS PALSAR pixels inside each 50 m buffer around each plot center were extracted, with several metrics computed (mean, minimum, maximum and standard deviation) and used to establish relationships with the AGB at the plot level.As the original ALOS PALSAR FBD mosaic had a 15 m spatial resolution and the buffer around each plot center was set to 50 m, then the extracted values per plot were those located approximately on a 6 × 6 pixel window size centered on each plot center, thus extracting the data in a 90 × 90 m window.
To assess the amount of speckle of the processed ALOS PALSAR FBD MLI data (15 m ground resolution), the equivalent number of looks (ENL) (Equation ( 6)) (e.g., [33,34]) was estimated over a set of 30 Miombo homogeneous regions of interest spread over the study area: where µ and σ 2 are the mean and variance of the backscatter intensity values (original scale).The estimated ENL mean values were 5.26 and 4.91 for the HH and HV polarizations, respectively.These values are approximately the same as the number of looks used to produce the MLI data, i.e., 1 look in range and 5 look in azimuth, thus resulting in a 5 look image, meaning that the amount of speckle anticipated during multilooking is comparatively the same as that estimated by Equation ( 6).However, as mentioned above, the ALOS PALSAR FBD data extracted per plot were equivalent to extracting data located on a 6 × 6 pixel window size centered on each plot center.Therefore, the amount of speckle present in the ALOS PALSAR FBD data used to model AGB should be that of the ALOS PALSAR FBD data averaged to a 90 m spatial resolution.The ENL corresponding to this ALOS PALSAR FBD data were estimated, using the same set of 30 homogeneous areas and the mean values were 48.86 and 37.82, for the HH and HV polarizations, respectively.It shows that the amount of speckle was substantially reduced by spatially averaging the ALOS PALSAR FBD data.

Contribution of Different ALOS PALSAR Polarizations and Metrics
Several scatterplots were produced to display and evaluate the strength of the relationship between ALOS PALSAR backscatter intensity (γ°) and AGB data estimated from field data.ALOS PALSAR backscatter intensity (γ°) for the HH and HV polarizations were displayed against AGB data.As the ALOS PALSAR data were extracted on a 50 m radius of each plot center, then several metrics were computed, namely, (i) mean, (ii) minimum, (iii) maximum and (iv) standard deviation.Also, the coefficient of correlation (R) between the ALOS PALSAR metrics and the AGB was computed, either using parametric (Pearson's R) and non-parametric (e.g., Spearman's rank R) approaches [37].

Regression with Stochastic Gradient Boosting (SGB) and Bagging SGB (BagSGB)
Traditionally, the attempt of explaining a given variable (dependent variable) as a function of one or more variables (predictor or independent variable(s)) has relied on parametric statistical models, such as linear (simple or multiple) regression models.These models have several assumptions, including normal distribution of errors and variables, as well as homoscedasticity (e.g., [38]).However, in the past decades and largely because of problems presented by large arrays of data, several other methods and algorithms have been developed, which are commonly referred to as machine learning methods.These methods have a large scope of application, from prediction (e.g., [39,40]) to classification (e.g., [41,42]) problems and can have different formulations (e.g., neural nets, ensembles of trees, support vector machines).A comprehensive review of these methods can be found in Hastie et al. [38].
SGB [43,44] originated from the decision tree theory (e.g., [38,45]).Decision tree theory relies on partitions of the space of all possible predictor variables.Starting with the whole predictor space (at the root of the tree), the space is successively split using a series of rules such that, in the end, each terminal node of the tree is assigned to the most probable response class (classification trees) or the mean response in that node (regression trees) [45,46].Some advantages over traditional parametric classification methods include the non-parametric nature of the classifier, quantification of variable importance, disclosure of non-linear and hierarchical relationships between predictor variables, and acceptance of missing values [47,48].However, classification and regression trees are sensitive to small perturbations in the training data, which may originate large changes in the resulting outputs [49].Therefore, these unstable methods can have their accuracy improved with perturbing and combining techniques.These generate multiple perturbed versions of the classifier (a.k.a., ensemble or committee) and combines them into a single predictor [50].These methods can be divided into two types: those that adaptively change the distribution of the training set based on the performance of previous classifiers (e.g., boosting) and those that do not (e.g., bagging) [51].
SGB [43,44] combines both the advantages of bagging and boosting and can be used in regression and classification problems (e.g., [52][53][54][55]).It typically uses a base learner (in our case, binary decision trees) and constructs additive regression (or classification) models by sequentially fitting the chosen base learner to current "pseudo"-residuals by least squares at each iteration using a random fraction of the training data without replacement [44].This process has been shown to substantially improve the prediction accuracy and execution speed, making the approach resilient to overfitting [44].Furthermore, Suen et al. [56] demonstrated that building and combining (by averaging in the case of regression) several SGB models on samples randomly drawn with replacement from the original training dataset (bootstrap sample) performed significantly better than a unique SGB model and concluded that it was accomplished by variance reduction.Therefore, two approaches were followed, (i) generate a SGB model from the original training dataset and (ii) generate several SGB models fitted to bootstrap samples (with replacement) of the original training dataset that were then combined by averaging.This allowed us to compare SGB against bagging SGB (BagSGB).
Model fitting under SGB has a number of options that were tested to select the best model.This was done by developing in-house R code [57] to implement a loop using Ridgeway's R gbm package [58] and Elith et al. [40] R code.These options include: (i) distribution (Gaussian, i.e., the loss function, whose measure of deviance to be minimized is the mean squared error), (ii) bagging fraction (i.e., the random fraction of the training data that is randomly selected to build each decision tree; 0.5, 0.6, 0.7, 0.8 and 0.9), (iii) tree complexity (i.e., the number of nodes in each decision tree; 1, 2, 3, 4, 5, 7 and 9) and (iv) shrinkage rate (i.e., controlling the learning rate of the algorithm; 0.01, 0.005, 0.0025, 0.001 and 0.0005) [58].Therefore, the selection of the best SGB model is the result of evaluating 175 candidate individual SGB models (5 bagging fraction values × 7 tree complexity parameters × 5 shrinkage rate values).Elith et al. [40] referred to the fact that small shrinkage rates result in a smaller contribution of each tree and, therefore, are more directed at building a model that will provide a more consistent estimate of the dependent variable; as a rule of thumb, they indicate that, as a minimum, models should be fitted with 1,000 trees.Accordingly, to select the best SGB model out of the various candidates, we selected the model that had the lowest 10-fold cross-validated deviance (i.e., lowest mean squared error), given that a minimum of 1,000 trees were used to generate that model.This procedure was carried out for each bootstrap sample of the original training dataset when generating the BagSGB model.We used 25 bootstrap samples to build a BagSGB model, as Breiman [49] suggests that a higher number of replicates tend not to produce a significant test set error reduction.The final model (bagging SGB or BagSGB) was built by averaging the predictions from the 25 selected SGB models fitted to the bootstrap samples.Also, as each predicted observation is the result of averaging over a set of 25 bootstrap samples, it allows also building a measure of prediction variability.The coefficient of variation (CV) (e.g., [37]), calculated as the standard deviation of a predicted observation divided by the corresponding mean value, was used as a measure of assessing the uncertainty associated with each prediction.
AGB was chosen as the dependent variable, and the mean, minimum, maximum and standard deviation of the ALOS PALSAR HH and HV backscatter intensity values extracted for each plot (50 meters buffer around each plot center) were selected as independent variables.The number of pixels that were used to compute those metrics ranged from 28 to 37, the variation being dependent on the plot location regarding the 15 m ALOS PALSAR data.
The SGB and BagSGB models were compared using a traditional bias and variance decomposition of the root mean square error (RMSE) (Equations (7-9)).A 10-fold cross validation approach was followed, as the number of observations was not large enough to evaluate model performance with an independent subset (e.g., [38]).
where RMSE is the model root mean square error, e i the error (difference between the observed and predicted values) of the i th observation, σ 2 the error variance, b the error bias and n the number of observations; where ē is the mean error; 1 (9)

C Stocks and Comparison with Published Biomass Maps
Mean AGB and C stocks per tree canopy cover classes under 50% were estimated, as well as the total AGB and C stock in the Miombo forests of the broader study area (~10,000 ha).The default carbon fraction of dry matter of 0.47 obtained from the Intergovernmental Panel on Climate Change [59] was used to convert Miombo forest AGB to C content.This default value is consistent with the dry mass fraction determined by Ryan et al. [25] with subsamples from both the trunk and branches of 19 trees from a Miombo region in central Mozambique.
A comparison with published biomass maps, especially when derived from distinct data sources, is important, namely to compare and assess differences in the AGB and C stocks estimates in a same ecosystem and region.For that, two published maps were selected: Baccini et al. [60] and Saatchi et al. [12].Baccini et al. [60] mapped AGB across tropical Africa (1 km spatial resolution) using an ensemble of regression tree-based models (random forests) that relate in situ measurements with data acquired by the Moderate Resolution Imaging Spectroradiometer (MODIS) onboard NASA's Terra and Aqua satellites between 2000 and 2003; a cross-validation approach showed that the model explained 82% of the variance in AGB, corresponding to a RMSE of 50.5 Mg•ha −1 (for AGB values between 0 and 454 Mg•ha −1 ).Saatchi et al. [12] mapped forest C stocks, AGB plus below-ground biomass (BGB), in the pan-tropical belt (~1 km spatial resolution) around the year 2000 using a data fusion algorithm based on the maximum entropy approach (MaxEnt) to model field-based measurements as a function of data acquired by the Geoscience Laser Altimeter System (GLAS) onboard the Ice, Cloud and land Elevation Satellite (ICESat) and extrapolating to the landscape using data from other optical and microwave sensors; the prediction variability ranged from ±6% to ±53% on a pixel basis, but when scaling-up to project-and country-scales, the errors decreased to ±5% and ±1%, respectively.
The uncertainty at the study area level (U P ) was estimated using the error propagation approach (Equation ( 10)) (e.g., [61]), which is the same equation used to assess overall uncertainty in REDD projects and national greenhouse gas inventories [59]: where AGB i and U i are the forest AGB and uncertainty at the ith pixel, respectively, and N the number of pixels in the area being assessed.

Forest AGB from Field Data
As mentioned in section 3.1, the four selected allometric equations were used to produce an estimate of each tree AGB and were subsequently averaged to produce an estimate of AGB of each measured tree.Table 1 depicts the average AGB C stock per tree canopy cover class.
Table 1.Average above-ground biomass (AGB) (tC•ha −1 ) values per tree canopy cover class (standard error of the mean in parentheses).A factor of 0.47 was used to convert from biomass to C content [25,59].

Contribution of ALOS PALSAR Polarizations and Metrics
The relationship between ALOS PALSAR backscatter intensity data (HH and HV polarizations) and forest AGB data is shown in Figure 2

BagSGB Modeling
The final BagSGB model was the result of combining the 25 SGB models built from the corresponding bootstrap samples of the original training dataset.The forest AGB values predicted from the 10-fold cross-validation of each of the 25 models were aggregated (averaged) and the corresponding statistics calculated (Table 3).The comparison between observed and cross-validation predicted forest AGB values is displayed in Figure 3(a).Each cross-validation predicted forest AGB value was the result of averaging between 16 and 36 times, as each bootstrap sample was built with replacement.A close correspondence (R = 0.95) between observed and cross-validation predicted forest AGB values was identified, although the AGB above 40 Mg•ha −1 was slightly underestimated.However, there is no evidence that it is a trend consistent with L-band SAR saturation, as only four plots have AGB values higher than that value.As a comparison, the same combination of parameters (i.e., distribution, bag fraction, tree complexity and learning complexity) was tested to build the best SGB model fitted to the original training dataset.This model was based on 4,700 trees, resulting from a bag fraction of 0.8, a learning rate of 0.0005 and a tree complexity of 2. The 10-fold cross-validation results performed less well than those resulting from the application of an ensemble of SGB models (i.e., a BagSGB model).The cross-validation RMSE of this model was 12.04 Mg•ha −1 (5.03 Mg•ha −1 in the BagSGB model), the variance 144.81 (24.96 in the BagSGB model), the error bias −0.32 Mg•ha −1 (0.58 Mg•ha −1 in the BagSGB model) and the linear coefficient of correlation (R) between observed and cross-validation predicted AGB values was 0.48 (0.95 in the BagSGB model).Figure 3(b) shows the scatterplot between observed and cross-validation predicted forest AGB values for this SGB model.The error bias is approximately the same; therefore, the RMSE decrease (from 12.04 to 5.03 Mg•ha −1 ) was a consequence of variance reduction from 144.81 in the SGB model to 24.96 in the BagSGB model.This is in agreement with the results of Suen et al. [56] already mentioned in Section 4.2.On the basis of its better performance in the cross-validation assessment, the BagSGB was applied to the entire study area.The ALOS PALSAR FBD backscatter intensity data used to build the model were extracted in a 50 m buffer around each plot center ,with this being equivalent to a spatial resolution of approximately 100 m.Therefore, to create the forest AGB map of the study area, the original ALOS PALSAR FBD backscatter intensity data at 15 m spatial resolution was necessarily aggregated to a 90 m spatial resolution, and the minimum, maximum, mean and standard deviation values were computed.Also and for displaying purposed, the continuous AGB map was transformed into a five-class map (Figure 6).As mentioned in Section 4.2, the type of algorithm used allowed the production of a map displaying the prediction variability in each pixel, by computing the coefficient of variation (%) on a pixel-by-pixel basis and for the same reasons was converted into a five-class map (Figure 7).As mentioned above, although the 10-fold cross validation procedure that was implemented may have resulted in overoptimistic values of model assessment, the prediction variability at the pixel level provides additional and complementary information regarding the spatial distribution of the model error that could be used to assess its applicability to new observations.Figure 7. Forest AGB uncertainty classes map of the study area (outlined) obtained with the coefficient of variation (%) resulting from the application of the fitted BagSGB model.The minimum and maximum values presented in the legend are for the area encompassing the mosaic of the two ALOS PALSAR scenes used.In the study area (~10,000 ha), the minimum and maximum forest AGB coefficient of variation values were 10% and 119%, respectively.

C Stocks and Comparison with Published Biomass Maps
Mean forest AGB and derived C stocks, as well as uncertainty per tree canopy cover class in the study area (~10,000 ha) were calculated on the basis of the forest AGB map produced in this study in the same manner as in the work of Baccini et al. [60] and Saatchi et al. [12].Differences in the time frame of each study (2008 for this study, 2000-2003 for Baccini et al. [60] and early 2000s for Saatchi et al. [12]) were not considered nor addressed here.The forest AGB carbon stock obtained for the entire study area from this study (143,444 Mg•C) is very different from the values obtained with data from Baccini et al. [60] (498,570 Mg•C) and Saatchi et al. [12] (405,986 Mg•C).The mean AGB obtained with the data from this study is 30.6 Mg•ha −1 , while in Baccini et al. [60], it is 106.3Mg•ha −1 and in Saatchi et al. [12], 86.6 Mg•ha −1 .Mitchard et al. [63] on their evaluation of the data published by Baccini et al. [60] used an independent dataset of 1,154 field measurements, obtained in 16 African countries and concluded that large errors were associated with this AGB map; more significantly for our study, there was large underestimation in forests with higher AGB and overestimation in woodland savannas, resulting in a RMSE of 145 Mg•ha −1 .However, Saatchi et al. [12] also reported higher values than those reported here.As mentioned in Section 3.1, in our study, only the area with a tree canopy cover between 10% and 50% was sampled for forest AGB estimation, and therefore, it is not unexpected that the derived C stock and mean AGB are lower than those reported in studies using field data spanning the entire tree canopy cover range.Also, it is important to note that the Miombo forests are diverse in terms of tree canopy cover and, therefore, C content.For example, Glenday [64] reported an average forest AGB value of approximately 92 Mg•ha −1 for Brachystegia sp.dominated dry forests in Kenya.Malimbwi et al. [65], using field data collected in the Miombo woodlands of Tanzania, reported an average forest AGB value around 18 Mg•ha −1 .Ryan et al. [25] estimated an average forest AGB value of 45 Mg•ha −1 for a Miombo woodland in Mozambique.Shirima et al. [66] measured several plots in Miombo forests of Tanzania and estimated a mean forest AGB value of approximately 48 Mg•ha −1 .The study of Ryan et al. [25] was carried out in areas of Miombo formations similar to those addressed in the present study and the estimated mean AGB was 45 Mg•ha −1 ; a value that compares with the 31 Mg•ha −1 estimated during this study.The difference can probably be related to the 10-50% threshold of the measured Miombo in this study.
The estimated forest AGB uncertainty at the study area level was 0.21% (coefficient of variation) in this study and 2.82% (error) in Saatchi et al. [12].The values of uncertainty on a pixel basis ranged from 10-119% (mean = 25%) and 21-31% (mean = 27%) in this study and in Saatchi et al. [12] respectively.Although higher on a pixel level, the values decrease substantially when the measure of uncertainty is obtained at a study area level.This is also the case in Saatchi et al. [12].
Furthermore, the mean forest AGB and the total C stocks were estimated for the subset of the study area with a tree canopy cover between 10% and 50% (1,157 ha), which was the area sampled during the field campaign that took place in the 2011 dry season.The estimates are depicted in Table 4 by tree canopy cover classes.Similarly as for the results presented in Table 4, the estimates of mean AGB and total AGB C stock per tree canopy cover class derived from this study are substantially lower than those estimated in Baccini et al. [60] and Saatchi et al. [12].Table 4. Mean AGB, Total AGB, AGB C stock and uncertainty per tree canopy cover class in the study area with tree canopy cover between 10% and 50% (1,157 ha), derived from this study (BagSGB model), Baccini et al. [60] and Saatchi et al. [12] ( 1 the uncertainty refers to the coefficient of variation (%) (this study) and error (%) [12]; n.a., not available.).Romijn [67] concluded that the land use change associated with the introduction of Jatropha curcas L. in Miombo savanna woodlands will only have a positive feedback (i.e., atmospheric carbon sequestration) when introduced in wastelands or severely degraded lands; however, he acknowledges that the data used in his study have a high degree of uncertainty, especially due to substantial regional and local variations in soil, biomass and climate characteristics.Nevertheless, for industry and for the purpose of complying with the objectives of environmental sustainability in biofuel plantations, it is important to guarantee that the C content of the areas to be converted is sufficiently low at the start, so that the biofuel produced actually corresponds to savings in carbon emissions when compared to fossil fuels, and thus, it is effectively contributing to the countries' and companies' renewable energy targets.Given uncertainties in carbon estimations, good practice demands that a conservative approach be applied.Thus, an accurate enough, expedient and cost effective method for spatially explicit carbon quantification whose performance risk is systematic underestimation may constitute a helpful planning tool.

Reference
When dealing with international funds for performance based payments to developing countries in the context of REDD, the use of the conservativeness principle for estimating emissions from deforestation and forest degradation is always required.In such cases, establishing a reference emission level using a conservative carbon stock is mandatory.Ryan et al. [68] in their study in areas of Miombo savanna woodland located in central Mozambique concluded that deforestation activities were not particularly occurring in high biomass areas.Therefore, by using estimates of forest AGB exclusively from data retrieved from plots that had a tree canopy cover between 10% and 50%, we could actually be encompassing the areas more prone to deforestation and, therefore, provide more conservative estimates of forest AGB for REDD projects.Additionally, the methodology used in this study could be easily adapted to produce spatial conservative estimates of C stocks in the AGB pool.The methodology uses n (in this case n = 25) SGB models and then averages them to estimate the AGB at the pixel level.Therefore, a conservative estimate of AGB at a given pixel could be obtained by using a given percentile (lower than 50%) of the distribution of possible forest AGB values for that pixel.Alternatively, the coefficient of variation that is produced, also on a pixel-by-pixel basis, could be used to discount the value of the estimated forest AGB.

Conclusions
Consistent forest carbon monitoring methods are required at various levels, namely for developing countries who wish to address international conventions and to access carbon-based financial mechanisms associated with climate change mitigation.A method for reducing the uncertainty in the estimation of forest above-ground biomass (AGB) in Miombo savanna woodlands in southeast Africa (Zambézia province, Mozambique) has been presented.The advancement of this study relied on the use of a machine learning algorithm to establish a relationship between in situ forest AGB and L-band Synthetic Aperture Radar (SAR) backscatter intensity (gamma nought, γ°) data obtained from the Phased Array L-band SAR (PALSAR) sensor onboard the Advanced Land Observing Satellite (ALOS).This algorithm, bagging stochastic gradient boosting (BagSGB), is unique, as it allows also the production of spatial explicit estimates of prediction variability and an indication of the importance of each predictor variable.Estimates of forest AGB with a root mean square error (RMSE) of 5.03 Mg•ha −1 based on 10-fold cross validation were produced with this modeling approach.Also, the coefficient of correlation (R) between observed and predicted (from 10-fold cross validation) forest AGB values was 0.95.The variable contributing the most to this model was the mean backscatter intensity for the HH polarization, which was explained by the low tree canopy cover characterizing Miombo savanna woodlands, thus invoking scattering mechanisms associated with this polarization (e.g., trunk-ground scattering).Furthermore, it was recognized that the optimistic overall validation results (RMSE and R) might be a consequence of the 10-fold cross-validation procedure, especially when dealing with bootstrap samples that were drawn with replacement.Nevertheless, this algorithm was unique in producing estimates of prediction variability (coefficient of variation) on a pixel-by-pixel basis.These estimates ranged from 10 to 119% across the study area, with a mean value of 25%.This map of prediction variability (Figure 7) is a useful instrument to assess how well the model is predicting new observations.One of the reasons for the observed disagreement between the mean forest AGB values and total forest AGB carbon (C) stocks generated from this study and those resulting from the two available forest AGB maps (i.e., [12,60]) could be related to the fact that only the forest areas with tree canopy cover between 10% and 50% were sampled for the collection of in situ data.Therefore, subsequent work will rely on sampling the Miombo areas with tree canopy cover greater than 50%, which will allow a better characterization of the Miombo savanna woodlands in the region and more in situ observations to produce an updated version of the BagSGB model.

Figure 1 .
Figure 1.(a) Location of Mozambique in Africa; (b) Mozambique provinces and location of the study area in the Zambezia province; (c) mosaic of the two ALOS PALSAR Fine Beam Dual (FBD) scenes (HH polarization 90 m mosaic) and limits (in white) of the study area.ALOS PALSAR FBD zoom over the ~10,000 ha study area; (d) HH polarization; (e) HV polarization.
. The mean, minimum, maximum and standard deviation were computed based on the ALOS PALSAR backscatter intensity data extracted over a 50 m radius Figure values the tra values dashed

Table 2 .
Parametric Pearson's coefficient of correlation (R) and non-parametric Spearman's rank coefficient of correlation (R rank ) between forest AGB and several metrics derived from the ALOS PALSAR FBD (HH and HV polarizations) backscatter intensity data (* significant at a significance level of 0.05).