New Forest Aboveground Biomass Maps of China Integrating Multiple Datasets

: Mapping the spatial variation of forest aboveground biomass (AGB) at the national or regional scale is important for estimating carbon emissions and removals and contributing to global stocktake and balancing the carbon budget. Recently, several gridded forest AGB products have been produced for China by integrating remote sensing data and ﬁeld measurements, yet signiﬁcant discrepancies remain among these products in their estimated AGB carbon, varying from 5.04 to 9.81 Pg C. To reduce this uncertainty, here, we ﬁrst compiled independent, high-quality ﬁeld measurements of AGB using a systematic and consistent protocol across China from 2011 to 2015. We applied two different approaches, an optimal weighting technique (WT) and a random forest regression method (RF), to develop two observationally constrained hybrid forest AGB products in China by integrating ﬁve existing AGB products. The WT method uses a linear combination of the ﬁve existing AGB products with weightings that minimize biases with respect to the ﬁeld measurements, and the RF method uses decision trees to predict a hybrid AGB map by minimizing the bias and variance with respect to the ﬁeld measurements. The forest AGB stock in China was 7.73 Pg C for the WT estimates and 8.13 Pg C for the RF estimates. Evaluation with the ﬁeld measurements showed that the two hybrid AGB products had a lower RMSE (29.6 and 24.3 Mg/ha) and bias ( − 4.6 and − 3.8 Mg/ha) than all ﬁve participating AGB datasets. Our study demonstrated both the WT and RF methods can be used to harmonize existing AGB maps with ﬁeld measurements to improve the spatial variability and reduce the uncertainty of carbon stocks. The new spatial AGB maps of China can be used to improve estimates of carbon emissions and removals at the national and subnational scales.


Introduction
With the implementation of national ecological restoration projects, total forest biomass in China has increased rapidly since the 1980s [1]. Fang et al. [2] and Piao et al. [3] have found that forest biomass carbon (C) increased at a rate of about 0.075 Pg C/year from 1981 to 2000, which accounted for about 11.2% of China's fossil fuel C emission over that period. This forest C sink was estimated to have increased to 0.13 Pg C/year by the early 2000s [4][5][6]. Given this rapid increase in the forest C sink, it is important to quantify the distribution of C stocks in forest aboveground biomass (AGB) to provide a better constraint on the national C budgets in China [7].
In general, forest AGB can be estimated by forest inventories or field measurements [2,8,9], remote sensing techniques [10][11][12], or ecosystem models [3,[13][14][15]. Forest inventories or field measurements provide the most direct estimates of forest AGB by using the biomass expansion factors or allometric models. However, this approach is labor intensive at the national scale and often needs to be implemented at regular intervals [16,17]. Ecosystem models can be used to quantify forest C budgets and dynamics across different spatial and temporal scales and can help us understand the mechanisms driving forest AGB change [18][19][20]. However, the performance of these models critically depends on the field measurements used in model calibration. Large discrepancies were found in the estimated forest AGB by different models [21]. Compared with the above two methods (in situ measurements and models), remote sensing techniques that use a calibrated model to relate the remote sensing signal to field measurements can improve the efficiency and accuracy of forest AGB mapping at regional and global scales [22][23][24]. Remote sensing products from MODIS [25], Landsat [26], and Synthetic Aperture Radar (SAR) [27] data are often used. Recently, Light Detection and Ranging (LiDAR) data that estimate forest height have been shown to improve the accuracy of forest AGB mapping [28][29][30]. However, the use of these satellite data for estimating AGB at a large scale also raises some significant issues, such as the lack of systematic and consistent field measurements at the national scale [31] and the saturating response of remote sensing signals to AGB in dense forests [32]. Thus, it is not surprising that significant differences have also been reported among different remote sensing AGB products [33,34].
Disagreement in the estimated forest AGB results from uncertainties in both forest area and forest AGB density. Discrepancies in forest area largely result from different definitions of forest used among different studies, as discussed by Li et al. [35]. Differences in the estimated forest AGB density can be attributed to differences in the quality of field measurements data used to calibrate models and the quality of remote sensing data. Among all these factors, the lack of sufficient and reliable field measurements for model calibration or evaluation may well be dominant, as the field measurements are often done by different groups over different periods using different methodologies [36,37].
Several forest AGB maps have been produced for China, with spatial resolutions ranging from 30 m to 1 km by integrating remote sensing data and field measurements using machine learning methods [38,39]. These studies used similar methodologies and input datasets, and yet, their results differ significantly in both AGB stocks and density. It is not clear which estimates over which regions are more reliable than others. For example, Su et al. [40] estimated a mean forest AGB density of 120 Mg/ha for the whole of China, which is close to the mean forest AGB density of some highly productive subtropical forests (123 Mg/ha) in China as estimated by Zhang et al. [41]. Comparison with ground observations identified significant differences between these AGB maps, in particular in areas where only limited numbers of field measurements are available [31,33,34]. It is reasonable to assume that each dataset has its own merits and limitations and that determining the most reliable one can be quite subjective. An alternative and promising way for improved AGB mapping is to make the best use of existing information and optimally integrate existing AGB maps [42]. This has been highlighted by Scholze et al. [43] and Zhang and Liang [44], who found that merging multiple products with observational data is an effective strategy for improving model estimates and predictions.
Bishop and Abramowitz [45] and Abramowitz and Bishop [46] developed a novel weighting technique (WT) that computes a weighted average of an ensemble of existing datasets, with weights based on the ability of the participating AGB datasets to match field measurements while also accounting for their error dependencies. This technique was used to derive hybrid global evapotranspiration (ET) [47] and runoff datasets [48], as well as the other components of the surface energy budgets [49]. Another important feature of the WT is its ability to derive uncertainty estimates associated with the hybrid product. This uncertainty varies spatially, which reflects the actual discrepancy between the hybrid product and the field measurements. A similar merging approach has been implemented to derive a fused pantropical forest AGB map [42,50] and a global forest AGB map [44] by integrating existing local or regional AGB maps.
Another technique that has the potential to merge available forest AGB estimates with field measurements is Random Forest regression (RF) [51]. RF builds a model from a range of features such that the biases of a predicted feature with respect to a response variable are minimized. Using gridded AGB datasets as features in the RF model and field measurements as the response variable, the RF algorithm builds a model that produces a new AGB dataset that minimizes the bias when compared with observed AGB [44,52]. Furthermore, a key component of the RF algorithm is feature subsampling, i.e., each decision tree is built by randomly selecting a subspace of the features. The final results are obtained by averaging predictions from all individual decision trees. This mechanism enables RF to cope with highly correlated features [53].
Using high-quality field measurements that cover all major forest types in China from 2011 to 2015, this study applied both the WT and RF methods to derive two observationally constrained hybrid AGB maps from five existing AGB maps using 75% of the high-quality data. We evaluated the derived AGB maps and their participating AGB datasets against the remaining 25% of the high-quality data. Finally, we discussed the uncertainties of these hybrid maps derived using the WT and RF techniques and the limitations of these two techniques.

Field AGB Data
The field measurements of forest AGB data used in this study were obtained from a systematically designed and consistent inventory system across China between 2011 and 2015 [54]. To ensure that the field measurements are representative of different forest types in different regions across China, mainland China was divided into three types of grid sizes (100 km 2 , 400 km 2 , and 900 km 2 , and a grid size of 100 km 2 was used for the tropical and subtropical forests with higher variability in forest AGB density) based on the 1:1,000,000 vegetation map [55]. In total, 12,161 grids were identified in forest regions of mainland China, and 3.1% of those grids (covering an area of 160,000 km 2 ) were randomly chosen for field measurements. In each chosen grid, about 7 sites were established according to the complexity of forest types. The location of each site was determined based on forest origin (natural or planted forests), forest age, and forest types. Overall, 7800 forest sites were established for the field measurements. At each site, one 1000 m 2 plot (600 m 2 in a few cases of forest plantations) consisting of 10 subplots (10 × 10 m 2 ) was established.
Within each subplot, the diameter at breast height (DBH; breast height = 1.3 m) and the height of all trees with a DBH ≥ 5 cm in the entire plot were measured. The AGB density of each subplot was estimated with 158 sets of allometric equations that were developed based on local biomass harvest data. The total AGB density of each site was then estimated from the measurements of AGB density of the 10 subplots. Further details can be found in Tang et al. [54].
Based on the method of Avitabile et al. [42], we first applied a screening procedure to select the forest sites that satisfied the following criteria: (1) the site should have detailed geographical information; (2) it should not suffer from any local disturbance that is not observed within the 1 km 2 pixel. Applying these criteria, the number of field sites potentially suitable for this study reduced from 7800 to 4904. To reduce the errors caused by the spatial mismatch between field sites and AGB maps, we only selected those field sites located within relatively homogeneous 1 km 2 pixels. The 1 km 2 pixels were considered homogeneous when the standard deviation of tree cover from Hansen et al. [56] was less than 15%. The threshold of 15% was determined through visual interpretation based on high-resolution Google Earth images ( Figure S1). For each pixel (1 km 2 ) that includes multiple sites, the average value of all sites located within the same pixel was used. Finally, a total of 1562 representativeness sites over 1 km 2 pixels were retained as high-quality field measurements for model calibration and evaluation in this study ( Figure 1).
The Saatchi map provides forest AGB density for the pantropics at a 0.00833 (~1 km) degree resolution for the early 2000s [58]. Here we used an updated version of this map, which provides global vegetation AGB density for 2015 [62]. Similarly, the Baccini map estimates the AGB for the pantropics at a resolution of 500 m for the period of 2007-2008 It is well known that the forest AGB density varies with vegetation type, soil, and climatic conditions [32]. Based on ecozone and forest type maps [55,57], the forested area in China was divided into five regions in this study ( Figure 1). They are: (A) temperate needleleaf and needleleaf-broadleaf mixed forest, (B) temperate steppe/desert and Qinghai-Tibet Plateau alpine vegetation, (C) temperate deciduous-broadleaf forest, (D) subtropical evergreen broadleaf forest and tropical monsoon forest-rain forest, and (E) Yunnan-Guizhou Plateau evergreen broadleaf and alpine needleleaf forest.
The Saatchi map provides forest AGB density for the pantropics at a 0.00833 (~1 km) degree resolution for the early 2000s [58]. Here we used an updated version of this map, which provides global vegetation AGB density for 2015 [62]. Similarly, the Baccini map estimates the AGB for the pantropics at a resolution of 500 m for the period of 2007-2008 [59]. The Baccini map used in this work is also an updated version that presents global AGB density at approximately a 30 m resolution [63]. The Santoro map estimates forest AGB globally at a 0.01 degree (~1 km) resolution for 2010 [60]. The Su and Huang maps provide estimates of nationwide forest AGB in China circa 2005 at 1 km and 30 m spatial resolutions, respectively [40,61].
These AGB maps shared similar methods and input datasets, and they were produced by integrating multiple remote sensing data with field measurements using machine learning approaches [64]. Despite the similar frameworks, these AGB maps differed in their LiDAR metrics, allometric models, remote sensing layers, and training models. In the present study, these AGB maps were aggregated into 1 km resolution through simple averaging and were resampled to use the same reference system and unit (AGB density in Mg/ha). Additionally, we applied the same forest mask based on the land-use map from Liu et al. [65] to the above five AGB maps.

AGB Estimation Method
We applied two different techniques (WT and RF) to integrate the five existing AGB maps (referred to as participating AGB datasets) with high-quality field measurements of forest AGB in China. The WT constructs a linear combination of the participating AGB datasets such that the sum of the squared differences between the resulting dataset and the field measurements is minimized while also accounting for the error dependence between their participating AGB datasets. The RF constructs a model consisting of nonlinear multiple regressions between AGB in the five participating AGB datasets and field measurements. The descriptions of the model parameters are provided in Supplementary Texts 1 and 2.
For each of the five regions in Figure 1, we used each of the two techniques (i.e., WT and RF) to derive an estimate of forest AGB at a spatial resolution of 1 km separately. We then combined the 1 km estimates of forest AGB for all five regions and for each technique (WT and RF) to derive a nationwide forest AGB map across China, hereafter referred to as WT-AGB and RF-AGB, respectively.

AGB Estimation the Using Weighting Technique
We first used the WT method developed by Bishop and Abramowitz [45]. This technique finds the linear combination of participating AGB datasets that minimizes its mean square difference (MSD) to the field measurements. The hybrid AGB estimate is expressed as where x j k is the value of the kth bias-corrected AGB dataset (i.e., after subtracting the mean error from the dataset) at the jth grid cell. The weights wT provide an analytical solution to the minimization of represents the participating AGB product, and AGB j obs is the observed AGB at the jth grid cell.
The solution is expressed as where 1 T = [1, 1, 1, 1, 1], and A is the 5 × 5 error covariance matrix of the gridded products. The WT method accounts not only for the performance differences between the participating AGB datasets but also their error dependencies. It uses the error correlation coefficient as a metric for error dependencies where errors are computed from bias with respect to observational data. Figure 2 illustrates the error correlation in the participating AGB datasets and shows a relatively high (>0.6) error correlation between the datasets, particularly among the Baccini, Santoro, and Huang maps. We followed the method described in Hobeichi et al. [47] to compute the spatial uncertainty of the hybrid AGB product. First, we quantified the discrepancy between the hybrid AGB product (i.e., ) and the field measurements ( ). We denote this as error variance : Next, we transformed the participating AGB datasets so that their variance about at any given location, , averaged over all locations where we have field measurements, is equal to : but the variance of the participating AGB datasets will not satisfy this equation. To ad- We followed the method described in Hobeichi et al. [47] to compute the spatial uncertainty of the hybrid AGB product. First, we quantified the discrepancy between the hybrid AGB product (i.e., AGB w ) and the field measurements (AGB obs ). We denote this as error variance s 2 AGB : Next, we transformed the participating AGB datasets so that their variance about AGB w at any given location, σ 2j AGB , averaged over all locations where we have field measurements, is equal to but the variance of the participating AGB datasets will not satisfy this equation. To address this, we applied a mathematical transformation that involves first modifying the coefficients from Equation (4), so that they are guaranteed to be all positive. We define: where α = 1 − K min (w k ), min(w k ) is the smallest negative weight (and α is set 1 if all w k are non-negative), and K represents the number of the participating AGB datasets. We then transformed the ensemble using where x j k is the mean value of the corrected AGB dataset at the jth grid cell.
Next, we defined the weighted variance estimate as This process ensures the spread of the transformed ensemble σ 2j AGB varies in space and accurately reflects uncertainty in those grid cells where field measurements are available [45]. We then used the transformed ensemble standard deviation σ 2j AGB as an estimate of the uncertainty of AGB w .

AGB Estimation the Using Random Forest Regression Method
RF is an ensemble learning method that constructs a collection of decision trees and then outputs a weighted average of predictions of the individual trees. The RF model was used to produce a new hybrid AGB map by integrating the five existing AGB datasets. For each decision tree, a subset of predictor variables (i.e., participating AGB datasets) and a subsample of the field AGB measurements were randomly selected. These selections change as the tree grows following a random sampling with the replacement approach. Furthermore, the algorithms involved in different decision trees are run in parallel. Ultimately, both the random sampling procedure and the parallelism in algorithm operations mean that the predictor blocks in RF are built independently.
As for the WT method, we built an RF model for each of the five regions separately using the available data in each region. We used the R caret package [66] to develop RF models and evaluated the model results with different parameter values using crossfold validations. Eventually, we applied 10-fold cross-validations and selected the model parameter values that minimize absolute RMSE against field measurements. This includes determining the number of predictors (mtry) available for splitting at each tree node and the number of trees (num.trees) used to build the model. Table 1 lists these parameters, the number of samples, and the participating AGB datasets used in each region. The number of samples ranges from 205 to 533 in Regions B and D, respectively. The RF model was built using all the five datasets in all regions, except in Region E, where only Saatchi and Su maps were used. We excluded the other three datasets (i.e., Baccini, Santoro, and Huang maps) after finding by trial and error that including them as predictors in the RF model did not offer improvements in the estimated forest AGB over the participating AGB datasets. Eventually, we computed the uncertainty associated with the derived hybrid AGB map from the standard deviation of the individual tree predictions that were built for each RF model. Table 1. The participating AGB datasets, the number of samples available for calibrating and evaluating the RF models in each region, and the hypertuning parameters that were used to build the RF model in each region, including the number of decision trees (num.tree) and the number of products available for splitting at each tree node (mtry).

Region
Participating

Model Evaluation
We conducted an out-of-sample test to assess the performance of the hybrid AGB products. This involved randomly dividing the field measurements into a calibration set (75% of the observation sites; i.e., in-sample) and a validation set (the remaining 25%; i.e., out-of-sample). The validation set was used to assess the performance of the hybrid AGB products and the five participating AGB datasets. We calculated the absolute and relative root mean square error (RMSE), bias, and relative standard deviation difference (RSD) of the hybrid AGB maps in each of the five regions. These metrics are calculated as: where y i is the observed AGB from the field sites,ŷ i is its corresponding modeled AGB from the hybrid AGB products or the participating AGB datasets, n is the number of validation sites, σ is the standard deviation of observed AGB, andσ is the standard deviation of modeled AGB or the participating AGB datasets.

Performance of the Two Hybrid Products
In terms of RMSE and bias, both the WT and RF hybrid products showed a significant overall improvement, as compared with the five participating AGB datasets across Regions A to D: the out-of-sample test indicated that the hybrid products achieved a lower RMSE and bias than any of the participating AGB datasets for all regions, except in Region E (Figure 3, Table S4). The RMSE of the hybrid maps varied from 7.8 to 75.7 Mg/ha for WT-AGB and from 10 to 35.3 Mg/ha for RF-AGB among the five regions. The bias of the hybrid maps was less than 10 Mg/ha in most cases. In Region E, WT-AGB presented a lower RMSE than the Baccini, Santoro, and Huang maps but a greater RMSE than the Saatchi and Su maps. RF-AGB had the lowest RMSE (33.7 Mg/ha) in Region E, followed by the Saatchi (40.3 Mg/ha) and Su (54.6 Mg/ha) maps. Similarly, the relative RMSE of the two hybrid products (ranging from 1.4% to 7.4%) was lower than with any of the five participating AGB datasets (ranging from 4% to 25%) for all regions, except in Region E (Table S5). The two hybrid products also had a lower relative bias (ranging from −8.2% to 20.7%) as compared with any of the five participating AGB datasets (ranging from −45.2% to 85.6%) across Regions A to D. In terms of RSD, significant improvements were found for the two hybrid products in Regions B, C, and D, with little improvement in Regions A and E.   Both the WT and RF approaches agreed better with the field measurements than any of the five participating AGB datasets (Figure 4). When evaluating the performance for the whole of China using the out-of-sample test, the RMSE values of WT-AGB and RF-AGB were 29.6 Mg/ha and 24.3 Mg/ha, respectively, which were significantly lower than the other datasets (with RMSE varying from 49.5 Mg/ha for Santoro AGB to 74.7 Mg/ha for Saatchi AGB). We also compared the performance of the two hybrid products with the participating AGB datasets using all available field measurements ( Figure S2). The results were similar to those based on the out-of-sample tests.

Evaluation of the Two Hybrid Products over China
The spatial patterns of the WT and RF estimates in China were very similar ( Figure  5). Overall, the average AGB density in southern China was higher than that in northern

Evaluation of the Two Hybrid Products over China
The spatial patterns of the WT and RF estimates in China were very similar ( Figure 5). Overall, the average AGB density in southern China was higher than that in northern China, ranging from 53 Mg/ha to 140 Mg/ha among the different regions. The maximum AGB density as estimated using the WT and RF approaches was over 420 Mg/ha in the southwest subalpine region of China, where alpine coniferous forests are dominant. Conversely, the average forest AGB density was relativity low in northern China, where these regions are mainly covered by temperate coniferous, coniferous-broadleaf mixed forest, temperate steppe, and desert. In Region C, where the warm deciduous-broadleaf forests are dominant, AGB density was generally low (<100 Mg/ha).  The estimated forest AGB stock in China was 7.73 Pg C for WT-AGB and 8.13 Pg C for RF-AGB, respectively. The two hybrid AGB maps differed from the five participating AGB datasets in most regions ( Figure 6). The total AGB stock of China as estimated by the WT approach was 8.8%, 14.7%, and 21.2% lower than those from Baccini (8.48 Pg C), Su (9.06 Pg C), and Saatchi (9.81 Pg C) but was 34.8% and 29.6% higher than those from Santoro (5.04 Pg C) and Huang (5.44 Pg C). Across the different regions, the mean forest AGB densities of both hybrid products were higher than any of the participating AGB datasets for Region A. In Region B, where nonforest vegetation is dominant and forest AGB density is very low, the differences were relatively large between the two hybrid products and the The estimated forest AGB stock in China was 7.73 Pg C for WT-AGB and 8.13 Pg C for RF-AGB, respectively. The two hybrid AGB maps differed from the five participating AGB datasets in most regions ( Figure 6). The total AGB stock of China as estimated by the WT approach was 8.8%, 14.7%, and 21.2% lower than those from Baccini (8.48 Pg C), Su (9.06 Pg C), and Saatchi (9.81 Pg C) but was 34.8% and 29.6% higher than those from Santoro (5.04 Pg C) and Huang (5.44 Pg C). Across the different regions, the mean forest AGB densities of both hybrid products were higher than any of the participating AGB datasets for Region A. In Region B, where nonforest vegetation is dominant and forest AGB density is very low, the differences were relatively large between the two hybrid products and the five participating AGB datasets. In Region C, estimates of AGB densities by the two hybrid products were generally lower than Saatchi and Su and higher than the other three products. In Regions D and E, the estimated forest AGB densities by the two hybrid products were generally lower than those by Saatchi and Su, similar to the Baccini estimates, and higher than the Santoro and Huang estimates (see Figure 6). Between the two hybrid maps, the agreement over Region A was good with relative differences <10% and was poorest in Region B (Figure 7, Table 2), where temperate steppe and desert are dominant. The mean AGB densities of both hybrid products were higher than the five participating AGB datasets in Region A and had a similar range for the other Between the two hybrid maps, the agreement over Region A was good with relative differences <10% and was poorest in Region B (Figure 7, Table 2), where temperate steppe and desert are dominant. The mean AGB densities of both hybrid products were higher than the five participating AGB datasets in Region A and had a similar range for the other four regions (see Figure 7). All seven products showed the highest mean AGB density in Region E, where the mean AGB density estimated by Su was more than 75% higher than that of the two hybrid products. Furthermore, the spread of data (bar height in Figure 7) did not show noticeable differences between the two hybrid datasets and was smaller than the spread of estimates in the Saatchi and Baccini estimates for most regions. Across the different regions in China, the relative error (defined as the ratio of standard deviation to mean) varied from 19% in Region D to 44% in Region A for the RF approach and from 11% in Region A to 53% in Region B for the WT approach ( Table 2).

Uncertainties of the Two Hybrid Products
The uncertainties of the two hybrid products differed spatially (Figure 8). Across the different regions in China, the uncertainty associated with the WT approach was lower than that of the RF approach for Regions A and E and was similar to the RF approach for Regions C and D. The estimated uncertainty was less than 230 Mg/ha across the WT-AGB map and less than 90 Mg/ha across the RF-AGB map, with average values of 21 Mg/ha for WT-AGB and 28 Mg/ha for RF-AGB. In terms of relative uncertainty, the average values were 27% for WT-AGB and 31% for RF-AGB. The differences in uncertainty between the two hybrid products were particularly large in Region A and part of Region E. In Region A, the uncertainty in the RF approach was higher than that in the WT approach, with average uncertainties of 42 Mg/ha and 10 Mg/ha, respectively. In Region E, the large uncertainty associated with the RF approach is expected, given that it was derived from the standard deviation of predictions from three tree blocks with two predictors only, i.e., Saatchi and Su maps. Increasing the number of decision trees (i.e., 150 trees used in Region A) had little impact on the estimated uncertainty.

Uncertainties of the Two Hybrid Products
The uncertainties of the two hybrid products differed spatially (Figure 8). Across the different regions in China, the uncertainty associated with the WT approach was lower than that of the RF approach for Regions A and E and was similar to the RF approach for Regions C and D. The estimated uncertainty was less than 230 Mg/ha across the WT-AGB map and less than 90 Mg/ha across the RF-AGB map, with average values of 21 Mg/ha for WT-AGB and 28 Mg/ha for RF-AGB. In terms of relative uncertainty, the average values were 27% for WT-AGB and 31% for RF-AGB. The differences in uncertainty between the two hybrid products were particularly large in Region A and part of Region E. In Region A, the uncertainty in the RF approach was higher than that in the WT approach, with average uncertainties of 42 Mg/ha and 10 Mg/ha, respectively. In Region E, the large uncertainty associated with the RF approach is expected, given that it was derived from the standard deviation of predictions from three tree blocks with two predictors only, i.e., Saatchi and Su maps. Increasing the number of decision trees (i.e., 150 trees used in Region A) had little impact on the estimated uncertainty.

AGB Estimation in China's Forests
This study demonstrated that the hybrid AGB maps agreed better with the ground observations than the five participating AGB products over China. This could be interpreted as a consequence of (1) the high-quality field measurements we used for calibration and evaluation, acquired from a systematic and consistent protocol across China; (2) the hybrid methods being very effective in finding the optimal estimates by minimizing the errors between the participating AGB datasets and field measurements.

AGB Estimation in China's Forests
This study demonstrated that the hybrid AGB maps agreed better with the ground observations than the five participating AGB products over China. This could be interpreted as a consequence of (1) the high-quality field measurements we used for calibration and evaluation, acquired from a systematic and consistent protocol across China; (2) the hybrid methods being very effective in finding the optimal estimates by minimizing the errors between the participating AGB datasets and field measurements.
The average forest AGB density in China from our hybrid maps was close to the estimates by Tang [9] estimated the forest AGB stock in China as 7.27 Pg C, with a forest area of 165 × 10 6 ha. Our hybrid maps were masked by the land-use map from Liu et al. [65], with a forest area of 170 × 10 6 ha. Tang et al. [54] and Xu et al. [67] reported the forest area to be 188 × 10 6 ha and 196 × 10 6 ha, respectively. Considering the differences in forest area found among the previous studies, our estimated forest AGB stock in China was more consistent with the results based on forest inventories or field measurements than the five participating AGB datasets.
Our results suggest that the Saatchi and Baccini maps tend to overestimate AGB in southern China for AGB values >120 Mg/ha ( Figure 6). This result agrees with that of Avitabile et al. [42], who identified an overestimation of 9-18% of total stocks in the Saatchi and Baccini maps across the tropics. This overestimation may result from the fact that the Saatchi and Baccini maps were not calibrated against any field measurements from China, and the observational data and allometric models they used for tropical dense forests were not representative of the forest conditions in China. This may also explain the low accuracy identified here in the Saatchi (RMSE = 74.7 Mg/ha) and Baccini (RMSE = 60.8 Mg/ha) maps (see Figure 4). The Santoro map was based on the growing stock volume (GSV) derived from SAR with the BIOMASAR algorithm, then converted to AGB using the wood density and allometric relationship from in situ data [68]. Their validation indicated that the forest AGB was underestimated in the temperate and subtropical zone for reference AGB values >150 Mg/ha, which may explain the underestimation of AGB across China. The observational data used in the Huang map were collected from three decades of publications, where 2/3 of the observational sites were acquired before 2000. This temporal mismatch could likely lead to underestimating AGB in China.
The hybrid products derived from the WT and RF approaches are a result of the fusion of the in situ measurements and participating AGB datasets used here. Due to spatial heterogeneity and scale mismatch between in situ measurements and remote sensing products, it is important to ensure that the ground observations are representative of the pixel size of gridded products [69]. In this study, as detailed in Section 2, we carefully selected the field measurements to minimize the scale mismatch between AGB of field measurements and remote sensing products. A strength of this study is the use of a set of high-quality field measurements that were taken using a consistent protocol across China over a relatively short period (2011-2015) and that were not used in calibrating any of those participating AGB datasets. The high-quality set of field measurements provides thus an independent evaluation of all participating AGB datasets, which increases the credibility and reliability of our hybrid results.

Limitations of the Present Study
Although the hybrid maps had improved accuracy than all five participating AGB datasets for most regions in China, there are a number of limitations to this study. First of all, our results showed that RF-AGB could not estimate exceptionally high AGB values as observed in Region E ( Figure S2). This could be a result of a very small fraction of training data with high AGB (>350 Mg/ha). Second, the out-of-sample performance (see Figure 3) suggested that the merging techniques could not outperform the participating AGB datasets in the RSD metric. This result may be explained by the fact that (a) the merging techniques could not capture the extremely high values in the field measurements of AGB density, and (b) we would expect variance to be reduced as a result of the averaging process. This issue was also highlighted in Hobeichi et al. [47]. There are also some common errors that should be noted in our AGB estimations:

1.
Temporal and spatial mismatch The field measurements used here were collected between 2011 and 2015, while the participating AGB datasets were derived for different times between 2000 and 2015, with spatial resolutions ranging from 30 m to 1 km. This temporal mismatch between field measurements and participating AGB datasets likely contributes to increasing the uncertainty of the hybrid results [70,71]. We tried to minimize the temporal mismatch by excluding field measurements where forest canopy cover had significantly changed, identified by clear land changes presented in the high-resolution Google Earth images. However, the forest growth and degradation events that did not significantly affect the forest canopy cover may not be detected by visual analysis, which may introduce some uncertainties into the hybrid AGB products. The spatial mismatch between the AGB maps (1 km 2 ) and the field measurements (1000 m 2 ) could also introduce additional uncertainties into the hybrid AGB products. A common solution is to screen out the field measurements that are not representative of the forest pixels [41,69]. The spatial mismatch problems in our study were minimized by selecting only the datasets that satisfied certain quality criteria and by further screening through visual analysis. However, some errors may still remain in our hybrid AGB products, particularly in regions where the local variability in forest AGB was high [72]. While the temporal and spatial mismatch can limit the success of our present hybrid AGB products, application of the hybrid approaches by combining field measurements with new remote sensing products, such as airborne LiDAR measurements, will generate more robust hybrid results in the future.

2.
Impact of different definitions of forest As shown in Sexton et al. [73], the discrepancies in estimates of forest area mainly result from the definition of 'forest'. For example, the Chinese National Forest Inventory defines a forest as woodland with canopy cover >20%, whereas the FAO defines forests by the criterion of tree cover >10%. Li et al. [35] showed that the total forested area in China can differ by up to 10.7% using different definitions for the period 2000 to 2013. This variation in the definition can directly influence the estimated amount of forest AGB [74]. In this study, to ensure the comparability of the different products, we applied the same forest mask based on the land-use data from Liu et al. [65] to all participating AGB products. However, caution should be exercised when comparing our estimated amount of forest AGB with other studies.

Implications for National C Budgets
As discussed by Avitabile et al. [42], the hybrid method presented in this study allows for the optimal integration of AGB maps with additional field measurements when they become available. For example, the proposed method may be applied at the national scale using existing forest inventory data and local maps that cover only part of the country. The resulting forest AGB maps may be used to provide an additional constraint on the national C budgets.
To ensure the representativeness of field measurements over the pixel size, we implemented the screening and averaging methods as outlined in the Methods section. While pixel-level averaging and screening reduced the biases from scale mismatch and ensure the representativeness of field measurements, this approach also significantly reduced the number of usable field measurements. In this study, screening reduced the number of pixel-scale measurements from 4904 to 1562, which may have degraded the quality of our products. Therefore it is necessary to compile field measurements from multiple sources for direct comparison with remote sensing products. Indeed, the Forest Observation System (FOS) initiative has implemented a global in situ forest AGB database that offers reliable and representative field measurements for calibration and validation [75]. Similarly, the International Land Model Benchmarking (ILAMB) project also seeks to identify observations that can be used for evaluating model performance [76]. A new generation of remote sensing technologies will largely fill the need for regional and global estimates of forest structure and AGB. This include NASA's GEDI [77] and NISAR [78], ESA's BIOMASS [24,62], HJ-1C [79], NovaSAR-S [80], and airborne SAR [81]. These AGB maps can be combined with the temporal changes in the vegetation optical depth (VOD) derived from passive microwave satellites for monitoring annual AGB dynamics at regional and global scales [82][83][84]. With an increasing number of ground observations and gridded AGB products becoming available, the two approaches will be good candidates for further improving the accuracy of AGB mapping in the future.

Conclusions
In this study, we applied both the WT and RF methods to integrate five existing forest AGB products by minimizing the errors with respect to field measurements. The forest AGB stock in China was 7.73 Pg C for the WT estimates and 8.13 Pg C for the RF estimates. Unlike the previous studies that used field measurements from multiple sources with variable quality over different periods, our field measurements were acquired over the same period using a unified measurement protocol, minimizing inconsistency and measurement errors [7,85]. Moreover, the WT method used here accounted for both the performance against field measurement and error dependences in the participating AGB datasets. We evaluated our estimates using out-of-sample data and compared them with previous estimates. The results indicate that our hybrid AGB products corresponded reasonably well with the results from ground observations and national forest inventory. Considering possible issues due to spatial autocorrelation in AGB datasets [86], we suggest that evaluation against a larger amount of new independent field measurements is required to validate the results shown here. This study demonstrated that the hybrid methods are an effective way to integrate the increasing number of AGB products and observations and to improve forest AGB mapping at regional and country scales.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/rs13152892/s1, Figure S1: Example of visual selection of the field plots using Google Earth images, Figure S2: Scatterplots of all the available field AGB (x-axis) and estimated AGB (y-axis) of the (a) WT and (b) RF approaches and the (c) Saatchi, (d) Su, (e) Baccini, (f) Santoro, and (g) Huang maps. RMSE is given in Mg/ha. Table S1: Overview of the datasets used in this study, Table S2: Weights assigned to the participating AGB products and bias correction in Mg/ha (in brackets) computed at each region, Table S3: Importance of the five participating AGB products in the RF model in each region, Table S4: Results of the out-of-sample test, showing the RMSE, bias (in brackets), and relative standard deviation difference (RSD), all computed against the observational data. The RMSE and bias are given in Mg/ha, Table S5: Results of the out-of-sample test, showing the relative RMSE and relative bias (in brackets). The relative RMSE and relative bias are given in %.