Edinburgh Research Explorer Carbon Stocks and Fluxes in Kenyan Forests and Wooded Grasslands Derived from Earth Observation and Model-Data Fusion

: The characterization of carbon stocks and dynamics at the national level is critical for countries engaging in climate change mitigation and adaptation strategies. However, several tropical countries, including Kenya, lack the essential information typically provided by a complete national forest inventory. Here we present the most detailed and rigorous national-scale assessment of aboveground woody biomass carbon stocks and dynamics for Kenya to date. A non-parametric random forest algorithm was trained to retrieve aboveground woody biomass carbon (AGBC) for the year 2014 ± 1 and forest disturbances for the 2014–2017 period using in situ forest inventory plot data and satellite Earth Observation (EO) data. The ecosystem carbon cycling of Kenya’s forests and wooded grassland were assessed using a model-data fusion framework, CARDAMOM, constrained by the woody biomass datasets from this study as well as time series information on leaf area, ﬁre events and soil organic carbon. Our EO-derived AGBC stocks were estimated as 140 Mt C for forests and 199 Mt C for wooded grasslands. The total AGBC loss during the study period was estimated as 1.89 Mt C with a dispersion below 1%. The CARDAMOM analysis estimated woody productivity to be three times larger in forests (mean = 1.9 t C ha − 1 yr − 1 ) than wooded grasslands (0.6 t C ha − 1 yr − 1 ), and the mean residence time of woody C in forests (16 years) to be greater than in wooded grasslands (10 years). This study stresses the importance of carbon sequestration by forests in the international climate mitigation e ﬀ orts under the Paris Agreement, but emphasizes the need to include non-forest ecosystems such as wooded grasslands in international greenhouse gas accounting frameworks.

area index (LAI) is a strong indicator of potential carbon accumulation across Kenya [33]. The actual accumulation of photosynthates in the biomass depends on their residence time, particularly that of woody material, the most long-lived plant tissue. This residence time will depend on plant traits like wood density, that are linked to mortality, and also on disturbance factors, such as fire frequency. However, the fractions of photosynthate allocated to new wood growth and wood residence times are both poorly known, providing a major source of uncertainty in global carbon cycling models and forecasts [34].
Estimates of ecosystem carbon dynamics can be constrained at the site level by combining multiple biometric and ecosystem carbon flux measurements with process models [35][36][37][38]. However, intensive repeated measurements (e.g., AGBC, dead organic matter) are challenging to collect over long time periods and across large areas. At the landscape scale, carbon cycle estimates have larger uncertainties due to the scarcity of data and data uncertainties introduced by statistical upscaling approaches. EO data provide a means of strengthening the upscaling approach by providing full spatial coverage and time series information. Combining the EO of ecological parameters (e.g., AGBC and LAI) with ecosystem carbon cycle models through model-data fusion approaches is a powerful methodology to quantify ecosystem properties, carbon cycling and their uncertainty, from plot scale to global scale [38][39][40][41][42]. These analyses can determine how carbon cycling varies in space with climate, which can provide insights into how climate change influences future carbon stocks.
This study has the overall objectives of: (i) providing an improved quantification of biomass carbon (C) in wooded Kenyan ecosystems; (ii) quantifying the rate of C loss due to deforestation processes; and (iii) providing a better understanding of the dynamics and underlying ecosystem processes governing biomass carbon stocks in the country's forests and wooded grasslands. We first estimate the baseline AGBC stock for Kenya's wooded vegetation for the year 2014 using a combination of nationally collected field data and EO datasets, and then map forest loss between 2014 and 2017. Finally, we combine the AGBC dataset, the forest loss data and other data (LAI, soil C, climate) with a process-based carbon cycle model within the CARbon DAta MOdel FraMework (CARDAMOM) [39,40] to map C accumulation (NPP allocated to wood) and the residence time of biomass. We focus our analysis on comparing forest and woody grassland landscapes in Kenya to understand their contribution to national C stocks and to quantify their contributions to C cycling.

Reference Datasets
In situ forest inventory plots were established following a stratified random sampling across four forest ecosystems (Montane and Western Rain forest, Dryland Forest, Coastal and Mangrove and the Plantation Forest) and three canopy density classes (dense, moderate and open) across Kenya [43,44]. The data was collected by the Kenya Forest Service (KFS) during a pilot forest inventory taken between 2014 and 2016. Species identification of trees, bamboos and lianas, as well as non-destructive measurements of diameter at breast height and total height were acquired within each plot [43]. Approximately 61% of the plots were in forest and woodlands, 22% in dryland forest and wooded grasslands and the remaining 17% in mangroves and other coastal forest. This dataset consisted of 266 circular plots 30 m diameter (0.07 ha). We used all these plots for training and validation of the AGB map. Plots were sometimes gathered in 4-plot clusters (0.28 ha per cluster) where plots were set in a straight line and separated by different distances (Figure 1). This sampling design allowed us to validate the AGB map at the cluster level as well (0.28 ha). Some plots stood alone or were in clusters containing fewer than 4 plots for a variety of reasons (e.g., no access due to private property). Only clusters with 4 plots (i.e., 40 clusters) were considered for validation.
Due to the small number of plots available, LiDAR footprints from the Geoscience Laser Altimeter System (GLAS) onboard the ICESat satellite acquired between 2004 and 2008 were also used for validation. The footprints sampled over forested areas were acquired from a spatially balanced sample Due to the small number of plots available, LiDAR footprints from the Geoscience Laser Altimeter System (GLAS) onboard the ICESat satellite acquired between 2004 and 2008 were also used for validation. The footprints sampled over forested areas were acquired from a spatially balanced sample at the national level [45,46]. GLAS data consist of full waveform LiDAR acquisitions, with each footprint having a diameter of approximately 65 m (0.33 ha) (Figure 1). The canopy height was estimated for all LiDAR footprints using the full waveform parameters and the method proposed by Lefsky [47], while aboveground woody biomass (AGB) was derived using the allometric model proposed for Africa by Saatchi et al. [22]. For Kenya, 40 footprints were available, which we could increase to 64 by including footprints in close proximity from Tanzania, Somalia and Uganda. Due to the temporal discrepancy between the GLAS LiDAR footprints (2004)(2005)(2006)(2007)(2008) and this study (2014 ± 1), we screened the footprint locations for disturbances in the dates between that time interval using the Hansen et al. [3] forest loss product. From the final selection of footprints, 5 were discarded as forest loss was detected. A dataset consisting of 500 polygons (60 m × 60 m) depicting stable classes according to the Kenya land cover map 2014 [48], i.e., stable forest (SF), stable other vegetation (SOV), stable other land (SOL) and stable water (SW), together with forest loss (Floss) during the 2014-2017 period, was generated by means of stratified random sampling and the use of time series of very high resolution (VHR) imagery available through Google Earth Pro. Each sample was visually interpreted and labelled. The dataset was then randomly divided into training and validation subsets (70% and 30%, respectively) and used to map these classes over the study period (2014-2017).

Landsat 8 Operational Land Imager (OLI)
Landsat 8 OLI surface reflectance Tier 1 imagery from the U.S. Geological Survey (USGS) acquired between 2014 and 2017 over Kenya was used to produce cloud-free annual 30 m resolution composites. Landsat 8 OLI data are atmospherically corrected using the Land Surface Reflectance Code (LaSRC) [49] by the USGS and includes a cloud, shadow, water and snow mask derived by the C Function of Mask (CFMASK) algorithm. We only retained bands 3 (Green; 0.533-0.590 μm), 4 (Red; 0.636-0.673 μm), 5 (Near Infrared, NIR; 0.851-0.879 μm), 6 (Shortwave Infrared 1, SWIR1; 1.566-1.651 Figure 1. Location of in situ forest inventory plots and the representative national sample of GLAS LiDAR footprints (left). Histograms of relative frequency of aboveground woody biomass (AGB) for both datasets, and the design of the in situ forest inventory plots within a given cluster (showing the 10m buffer around the plots), and a GLAS LiDAR footprint (right).
A dataset consisting of 500 polygons (60 m × 60 m) depicting stable classes according to the Kenya land cover map 2014 [48], i.e., stable forest (SF), stable other vegetation (SOV), stable other land (SOL) and stable water (SW), together with forest loss (F loss ) during the 2014-2017 period, was generated by means of stratified random sampling and the use of time series of very high resolution (VHR) imagery available through Google Earth Pro. Each sample was visually interpreted and labelled. The dataset was then randomly divided into training and validation subsets (70% and 30%, respectively) and used to map these classes over the study period (2014-2017).

Landsat 8 Operational Land Imager (OLI)
Landsat 8 OLI surface reflectance Tier 1 imagery from the U.S. Geological Survey (USGS) acquired between 2014 and 2017 over Kenya was used to produce cloud-free annual 30 m resolution composites. Landsat 8 OLI data are atmospherically corrected using the Land Surface Reflectance Code (LaSRC) [49] by the USGS and includes a cloud, shadow, water and snow mask derived by the C Function of Mask (CFMASK) algorithm. We only retained bands 3 (Green; 0.533-0.590 µm), 4 (Red; 0.636-0.673 µm), 5 (Near Infrared, NIR; 0.851-0.879 µm), 6 (Shortwave Infrared 1, SWIR1; 1.566-1.651 µm) and 7 (Shortwave Infrared 2, SWIR2; 2.107-2.294 µm). In addition, a set of spectral indices was included: normalised difference vegetation index (NDVI) [50], soil adjusted vegetation index (SAVI) [51], enhanced Remote Sens. 2020, 12, 2380 5 of 24 vegetation index (EVI) [52], normalised difference moisture index (NDMI) [53] and normalised burn ratio (NBR) [54]. For each annual composite image, we selected for each pixel the median value of all observations with low confidence cloud and low confidence cirrus over land, water and snow/ice. These annual composites are the basis for mapping stable classes and detecting forest loss in the 2014-2017 period. Additionally, a median value composite and vegetation indices were generated using the same procedure with data from 2014 ± 1 yr. Imagery from 3 years (i.e., 3340 scenes) was needed to generate a completely cloud-free composite for the whole of Kenya. This composite was used as input for generating the AGBC map.  [55] consist of dual-polarisation, pre-processed, incidence-angle corrected radar backscatter expressed as γ 0 in dB as follows: where DN is the pixel digital number and CF is the calibration factor (−83.0 dB). The pre-processing of these data products [55] involves calibration, multi-looking (16 looks), projection, orthorectification and slope correction using the Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) data, along with a de-striping process [56]. We applied a multi-channel filter [57] on the annual mosaics (2007-2010, and 2015-2017) with a 5 × 5 window to reduce speckle. After performing the filtering, only the 2015 mosaic was used. The registration of PALSAR/PALSAR-2 mosaics showed more than 80 m displacement in some regions. This error comes in the processing of the mosaics, resulting sometimes in shifts and sometimes in deformation ( [58], personal communication to K&C team). Therefore, an additional fine co-registration of the 2015 PALSAR-2 mosaic was performed using a Sentinel-1 mosaic as reference. This was preferred over a SAR-to-optical approach because of the different viewing geometry of optical and SAR images, which can lead to inaccurate co-registration. In addition to both backscatter polarisations (γ 0 HH and γ 0 HV ), we calculated the cross-polarisation ratio (CpR = γ 0 HV /γ 0 HH ) and the Radar Forest Degradation Index (RFDI = γ 0 HH − γ 0 HV /γ 0 HH + γ 0 HV ) [59].

Land Cover Data
The Government of Kenya initiated a process for developing a system for the estimation of land-based emissions in Kenya [4], and among the products developed were the national land cover classification maps [48] based on IPCC guidelines [60]. The Kenya land cover map is a 10-class map with 30 m spatial resolution developed using Landsat imagery. The classes are dense forest (canopy cover > 65%), moderate forest (canopy cover 40-65%), open forest (canopy cover 15-40%), wooded grassland, open grassland, perennial cropland, annual cropland, mangroves and vegetated wetlands, open water and other land. The Kenya land cover product for the year 2014, in combination with the JRC Global Surface Water product [61], was used to constrain the AGBC retrievals to only woody vegetation classes: dense forest, moderate forest, open forest, wooded grassland and mangroves and vegetated wetland. As previously mentioned, the land cover map was also used to guide the stratified selection of reference observations of stable land cover classes in the 2014-2017 period.

Global Forest Change
The Global Forest Change (GFC) dataset v 1.6 [3] is a 30 m Landsat-based product that depicts tree canopy cover for the year 2000, annual forest cover loss, and total forest cover gain for the period of 2000-2018. The tree canopy cover 2000 and the loss/gain data from the 2000-2014 period were used to generate a 2014 tree canopy cover product for Kenya used as input to generate the AGBC map. Additionally, the validation subset was used to evaluate the accuracy of the GFC annual forest loss for the period 2014-2017 over Kenya in order to assess its value in constraining the CARDAMOM analysis to undisturbed woody classes only (see Section 3.4 for details).

Leaf Area Index
The Copernicus Global Land Service (CGLS) Collection 300 m Leaf Area Index (LAI) Version 1 dataset was used as the input for the CARDAMOM analysis. This dataset is based on the daily top-of-atmosphere reflectance measured in the blue, red and near infrared channels [62,63]. These were transformed into estimates of LAI using a neural network in an approach similar to Baret et al. [64] and Camacho et al. [65]. The daily estimates were smoothed, gap-filled and composited over 10-day periods starting in January 2014. The uncertainty of the composited product is reported as its root-mean-squared difference (RMSD) from daily estimates in the compositing period. The data were taken from the CGLS online portal (http://land.copernicus.eu/).

Burned Area and Soil Data
The Global Fire Emissions Database version 4 (GFEDv4) was used to provide time series information on the burned area at monthly time steps and 0.25 • × 0.25 • spatial resolution over Kenya [66]. This was used to impose observed fire disturbance on the CARDAMOM analyses. CARDAMOM also requires a prior estimate of the magnitude of the organic soil carbon stock, extracted from the SoilGrids (250 m × 250 m) product [67]. SoilGrids was chosen over other products due to a recent review of global soil maps against independent data indicating a greater degree of skill in the SoilGrids (see https://www.soil-journal.net/5/137/2019/for details). For details on the data aggregation, see Section 3.4.

Methods
The methodology adopted here has the following components: (i) in situ AGB estimation using pantropical allometries, (ii) AGBC map production, (iii) mapping stable classes and forest loss and (iv) carbon cycle analysis using a model-data fusion framework ( Figure 2). Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 24 for the period 2014-2017 over Kenya in order to assess its value in constraining the CARDAMOM analysis to undisturbed woody classes only (see Section 3.4 for details).

Leaf Area Index
The Copernicus Global Land Service (CGLS) Collection 300 m Leaf Area Index (LAI) Version 1 dataset was used as the input for the CARDAMOM analysis. This dataset is based on the daily topof-atmosphere reflectance measured in the blue, red and near infrared channels [62,63]. These were transformed into estimates of LAI using a neural network in an approach similar to Baret et al. [64] and Camacho et al. [65]. The daily estimates were smoothed, gap-filled and composited over 10-day periods starting in January 2014. The uncertainty of the composited product is reported as its rootmean-squared difference (RMSD) from daily estimates in the compositing period. The data were taken from the CGLS online portal (http://land.copernicus.eu/).

Burned Area and Soil Data
The Global Fire Emissions Database version 4 (GFEDv4) was used to provide time series information on the burned area at monthly time steps and 0.25° × 0.25° spatial resolution over Kenya [66]. This was used to impose observed fire disturbance on the CARDAMOM analyses. CARDAMOM also requires a prior estimate of the magnitude of the organic soil carbon stock, extracted from the SoilGrids (250 m × 250 m) product [67]. SoilGrids was chosen over other products due to a recent review of global soil maps against independent data indicating a greater degree of skill in the SoilGrids (see https://www.soil-journal.net/5/137/2019/for details). For details on the data aggregation, see Section 3.4.

Methods
The methodology adopted here has the following components: (i) in situ AGB estimation using pantropical allometries, (ii) AGBC map production, (iii) mapping stable classes and forest loss and (iv) carbon cycle analysis using a model-data fusion framework ( Figure 2).  We used a non-parametric machine learning random forest algorithm [68] to perform regression and supervised image classification. Random forests (RF) have been used extensively to retrieve biophysical parameters and land cover from remote sensing data [19,[69][70][71]. RF is built as an ensemble of binary decision trees, where each tree is fitted to a bootstrap sample of the training dataset (with replacement) and the final prediction is generated as the average of all the estimates (for regression) or by majority vote (for classification). The observations not selected for fitting a RF model (the out-of-bag (OOB) sample) are combined to give the overall estimate of error. A sensitivity analysis was carried out to find the combination of hyper-parameters generating the lowest OOB error.

AGB Estimation Using Allometric Models
The AGB for in situ plots was estimated using pantropical tree allometries [72] with diameter at breast height, tree height and wood specific gravity as input variables (Equation (A1)) for all the trees within each plot. The AGB of bamboos was estimated using Muchiri and Muga [73] allometries (Equations (A2) and (A3)), while the AGB of lianas was estimated using the volume of the equivalent cylinder and the mean wood density of 0.6 g m −3 [74]. For the LiDAR footprints, AGB was estimated using the Saatchi et al. [22] footprint allometric model for sub-Saharan Africa, which used GLAS-derived canopy height as predictor variable (Equation (A4)). The histograms of AGB values from both datasets were similar ( Figure 1), with an average AGB from the in situ plots and GLAS dataset of 90.7 t ha −1 and 97.6 t ha −1 respectively.

Biomass and Carbon Mapping
As the in situ plots were relatively small (30 m diameter) and geographical coordinates obtained with a GPS under a forest canopy can have substantial geolocation error (5-10 m) [43], the AGB field data may not correspond exactly to the overlapping pixel values at a given location. Therefore, we added a 10 m buffer around the plot boundaries to obtain a representative 50 m diameter circular area ( Figure 1). We then extracted the average value of all the EO pixels from the different predictors overlapping those areas to train the supervised regression. This had the aim of averaging out any geolocation errors in the plots.
The AGB map was generated by combining the in situ AGB measurements with L-band SAR ALOS-2 PALSAR-2 (γ 0 HV , γ 0 HH , RFDI and CpR), multispectral Landsat data (Green, Red, Near infrared, both Shortwave infrared bands, NDVI, NBR, NBR2, NDMI and SAVI) and a Landsat tree canopy cover product, and running RF within a stratified k-fold cross-validation framework. This framework was designed to make the best use of the limited data available for training, to validate the map and, at the same time, to estimate our prediction error at pixel level based on the standard deviation of the k predictions performed by the models calibrated with different groups of data (see Appendix A).
The Kenya land cover product [48] and JRC Global Surface Water product [61] were used to constrain the predictions to only forests, wooded grasslands and savannahs, mangroves and vegetated wetlands, so water, croplands, open grasslands and bare land classes were set to zero AGB. The AGB and SD maps in units of t ha−1 were then converted to AGBC units (i.e., t C ha −1 ) by using a 0.47 conversion ratio [75]. We also estimated the belowground biomass carbon (BGBC), i.e., the biomass carbon of live roots, using a root-shoot ratio of 0.24 [75].
For accuracy assessment of the AGBC map we quantified the difference between our map and the reference data using the mean bias difference (MBD), the mean absolute difference (MAD), the root mean square difference (RMSD) and the relative squared difference (RSD) (Appendix A). We also performed an accuracy assessment stratified by the AGBC range as seen in Rodríguez-Veiga et al. [14] to evaluate random errors and biases using the Coefficient of Variation of the bias (CV bias ) (Appendix A).

Forest Loss Mapping
The previously described 500 squares of 60 m × 60 m from the training dataset depicting stable classes (SF, SOV, SOL and SW; according to the Kenya land cover map-see Section 2.1) and forest loss Remote Sens. 2020, 12, 2380 8 of 24 were used to extract the average value from each Landsat 8 OLI band and derive spectral indices. A RF classifier was then used to perform supervised classification on a 70% random sample of that dataset (n = 350). This map was validated using the validation subset (n = 150). The validation results are reported using the testing subset to generate the corresponding confusion matrix, which was corrected using the mapped area of each class according to Olofsson et al. [76].
RF gives the proportion of votes in each class on a pixel-by-pixel basis, which can be viewed as an indicator of classification confidence. A likelihood scale can then be defined according to these values using a similar approach to that developed by IPCC (2010). The likelihood of an outcome could be defined as "unlikely" (probability < 0.33), "about as likely as not" (probability = 0.33-0.66) and "likely" (probability > 0.66). We applied these intervals to the predictions from the RF model and generated three levels of likelihood per class. We then used the "likely" (probability > 0.66) F loss class to update the forest cover area to focus only on undisturbed forest during the study period.

Carbon Cycle Analyses: CARbon DAta MOdel fraMework (CARDAMOM)
We used the CARDAMOM [38,40,41] model-data fusion framework to analyse the terrestrial carbon cycle in Kenya at 0.25 • spatial resolution and monthly time step. CARDAMOM combines the Data Assimilation Linked Ecosystem Carbon version 2 (DALEC2), which is an intermediate complexity model of the terrestrial carbon cycle (Bloom and Williams [39]), with biophysical observations and their uncertainties, plus meteorological information. Ecological and dynamic constraints (EDCs) are used in CARDAMOM to ensure sensible parameter searches within a Bayesian framework using an Adaptive Proposal-Markov Chain Monte Carlo (AP-MCMC). This process retrieves ensembles of location-specific model parameters, providing probabilistic estimates of fluxes and pools of C over the analysis period. The EDCs ensure that carbon cycle dynamics and trait retrieval are consistent with ecological theory (e.g., ensuring litter turnover is faster than that of soil organic matter), as described in Bloom and Williams [39]. DALEC simulates photosynthesis, respiration, the allocation of photosynthate to foliar, root and wood pools, their turnover and the subsequent decomposition in the litter and soil carbon pools.
CARDAMOM was used to estimate the carbon cycle of undisturbed Kenyan forests and wooded grasslands for 2014-2017 under meteorology extracted from ERA-Interim re-analysis climate data [77]. Fire was imposed using GFEDv4 (see Section 2.7) following the approach of Exbrayat et al. (2018). Time series of LAI from the Copernicus Land Monitoring Service [64] were assimilated into CARDAMOM to constrain canopy phenology and productivity, while the maps of AGBC (this study) and soil organic matter carbon (SoilGrids; see Section 2.7) constrained the initial status of the corresponding carbon pools.
The CARDAMOM analysis was conducted at 0.25 • × 0.25 • resolution as this was the resolution of the coarsest dataset used (GFEDv4). However, many of the other observational constraints have finer spatial resolutions. Using the map of undisturbed forest (this study) we resampled the AGBC estimate, LAI time series and soil organic matter information from their native resolution to 0.25 • following [78]. Each 0.25 • grid cell was analysed by CARDAMOM using the average status of the undisturbed forest within that pixel after removing contamination by other land cover types. Overall, a total of 761 grid cells were simulated to represent the mean undisturbed forest state. To ensure convergence, CARDAMOM's AP-MCMC procedure was repeated three times at each location for 100 million iterations. Convergence was then tested using the Gelman-Rubin convergence criterion. Grid cells which did not converge across the three repeated analyses were re-run. The results presented here were derived by subsampling 250 parameter sets from the second half of each chain. Pixel level uncertainties are estimated directly from the probability density function estimated from the ensemble of simulations from the parameter sub-sample.
Estimates of the following land-atmosphere fluxes are reported in Section 4: gross primary productivity (GPP), autotrophic respiration (Ra), net primary productivity (NPP = GPP − Ra), heterotrophic respiration (Rh), net ecosystem exchange (NEE = −NPP + Rh) and net biome exchange Remote Sens. 2020, 12, 2380 9 of 24 (NBE = NEE + Fire). For NEE and NBE, negative values indicate a net C uptake from the atmosphere. CARDAMOM also generates estimates of ecologically important information on the partitioning of photosynthate allocated to plant tissues and their mean residence times, as well as carbon cycling.

Biomass Carbon Stocks
The AGBC map for Kenya and its associated uncertainty map ( Figure 3) shows a concentration of AGBC in the western mountainous areas of the country and in the eastern coastal region.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 24 heterotrophic respiration (Rh), net ecosystem exchange (NEE = −NPP + Rh) and net biome exchange (NBE = NEE + Fire). For NEE and NBE, negative values indicate a net C uptake from the atmosphere. CARDAMOM also generates estimates of ecologically important information on the partitioning of photosynthate allocated to plant tissues and their mean residence times, as well as carbon cycling.

Biomass Carbon Stocks
The AGBC map for Kenya and its associated uncertainty map ( Figure 3) shows a concentration of AGBC in the western mountainous areas of the country and in the eastern coastal region. The accuracy of the AGBC map was assessed using in situ plots, plot clusters and GLAS LiDAR footprints ( Figure 4). The assessments at plot and cluster level are the result of the k-fold crossvalidation framework, whereas the GLAS LiDAR footprints give an additional independent validation. The accuracy assessment at plot level showed a coefficient of determination (R 2 ) of 0.47, RMSD of 39.0 t C h −1 , RSD of 52.9%, and MBD in the overall mean of 1.4 t C ha −1 . At cluster level, R 2 was 0.88, the RMSD was 18.0 t C ha −1 , the RSD was 13.5% and the MBD was 3.2 t C ha −1 . Validation with the GLAS footprint data gave similar results, with R 2 of 0.45, RMSD of 39.1 t C ha −1 and MBD of the mean of −4.3 t C ha −1 .
The error analysis indicated that random error, as measured by the CVbias, was dominant in the low and middle AGBC ranges predicted by the map for plots and clusters. In the case of plot and GLAS level assessments, bias dominated the error in ranges above 75 t C ha −1 for GLAS and 135 t C ha −1 for plots, which resulted in an underestimation of AGBC ( Figure 4 and Table A1). However, this effect is reduced when assessing the errors using larger units (i.e., clusters).
Using the AGBC and the uncertainty map, we computed an average AGBC of 36.69 ± 17.04 t C ha −1 for forests (open to dense forests and mangroves) and 6.01 ± 2.98 t C ha −1 for wooded grasslands. The highest AGBC densities estimated by the map in Kenya are in the dense forest areas, with values up to 250.50 t C ha −1 . The lowest AGBC values are found in the wooded grassland areas. Kenya is dominated by arid and semi-arid conditions, leading to a dominance of low AGBC densities (i.e., The accuracy of the AGBC map was assessed using in situ plots, plot clusters and GLAS LiDAR footprints (Figure 4). The assessments at plot and cluster level are the result of the k-fold cross-validation framework, whereas the GLAS LiDAR footprints give an additional independent validation. The accuracy assessment at plot level showed a coefficient of determination (R 2 ) of 0.47, RMSD of 39.0 t C h −1 , RSD of 52.9%, and MBD in the overall mean of 1.4 t C ha −1 . At cluster level, R 2 was 0.88, the RMSD was 18.0 t C ha −1 , the RSD was 13.5% and the MBD was 3.2 t C ha −1 . Validation with the GLAS footprint data gave similar results, with R 2 of 0.45, RMSD of 39.1 t C ha −1 and MBD of the mean of −4.3 t C ha −1 .
The error analysis indicated that random error, as measured by the CV bias , was dominant in the low and middle AGBC ranges predicted by the map for plots and clusters. In the case of plot and GLAS level assessments, bias dominated the error in ranges above 75 t C ha −1 for GLAS and 135 t C ha −1 for plots, which resulted in an underestimation of AGBC ( Figure 4 and Table A1). However, this effect is reduced when assessing the errors using larger units (i.e., clusters).
Using the AGBC and the uncertainty map, we computed an average AGBC of 36.69 ± 17.04 t C ha −1 for forests (open to dense forests and mangroves) and 6.01 ± 2.98 t C ha −1 for wooded grasslands. The highest AGBC densities estimated by the map in Kenya are in the dense forest areas, with values up to 250.50 t C ha −1 . The lowest AGBC values are found in the wooded grassland areas. Kenya is dominated by arid and semi-arid conditions, leading to a dominance of low AGBC densities (i.e., wooded grassland, Figure A1). However, the large area covered by wooded grassland means that we estimate~59% of Kenya's total AGBC stocks are found there.
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 24 wooded grassland, Figure A1). However, the large area covered by wooded grassland means that we estimate ~59% of Kenya's total AGBC stocks are found there. The total AGBC stocks for the forests and wooded grasslands in Kenya was estimated to be 339.20 Mt C with a coefficient of variation at national level of <1%. Stratifying by cover type, the total AGBC stock in forests (open to dense forests and mangroves) was 140.70 Mt C, and in wooded grasslands 198.50 Mt C, representing 41% and 59% of the total stocks respectively. The total BGBC was estimated as 81.41 Mt C, resulting in a total biomass carbon stock of 420.61 Mt C.
The AGBC map derived by this study was compared in terms of accuracy and carbon stocks to previous global and pan-tropical studies from Avitabile et al. [24], Saatchi et al. [22], Baccini et al. [23] and Santoro et al. [79] over Kenya at 1 km spatial resolution ( Figure A2). These products correspond to different years between 2000 and 2010, while our map corresponds to 2014. We used the GLAS footprints and the clusters to assess the accuracy of these products. We made sure these were not affected by forest disturbances between 2000 and 2015, according to Hansen et al. [3]. This study showed higher accuracy and lower errors than any other product over Kenya (Table A2).
In terms of carbon stocks, Saatchi et al. [22] calculated almost twice the amount calculated by this study or by Baccini et al. [23] for Kenya, and almost three times the stock estimated by Avitabile et al. [24]. Dense, moderate and open forest areas showed similar AGBC values in all the studies, except that of Avitabile et al. [24] with lower AGBC values for all vegetation types ( Figure 5). The biggest disagreements between the maps occur for wooded grasslands and mangroves and vegetated wetlands ( Figure 5). Each of the AGBC maps indicates that wooded grassland has the largest total AGBC stocks, ranging from 48% to 76% of the total AGBC stock for the country ( Figure 5).  The total AGBC stocks for the forests and wooded grasslands in Kenya was estimated to be 339.20 Mt C with a coefficient of variation at national level of <1%. Stratifying by cover type, the total AGBC stock in forests (open to dense forests and mangroves) was 140.70 Mt C, and in wooded grasslands 198.50 Mt C, representing 41% and 59% of the total stocks respectively. The total BGBC was estimated as 81.41 Mt C, resulting in a total biomass carbon stock of 420.61 Mt C.
The AGBC map derived by this study was compared in terms of accuracy and carbon stocks to previous global and pan-tropical studies from Avitabile et al. [24], Saatchi et al. [22], Baccini et al. [23] and Santoro et al. [79] over Kenya at 1 km spatial resolution ( Figure A2). These products correspond to different years between 2000 and 2010, while our map corresponds to 2014. We used the GLAS footprints and the clusters to assess the accuracy of these products. We made sure these were not affected by forest disturbances between 2000 and 2015, according to Hansen et al. [3]. This study showed higher accuracy and lower errors than any other product over Kenya (Table A2).
In terms of carbon stocks, Saatchi et al. [22] calculated almost twice the amount calculated by this study or by Baccini et al. [23] for Kenya, and almost three times the stock estimated by Avitabile et al. [24]. Dense, moderate and open forest areas showed similar AGBC values in all the studies, except that of Avitabile et al. [24] with lower AGBC values for all vegetation types ( Figure 5). The biggest disagreements between the maps occur for wooded grasslands and mangroves and vegetated wetlands ( Figure 5). Each of the AGBC maps indicates that wooded grassland has the largest total AGBC stocks, ranging from 48% to 76% of the total AGBC stock for the country ( Figure 5).
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 24 wooded grassland, Figure A1). However, the large area covered by wooded grassland means that we estimate ~59% of Kenya's total AGBC stocks are found there. The total AGBC stocks for the forests and wooded grasslands in Kenya was estimated to be 339.20 Mt C with a coefficient of variation at national level of <1%. Stratifying by cover type, the total AGBC stock in forests (open to dense forests and mangroves) was 140.70 Mt C, and in wooded grasslands 198.50 Mt C, representing 41% and 59% of the total stocks respectively. The total BGBC was estimated as 81.41 Mt C, resulting in a total biomass carbon stock of 420.61 Mt C.
The AGBC map derived by this study was compared in terms of accuracy and carbon stocks to previous global and pan-tropical studies from Avitabile et al. [24], Saatchi et al. [22], Baccini et al. [23] and Santoro et al. [79] over Kenya at 1 km spatial resolution ( Figure A2). These products correspond to different years between 2000 and 2010, while our map corresponds to 2014. We used the GLAS footprints and the clusters to assess the accuracy of these products. We made sure these were not affected by forest disturbances between 2000 and 2015, according to Hansen et al. [3]. This study showed higher accuracy and lower errors than any other product over Kenya (Table A2).
In terms of carbon stocks, Saatchi et al. [22] calculated almost twice the amount calculated by this study or by Baccini et al. [23] for Kenya, and almost three times the stock estimated by Avitabile et al. [24]. Dense, moderate and open forest areas showed similar AGBC values in all the studies, except that of Avitabile et al. [24] with lower AGBC values for all vegetation types ( Figure 5). The biggest disagreements between the maps occur for wooded grasslands and mangroves and vegetated wetlands ( Figure 5). Each of the AGBC maps indicates that wooded grassland has the largest total AGBC stocks, ranging from 48% to 76% of the total AGBC stock for the country ( Figure 5).   [23] and Santoro et al. [79]. The comparison was made after aggregating all the products to 1 km spatial resolution.

Deforestation and Carbon Loss
The accuracy assessment of the "likely" forest loss class refers to the testing subset and is given as proportions, following Olofsson et al. [76]. Two products were validated using this subset: (i) the globally calibrated forest loss product for 2014-2017 from the GFC dataset [3]; (ii) the locally calibrated forest loss map generated in this study for the 2014-2017 period.
Our locally calibrated undisturbed land cover and forest loss product for the 2014-2017 period in Kenya ( Figure 6) showed an overall accuracy of 0.92 and κ of 0.83. The commission error for this forest loss product was estimated at 3%, while the omission error was 0%. For comparison, the globally calibrated GFC forest loss dataset for the 2014-2017 period in Kenya gave a commission error of 0%, and a much higher omission error of 99%, for an overall accuracy of 0.89 and κ of 0.01 (Tables A3  and A4).

Deforestation and Carbon Loss
The accuracy assessment of the "likely" forest loss class refers to the testing subset and is given as proportions, following Olofsson et al. [76]. Two products were validated using this subset: (i) the globally calibrated forest loss product for 2014-2017 from the GFC dataset [3]; (ii) the locally calibrated forest loss map generated in this study for the 2014-2017 period.
Our locally calibrated undisturbed land cover and forest loss product for the 2014-2017 period in Kenya ( Figure 6) showed an overall accuracy of 0.92 and κ of 0.83. The commission error for this forest loss product was estimated at 3%, while the omission error was 0%. For comparison, the globally calibrated GFC forest loss dataset for the 2014-2017 period in Kenya gave a commission error of 0%, and a much higher omission error of 99%, for an overall accuracy of 0.89 and κ of 0.01 (Tables  A3 and A4).
As our locally calibrated forest loss product gave much higher accuracy, it was used for the subsequent analysis. The total forest loss calculated using this dataset for the 2014-2017 period in Kenya was estimated as 30,802 ha (deforestation rate of 10,267 ha yr −1 ), of which 87% occurred in dense forests (Table 1).
These figures change if wooded grassland is included in the calculation. Then, the total loss would be 43,736 ha (14,579 ha yr −1 ), with 61% corresponding to dense forest and 30% to wooded grassland. The combination of the forest loss product with the AGBC map allows us to estimate the AGBC loss rates due to deforestation for the 2014-2017 period (Table 1). In this period, Kenyan forests lost approximately 2.09 Mt C, with an annual loss rate of approximately 0.70 Mt C yr −1 , while approximately 0.29 Mt C (0.10 Mt C yr −1 ) was lost from wooded grassland due to deforestation. The deforestation rate of wooded grasslands was half that of dense forest, but five times more AGBC was lost in the latter. The CV for the total AGBC losses for all forest types was below 1%. As our locally calibrated forest loss product gave much higher accuracy, it was used for the subsequent analysis. The total forest loss calculated using this dataset for the 2014-2017 period in Kenya was estimated as 30,802 ha (deforestation rate of 10,267 ha yr −1 ), of which 87% occurred in dense forests (Table 1). These figures change if wooded grassland is included in the calculation. Then, the total loss would be 43,736 ha (14,579 ha yr −1 ), with 61% corresponding to dense forest and 30% to wooded grassland.
The combination of the forest loss product with the AGBC map allows us to estimate the AGBC loss rates due to deforestation for the 2014-2017 period (Table 1). In this period, Kenyan forests lost approximately 2.09 Mt C, with an annual loss rate of approximately 0.70 Mt C yr −1 , while approximately 0.29 Mt C (0.10 Mt C yr −1 ) was lost from wooded grassland due to deforestation. The deforestation rate of wooded grasslands was half that of dense forest, but five times more AGBC was lost in the latter. The CV for the total AGBC losses for all forest types was below 1%.

Ecosystem Carbon Cycling Properties and Dynamics
CARDAMOM accurately simulated the time series of observed LAI and single estimates of AGBC and soil carbon used to constrain its initial conditions (R 2 > 0.95 and mean absolute error <10%; Figure A3). Consistent with the AGBC mapping, CARDAMOM estimated that ecosystem productivity (GPP) and growth (NPP) are~2.5 times greater per hectare for forest ecosystems than for wooded grasslands. However, the more extensive wooded grasslands are overall three times more productive than the forest area at the national scale ( Table 2). Carbon losses from both forests and wooded grasslands are dominated by respiration, with fire losses estimated to represent only~1% of the net biome exchange. Respiration has a similar magnitude from autotrophs and heterotrophs. Given the currently assimilated information, the analysis cannot confidently identify either forest or wooded grassland as a net source or sink of carbon for the analysis period ( Table 2). The mean wood stock estimates in 2014 for forests is more than 4 times larger than in wooded grassland. Similarly, the mean magnitude of NPP allocated to wood is~3 times larger in forests than wooded grassland, while the residence time of wood is~70% longer in forests than in wooded grassland (Table 3).

Discussion
In this study we have combined field inventory and EO information within a machine learning approach to create Kenya-wide improved estimates of AGBC (Figures 3-5), forest cover loss ( Figure 6, Table 1) and their uncertainties. These and other data have then been used to constrain a process-oriented Remote Sens. 2020, 12, 2380 13 of 24 model of the terrestrial carbon cycle to improve our understanding of the properties and carbon cycling of the Kenyan ecosystem (Tables 2 and 3).

Biomass Carbon Stocks
The total AGBC stock estimated for Kenya is 339.20 Mt C. The coefficient of variation for AGBC at national level was calculated to be <1%, which agrees with Saatchi et al. [22]. However, this result must be treated with caution, as the CV of a sum of measurements can be made arbitrarily small by adding more measurements/pixels (i.e., increasing the area of interest). This is because the precision of the estimated total AGBC improves as the number of pixels increases (see Equations (A8) and (A9)), but the accuracy of the estimate may increase or decrease as new measurements are added if there are systematic biases in the measurements. The potential error depends on how bias depends on true AGBC and the histogram of the true AGBC for the measurements being summed.
Previous EO-based pantropical and global maps correspond to different time-periods and used different reference datasets and methods. These studies have estimated Kenya's biomass carbon stocks to be 174-556 Mt C, a range encompassing the estimate from this study [21][22][23][24]. However, all the studies give an AGBC stock that is a lot smaller than the 694 Mt C reported in the Forest Resource Assessment (FRA) [75]. The discrepancies between this study and the abovementioned EO-orientated studies is due to the AGBC stocks estimated for wooded grasslands ( Figure 5). In the studies, the forest AGBC stock ranges from 88.82 Mt C to 139.21 Mt C, whereas the AGBC from wooded grasslands presents a wider range from 84.22 Mt C to 424.95 Mt C. The lowest estimate for wooded grassland comes from Avitabile et al. [24]; this is not surprising as this used the GLC2000 map [80] as a forest mask, which excluded a large proportion of wooded grassland and other vegetation types from their estimation, as seen in Figure 5. These large differences at national level, in the order of hundreds Mt C, cannot be explained by inter-annual AGBC variation. Nevertheless, all studies suggest that the largest pool of AGBC in Kenya is in wooded grasslands.
However, due to their low tree density wooded grasslands are not considered forest under the Kenyan forest definition and are therefore not included within the framework of national and international mechanisms aiming to protect carbon stocks, such as REDD+. This is an indication that the role of wooded grasslands in the C cycle is not fully understood and that this role is undervalued, as recently stressed by McNicol et al. [31], who for the first time identified: (i) degradation as the main driver of C loss in the highly dynamic southern African savannahs and woodlands; and (ii) the importance played by extensive regrowth, making this region a significant sink of carbon. Even mechanisms such as the Bonn Challenge, which aims to reforest 150 Mha by 2030, include some wooded grasslands and savannahs as targets for its afforestation agenda [81], implying that these ecosystems need to be converted to forest, thereby underestimating not only their climate mitigation capacity in their own right but also their ecological importance. Furthermore, fluxes from degradation and regrowth, including recent reforestation efforts made by the government of Kenya to increase the forest cover of the country to 10% of the national territory by 2022 [2], are unaccounted for.

Deforestation and Carbon Loss
We compared the accuracy of the locally calibrated forest loss product ( Figure 6) generated in this study with a globally calibrated forest loss product (GFC) [3] for Kenya. The GFC product does not perform well for Kenyan forests and has a large omission error, indicating that it does not detect most of the deforestation occurring in the country (Table A4). This has implications when calculating the overall carbon budget for Kenya. As Kenya's carbon stocks are relatively low, a miscalculation of the deforestation might change the estimate from a net C source to net C sink and vice versa. Likewise, Milodowski et al. [29] found that the GFC provides robust estimates of forest loss for large-scale deforestation areas, but has serious problems detecting small-scale disturbances (<2 ha).
The estimated deforestation rates in Kenya between 1990 and 2015, as reported by the Food and Agriculture Organization (FAO), are approximately 12,400 ha yr −1 [1], which roughly agrees with the average deforestation rate of 10,459 ha yr −1 (14,265 ha yr −1 if wooded grassland is included) reported in this study for the 2014-2017 period. Based on these deforestation rates we estimated that 2.38 Mt C was lost due to deforestation in that period (annual rate of 794,972 t C yr −1 ).
It should be noted that the deforestation rates reported in this study are only based on forest cover loss, as satellites do not see if regrowth is allowed or there is land cover conversion after forest removal. Additionally, our products do not differentiate harvesting occurring in forest plantations or concessions from illegal logging of protected forests.

Ecosystem Carbon Cycling Properties and Dynamics
The CARDAMOM analysis emphasises the importance of carbon cycling in wooded grassland ecosystems for Kenya's overall biosphere carbon balance (Tables 2 and 3). Wooded grasslands cover approximately 62% of the country while forest only covers approximately 7%, so the national carbon fluxes both into and out of wooded grasslands are much larger ( Table 2). Despite their limited tree cover, our analysis indicates that at national scale more carbon is fixed into wood each year in the wooded grasslands than in forests (Table 3). Thus, while forest ecosystems may be more productive per unit area, wooded grasslands must be considered in Kenya's carbon management strategy. The role of fire within both ecosystems was estimated to be <1% of the net biome exchange of carbon (Table 2). However, this estimate should be treated with caution given the widely reported difficulty in detecting small fires from space, which has led to significant efforts to improve small fire detection in recently released products, e.g., [28].
Given the currently available observational datasets, the remaining uncertainties in net carbon exchange make it impossible to identify either forests or wooded grasslands as net sources or sinks of carbon (Table 2). Large uncertainty in the net carbon balance is expected given the lack of repeat measurements of woody biomass [38] or additional information that constrains the woody residence time and/or net carbon exchange. Future CARDAMOM studies will be able to take advantage of new EO-based estimates of biomass to provide additional constraints, such as the annual assessment of woody biomass and its changes, e.g., ESA Biomass mission e.g., ESA Biomass mission [82]. In the near-term, subsequent studies should investigate the usefulness of assimilating net biome exchange estimates derived from atmospheric inversion analysis [83,84].
The CARDAMOM analysis focuses only on forest and wooded grasslands that were undisturbed between 2014 and 2017 and therefore does not account for degradation, reforestation or afforestation activity which occurred prior to the analysis period or degradation during this period. Additionally, the analysis of vegetation dynamics does not represent the complex dynamics in this ecosystem due to large herbivores, pastoral practices and small-scale fires not detected in the GFEDv4 burned area product.

Conclusions
The assessment of forest carbon fluxes at national level requires precise and detailed carbon stock estimations. However, the limited availability of in situ data together with the lack of accurate forest carbon maps with fine enough spatial resolution for Kenya have until recently made this particularly challenging.
The AGBC map and the forest loss products generated by this study are the most accurate and detailed published for Kenya to date, giving estimates for every 30 m pixel along with an estimate of the associated pixel-level uncertainty. They complement other Landsat-based national forest land cover methods, and therefore facilitate national and international reporting on carbon dynamics. The AGBC map quantifies carbon stocks not only for forests but also for wooded grasslands, which are often neglected in this type of analysis. The AGBC and forest loss maps allowed us to produce the most complete analysis of carbon stocks and fluxes for the forests and wooded grasslands in Kenya. This study can therefore be considered a benchmark for studying carbon fluxes at national level.
Our results emphasise the importance of other vegetated ecosystems (i.e., wooded grasslands) besides forests for carbon sequestration, and stresses the importance of considering those in Kenya's carbon management strategy. Wooded grasslands represent 59% of the AGBC stocks in Kenya due to the large area of this ecosystem type. However, unlike forests, international mechanisms for climate mitigation, such as REDD+, do not recognise the importance of wooded grasslands by providing economic incentives for their preservation and management. Moving from a concept of a discontinuous landscape described by categorical land cover classes to a continuous landscape paradigm describing landscapes through biophysical parameters such as percentage canopy cover will provide a way forward for such international mechanisms.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A.1. AGB Estimation Using Allometric Models
The allometric model from Chave et al. [72] was used to estimate tree AGB: where D is diameter at breast height in cm, H is tree height in m and ρ is wood specific gravity in g cm −3 .
The allometries from Muchiri and Muga [73] were used for AGB of bamboos: AGB = 1.04 + 0.06 × D × 1.11 + 0.36 × D 2 for bamboos with D > 3cm (A2) The volume of the equivalent cylinder and the mean wood density of 0.6 g m −3 [74] was used to estimate the AGB of lianas.
In the case of GLAS LiDAR footprints, AGB was estimated using a footprint level allometric model developed by Saatchi et al. [22] for sub-Saharan Africa including woodland savanna: where H L is the basal area weighted canopy height (i.e., Lorey's height).

Appendix A.2. K-Fold Cross Validation and AGB Error Mapping
The original sample of in situ data was stratified into 3 AGB ranges (high, medium, low) and each range (stratum) was randomly partitioned into k equally sized subsamples (k = 10). Of the k subsamples, a single subsample is retained from each stratum as the validation data for testing the RF model, and the remaining k − 1 subsamples are used as training data. The cross-validation process was then repeated k times (the folds), with each of the k subsamples used exactly once as the validation data. Therefore, all observations were used for both training and validation, and each observation was used for validation exactly once. As a result of this framework, the k AGB maps were generated, from which the mean value of all estimates for each pixel was used as the final AGB retrieval (ÂGB): where AGB i is the prediction of each of the k AGB estimates. Accordingly, the standard deviation (SD) of all k AGB estimates is defined as: The total SD (ε AGB ) for our final retrieval (ÂGB) is then propagated as explained in Rodriguez-Veiga et al. [15], where: where ε measurement is the SD from the measurement of tree level parameters averaged at plot scale, ε allometry is the SD from the use of allometric models and ε sampling is the SD originating from the variability of AGB within the pixel. The ε prediction accounts for errors that arise if the sampling sites are not truly representative of the of the distribution of AGB in the region [22]. In order to estimate these parameters, we assumed the Coefficient of Variation of the measurement (CV measurement = ε measurement /ÂGB) was 10% [85], CV allometry = ε allometry /ÂGB was 11% [86], and CV sampling = ε sampling /ÂGB was 31%, based on Réjou-Méchain et al. [87].
difference (MBD), the root mean square difference (RMSD) and the relative squared difference (RSD) defined as follows: where n is the number of plots or footprints, p i is the AGBC estimated by the map, a i is the AGBC estimated on the plot or footprint and a is the sample average of the latter. The accuracy assessment was also carried out by stratifying the reference data into AGBC ranges following the design shown in Rodríguez-Veiga et al. [14]. The selected ranges varied by reference dataset, depending on the maximum biomass observed for the dataset and the need to have a sufficient number of reference data within each range. The CV of the bias (CV bias ) for a given AGBC range was estimated as follows: where SD and MDB are the standard deviation and the mean bias difference. When the CV bias exceeds 1, the RMSD is dominated by random error, but when it is less than 1 the dominant error source is bias in the estimator. An accuracy assessment was also carried out for the previous AGBC maps of Avitabile et al. [24], Saatchi et al. [22], Baccini et al. [23], and Santoro et al. [79] over Kenya. For comparability purposes, all the maps were aggregated to 1 km spatial resolution (including the map developed in this study). The maps were then validated using GLAS footprints and the NFI clusters.

Appendix B. Additional Results
Remote Sens. 2020, 12, x FOR PEER REVIEW 17 of 24 where n is the number of plots or footprints, is the AGBC estimated by the map, is the AGBC estimated on the plot or footprint and is the sample average of the latter.
The accuracy assessment was also carried out by stratifying the reference data into AGBC ranges following the design shown in Rodríguez-Veiga et al. [14]. The selected ranges varied by reference dataset, depending on the maximum biomass observed for the dataset and the need to have a sufficient number of reference data within each range. The CV of the bias (CVbias) for a given AGBC range was estimated as follows: where SD and MDB are the standard deviation and the mean bias difference. When the CVbias exceeds 1, the RMSD is dominated by random error, but when it is less than 1 the dominant error source is bias in the estimator. An accuracy assessment was also carried out for the previous AGBC maps of Avitabile et al. [24], Saatchi et al. [22], Baccini et al. [23], and Santoro et al. [79] over Kenya. For comparability purposes, all the maps were aggregated to 1 km spatial resolution (including the map developed in this study). The maps were then validated using GLAS footprints and the NFI clusters. Figure A1. Histograms of AGBC map predictions for broad vegetation cover type: (a) forest, (b) mangroves and vegetated wetlands, and (c) wooded grassland. Note the different scale of the y axis for wooded grasslands. Table A1. Summary accuracy assessment stratified by reference AGBC range: Mean Bias Difference (MBD), Root Mean Square Difference (RMSD), Standard Deviation (SD), and Coefficient of Variation of the bias (CVbias) (when CVbias > 1 the random error dominates, when CVbias < 1 the bias does). Units are in t C ha −1 , except for CVbias which is a ratio.    Figure A2. AGBC maps derived from (a) this study, (b) Avitabile et al. [24], (c) Saatchi et al. [22], (d) Baccini et al. [23], and (e) Santoro et al. [79]. All the maps were aggregated to 1 km spatial resolution and masked using the land cover map for Kenya to constrain prediction to only forests, wooded grasslands, and mangroves & vegetated wetlands.  Figure A2. AGBC maps derived from (a) this study, (b) Avitabile et al. [24], (c) Saatchi et al. [22], (d) Baccini et al. [23], and (e) Santoro et al. [79]. All the maps were aggregated to 1 km spatial resolution and masked using the land cover map for Kenya to constrain prediction to only forests, wooded grasslands, and mangroves & vegetated wetlands. Stable Water (SW) 0.00% Forest Loss (Floss) 0.00% Table A4. Summary of accuracy assessment of the forest loss products derived by GFC and this study.

Forest Loss Product Commission Error (%) Omission Error (%)
GFC forest loss 0 99 This study 3 0 Figure A3. Comparison of the assimilated information and CARDAMOM estimates of mean leaf area, AGBC and soil organic matter for both stable forest and wooded grassland (savannah). Leaf area index information is drawn from the ESA Copernicus product, while the prior on the initial soil C status is drawn from the SoilGrids database. The constraint on initial woody biomass is drawn from this study. Warmer colours indicate higher point density. The red solid line corresponds to the y = x line. Figure A3. Comparison of the assimilated information and CARDAMOM estimates of mean leaf area, AGBC and soil organic matter for both stable forest and wooded grassland (savannah). Leaf area index information is drawn from the ESA Copernicus product, while the prior on the initial soil C status is drawn from the SoilGrids database. The constraint on initial woody biomass is drawn from this study.
Warmer colours indicate higher point density. The red solid line corresponds to the y = x line.