A Cluster of Co 2 Change Characteristics with Gosat Observations for Viewing the Spatial Pattern of Co 2 Emission and Absorption

Satellite observations can be used to detect the changes of CO2 concentration at global and regional scales. With the column-averaged CO2 dry-air mole fraction (Xco 2) data derived from satellite observations, the issue is how to extract and assess these changes, which are related to anthropogenic emissions and biosphere absorptions. We propose a k-means cluster analysis to extract the temporally changing features of Xco 2 in the Central-Eastern Asia using the data from 2009 to 2013 obtained by Greenhouse Gases Observing Satellite (GOSAT), and assess the effects of anthropogenic emissions and biosphere absorptions on CO2 changes combining with the data of emission and vegetation net primary production (NPP). As a result, 14 clusters, which are 14 types of Xco 2 seasonal changing patterns, are obtained in the study area by using the optimal clustering parameters. These clusters are generally in agreement with the spatial pattern of underlying anthropogenic emissions and vegetation absorptions. According to correlation analysis with emission and NPP, these 14 clusters are divided into three groups: strong emission, strong absorption, and a tendency of balancing between emission and absorption. The proposed clustering approach in this study 1696 provides us with a potential way to better understand how the seasonal changes of CO2 concentration depend on underlying anthropogenic emissions and vegetation absorptions.


Introduction
The global carbon cycle has been changed by human activities since the beginning of the industrial era [1].Anthropogenic emissions of CO2, especially that from burning fossil fuels, is considered to be a major cause of the continual increase of atmospheric carbon dioxide (CO2) concentrations [2,3].This is the leading driving factor of climate change and global warming [4].It is well known that CO2 is a long-lived greenhouse gas, and the gradients generated by local fluxes are relatively small compared with background concentrations [5].To better understand the carbon budget and combat climate change, it is extremely important to know where CO2 is released into the atmosphere and from where it is removed [6,7].Therefore, effective approaches for observing atmospheric CO2 concentrations in high-quality are essential.For a long past time, ground-based observations had been the only reliable way of obtaining stable, highly accurate data of CO2 concentrations in the atmosphere, which have helped us in understanding the global and latitudinal variations of atmospheric CO2 concentration [8,9].However, the sparseness of current ground-based measurement stations [7,9,10] has been limiting our knowledge of the global carbon cycle [11].
With the development of atmospheric remote sensing technology, satellite observation, with high spatial and temporal resolutions has become one of the effective approaches to monitoring the changes of greenhouse gases at regional and global scales [11][12][13][14].Space-based remote sensing observations are expected to complement, rather than replace, ground-based measurements [15] for detecting how CO2 concentration changes in space and time and where CO2 is emitted and absorbed [16].In the past few years, particularly since GOSAT was launched, satellite observations have contributed a large amount of data to help facilitate detection of the changing characteristics of atmospheric CO2 concentrations at global and regional scales.
GOSAT was designed to provide views of real spectra from space in the short-wave infrared band (SWIR) and thermal infrared band (TIR), where CO2 absorption lines are located [17,18].The CO2 concentrations, column-averaged volume mixing ratios of CO2 (Xco 2 ) [19], are derived from GOSAT observing spectra and auxiliary parameters [11], which are sensitive to the atmospheric boundary layer [10,20].These parameters have been compared and validated with ground-based measurements [21][22][23] and model simulations [24,25] in many studies.Satellite observation covering the globe can help us better understand the spatio-temporal changes of atmospheric CO2 concentrations as to confirm carbon sources and sinks.
Variations of Xco 2 depend on terrestrial biosphere fluxes, anthropogenic emissions, and foreign fluxes transported by atmospheric wind fields [26].Accordingly, the spatial distributions of anthropogenic emissions and biosphere fluxes can be characterized based on the Xco 2 variations, which can be observed by satellite.At present, there are two main ways, model-driven and data-driven, for detecting sources and sinks.Model-driven way applies the inverse modeling approach, incorporating an atmospheric transport model to deriving surface CO2 fluxes for sources and sinks from satellite observations [27][28][29].Based on prior fluxes and meteorological data, this method takes advantage of grid cells to quantify CO2 sources, and sinks, and transmission exchanges and help us to know the mechanism of CO2 changes.However, the uncertainty and low spatial resolutions of initial data used in the model inputs and prior fluxes, and the implicit relationships between emissions/absorptions and concentrations result in our understanding for CO2 fluxes is not exact in space and time.The data-driven way detects sources and sinks directly by using the multi-source data including satellite observations to detect the changes of CO2 concentration.This direct analysis of change characteristics of CO2 concentration using satellite observations based on proper regional divisions has been shown to be an effective way to study CO2 changes induced by emitting and absorption.For example, Kort et al. [30] explored the enhancement of atmospheric CO2 concentrations over Los Angeles using GOSAT data.Moreover, Keppel-Aleks et al. [31] used GOSAT data to compare the differences of CO2 concentrations between emission and upwind regions to analyze fossil fuel emissions.This data-driven way is intuitive and may reveal the real changes since satellite observations provide direct and instantaneous measurements of CO2 concentrations.
Clustering of data is an effective analysis tool to extract valuable information from data by grouping large datasets according to their similarity [32].This approach has been widely applied in global climate change analysis [33][34][35] but is still not applied in studying of CO2 concentrations.In this paper, we propose a clustering approach of satellite CO2 observation data based on the temporally changing characteristics of CO2 concentrations.The objective of this approach is to study the spatial patterns from the clustering that may indicate CO2 emissions and absorptions.Auxiliary datasets, including bottom-up emission datasets, net primary product (NPP) datasets, and land cover datasets are used to evaluate the performance of the clustering results.

Used Data
We chose Central-Eastern Asia (from 18°N to 55°N in latitude, and from 70°E to 140°E in longitude) as the study area.The area covers Eastern China and Northern India, where CO2 emissions are rapidly increasing [3] on account of high population densities, rapid economic development, and significant energy consumption sharpen [36,37].
(1) Gap-filled Xco 2 Dataset GOSAT Xco 2 data are irregularly distributed and have many gaps in space and time because of the limitation of GOSAT observation mode, cloudy block and data screening.For viewing the space-time continuous changes of CO2 concentrations, we applied a spatio-temporal kriging interpolation method [38][39][40] to fill the gaps and generated a mapping dataset in 1° × 1° grid cells and ten-day interval in time (from 18°N to 55°N in latitude, and from 70°E to 140°E in longitude, from 1 June 2009 to 15 May 2013 in time).We generated two datasets from this gap-filled data.One is the monthly averaged Xco 2 dataset (M-Xco 2 ) containing 1415 grid cells with 47 month-averaged Xco 2 values for each grid cell.The other is the seasonally averaged Xco 2 dataset (S-Xco 2 ) containing 1561 grid cells with 15 season-averaged Xco 2 values for each grid cell.These two datasets have been filtered based on Xco 2 data integrity in time series.
The GOSAT Xco 2 retrievals used here are Atmospheric CO2 Observations from Space (ACOS) Xco 2 retrieval products (v3.3) from June 2009 to May 2013 (http://mirador.gsfc.nasa.gov).The ACOS Xco 2 dataset was produced by the Orbiting Carbon Observatory (OCO) team of the US National Aeronautics and Space Administration (NASA) using a full physics algorithm for retrieving data from GOSAT's onboard Thermal And Near-infrared Sensor for carbon Observation-the Fourier Transform Spectrometer (TANSO-FTS) calibrated spectra measurements (Level 1B).Only data with land high gain were used after screening and systematic bias correction described in the ACOS Level 2 Standard Product Data User's Guide, v3.3 [41] to ensure high reliability of the data.
(2) Auxiliary Data In order to evaluate the performance of the clustering results and assess the impact of different underlying surfaces on the variations of CO2 concentrations, we collected bottom-up anthropogenic CO2 emissions, net primary productivity (NPP) data and land cover data.
The 0.1° × 0.1° gridded annual estimates of CO2 emissions from EDGAR 4.2 FT2010 (http://edgar.jrc.ec.europa.eu/) was collected [42].The EDGAR 4.2 FT2010 database was jointly developed by the Joint Research Center (JRC) and Netherlands Environmental Assessment Agency (PBL).It was generated by applying the emission factors and the calculation method from the 2006 IPCC Guidelines to international statistics on energy production and consumption, industrial manufacturing, agricultural production, waste treatment and disposal, and burning of biomass [43].
Land cover types in the study area vary.The spatial distributions of land cover in the study area were obtained from MODIS Land Cover Type data (MCD12C1) by the MODIS-derived LAI/FPAR scheme (https://lpdaac.usgs.gov).For statistical analyses purpose, we derived averaged percentages of each land cover type in 1° × 1° grid cells from original data.

Clustering Approach Based on Multi-Temporal Xco 2 Data
To make use of the temporally changing characteristics of Xco 2 , we used a robust cluster method to classify Xco 2 data based on the similarity of CO2 concentrations to temporally changing patterns.K-means is an iterative algorithm used to partition the given dataset into k clusters, where k is a user-specified number [46,47].The clustering result should meet the conditions that the intra-cluster similarity is high, while the inter-cluster similarity is low.The clustering process was implemented in the following steps: (1) Combine multi-temporal Xco 2 into a characteristic vector x for each spatial grid cell.The count of the grid cells is n, and "Z-score" measures are used to remove seasonality from the data."Z-score" is a distance measure of a data point from the mean in terms of the standard deviation, and can be used for standardizing the original data [48]; (2) Randomly select k objects from n grid cells as the initial cluster centroids.The characteristic vector of cluster Ci can be expressed as mi; (3) Assign or reassign each grid cell to the cluster to which the grid cell is the most similar based on the Euclidean distance criterion (Equation ( 1)) between the grid cell and cluster; (4) Update the cluster centroids by averaging characteristic vector of the grid cells for each cluster; (5) Repeat Steps ( 3) and ( 4) until the sum of square error criterion (Equation ( 2)) converges when the cluster centroids do not change. , Because the results of k-means clustering can be influenced by the initial cluster centroids, and the sum of square error (Equation ( 2)) may fall into the local optimum, the clustering process is repeated for ten times.In addition, the lowest value of SSE was chosen for acquiring stable clustering results.
To test the effects of different time intervals of the mapping Xco 2 dataset, we respectively used both the M-Xco 2 and S-Xco 2 to the clustering process described above.Grid cells with similar Xco 2 temporal patterns could be merged to respectively generate the clustering results for different k (from 3 to 40), in which month-interval clustering results are referred to as M-Clusters and season-interval clustering results are referred to as S-Clusters.
With the increase in cluster number k from 3 to 40, the averages, standard deviations, and non-negative ratios of silhouette values are calculated as measures of how appropriately the data has been clustered.Silhouette values [50,51]are used for validating the clustering results [49,50].The silhouette is a measure of how similar each grid cell in one cluster is to grid cells in the neighboring clusters.It is given by where a (x) is the average distance from x to all other grid cells in the same cluster, and b(x) is the minimum average distance from x to grid cells in any other clusters.s(x) ranges from −1 to +1.If s(x) is close to −1, it indicates that the clustering of x is incorrect and it should be belonged to its neighboring cluster.In contrast, if s(x) is close to 1, the current cluster is suitable for x.

Optimal Number of Clusters
To validate cluster results and choose the best cluster parameters, the numerical evaluation indicators, including average silhouette values, and non-negative ratios of the silhouette values were used.Figure 1 shows the variations of the averages and ratios of s(x), in which the number of clusters k range from 3 to 40.The error bars represent one standard deviation of the silhouette values.The plots show that the incorrect results of the clustering ratios are limited at a low level and that the clustering results are valid for either M-Clusters or S-Clusters.Specifically, the averages remain at approximately 0.4; the standard deviations are approximately the same; and all the ratios are above 95%.Based on Figure 1a,b, we found S-Clusters are slightly better than M-Clusters.On the other hand, auxiliary data of emissions and absorption were used for proving the effectiveness of the clustering approach.The average and amplitude were considered the two most typical characters of temporal Xco 2 variations.Before clustering, we calculated the correlation coefficients: Ravg_e between the Xco 2 average and emission, Ravg_n between the Xco 2 average and NPP, and Ramp_e between the Xco 2 amplitude and emission.In addition, we calculated the correlation coefficient Ramp_n between the Xco 2 amplitude and NPP based on gridded data, which are introduced in Section 2. Ravg_e, Ravg_n, Ramp_e, and Ramp_n results were 0.32, 0.26, 0.13, and 0.52, respectively.These results show that it is difficult to directly obtain and quantify spatial distributions of CO2 emissions using Xco 2 data in a grid scale.Moreover, it is obvious that the NPP, which is representative of the absorption capability, showed a relatively strong influence on the Xco 2 temporal fluctuation.
After clustering, the new Ravg_e and Ramp_n in the cluster region scale based on the clustering result were also calculated.They are illustrated in Figure 2. As shown in the figure, these correlation coefficients continuously remain at a relatively high level when k ranges from 3 to 40, although there is a slight decrease associated with the increase of k when a linear or logarithmic model is used.These results respectively prove the strong positive correlations in the cluster scale between Xco 2 averages and CO2 emissions, and between Xco 2 amplitudes and NPP.Compared to the analysis results before clustering, these results reveal the distribution information on both anthropogenic emissions and NPP.
In general, these results show that the satellite-based observations are correlated with regional CO2 emissions and NPP in a cluster region scale, and that the regional division by temporal clustering plays a role in exploring these relationships.The regional Xco 2 averages can be regarded as a potential indicator to quantify the diversity of CO2 emissions intensity in the cluster region scale, while regional Xco 2 amplitudes can be regarded as a potential indicator for analyzing NPP.
In addition, p-values were calculated in correlation analyses and applied in significance tests.Here, "p-value" is used for testing the null hypothesis of no correlation against the alternative that there is a nonzero correlation.As shown in Figure 2, where k is greater than 10, the p-values are less than 0.05, and the correlation analysis results can be considered reliable.Based on the above results and analyses, k can be appropriately chosen from the range of 10 to 40, and average silhouette values and non-negative ratios reach their peak when k is equal to 14 in S-Clusters when the minimum standard deviation is reached.Therefore, k = 14 in S-Clusters (average = 0.417, SD = 0.195, and ratio = 98.014%) is regarded as the optimal clustering scheme, and will be described and analyzed in more detail further.

Xco 2 and Anthropogenic Emissions
Based on the optimal clustering result shown in Figure 3, an overlay analysis with CO2 concentrations and emission data, respectively, is implemented.Figure 4a,b show annual Xco 2 and CO2 emissions in 2010 overlaid with polylines of the 14 clusters.Figure 4b is the average of CO2 emissions in 1° × 1° grid cells derived from original emission data; the emission units are converted to kg CO2/m 2 .In addition, Figure 5 depicts a correlation between Xco 2 and CO2 emissions.The results indicate that a positive correlation is shown between emissions and Xco 2 at the clustering region scale.Xco 2 of Cluster 3, an outlier in Figure 5, is abnormally high but low in emissions.

Xco 2 and NPP
The seasonal changes of CO2 concentrations are mainly caused by vegetation absorptions.The amplitude of seasonal variations in a year depends on vegetation coverage and growth activities in the northern parts of the northern hemisphere [51][52][53].Figure 6a presents the seasonal amplitude of Xco 2 in 2010, which is calculated by the difference between the maximum and minimum of the monthly averaged Xco 2 of each grid cell.Figure 6b   It is evident that the spatial distribution of Xco 2 amplitudes are generally in agreement with the spatial trend of NPP.Clusters 1, 4, 7, and 14 with large amplitudes of Xco 2 show significant corresponds to high NPP, while Clusters 3, 5, and 11 with small amplitudes of Xco 2 demonstrate low NPP.This indicates that the impact of NPP, a measurement of vegetation activities, on the seasonal changing amplitudes of Xco 2 .
The correlation coefficients between seasonal amplitudes of Xco 2 and NPP for each cluster are shown in Figure 7.It is shown that correlations between the Xco 2 amplitude and NPP are mostly positive; moreover, the larger NPP is, the higher the correlation coefficient is, except for Clusters 1 and 2, which are the maximum emission areas (shown in Figure 4b), and Cluster 9. Regression analysis between the cluster-averaged NPP and Xco 2 amplitude is shown in Figure 8.The coefficient of determination between them is 0.53 for all clusters, which is similar to the coefficient of determination (r 2 = 0.52) between Xco 2 and the emissions (shown in Figure 5).The results imply that the variations of Xco 2 can be equally explained by the underlying anthropogenic emissions and vegetation absorptions in the whole study area.

Attribution of Xco 2 Clusters
Figure 9 shows the variation of the monthly mean Xco 2 in each cluster.As is shown, each cluster demonstrates a different seasonal changing pattern of Xco 2 .Clusters 7, 8, 12, and 14 located at the northern part present two high Xco 2 peaks, which may be caused by vegetation growth activity and temperature variations.On the other hand, Clusters 3, 5, and 11 show smaller seasonal amplitudes, which can be attributed to sparse vegetation areas in these study regions.Table 1 demonstrates typical statistics of Xco 2 and auxiliary data in each cluster, including Xco 2 variations, anthropogenic emissions, and NPP.The contrast of Xco 2 in Table 1 is calculated by subtracting the average Xco 2 of the study area (389.10 ppm) from cluster-averaged Xco 2 to indicate the difference from the regional background Xco 2 value in each cluster.Xco 2 amplitude is calculated by the maximum monthly cluster-averaged Xco 2 minus the corresponding minimum in each cluster.The strength of emissions is derived from the emissions in the target cluster divided by the maximum emission value of all clusters (Cluster 1).The strength of NPP is calculated in the same way.
The fraction of land cover is the fraction of low vegetation (L), forest (F), and non-vegetated areas (N), in each cluster derived from land cover data.The correlation coefficients (p-value < 0.05) of the Xco 2 amplitude versus the NPP of all grid cells within a cluster are also listed in this table.The correlation between Xco 2 and emissions of all grid cells within a cluster is very small and not significant, and is therefore not listed in Table 1.* The land cover types "grasses/cereal", "shrubs", "broadleaf crops" and "savanna" were grouped into type "L"; "evergreen broadleaf forest", "deciduous broadleaf forest", "evergreen needleleaf forest" and "deciduous needle-leaf forest" were grouped into type "F"; "unvegetated" and "urban" areas were grouped into type "N".
In this section, the anthropogenic emissions and vegetation absorptions described in Sections 3.1 and 3.2, and the seasonal changing patterns of Xco 2 of each cluster shown in Figure 9, will be combined to analyze the attribute characteristics of the 14 clusters listed in Table 1.As a result, the 14 clusters can be divided into three flux types, as outlined below.
(1) Strong Emission Type: Clusters 1 and 2 Clusters 1 and 2 as a group, strongly tends to present sources regions as they show the highest Xco 2 (390.6 ppm and 390.3 ppm in average, as shown in Table 1), and the largest positive contrast (1.5 ppm and 1.2 ppm).It can be known from bottom-up investigation of carbon emissions (EDGAR) that in these two cluster regions there are very intensive big power fuel plants and dense populations which corresponds to large emissions (Table 1).The strong anthropogenic emissions can disturb the temporal variations of Xco 2 .Consequently, the correlations between the Xco 2 amplitude and NPP do not show a significant correlation although they show a large Xco 2 amplitude (9.3 ppm and 7.8 ppm), which is likewise due to vegetation coverage of more than 90% in these regions.
These results indicate that both Clusters 1 and 2 can be attributed to a source with strong emissions.These clusters show different temporal variations of Xco 2 (Figure 9), even if both of them belong to the strong emission type.Cluster 1 is located in the major grain producing areas mixed with many residential areas in China.Intensive farming and crop vegetation growth have resulted in a significant seasonal amplitude of Xco 2 (9.3 ppm in Table 1) due to crop CO2 absorption.Temporal variations of Xco 2 show that Xco 2 sharply increased from August to October over this cluster compared with Cluster 2, which is likely due to the crops being harvested and large amounts of straw being burned during this period.Cluster 2, located at the paddy fields mixed with the fragmented forests, residential areas, shows that a 7.8 ppm of Xco 2 amplitude, which is slightly less than that in Cluster 1. Their different temporally changing patterns of Xco 2 indicate the effects of different anthropogenic emitting actions on variations of Xco 2 .
(2) Strong Absorption Type: Clusters 7, 12, 13, and 14 Clusters 7, 12, 13, and 14, as a group, tend to be strong sinks, according to the negative contrasts and strong correlations between Xco 2 and NPP, high fractions of vegetation, and less anthropogenic emissions shown in Table 1.Xco 2 values over these cluster regions are lower than the overall average; moreover, the contrasts are −0.1 ppm, −0.6 ppm, −0.8 ppm, and −1.0 ppm, respectively (Table 1).These clusters are located at north of latitude 45°N, and are covered with dense vegetation and forest.Accordingly, the correlations between Xco 2 amplitudes and NPP are significantly larger than 0.55 over these clusters.This demonstrates the effects of vegetation absorption on Xco 2 variations.With strong vegetation absorptions in addition to low anthropogenic emissions over these clusters (Figure 4), they tend to be sinks of atmospheric CO2.These clusters show different temporally changing patterns of Xco 2 (Figure 9), which demonstrate the effects of vegetation absorption.Among the clusters, Clusters 7 and 14 show the highest amplitude of Xco 2 seasonal variation (11.3 ppm and 8.7 ppm, respectively, shown in Table 1), which is likely induced by the highest coverage of mixed broad-leaf with needle-leaf forest.Clusters 12 and 13 show a strong correlation between Xco 2 amplitude and NPP, which is 0.61 and 0.70, respectively, although they have less forest coverage and a high fraction of grasslands.The values of Xco 2 over these clusters as shown in Figure 9, moreover, are almost unchanged from October to January, or they present a low peak around December, which is likely due to the low temperature in the high latitude region.
(3) Tending to Balance Type: Clusters 4, 5, 6, 8, 9, 10 and 11 Clusters 4, 5, 6, 8, 9, 10 and 11 are defined as being a "tending to balance" type.All of their mean Xco 2 values are close to the average level of the whole study area.In addition, Clusters 5, 6, 8, 9, 10 and 11 do not show significant correlations between Xco 2 amplitude and NPP, and there are the small anthropogenic emissions over these regions.Cluster 4 shows slightly higher Xco 2 than the average level, with a 0.4 ppm contrast, mainly due to significant large anthropogenic emission.However, the correlation between the Xco 2 amplitude and NPP is 0.71, which implies that the CO2 enhancement from the anthropogenic emission may be in equilibrium with the vegetation absorption.Cluster 11, located on the Qinghai-Tibet Plateau, shows lower Xco 2 amplitude than the other regions, which are probably owing to its low NPP and high altitude.
Cluster 3 is indistinctly grouped as its abnormally high Xco 2 while there is less anthropogenic emissions (Figure 4).Cluster 3, located at a desert area in China, shows high Xco 2 values, which are likely owing to uncertainties of Xco 2 retrievals from satellite over these high lightness desert [9], and further verification is needed in this region.

Conclusions
In this paper, a k-means cluster analysis method based on the temporally changing features of Xco 2 was proposed for application to the gap-filled ACOS Xco 2 dataset to view spatial pattern of CO2 emissions and absorption in Central-Eastern Asia.14 clusters were obtained by optimizing the clustering results and evaluated using the characteristics of Xco 2 variations combined with emissions data, NPP data, and land cover data.The clustering result demonstrates that each cluster can be related to the typical features of anthropogenic emissions and vegetation absorptions.
The relationships between seasonal variations of Xco 2 , and underlying anthropogenic emissions and vegetation absorption were analyzed respectively.Cluster-averaged Xco 2 tend to correlate with regional emissions, while seasonal amplitude of Xco 2 is highly related to vegetation NPP.Strong anthropogenic emissions may disturb the relationship between the seasonal amplitude of Xco 2 and NPP data.Consequently, the 14 clusters can be divided into three types: strong emission, strong absorption, and a "tending to balance" type.Different clusters, corresponding to different temporally changing patterns of Xco 2 , indicate that the Xco 2 values increase with anthropogenic activities and that Xco 2 reduction is caused by vegetation absorption on a regional scale.
This study shows that the developed cluster-analysis approach based on temporal variation of Xco 2 can effectively provide a way to reveal the spatial patterns of underlying anthropogenic emissions and vegetation absorptions, and therefore enable to us to better understand how the seasonally changing pattern of CO2 concentrations is affected by anthropogenic emissions and vegetation absorptions.The result of clustering can provide the significant monitoring targets of anthropogenic emissions and vegetation absorption to support the implement of regional emissions reduction of carbon.

Figure 1 .
Figure 1.Averaged silhouette values and non-negative ratios of silhouette values based on clustering results of: (a) M-Cluster; (b) S-Cluster.

Figure 2 .
Figure 2. Correlation analysis: (a) Ravg_e between the Xco 2 average and CO2 emissions based on the clustering results; (b) Ramp_n between the Xco 2 amplitude and NPP based on the clustering results.

Figure 3
Figure 3 presents the results of 14 clusters obtained using the clustering method described in Section 2.2, and the mean Xco 2 in each clusters.As shown in the figure, Xco 2 data of Clusters 1 and 2 in Central and Eastern China are the maximum, followed by those of Cluster 3 in Xinjiang province in Western China.Xco 2 data of Clusters 12, 13, and 14, located around Inner Mongolia, Mongolia, Northeastern Kazakhstan, are the minimum.

Figure 3 .
Figure 3. Clustering result based on S-Xco 2 dataset with 14 clusters and the corresponding cluster-averaged Xco 2 data from 2010 to 2012.
presents the annual mean NPP in 2010 of each grid cell, which can indicate the ability of vegetation to absorb CO2. Figure 6b is the average of CO2 fluxes in 1° × 1° grid cells derived from original NPP data.

Figure 6 .
Figure 6.(a) Seasonal amplitude of Xco 2 ; (b) grid map of annual NPP in 2010 overlapped with the optimal cluster result (black lines).

Figure 9 .
Figure 9. Variation of monthly averaged Xco 2 in each Cluster (number of clusters k = 14).Blue, green, and red lines correspond to average, minimum, and maximum Xco 2 values in each cluster, respectively, derived from the gap-filled Xco 2 dataset.Error bars represent one standard deviation of Xco 2 , and the grey scatter points correspond to original observations from the ACOS-Xco 2 dataset.

Table 1 .
Attribute characteristics of each cluster from the 2010 data.