Geostatistical Analysis of CH4 Columns over Monsoon Asia Using Five Years of GOSAT Observations

The aim of this study is to evaluate the Greenhouse gases Observation SATellite (GOSAT) column-averaged CH4 dry air mole fraction (XCH4) data by using geostatistical analysis and conducting comparisons with model simulations and surface emissions. Firstly, we propose the use of a data-driven mapping approach based on spatio-temporal geostatistics to generate a regular and gridded mapping dataset of XCH4 over Monsoon Asia using five years of XCH4 retrievals by GOSAT from June 2009 to May 2014. The prediction accuracy of the mapping approach is assessed by using cross-validation, which results in a significantly high correlation of 0.91 and a small mean absolute prediction error of 8.77 ppb between the observed dataset and the prediction dataset. Secondly, with the mapping data, we investigate the spatial and temporal variations of XCH4 over Monsoon Asia and compare the results with previous studies on ground and other satellite observations. Thirdly, we compare the mapping XCH4 with model simulations from CarbonTracker-CH4 and find their spatial patterns very consistent, but GOSAT observations are more able to capture the local variability of XCH4. Finally, by correlating the mapping data with surface emission inventory, we find the geographical distribution of high CH4 values correspond well with strong emissions as indicated in the inventory map. Over the five-year period, the two datasets show a significant high correlation coefficient (0.80), indicating the dominant role of surface emissions in determining the distribution of XCH4 concentration in this region and suggesting a promising statistical way of constraining surface CH4 sources and sinks, which is simple and easy to implement using satellite observations over a long term period.


Introduction
Atmospheric methane (CH 4 ) is the second most important anthropogenic greenhouse gas after carbon dioxide (CO 2 ). Under the driving influence of human activities, such as fossil fuel combustion, global averaged CH 4 concentration has increased from pre-industrial levels of 722 ppb to 1833 ppb in 2014 [1] and has kept an average growth rate of about 6 ppb per year since 2007 [2]. CH 4 has a stronger global warming potential than carbon dioxide; therefore, monitoring CH 4 might be a more effective way to mitigate the short-term greenhouse effects [3].
Current ground-based observation station networks have provided long-term greenhouse gas observations with high precision; however, the stations are still sparse and inhomogeneous [4]. Satellite Previous studies based on satellite and ground-based observations [7,[23][24][25] have showed that Monsoon Asia is one of the highest CH4 emission regions in the world, due to its strong anthropogenic emissions, such as those from rice paddies, and natural emissions, such as those from natural wetlands. The spatial and temporal variations of CH4 over this region have been investigated in several previous studies. Xiong et al. [26] investigated the CH4 enhancement in the middle to upper troposphere in South Asia from July to September by comparing satellite CH4 retrievals from the Atmospheric Infrared Sounder (AIRS) on the Earth Observing System (EOS)/Aqua platform with the model simulations from global tracer transport model version 3 (TM3), and concluded that the enhancement is contributed by both local surface emissions and convective transport processes. This study is complemented with the work by Qin et al. [27], which analyzed the spatio-temporal variations of XCH4 over the Sichuan Basin using GOSAT data, and attributed the consistent high XCH4 values in the basin to the strong regional paddy field emissions and typical closed topography of the basin. Moreover, Ishizawa et al. [28] analyzed the large XCH4 anomaly in summer 2013 over Northeast Asia, as observed by GOSAT and ground-based instruments, and attributed the anomaly to the transport of CH4-rich air to Japan from strong CH4 source areas in East China. A more comprehensive study by Hayashida et al. [25] conducted detailed analysis on the characteristics of XCH4 data in Monsoon Asia using SCIAMACHY retrievals, and concluded that the distribution of high XCH4 values agreed well with regional emissions over this region. Though insights on CH4 distribution and variation were gained from these studies, they were only focusing on some specific regions and using simple data binning and averaging methods to generate monthly or seasonal data for their analysis. The binning and averaging method ignores the XCH4 variation within the binning blocks and lacks quantitative uncertainty assessment [15]. In addition, the GOSAT XCH4 data, which have higher precision than SCIAMACHY data, over the whole Monsoon Asia region, has not been studied and assessed, mainly due to the limitation of the data coverage. In this study, we aim to evaluate five years of GOSAT XCH4 retrievals over Monsoon Asia by first proposing the use of spatiotemporal geostatistics for mapping the XCH4 data over the Monsoon Asia region using five years of GOSAT data. This spatio-temporal geostatistical approach has the advantage of utilizing the joint spatial and temporal dependences between observations and including a larger dataset both in space and time to support stable parameter estimation and prediction [11,29]. With the mapping XCH4 data, we further analyze the spatio-temporal variations of XCH4, and compare the results with model simulations and surface CH4 emissions from bottom-up estimation.
In Section 2, we describe all the used data, and in Section 3, we introduce the spatio-temporal geostatistics and evaluation method of mapping precision. Sections 4 and 5 demonstrate results and discussions from analysis of the spatio-temporal XCH4 variations, comparison with model simulations and correlation with surface emissions. The summary is presented in Section 6. Previous studies based on satellite and ground-based observations [7,[23][24][25] have showed that Monsoon Asia is one of the highest CH 4 emission regions in the world, due to its strong anthropogenic emissions, such as those from rice paddies, and natural emissions, such as those from natural wetlands. The spatial and temporal variations of CH 4 over this region have been investigated in several previous studies. Xiong et al. [26] investigated the CH 4 enhancement in the middle to upper troposphere in South Asia from July to September by comparing satellite CH 4 retrievals from the Atmospheric Infrared Sounder (AIRS) on the Earth Observing System (EOS)/Aqua platform with the model simulations from global tracer transport model version 3 (TM3), and concluded that the enhancement is contributed by both local surface emissions and convective transport processes. This study is complemented with the work by Qin et al. [27], which analyzed the spatio-temporal variations of XCH 4 over the Sichuan Basin using GOSAT data, and attributed the consistent high XCH 4 values in the basin to the strong regional paddy field emissions and typical closed topography of the basin. Moreover, Ishizawa et al. [28] analyzed the large XCH 4 anomaly in summer 2013 over Northeast Asia, as observed by GOSAT and ground-based instruments, and attributed the anomaly to the transport of CH 4 -rich air to Japan from strong CH 4 source areas in East China. A more comprehensive study by Hayashida et al. [25] conducted detailed analysis on the characteristics of XCH 4 data in Monsoon Asia using SCIAMACHY retrievals, and concluded that the distribution of high XCH 4 values agreed well with regional emissions over this region. Though insights on CH 4 distribution and variation were gained from these studies, they were only focusing on some specific regions and using simple data binning and averaging methods to generate monthly or seasonal data for their analysis. The binning and averaging method ignores the XCH 4 variation within the binning blocks and lacks quantitative uncertainty assessment [15]. In addition, the GOSAT XCH 4 data, which have higher precision than SCIAMACHY data, over the whole Monsoon Asia region, has not been studied and assessed, mainly due to the limitation of the data coverage. In this study, we aim to evaluate five years of GOSAT XCH 4 retrievals over Monsoon Asia by first proposing the use of spatio-temporal geostatistics for mapping the XCH 4 data over the Monsoon Asia region using five years of GOSAT data. This spatio-temporal geostatistical approach has the advantage of utilizing the joint spatial and temporal dependences between observations and including a larger dataset both in space and time to support stable parameter estimation and prediction [11,29]. With the mapping XCH 4 data, we further analyze the spatio-temporal variations of XCH 4 , and compare the results with model simulations and surface CH 4 emissions from bottom-up estimation.
In Section 2, we describe all the used data, and in Section 3, we introduce the spatio-temporal geostatistics and evaluation method of mapping precision. Sections 4 and 5 demonstrate results and discussions from analysis of the spatio-temporal XCH 4 variations, comparison with model simulations and correlation with surface emissions. The summary is presented in Section 6.

GOSAT XCH 4 Retrievals
We collected five years of GOSAT Level 2 XCH 4 data products (version 02.xx) for General Users [8], ranging from June 2009 to May 2014 over the land area of the Monsoon Asia region, as shown in Figure 1a. Validated by the Total Carbon Column Observing Network (TCCON), the GOSAT XCH 4 data show a mean global bias of´5.9 ppb and standard deviation of 12.6 ppb [30]. GOSAT has a recurrent cycle of three days, and, therefore, in this study we set the basic time unit of the mapping data to be three days. The instantaneous field of view of TANSO-FTS is 15.8 mrad, corresponding to a nadir footprint diameter of 10.5 km approximately. Figure 1 shows the spatial distribution and temporal variation of the available data number from satellite observations. The total number of retrievals is irregularly distributed both in space and time. From Figure 1a, it can be seen that the number of retrievals is larger over northern China and India than the western China and high latitude areas, mainly due to the limitations of clear sky conditions or large solar zenith angle. From Figure 1b, we can see the number of retrievals is larger in winter than in summer when Monsoon brings much more cloudiness in this region.

CH 4 Simulations from CarbonTracker
CarbonTracker-CH 4 is an assimilating system for atmospheric CH 4 developed by the National Oceanic and Atmospheric Administration (NOAA). Coupled with the atmospheric tracer transport model version 5 (TM5), CarbonTracker-CH 4 assimilates global atmosphere CH 4 observations from surface air samples and tall towers to track global CH 4 sources and sinks [31,32]. The current release of CarbonTracker-CH 4 , CT2010, produces estimates of global atmospheric CH 4 mole fractions and surface-atmosphere fluxes from January 2000 to December 2010. The model produces regularly gridded CH 4 at a spatial resolution of 4˝in latitude by 6˝in longitude with 34 vertical levels, and with a temporal resolution of one day. In comparison with surface observations, the model outputs show a mean bias of´10.4 ppb [32]. The CH 4 mole fractions profile data from the model are transformed to XCH 4 by using the pressure-averaged method described by Connor et al. [33]. When comparing XCH 4 data from GOSAT and CT2010, the difference due to averaging kernel effect [34] is not considered here, since the difference between applying and not applying the averaging kernel smoothing on model simulations is very small, which is less than 0.1% in XCO 2 as shown in Cogan et al. [35].

Emission Inventory from EDGAR
CH 4 Emission inventory data are derived by bottom-up estimates using CH 4 emissions statistics from human activities and natural processes, which corresponds to anthropogenic sources and natural sources, respectively. In this study, we collected annual CH 4 emissions data in 2010 from Emission Database for Global Atmospheric Research (EDGAR) in version 4.2 FT2010, released by the European Commission's Joint Research Centre and Netherlands Environmental Assessment Agency [36]. The inventory data have a spatial resolution of 0.1˝. The EDGAR CH 4 emission data have been used as a fundamental reference for many surface emission related studies [37].

Methodology
We applied the spatio-temporal geostatistics to the five years of GOSAT XCH 4 retrievals to produce the XCH 4 mapping data product in spatial resolution of 1˝by 1˝and temporal resolution of three days. A theoretical framework of the spatio-temporal geostatistical mapping approach for XCO 2 can be found in Zeng et al. [19]. Here, we introduce the essential procedures of implementing spatio-temporal geostatistics and describe some adjustments necessary for XCH 4 . Spatio-temporal geostatistical methods are concerned with analyzing the spatial-temporal correlation structure of regional variables using the variogram, and optimal prediction (kriging) of the variable at an unsampled location and time [11]. The implementation of spatio-temporal geostatistics in this study can be divided into the following three main steps, analysis of XCH 4 trend in space and time, modeling of correlation structure, and space-time kriging prediction.

Analysis of XCH 4 Trend in Space and Time
The spatio-temporal variations of XCH 4 data can be represented by a variable Z " tZ ps, tq|s P S, t P Tu that varies within a spatial domain S and time interval T. We decompose the spatio-temporal variations into an inherent and deterministic trend component m ps, tq and a stochastic residual component R ps, tq, as shown in Equation (1), Z ps, tq " m ps, tq`R ps, tq . ( The deterministic spatial trend is determined by the spatial distribution of CH 4 sources and sinks, while the deterministic temporal trend, which consists of the inherent seasonal CH 4 cycle and an annual CH 4 increase, is determined by temporal variation of CH 4 sources and sinks [2]. As in Zeng et al. [19], we used a linear surface function for modeling the spatial trend, and a set of annual harmonic functions with a simple linear function for the temporal trend, as shown in Equation (2), m ps, tq " ra 0`a1˚s p1q`a 2˚s p2q s`ra 3˚t`ÿ 4 i"1 pβ i sin piωtq`γ i cos piωtqqs, where ω " 2˚π 12 since the period is 12 months, s p1q and s p2q are latitude and longitude of the spatial location, t is time in months, and a 0´3 , β 1´4 and γ 1´4 are parameters to be estimated using the least-squares technique. The modeled trend components in space and time are shown in Figure 2. An increasing spatial trend from north to south can be clearly seen, which is determined by the distribution of surface emissions [25]. On the other hand, the temporal trend shows a clear annual increase and a seasonal cycle. The trend component is then subtracted from the full dataset to produce the residual component for the following analysis. 5 unsampled location and time [11]. The implementation of spatio-temporal geostatistics in this study can be divided into the following three main steps, analysis of XCH4 trend in space and time, modeling of correlation structure, and space-time kriging prediction.
The deterministic spatial trend is determined by the spatial distribution of CH4 sources and sinks, while the deterministic temporal trend, which consists of the inherent seasonal CH4 cycle and an annual CH4 increase, is determined by temporal variation of CH4 sources and sinks [2]. As in Zeng et al. [19], we used a linear surface function for modeling the spatial trend, and a set of annual harmonic functions with a simple linear function for the temporal trend, as shown in Equation (2), where ω = * π since the period is 12 months, (1) and (2) are latitude and longitude of the spatial location, t is time in months, and a , β and γ are parameters to be estimated using the leastsquares technique. The modeled trend components in space and time are shown in Figure 2. An increasing spatial trend from north to south can be clearly seen, which is determined by the distribution of surface emissions [25]. On the other hand, the temporal trend shows a clear annual increase and a seasonal cycle. The trend component is then subtracted from the full dataset to produce the residual component for the following analysis.

Modeling Correlation Structure of XCH4
The spatio-temporal correlation structure of the XCH4 data describes the correlation between any two points separated in space and time. The correlation structure can be characterized by the spatiotemporal variogram model [11]. The experimental variogram at spatial lag ℎ and temporal lag ℎ , 2 (ℎ , ℎ ), is computed by Equation (3), where (ℎ , ℎ ) is the number of pairs in the space and time lags. Following Zeng et al. [19], we set the intervals for spatial lag and temporal lag to be 100 km and 3-day, respectively. That is, the

Modeling Correlation Structure of XCH 4
The spatio-temporal correlation structure of the XCH 4 data describes the correlation between any two points separated in space and time. The correlation structure can be characterized by the spatio-temporal variogram model [11]. The experimental variogram at spatial lag h s and temporal lag h t , 2γ ph s , h t q, is computed by Equation (3), where N ph s , h t q is the number of pairs in the space and time lags. Following Zeng et al. [19], we set the intervals for spatial lag and temporal lag to be 100 km and 3-day, respectively. That is, the tolerances for spatial and temporal lags, as defined in De Iaco et al. [29], in calculating the variogram, are set to be 50 km and 1.5 days, respectively. In this study, the distance between observations is defined by the great-circle distance based on the latitude and longitude of the observations. Half of a variogram quantity,γ ph s , h t q, has been called a semi-variance. A spatio-temporal variogram model, γ ph s , h t q, is then fitted to experimental variogram. The spatio-temporal variogram model adopted here is a combination of the product-sum model [29,38] and an extra global nugget N ST to capture the nugget effect [10], and is given by Equation (4), where γ S phq and γ T puq are the exponential marginal variograms in space and time, and κ is a constant to be estimated. For modeling the spatio-temporal variogram in this study, all parameters are estimated simultaneously as in Zeng et al. [19] instead of sequentially as in De Iaco et al. [29]. As shown in Zeng et al. [19], this simultaneous estimation method is more precise for GOSAT data than the conventional sequential estimation method. The iterative nonlinear weighted least squared technique [10] has been used to estimate the model parameters. The calculated experimental spatio-temporal variogram and the corresponding modeled spatio-temporal variogram are shown in Figure 3. In the variogram, the semi-variance values when spatial and temporal lag are closest to 0 and infinity are called nugget and sill, respectively. The ratio of nugget to sill, expressed as a percentage, can be used as an indicator to classify data dependence [39]. In general, a ratio less than 0.25 indicates a strong spatial or temporal dependence, between 0.25 and 0.75, a moderate spatial or temporal dependence, and larger than 0.75 a weak spatial or temporal dependence. In this study, the ratio of nugget to spatial sill, temporal sill and global sill are 0.20, 0.48 and 0.16, respectively, indicating a strong dependence between XCH 4 retrievals in space, a moderate dependence between XCH 4 retrievals in time, and a strong overall spatio-temporal dependence.
Remote Sens. 2016, 82016, 8, 361 6 tolerances for spatial and temporal lags, as defined in De Iaco et al. [29], in calculating the variogram, are set to be 50 km and 1.5 days, respectively. In this study, the distance between observations is defined by the great-circle distance based on the latitude and longitude of the observations. Half of a variogram quantity, (ℎ , ℎ ), has been called a semi-variance. A spatio-temporal variogram model, (ℎ , ℎ ), is then fitted to experimental variogram. The spatio-temporal variogram model adopted here is a combination of the product-sum model [29,38] and an extra global nugget to capture the nugget effect [10], and is given by Equation (4), where (ℎ) and ( ) are the exponential marginal variograms in space and time, and κ is a constant to be estimated. For modeling the spatio-temporal variogram in this study, all parameters are estimated simultaneously as in Zeng et al. [19] instead of sequentially as in De Iaco et al. [29]. As shown in Zeng et al. [19], this simultaneous estimation method is more precise for GOSAT data than the conventional sequential estimation method. The iterative nonlinear weighted least squared technique [10] has been used to estimate the model parameters. The calculated experimental spatiotemporal variogram and the corresponding modeled spatio-temporal variogram are shown in Figure  3. In the variogram, the semi-variance values when spatial and temporal lag are closest to 0 and infinity are called nugget and sill, respectively. The ratio of nugget to sill, expressed as a percentage, can be used as an indicator to classify data dependence [39]. In general, a ratio less than 0.25 indicates a strong spatial or temporal dependence, between 0.25 and 0.75, a moderate spatial or temporal dependence, and larger than 0.75 a weak spatial or temporal dependence. In this study, the ratio of nugget to spatial sill, temporal sill and global sill are 0.20, 0.48 and 0.16, respectively, indicating a strong dependence between XCH4 retrievals in space, a moderate dependence between XCH4 retrievals in time, and a strong overall spatio-temporal dependence.

Space-Time Kriging Prediction of XCH4
Based on the modeled spatio-temporal variogram, space-time kriging can be implemented to predict the XCH4 value at unobserved locations using surrounding observations and produce the mapping XCH4 over Monsoon Asia. The kriging makes an optimal prediction ( , ) as the linear weighted sum of the surrounding GOSAT XCH4 observations that minimizes the mean squared prediction error, as shown in Equation (5), where λ is the weight assigned to a known sample ( , ). The weights for the observations are determined by the geometry of the observations and the modeled spatio-temporal variogram. The kriging prediction variance, a measurement of prediction uncertainty, is given by Equation (6),

Space-Time Kriging Prediction of XCH 4
Based on the modeled spatio-temporal variogram, space-time kriging can be implemented to predict the XCH 4 value at unobserved locations using surrounding observations and produce the mapping XCH 4 over Monsoon Asia. The kriging makes an optimal prediction R ps 0 , t 0 q as the linear weighted sum of the surrounding GOSAT XCH 4 observations that minimizes the mean squared prediction error, as shown in Equation (5), where λ i is the weight assigned to a known sample R ps i , t i q. The weights for the observations are determined by the geometry of the observations and the modeled spatio-temporal variogram. The kriging prediction variance, a measurement of prediction uncertainty, is given by Equation (6), where Γ pi, jq " γ`ˇˇs i´sj |, | t i´tjˇ˘, γ 0 pi, 1q " γ p|s i´s0 | , |t i´t0 |q, and 1 is nˆ1 unit vector.
In each kriging prediction, theoretically, the optimal prediction is achieved when using all available observations. However, the total number of N = 51742 in this study makes it not practically feasible for the kriging calculation because, to solve the inversion of such N by N matrix for each prediction, is almost computationally impossible. Following Zeng et al. [19], we used the kriging neighborhood, which is a cylinder in space and time, to define the surrounding data. The radii of the neighborhood vary from 200 to 300 km in space and 15 to 45 days in time to make sure the number of observations in the cylinder neighborhood is no less than 20. In order to verify the validity of the used kriging neighborhood in representing local data variability, two key parameters [40], weight of the mean (WM) and slope of the regression (SR) are used in this study. WM indicates the dependency of kriging prediction on the data in the neighborhood, while SR shows whether the neighborhood size is suitable for a valid kriging. The more WM is closer to zero, and the more SR is closer to one will indicate a better kriging neighborhood. Our kriging prediction results in an averaged WM = 0.040 and SR = 0.964 for the used cylinder neighborhood. The results show a low WM closer to zero and high SR closer to one, which indicate that the chosen kriging neighborhood is suitable in representing local variability, and the prediction would not be much improved by searching further. The mapping XCH 4 are described in detail in Section 4, and a discussion on prediction uncertainties of the mapping XCH 4 is in Section 5.

Precision Evaluation Using Cross-Validation
Cross-validation is a widely used method for assessing prediction accuracy of statistical models [40]. In this study, we used the leave-one-out cross-validation to assess the prediction accuracy of the spatio-temporal geostatistical approach for mapping the GOSAT XCH 4 observations. A detailed description of the method can be referred to in Cressie [10]. The cross-validation is carried out by first removing one observation datum Z`s j , t j˘a nd then making its predictionẐ`s j , t j˘u sing the remaining dataset. This process is repeated for each observation in Z ps, tq. As a result, two datasets, the predicted dataset {Ẑ`s j , t j˘j = 1 . . . N} and the corresponding original dataset {Z`s j , t j˘j = 1 . . . N} can be obtained. The following three summary statistics from the two datasets were derived to assess prediction precision. The correlation coefficient (R) between Z ps, tq andẐ ps, tq is 0.91, the mean absolute prediction error (MAE; ř N j"1ˇZ`sj , t j˘´Ẑ`sj , t j˘ˇ{ N) is 8.77 ppb, and the percentage of prediction error less than 15 ppb is 82%. These cross-validation results show a significant high correlation coefficient and small prediction errors between the original XCH 4 data and the predicted XCH 4 data. The histogram of the prediction errors are shown in Figure 4. It indicates that the spatio-temporal geostatistical prediction method developed for GOSAT XCH 4 data is effective, and we can therefore generate precise maps of XCH 4 over Monsoon Asia using the method from the five years of GOSAT XCH 4 observations. Moreover, we compare the prediction accuracy between spatio-temporal and spatial-only method based on the cross-validation statistics. The spatial-only method implemented here is similar to that for generating monthly GOSAT Level 3 data [13]. The whole XCH 4 dataset is firstly grouped into monthly datasets (five years = 60 months), and then the variogram for each month is calculated and modeled by the exponential variogram model. Finally, based on the monthly modeled variograms, spatial-only kriging is applied for predictions of XCH 4 in each month. As a result, R and MAE for the spatial-only method are 0.90 and 8.97 ppb, respectively. Compared with this spatial-only method, we can see that the spatio-temporal method shows a slightly higher correlation coefficient and lower prediction error.  Figure 5 shows the overall variations of mapping XCH4 in space and time. Spatial variation of XCH4 from south to north and strong seasonal variation can be observed from Figure 5a,b. Spatially, the XCH4 shows an obvious change with latitude, as shown in Figure 5a, in which higher averaged values can be seen in area between 18°N to 32°N. This latitude contrast shows stronger emissions of CH4 in southern Asia than northern Asia, which is further shown in Figure 6. Temporally, the averaged XCH4 is increasing from year to year, and the annual mean increase is 5.97 ppb over the five-year period. This annual mean increase is consistent with the annual mean increase of 5.60 ppb from surface observations from 2007 to 2013 [41]. Seasonally, the averaged seasonal cycle amplitude is 22.47 ppb, with maximum in August and minimum around February and March, as shown in Figure 5b. However, the seasonal cycle peak generally varies with latitude, as shown in Figure 5a. In areas south of 25°N, the peak appears around September, while in areas north of 30°N, it appears around July and August. The pattern of seasonal cycle peak is consistent with the result from SCIAMACHY as described in Figure 6a of Hayashida et al. [25]. They found that the Monsoon Asia region includes different phases of seasonal variation of XCH4, which also corresponds to a wide range of climate zones in this region.   Figure 5 shows the overall variations of mapping XCH 4 in space and time. Spatial variation of XCH 4 from south to north and strong seasonal variation can be observed from Figure 5a,b. Spatially, the XCH 4 shows an obvious change with latitude, as shown in Figure 5a, in which higher averaged values can be seen in area between 18˝N to 32˝N. This latitude contrast shows stronger emissions of CH 4 in southern Asia than northern Asia, which is further shown in Figure 6. Temporally, the averaged XCH 4 is increasing from year to year, and the annual mean increase is 5.97 ppb over the five-year period. This annual mean increase is consistent with the annual mean increase of 5.60 ppb from surface observations from 2007 to 2013 [41]. Seasonally, the averaged seasonal cycle amplitude is 22.47 ppb, with maximum in August and minimum around February and March, as shown in Figure 5b. However, the seasonal cycle peak generally varies with latitude, as shown in Figure 5a. In areas south of 25˝N, the peak appears around September, while in areas north of 30˝N, it appears around July and August. The pattern of seasonal cycle peak is consistent with the result from SCIAMACHY as described in Figure 6a of Hayashida et al. [25]. They found that the Monsoon Asia region includes different phases of seasonal variation of XCH 4 , which also corresponds to a wide range of climate zones in this region.  Figure 5 shows the overall variations of mapping XCH4 in space and time. Spatial variation of XCH4 from south to north and strong seasonal variation can be observed from Figure 5a,b. Spatially, the XCH4 shows an obvious change with latitude, as shown in Figure 5a, in which higher averaged values can be seen in area between 18°N to 32°N. This latitude contrast shows stronger emissions of CH4 in southern Asia than northern Asia, which is further shown in Figure 6. Temporally, the averaged XCH4 is increasing from year to year, and the annual mean increase is 5.97 ppb over the five-year period. This annual mean increase is consistent with the annual mean increase of 5.60 ppb from surface observations from 2007 to 2013 [41]. Seasonally, the averaged seasonal cycle amplitude is 22.47 ppb, with maximum in August and minimum around February and March, as shown in Figure 5b. However, the seasonal cycle peak generally varies with latitude, as shown in Figure 5a. In areas south of 25°N, the peak appears around September, while in areas north of 30°N, it appears around July and August. The pattern of seasonal cycle peak is consistent with the result from SCIAMACHY as described in Figure 6a of Hayashida et al. [25]. They found that the Monsoon Asia region includes different phases of seasonal variation of XCH4, which also corresponds to a wide range of climate zones in this region.   Figure 6 shows the seasonal means of XCH 4 over the five-year period. The spatial patterns of XCH 4 distribution in the four seasons are similar. Wetlands and rice paddy fields are among the primary sources of atmospheric CH 4 , which are responsible for the higher XCH 4, which can be observed in all seasons in southern Asia and south-eastern China. This spatial pattern is also well consistent with that from a six-year average (2003 through 2008) of SCIAMACHY XCH 4 retrievals as shown in Figure 14 of Frankenberg et al. [7]. Moreover, obvious seasonal variation can be observed. XCH 4 between summer and autumn is close, as well as between winter and spring, and XCH 4 of the first two seasons are generally higher than that of the latter two seasons. This is because, in this region, summer and autumn are warmer and wetter seasons, as well as the growing seasons of rice, which result in stronger emissions from rice paddies and wetlands compared to the dry season in winter and spring [25,42]. Several high value areas stand out, including the Pearl River Delta region in south China probably due to strong human activity related emissions, the Ganges Delta in south Asia probably due to strong wetland emissions, and the Sichuan Basin in west China. Sichuan Basin shows a consistent high anomalous value in all seasons, which is related to its strong local source as well as the stagnant effect due to its special basin terrain topography [27].

Spatio-Temporal Variations of XCH4
Remote Sens. 2016, 82016, 8, 361 9 Figure 6 shows the seasonal means of XCH4 over the five-year period. The spatial patterns of XCH4 distribution in the four seasons are similar. Wetlands and rice paddy fields are among the primary sources of atmospheric CH4, which are responsible for the higher XCH4, which can be observed in all seasons in southern Asia and south-eastern China. This spatial pattern is also well consistent with that from a six-year average (2003 through 2008) of SCIAMACHY XCH4 retrievals as shown in Figure 14 of Frankenberg et al. [7]. Moreover, obvious seasonal variation can be observed. XCH4 between summer and autumn is close, as well as between winter and spring, and XCH4 of the first two seasons are generally higher than that of the latter two seasons. This is because, in this region, summer and autumn are warmer and wetter seasons, as well as the growing seasons of rice, which result in stronger emissions from rice paddies and wetlands compared to the dry season in winter and spring [25,42]. Several high value areas stand out, including the Pearl River Delta region in south China probably due to strong human activity related emissions, the Ganges Delta in south Asia probably due to strong wetland emissions, and the Sichuan Basin in west China. Sichuan Basin shows a consistent high anomalous value in all seasons, which is related to its strong local source as well as the stagnant effect due to its special basin terrain topography [27].

Comparison with Model Simulations of XCH4
Atmospheric modeling of greenhouse gases, such as CarbonTracker, can reproduce large scale features of the atmospheric CH4 distribution well. Figure 7 shows the annual mean of mapping XCH4 data in 2010 and the corresponding model output of XCH4 in 2010 from CarbonTracker-CH4. We can see that the spatial patterns of mapping data and the modeling well agree with each other, with high XCH4 in southern and south-eastern Asia and low values in northern and north-western Asia, especially the high altitude Tibet area. However, the model data shows a lower XCH4 in the low latitude area than the mapping data, especially in two key regions, including the Sichuan Basin and the Ganges Basin, where CH4 emissions from rice paddy fields and wetlands dominate. In high latitude areas, the model simulations are higher than mapping satellite observations, especially in Tibet, Mongolia and north-eastern China, where XCH4 values are low. The satellite data, on the other

Comparison with Model Simulations of XCH 4
Atmospheric modeling of greenhouse gases, such as CarbonTracker, can reproduce large scale features of the atmospheric CH 4 distribution well. Figure 7 shows the annual mean of mapping XCH 4 data in 2010 and the corresponding model output of XCH 4 in 2010 from CarbonTracker-CH 4 . We can see that the spatial patterns of mapping data and the modeling well agree with each other, with high XCH 4 in southern and south-eastern Asia and low values in northern and north-western Asia, especially the high altitude Tibet area. However, the model data shows a lower XCH 4 in the low latitude area than the mapping data, especially in two key regions, including the Sichuan Basin and the Ganges Basin, where CH 4 emissions from rice paddy fields and wetlands dominate. In high latitude areas, the model simulations are higher than mapping satellite observations, especially in Tibet, Mongolia and north-eastern China, where XCH 4 values are low. The satellite data, on the other hand, show direct observations of XCH 4 and can therefore capture the local variability and present a more detailed characteristic of XCH 4 variation compared to model simulations. More discussions are presented in Section 5.
Remote Sens. 2016, 82016, 8, 361 10 hand, show direct observations of XCH4 and can therefore capture the local variability and present a more detailed characteristic of XCH4 variation compared to model simulations. More discussions are presented in Section 5.

Correlation with Surface Emission Inventory
Surface emission of CH4 is the main driver of the atmospheric XCH4 distributions. Figure 8a shows the annual surface CH4 emission in 2010 derived from EDGAR emission inventory. The spatial distribution pattern of high CH4 emissions is determined by both anthropogenic emissions (such as fossil fuel combustion, biomass burning and rice paddies) and natural emissions (such as natural wetlands). By comparing Figure 8a and Figure 7a, we can see their spatial patterns agree with each other well. It indicates the variation of multi-year averaged XCH4 can well represent the variation of annual averaged CH4 surface emissions. Figure 8b shows the relationship between CH4 surface emissions, which has been rearranged into 1° by 1° grids, and the annual averaged XCH4 value from the mapping dataset as shown in Figure 7a. A significant linear relationship can be observed, with significantly high correlation coefficient of 0.80 and the coefficient of determination (R 2 ) of 0.64 and pvalue less than 0.01. This significant linear relationship indicates the dominant role of surface emissions in determining the distribution of XCH4 concentration over such regions.

Correlation with Surface Emission Inventory
Surface emission of CH 4 is the main driver of the atmospheric XCH 4 distributions. Figure 8a shows the annual surface CH 4 emission in 2010 derived from EDGAR emission inventory. The spatial distribution pattern of high CH 4 emissions is determined by both anthropogenic emissions (such as fossil fuel combustion, biomass burning and rice paddies) and natural emissions (such as natural wetlands). By comparing Figures 8a and 7a, we can see their spatial patterns agree with each other well. It indicates the variation of multi-year averaged XCH 4 can well represent the variation of annual averaged CH 4 surface emissions. Figure 8b shows the relationship between CH 4 surface emissions, which has been rearranged into 1˝by 1˝grids, and the annual averaged XCH 4 value from the mapping dataset as shown in Figure 7a

Correlation with Surface Emission Inventory
Surface emission of CH4 is the main driver of the atmospheric XCH4 distributions. Figure 8a shows the annual surface CH4 emission in 2010 derived from EDGAR emission inventory. The spatial distribution pattern of high CH4 emissions is determined by both anthropogenic emissions (such as fossil fuel combustion, biomass burning and rice paddies) and natural emissions (such as natural wetlands). By comparing Figure 8a and Figure 7a, we can see their spatial patterns agree with each other well. It indicates the variation of multi-year averaged XCH4 can well represent the variation of annual averaged CH4 surface emissions. Figure 8b shows the relationship between CH4 surface emissions, which has been rearranged into 1° by 1° grids, and the annual averaged XCH4 value from the mapping dataset as shown in Figure 7a

Prediction Uncertainty of Mapping XCH 4 Data
In geostatistics, for each prediction, we can get a corresponding kriging variance at the same time, as shown in Equation (6). The variance is determined by both the density of GOSAT XCH 4 observations and the data homogeneity in the neighborhood of the prediction position. Theoretically, in conditions of denser observations and more homogeneous XCH 4 variation surrounding the prediction location there will be lower prediction uncertainty [14]. Therefore, the kriging variance indicates how uncertain each XCH 4 prediction is given the available satellite observations. Figure 9 shows the spatial variation of the kriging standard deviation (root of kriging variance) from the mapping XCH 4 data in Monsoon Asia averaged over a five-year period. The values change from about 11 ppb to 16 ppb with mean of 13.70 ppb. As expected, in areas with denser observations as shown in Figure 1a, including northern China and India, the prediction uncertainty is lower. In some parts of south China and west China, the uncertainty is relatively high due to the relatively sparseness of the satellite observations, and, therefore, more verifications are required for the mapping dataset in these areas. The spatio-temporal geostatistical approach developed in this study has the advantage of utilizing a larger dataset both in space and time to support stable parameter estimation and prediction. Therefore, as more observation data will become available in the coming future, we can anticipate that a more robust mapping dataset with lower uncertainty using spatio-temporal geostatistics can be produced.

Prediction Uncertainty of Mapping XCH4 Data
In geostatistics, for each prediction, we can get a corresponding kriging variance at the same time, as shown in Equation (6). The variance is determined by both the density of GOSAT XCH4 observations and the data homogeneity in the neighborhood of the prediction position. Theoretically, in conditions of denser observations and more homogeneous XCH4 variation surrounding the prediction location there will be lower prediction uncertainty [14]. Therefore, the kriging variance indicates how uncertain each XCH4 prediction is given the available satellite observations. Figure 9 shows the spatial variation of the kriging standard deviation (root of kriging variance) from the mapping XCH4 data in Monsoon Asia averaged over a five-year period. The values change from about 11 ppb to 16 ppb with mean of 13.70 ppb. As expected, in areas with denser observations as shown in Figure 1a, including northern China and India, the prediction uncertainty is lower. In some parts of south China and west China, the uncertainty is relatively high due to the relatively sparseness of the satellite observations, and, therefore, more verifications are required for the mapping dataset in these areas. The spatio-temporal geostatistical approach developed in this study has the advantage of utilizing a larger dataset both in space and time to support stable parameter estimation and prediction. Therefore, as more observation data will become available in the coming future, we can anticipate that a more robust mapping dataset with lower uncertainty using spatio-temporal geostatistics can be produced. Figure 9. Spatial distribution of kriging standard deviations, a measurement of prediction uncertainty, is obtained by averaging the kriging standard deviations over five years of mapping XCH4 dataset in Monsoon Asia. Blank pixels are grids where less than three of the five annual means are available, in which the annual mean is obtained by averaging at least 11 monthly mean data.

Discrepancies in Sichuan Basin between Observation and Modeling
As suggested by Hammerling et al. [43], mapping greenhouse gas concentration data with uncertainty estimation can be used for comparison with simulation data from the atmospheric transport model coupled with carbon flux estimates. As shown in Figure 7, the largest difference between satellite observations and model simulations of XCH4 is located in the Sichuan Basin of west China, where a large anomaly is shown in GOSAT data while not in model data. As suggested by Qin et al. [27], CH4 emission from rice paddy fields and wetlands and the stagnant effect of the basin terrain have the biggest contribution to this anomaly. This same feature can also be seen in satellite observations of CO columns and NO2 columns [44]. However, the model simulations with ground observations assimilated fail to reproduce this local variability over this basin. This may be due to the ineffectiveness of CarbonTracker [23], which assimilates only the sparse ground based observations in representing the local features of XCH4 in this area, where, unfortunately, ground-based observations are not available to constrain the CH4 fluxes. Figure 9. Spatial distribution of kriging standard deviations, a measurement of prediction uncertainty, is obtained by averaging the kriging standard deviations over five years of mapping XCH 4 dataset in Monsoon Asia. Blank pixels are grids where less than three of the five annual means are available, in which the annual mean is obtained by averaging at least 11 monthly mean data.

Discrepancies in Sichuan Basin between Observation and Modeling
As suggested by Hammerling et al. [43], mapping greenhouse gas concentration data with uncertainty estimation can be used for comparison with simulation data from the atmospheric transport model coupled with carbon flux estimates. As shown in Figure 7, the largest difference between satellite observations and model simulations of XCH 4 is located in the Sichuan Basin of west China, where a large anomaly is shown in GOSAT data while not in model data. As suggested by Qin et al. [27], CH 4 emission from rice paddy fields and wetlands and the stagnant effect of the basin terrain have the biggest contribution to this anomaly. This same feature can also be seen in satellite observations of CO columns and NO 2 columns [44]. However, the model simulations with ground observations assimilated fail to reproduce this local variability over this basin. This may be due to the ineffectiveness of CarbonTracker [23], which assimilates only the sparse ground based observations in representing the local features of XCH 4 in this area, where, unfortunately, ground-based observations are not available to constrain the CH 4 fluxes.

Long Term XCH 4 Observations in Constraining Surface Emissions
The distribution and variation of column-averaged greenhouse gas can be attributed to spatial distribution of surface emissions, sinks of the gas, and large-scale eddy-driven transport [45]. For XCH 4 over the Monsoon region, emissions from human-related activities including paddy fields, and natural wetlands are the dominant emissions, while the reaction of CH 4 with hydroxyl radicals (OH), which removes almost 90% of CH 4 [46], is the main sink of atmospheric CH 4 . For a specific CH 4 column, the number of CH 4 molecules going out (loss) and coming in (gain) the column due to transport varies from time to time. However, on a yearly basis, the averaged transport component of CH 4 leads to net gain or loss and can be approximately quantified as an invariable constant. For the CH 4 sinks due to chemical reaction over multi-year periods, the amount of sinks is determined by the total amount and can therefore be quantified as an averaged ratio to the total amount of CH 4 molecules in the atmosphere. Therefore, the multi-year averaged XCH 4 data observed by GOSAT is approximately a remaining ratio of CH 4 emissions from surface after excluding the averaged transport component and chemical sink component. As a result, the distribution of observed XCH 4 averaged over a long term period will be indicating the surface sources of CH 4 , as we see in Figure 8b. This significant correlation between XCH 4 observations and surface emissions provides a promising statistical way, which is simple and easy to implement, of constraining surface CH 4 sources using satellite observations over a long term period.

Conclusions
In this study, we propose the use of a data-driven mapping approach based on spatio-temporal geostatistics to generate a mapping dataset of XCH 4 and evaluate the five years of GOSAT XCH 4 retrievals over Monsoon Asia from June 2009 to May 2014. The prediction precision using cross-validation results in a significantly high correlation of 0.91 and a small mean absolute prediction error of 8.77 ppb between the original dataset and prediction dataset. Spatio-temporal distribution and variation of XCH 4 mapping data agree with results from SCIAMACHY and ground-based observations well. This XCH 4 mapping data also provide a new way to assess the model simulations of CH 4 coupled with atmospheric transport model. Comparison between XCH 4 and model simulations from CarbonTracker-CH 4 shows consistent spatial patterns, but the satellite observations are more able to resolve the local variability of XCH 4 , such as the constant anomaly shown in the Sichuan Basin. Finally, by comparing the mapping data with surface emission inventory, we found that the geographical distribution of high CH 4 values well corresponds to strong emissions as indicated in the inventory map. Over the five-year period, the two datasets show a significantly high correlation coefficient (0.80), indicating the dominant role of surface emissions in determining the distribution of averaged XCH 4 concentrations over such regions and a promising statistical way of constraining surface CH 4 sources using satellite observations over a long term period. In conclusion, this study provides new mapping datasets of XCH 4 over Monsoon Asia derived from satellite observations, and suggests a simple statistical way to constrain the surface CH 4 emissions using long term satellite XCH 4 retrievals. Moreover, this study provides a complementary work to previous studies using SCIAMACHY, and AIRS over this Monsoon Asia region.
However, some assumptions in this study need further verifications. Firstly, spatio-temporal correlation structure is assumed to be similar over the entire Monsoon Asia area, but this is not always true since the spatio-temporal variations in difference locations are different. However, the irregular distribution of the data do not enable us to resolve this problem in this study. Secondly, in geostatistical mapping, we adopted the kriging neighborhood to reduce computational complexity and preserve local variability. Theoretically, the minimum mean square error in kriging is achieved by using a global neighborhood which includes all points. However, a global neighborhood is not practically feasible for kriging calculations because it may lead to a kriging matrix that is too large to be numerically inverted. As a result, the use of a kriging neighborhood might introduce some errors to our results. Gaps in observations are common in XCH 4 retrievals from greenhouse gas observation satellites, such as GOSAT and SCIAMACHY. Therefore, application of the mapping method to SCIAMACHY and comparison between the two data sources will provide us a new way to investigate their differences in this developed geostatistical way. In comparison of prediction accuracy between spatio-temporal and spatial-only approaches using cross-validation, leaving out larger areas of data would conceivably better highlight the advantage of a spatio-temporal approach because the temporal correlation considered in spatio-temporal approach can provide large benefits given the lack of nearby data. This will be our future work. In addition, although cross-validation gives us a comprehensive statistical assessment of prediction accuracy, it is essential to carry out comparison with independent observations. As some TCCON stations are being set up over this Monsoon Asia region, a comparison with TCCON will also be our work in the coming future.