XCO 2 Fusion Algorithm Based on Multi-Source Greenhouse Gas Satellites and CarbonTracker

: In view of the urgent need for high coverage and high-resolution atmospheric CO 2 data in the study of carbon neutralization and global CO 2 change research, this study combines the Kriging interpolation and the Triple Collision (TC) algorithm to fuse three XCO 2 datasets, OCO-2, GOSAT, and CarbonTracker, to obtain a 1 ◦ × 1 ◦ half-monthly average XCO 2 dataset. Through a sub division of the Kriging interpolation, the average coverages of the OCO-2 and GOSAT XCO 2 interpolating datasets are increased by 53.65% and 48.5%, respectively. In order to evaluate the accuracy of the TC fusion dataset, this study used a reliable reference dataset, TCCON data, as the veriﬁcation data. Through comparative analysis, the MAE of the fusion dataset is 0.6273 ppm, RMSE is 0.7683 ppm, and R 2 is 0.8279. It can be seen that the combination of Kriging interpolation and TC algorithm can effectively improve the coverage and accuracy of the XCO 2 dataset.


Introduction
Since the industrial revolution, the concentration of carbon dioxide (CO 2 ) in the atmosphere has continued to rise, which is an important factor causing the greenhouse effect [1,2].In recent years, in order to control the trend of global warming and slow down the gradual increase in atmospheric CO 2 concentration, various countries have formulated corresponding emission reduction measures, but the atmospheric CO 2 concentration is still rising [3].The sixth assessment report of the IPCC in August 2021 revealed that since the 1970s, the rate of global warming has been higher than that of any 50 years in the past 2000 [4].In order to realize the great vision of peak CO 2 emission by 2030 and carbon neutralization by 2060, the monitoring of global atmospheric CO 2 concentration is essential.
The traditional CO 2 ground-based network observation can obtain relatively highprecision data, but its sites are few and unevenly distributed, and it is difficult to establish sites in the ocean and other regions, so it is difficult to obtain sufficient observation information.Therefore, it presents a significant challenge to obtain the distribution information of global carbon sources and sinks.With the development of remote sensing technology, greenhouse gas observation satellites have gradually developed and become an important means by which to obtain greenhouse gas data [5][6][7][8][9][10].Compared with ground-based observation networks, satellite remote sensing observation can cover the Earth's surface to a great extent, but there are also some deficiencies [11,12].For example, due to the observation limitations of passive satellites, the atmospheric carbon dioxide column concentration (XCO 2 ) can only be obtained under weather conditions of no cloud cover or low aerosol concentration [13,14].Due to the restriction of the solar altitude angle, it is also difficult for satellites to observe effective data in high latitudes.Furthermore, because of the error of the inversion algorithm and the limitation of the Earth's surface or atmospheric conditions, the obtained CO 2 concentration data has a certain deviation.
In order to meet the needs of climate research, the inversion accuracy of satellite observation XCO 2 needs to be better than 1% [15].Yang Dongxu proposed a spectral correction method for TANSAT based on the UOL FP algorithm, and verified its XCO 2 inversion product with TCCON data, with an RMSE of 1.59 ppm [16].Ye Hanhan used the optimization estimation method to retrieve XCO 2 , and verified the inversion accuracy of the GF-5 satellite by using the data from TCCON ground sites [17].The results showed that the inversion accuracy of XCO 2 was 0.67%, but the inversion algorithm had insufficient ability to correct the long-range complex atmospheric interference.Matthias Katzfussa and Noel Cressie proposed to use the Bayesian hierarchical spatio-temporal random effect (STRE) model to process AIRS data and compared it to the real dataset of simulation research and global satellite carbon dioxide measurement, but the model only improved the calculation speed [18].Reuter used seven CO 2 inversion algorithms of the GOSAT and SCIAMACHY satellites.Assuming that there were few outliers, they chose the median as the value in a grid to generate a global high-precision CO 2 fusion dataset EMMA [19].However, they did not improve the temporal and spatial resolution of CO 2 data.Nguyenf used the CO 2 data of GOSAT and AIRS, but the method only obtained the CO 2 concentration in the low troposphere [20].Lili Zhang presented a HASM XCO 2 surface obtained by fusing TCCON measurements with GEOS-Chem model results [21].David Leslie used the neural network method to estimate XCO 2 data from OCO-2 measurement, but it was very time-consuming to train the model in practical application [22].Stefan Noël used the Fast atmOspheric traCe gAs retrievaL (FOCAL) algorithm to obtain the results of XCO 2 from GOSAT and GOSAT-2 radiation, but after calculating the collected CO 2 concentration data, they still did not obtain a data product with high precision, high coverage, and high temporal and spatial resolution [23].
In this study, the method of data fusion was selected to combine multiple satellite data, and the uncertainty of data was the key factor to be considered in the process of data fusion.It was necessary to assign reasonable weight to each dataset according to the random error of each dataset.When multi-source data is fused, the random error of data is often weakened according to the best combination scheme, so as to generate an independent dataset with better accuracy.Weighted arithmetic average is a common method to combine multiple datasets.Ideally, the uncertainty evaluation of data needs absolute "truth" as a reference.However, it is difficult to obtain the global truth XCO 2 within the scope of a satellite footprint, and the previous verification is only limited to a few TCCON ground sites.In 1998, Stoffeland [24] developed a data quality evaluation method, triple collision (TC), which can avoid obtaining the "truth value".This method can objectively estimate the random error of three parallel datasets and is applicable to regional and global scales.It has been widely used in various disciplines.In view of this, this study used the Kriging interpolation method to fill the no-data regions of XCO 2 data products from the GOSAT, OCO-2, and carbontracker models, respectively, measured the uncertainty, and integrated the multi-source products to improve the accuracy, coverage, and spatio-temporal resolution through the TC algorithm.

OCO-2 Satellite
OCO (Orbiting Carbon Observatory) [25] was originally scheduled to launch in 2009, but failed to reach orbit due to carrier rocket failure.Two weeks later, NASA immediately launched its backup scheme, OCO-2 (Orbiting Carbon observatory-2).OCO-2 is only equipped with a three-band imaging grating-type hyperspectral CO 2 detector.OCO-2 has glint, nadir, or target modes.In nadir mode, the instrument has high spatial resolution by directly observing the Earth and collecting data along the ground orbit.Glint mode can provide a higher signal-to-noise ratio over the ocean, while target mode is mainly used for the calibration of specific ground sites.The dataset OCO-2_L2_Lite_FP 10r used in this paper is from the Goddard Earth Science Data and Information Services Center (GES DISC) (https://disc.gsfc.nasa.gov/datasets/OCO2_L2_Lite_FP_10r/), accessed on 1 December 2021.

GOSAT Satellite
Launched in January 2009, GOSAT is the first satellite with high spectral resolution and wide spectral coverage, which is specially used to monitor the concentration of major atmospheric greenhouse gases (i.e., CO 2 and CH 4 ) [26].At present, there are mainly two sets of GOSAT data, one is the Japan GOSAT algorithm developed by the Japanese team, and the other is the ACOS algorithm developed by NASA.The dataset ACOS_L2_Lite_FP_9r used is from the Goddard Earth Science Data and Information Services Center (GES DISC) (https://disc.gsfc.nasa.gov/datasets/ACOS_L2_Lite_FP_9r/,accessed on 1 December 2021).

CarbonTracker
NOAA built the world's first business-oriented global carbon assimilation system CT (CarbonTracker) in 2007.CT is an atmospheric assimilation model used to study the temporal and spatial variation characteristics of carbon flux [27,28].It is used to estimate the variation in surface CO 2 absorption and release over time.CT includes an ocean module, a fire point module, a fossil fuel module, a land surface ecosystem module, and an atmospheric transmission module.It obtains the best state of each module through the data assimilation method.CT provides an effective scientific and decision-making tool for estimating carbon flux, but it relies too much on surface observation data.In areas with few surface observation data, the accuracy of CT still needs to be improved.In recent years, with the support of the EUROCOM (European atmospheric transport inversion comparison) framework, scientists strive to achieve the accuracy and consistency of regional high-precision carbon flux estimation, so as to provide strong support for the assessment of climate impact.The CT XCO 2 datasets used are from NOAA CT2019B (https://gml.noaa.gov/ccgg/carbontracker/CT2019B/,accessed on 1 December 2021).
The introduction of data from OCO-2, GOSAT and CarbonTracker is shown in Table 1.

TCCON Data
TCCON (Total Carbon Column Observing Network) sites adopt ground-based Fourier Transform Spectrometers (FTS), and can detect spectral data in the range of 4000~9000 cm −1 in solar radiation [29].Among them, there are two CO 2 weak absorption bands, 6220 cm −1 and 6339 cm −1 .Compared with satellite detection data, the column concentration data of CO 2 , CH 4 , N 2 O, HF, CO, and H 2 O from TCCON are more accurate because the retrieval method can effectively avoid the error impact caused by aerosols and cirrus clouds.Therefore, the XCO 2 data products measured by TCCON have been confirmed as the verification standard of satellite remote sensing data [30][31][32][33][34][35].The main sites of its site selection are still concentrated in North America and Europe, with few sites in East Asia and Oceania.The TCCON data used in this paper is the GGG2014 version, which were obtained from the TCCON Data Archive (https://TCCON.ornl.govaccessed on 1 December 2021).When the solar zenith angle is below 82 • , its XCO 2 error is less than 0.25%.The information of TCCON sites used in this paper is shown in Table 2.The XCO 2 products used in this paper are OCO2_L2_Lite_FP.10R retrieved by the ACOS algorithm, GOSAT ACOS_L2_Lite_FP_9r, and CT2019B covering the period from 2017 to 2018.Due to the coverage of clouds, the scattering and absorption of aerosols in the atmosphere, the spacing of satellite orbits, and the limitations of the inversion algorithm, many observation results are determined as bad-quality data.Therefore, data screening was carried out according to the ACOS use guide, and only the data with "xco2_quality_flag" equal to 1 were used to eliminate the data the quality of which does not meet the requirements of OCO-2 and GOSAT [30].

Interpolation Algorithm
OCO-2 and GOSAT have a large number of areas without XCO 2 observations, especially in high latitudes, oceans, and other areas.Therefore, based on the Kriging interpolation method [44], this paper realized the filling of these blank areas by combining the inherent change trend in time and space and other relevant characteristics of XCO 2 data.The Kriging interpolation method, also known as the spatial autocovariance optimal interpolation method, obtains the distance range affecting the interpolated points by analyzing the variation distribution of spatial attributes at the spatial position.According to the sampling points within this range, the attribute value of the interpolated points is estimated.This paper adopted the method of ordinary Kriging interpolation.
The sample point to be measured is obtained from the measured value of sample points, and the estimated value of the sample point to be measured is: where, λ (i = 1,2,3. .., n) is the weight coefficient, which is a set of optimal coefficients that can meet the minimum difference between the estimated value and the real value at the point.The weight coefficient needs to meet the following two conditions: (1) the estimation is unbiased, that is, the mathematical expectation of the deviation is zero; (2) the estimation is optimal, that is, the variance between the estimated value and the actual value is the smallest.As shown in (2).
where: ẑo is the estimated value and z o is the real value.Satisfy Kriging equations, as shown in (3).
where: x i and x 0 represent the position of the ith sample and unknown point, and µ is the Lagrange constant.The weight coefficient can be obtained by solving the Kriging equations, and then the estimated value can be obtained through (1).
Variogram plays an important role in Kriging interpolation, which mainly studies the relationship and spatial structure of regionalized variables [45].In this paper, Kriging interpolation was carried out based on the screened data.By analyzing the spatial characteristics of the XCO 2 data and the accuracy results of different Kriging variogram, the exponential function model is selected.The variogram model formula used is as follows: where, h is lag distance, c 0 , c, and α are the interpolation model parameters to be estimated.
Due to the slow flow rate of CO 2 in the atmosphere, we believe that the distribution of CO 2 does not exhibit significant differences in its properties in various directions.Therefore, the isotropic parameters of the Kriging interpolation used in this study are true.
In addition, according to the interpolation algorithm of GOSAT XCO 2 L3 products, the world is divided into six latitude zones, including 90 Taking latitude zones and seasons as the division criteria, set different search ranges to find observation points.(Table 3, refer to "Algorithm theoretical basis document for GOSAT TANSO-FTS L3 (NIES-GOSAT-PO-017)")

TC Fusion Algorithm
The Triple Collocation (TC) algorithm can evaluate the random error of three independent datasets [46].Assuming that the error is irrelevant, each observation data in the dataset can be expressed as: where, θ N i an ε N i respectively represent the Nth observation of the ith dataset and its corresponding random error, and i ∈ {1, 2, 3} .θ N is the true value corresponding to the Nth observation.There may be different degrees of systematic deviation between the observed dataset and the true value.Assuming that there is a linear relationship between the observation and the true value, (5) can be written as: where, a i and b i are the regression coefficient between the observation and the true value.
The TC algorithm obtains the error term ε i in ( 6) by calculating the covariance between the datasets: where, σ 2 θ is the variance of the true dataset θ.Since the TC algorithm needs to meet the following assumptions: 1  (7) can be rewritten as: According to (8), the estimation expression of random error term ε i is: GOSAT, OCO-2, and CT XCO 2 have different prior distributions and mean kerns, which describe the sensitivity of the true state of the atmosphere.Recent studies have shown that the influence of column averaging on satellite XCO 2 is relatively small compared to the measurement accuracy, so the influence of different mean cores and the difference in the distribution of prior CO 2 are not considered in this paper.The three XCO 2 datasets used in this study have their own characteristics and uncertainties due to different observation frequencies or inversion algorithms.On this basis, the uncertainties were quantitatively analyzed using the TC algorithm, and the three datasets are further fused by linearly combining the three datasets of GOSAT, OCO-2, and CT: where, θ new is the fused dataset; w i are weight coefficients, θ 1 , θ 2 , θ 3 are the OCO-2, GOSAT, and CT dataset, respectively.In order to ensure that the fusion XCO 2 value is unbiased, it is needed to meet the condition: where w i can be determined by the random error of TC estimation:

Accuracy Evaluation Index
After Kriging interpolation and fusing XCO 2 using the TC algorithm, it was necessary to evaluate the accuracy of the fusion result by comparing the fused XCO 2 value with the measured value of TCCON.The smaller the difference, the better the fusion result.Three indexes, R 2 , mean absolute error (MAE), and root mean square error (RMSE), were selected to evaluate the accuracy of the result.The determination coefficient R 2 can be used to evaluate the consistency between the fusion result and the measured value.The value range of goodness of fit R 2 is from 0 to 1.The larger the R 2 value, the better the fitting between the XCO 2 fusion result and the TCCON measured value.The formula is as follows.
The MAE is the average value of the absolute value of the deviation between the estimated value and the real value.The calculation formula is shown in (13).The weight of the difference of all sub items on the average value is equal, which can reflect the actual situation of the difference between the estimated value and the real value.The smaller the MAE, the smaller the deviation between the estimated value and the measured value of TCCON, and the higher the fusion accuracy.
The RMSE is the square root of the ratio of the square sum of the deviation between the estimated value and the real value to the number of observations n.The calculation formula is shown in (14).RMSE measures the deviation between the estimated value and the real value.The smaller the RMSE, the smaller the deviation between the estimated value and measured value of TCCON, and the higher the fusion accuracy.
In ( 12)-( 14), y i is the fused XCO 2 value of multi-source datasets, ŷi is the measured value of TCCON.

Comparison of Kriging Interpolation Coverage of Three Products
Firstly, we resampled the XCO 2 data of the OCO-2, GOSAT, and CT models to obtain 1 • × 1 • half-monthly and monthly global XCO 2 datasets.In order to understand the effect of the Kriging model used for interpolating XCO 2 in this paper in more detail, we spatially display the data coverage of OCO-2 and GOSAT datasets before and after Kriging interpolation in Figure 1.At each 1 • × 1 • grid, if it contains an XCO 2 observation meeting the quality requirements, it is filled with a color (the data before interpolation is dark blue, while the data after interpolation is light blue).The results show that the coverage changes of half-monthly and monthly XCO 2 datasets are basically the same before and after interpolation.The coverage of half-monthly XCO 2 products in 2018 before and after interpolation is shown in Figure 2. In July, August, and December 2018, the coverage of GOSAT data after interpolation does not increase much, because the coverage of GOSAT raw data in these months is low.Overall, before interpolation, the average coverages of OCO-2, GOSAT, and CT products are 18.01%, 3.95%, and 100%, respectively.After interpolation, the average coverages of the OCO-2 and GOSAT datasets are 71.66% and 52.45%, increased by 53.65% and 48.5%, respectively.It can be seen that this method can effectively improve the temporal and spatial resolution of XCO 2 products.We believe that the interpolation results of OCO-2 and GOSAT XCO 2 are mainly affected by the spatial distribution, total number, and sampling interval distribution of observation points.If there is XCO 2 observation in each 1 • × 1 • grid, or it is evenly distributed in space, this method can obtain XCO 2 data through wider coverage.The coverage of CT reaches 100%, which is due to the time resolution of CT model data being 1 day and the spatial resolution being 3 • × 2 • ; the coverage range is −180 • ~180 • longitude and −90 • ~90 • latitude, and the distribution is uniform.According to the search range of Kriging interpolation, there are data affecting the points to be interpolated.This also paves the way for creating a global seamless 1 • × 1 • for the later generation XCO 2 dataset, as shown below.
improve the temporal and spatial resolution of XCO2 products.We believe that the inter-polation results of OCO-2 and GOSAT XCO2 are mainly affected by the spatial distribution, total number, and sampling interval distribution of observation points.If there is XCO2 observation in each 1° × 1° grid, or it is evenly distributed in space, this method can obtain XCO2 data through wider coverage.The coverage of CT reaches 100%, which is due to the time resolution of CT model data being 1 day and the spatial resolution being 3° × 2°; the coverage range is −180°~180° longitude and −90°~90° latitude, and the distribution is uniform.According to the search range of Kriging interpolation, there are data affecting the points to be interpolated.This also paves the way for creating a global seamless 1° × 1° for the later generation XCO2 dataset, as shown below.

Fusion of XCO2 Datasets Based on the TC Algorithm
Due to the errors from the satellite inversion algorithm, the limitations of the Earth's surface and atmospheric conditions, the CO2 concentrations obtained by satellite have a certain deviation.The CT model obtains the CO2 concentrations through the data assimilation method, which leads to its excessive dependence on the surface observation data.The CO2 concentrations obtained in high latitudes and other areas where it is difficult to obtain the observation data have larger deviations, so there are certain random errors in the CO2 concentrations observed by the satellite and assimilation model.In this study, the TC algorithm was used to evaluate the uncertainty of three XCO2 products of interpolated OCO-2, GOSAT, and CT.In order to reveal the influence of the time length of the input dataset on obtaining the random error, this study compared the random errors obtained from three different time series of one year, two years, and one month.As shown in Table

Fusion of XCO2 Datasets Based on the TC Algorithm
Due to the errors from the satellite inversion algorithm, the limitations of the Earth's surface and atmospheric conditions, the CO2 concentrations obtained by satellite have a certain deviation.The CT model obtains the CO2 concentrations through the data assimilation method, which leads to its excessive dependence on the surface observation data.The CO2 concentrations obtained in high latitudes and other areas where it is difficult to obtain the observation data have larger deviations, so there are certain random errors in the CO2 concentrations observed by the satellite and assimilation model.In this study, the TC algorithm was used to evaluate the uncertainty of three XCO2 products of interpolated OCO-2, GOSAT, and CT.In order to reveal the influence of the time length of the input dataset on obtaining the random error, this study compared the random errors obtained from three different time series of one year, two years, and one month.As shown in Table 4, the differences in random errors between the two-year time series of OCO-2, CT, and

Fusion of XCO 2 Datasets Based on the TC Algorithm
Due to the errors from the satellite inversion algorithm, the limitations of the Earth's surface and atmospheric conditions, the CO 2 concentrations obtained by satellite have a certain deviation.The CT model obtains the CO 2 concentrations through the data assimilation method, which leads to its excessive dependence on the surface observation data.The CO 2 concentrations obtained in high latitudes and other areas where it is difficult to obtain the observation data have larger deviations, so there are certain random errors in the CO 2 concentrations observed by the satellite and assimilation model.In this study, the TC algorithm was used to evaluate the uncertainty of three XCO 2 products of interpolated OCO-2, GOSAT, and CT.In order to reveal the influence of the time length of the input dataset on obtaining the random error, this study compared the random errors obtained from three different time series of one year, two years, and one month.As shown in Table 4, the differences in random errors between the two-year time series of OCO-2, CT, and GOSAT and the one-year time series are 0.0082, 0.0081, and 0.0349, while the differences in random errors between the two-year time series and the one-month time series are 0.0646, 0.0632, and 0.1505.It shows that the length of the time series has an important impact on the stability of the TC algorithm.When the time series of input data is short, the robustness of the TC algorithm is poor.In this study, the data from 2017 to 2018 were selected as the input dataset.The random errors obtained are CT of 0.5273 ppm, OCO-2 of 0.6346 ppm, and GOSAT of 0.7995 ppm from small to large.The traditional arithmetic average method cannot minimize the random error of the combined results, and the equal weighting coefficient will lead to the introduction of large random errors from GOSAT data, which are particularly obvious in high latitudes.Therefore, the fusion weight should be determined according to the characteristics of the data.Based on the least square principle, combined with the random error based on TC estimation, the weights of the three datasets were calculated by Equation (11).The weights of CT, OCO-2, and GOSAT are 0.4015, 0.3337, and 0.2648, respectively, as shown in Table 5.Among them, the CT with the smallest random error has the largest weight, followed by OCO-2 and GOSAT.This means that the XCO 2 dataset generated by fusion depends more on CT data.When a 1 • × 1 • grid fails to simultaneously have the three XCO 2 data, it is filled with the CT data.Otherwise, it is filled with a new XCO 2 value according to the obtained weights.The results of the TC algorithm fusion in 2018 are shown in Figure 3.
Atmosphere 2023, 14, x FOR PEER REVIEW 10 of 1 of 0.6346 ppm, and GOSAT of 0.7995 ppm from small to large.The traditional arithmeti average method cannot minimize the random error of the combined results, and the equa weighting coefficient will lead to introduction of large random errors from GOSA data, which are particularly obvious in high latitudes.Therefore, the fusion weight shoul be determined according to the characteristics of the data.Based on the least square prin ciple, combined with the random error based on TC estimation, the weights of the thre datasets were calculated by Equation (11).The weights of CT, OCO-2, and GOSAT ar 0.4015, 0.3337, and 0.2648, respectively, as shown in Table 5.Among them, the CT with th smallest random error has the largest weight, followed by OCO-2 and GOSAT.This mean that the XCO2 dataset generated by fusion depends more on CT data.When a 1° × 1° gri fails to simultaneously have the three XCO2 data, it is filled with the CT data.Otherwise it is filled with a new XCO2 value according to the obtained weights.The results of the TC algorithm fusion in 2018 are shown in Figure 3.In order to verify the XCO 2 accuracy of the fusion dataset, this study selected the data of eight TCCON sites and compared them with the actual observations of OCO-2, GOSAT, CT and TC fusion data.Figure 4 shows the time series scatter diagram of half-monthly average XCO 2 data from 2017 to 2018 at the eight TCCON sites.At the same time, we calculated the difference between the measured concentrations of TCCON and the four datasets.Figures 5-7 show the MAE, RMSE, and R 2 indicators of the verification results at each TCCON site.In terms of MAE (Figure 5), the average MAE of the TC fusion dataset at the eight sites is 0.6273 ppm, OCO-2 is 0.7628 ppm, GOSAT is 1.0029 ppm, and CT is 0.7306 ppm.In general, the fused XCO 2 values obtained by the TC algorithm are more consistent with that measured by TCCON ground sites.While the average MAE of GOSAT is the largest, which also proves the accuracy of weight allocation to some extent.In terms of RMSE (Table 6), the average RMSE of the TC fusion dataset is 0.7683 ppm, OCO-2 is 0.9898 ppm, GOSAT is 1.3557 ppm, and CT is 0.8601 ppm.At site OC, the RMSEs of the three input datasets are the largest, indicating that there are the largest deviations between the observed values and the measured values at this site, resulting in the high RMSE of the fused dataset.In terms of R 2 (Table 7 and Figure 6), the R 2 values of the XCO 2 fusion dataset are mostly greater than those of the three input datasets.The average R 2 of the TC fusion dataset at the eight sites is 0.8279, OCO-2 is 0.7533, GOSAT is 0.6118, and CT is 0.7812.At site OC, the R 2 of the TC fusion dataset is the smallest because it is affected by the accuracy of CT and GOSAT XCO 2 data.In addition, except for the sites JS and OC, the R 2 of the TC fusion dataset in all datasets is the highest, and it reaches more than 0.9 at sites PA, Br, WG, and TK.At sites JS and OC, the R 2 of the fusion dataset is slightly smaller than that of the CT and OCO-2.It shows that the fitting degree between the XCO 2 fusion dataset based on the TC algorithm and the observations of TCCON is good, which proves the feasibility of this method.

Comparison between Multi-Instrument Fused XCO2 and TC Fusion Dataset
Multi-instrument fused XCO2 is derived from the Kriging interpolation fusion of OCO-2 and GOSAT Level 2 products, which take a similar approach as the 10-s average using a variation of Kriging.In order to understand the effect of the TC algorithm in fusing XCO2 in more detail, this paper compared the difference between multi-instrument fused

Comparison between Multi-Instrument Fused XCO 2 and TC Fusion Dataset
Multi-instrument fused XCO 2 is derived from the Kriging interpolation fusion of OCO-2 and GOSAT Level 2 products, which take a similar approach as the 10-s average using a variation of Kriging.In order to understand the effect of the TC algorithm in fusing XCO 2 in more detail, this paper compared the difference between multi-instrument fused XCO 2 products and XCO 2 obtained by the TC algorithm, and created a quantitative description.In order to show the differences between the two XCO 2 datasets in different regions, Figure 7 displays positive in red and negative in blue, which can intuitively compare the changes in XCO 2 differences under different seasonal conditions.By comparing the MAE and RMSE between January, April, July, and October 2018 on a 15-day scale (Table 8), it is found that there is a small difference in January, April, and October, with an RMSE of 0.6736 ppm, 0.7012 ppm, and 0.7395 ppm, respectively, while there is a large difference in July, with an RMSE of 1.1405 ppm.MAE also shows the same characteristics, with a maximum MAE of 0.8143 ppm in July.This feature is particularly obvious in Asia and Europe, and compared with the multi-instrument fused XCO 2 dataset, the XCO 2 obtained by the TC fusion in this paper is higher, which is more obvious in October.In the marine area, there is little difference.Overall, the products obtained in this paper are in good agreement with multi-instrument fused XCO 2 .In January, April, July, and October, there are more than 99.41%, 98.34%, 90.97%, and 98.07% areas with differences within ±2 ppm.

Conclusions
OCO-2 and GOSAT satellites are easily restricted by observation conditions, such as aerosols and clouds, resulting in limited coverage of their effective measurements.In order to form XCO 2 data with global coverage, Kriging interpolation was used to improve the coverage and spatial-temporal resolution of OCO-2 and GOSAT data, and CT model data was used as a supplement in this paper.After interpolation, the average coverages of the OCO-2, GOSAT, and CT datasets are 71.66%,52.45%, and 100%, respectively, and the coverages of OCO-2 and GOSAT are increased by 53.65% and 48.5%, respectively.Furthermore, this paper used the TC algorithm to fuse the CT model dataset with two satellite-interpolated datasets.This algorithm can analyze the uncertainty of datasets by comparing the random errors of three independent datasets, and then determine the weights of three datasets according to the random errors to carry out data fusion.Through cross-validation with TCCON data, the MAE of the fused dataset was 0.6273 ppm, RMSE was 0.7683 ppm, and R 2 was 0.8279.The results indicate that this method can significantly improve the coverage and accuracy of the XCO 2 dataset.
In this paper, we have obtained the semi-monthly 1 • × 1 • half-monthly average XCO 2 dataset with good accuracy.Due to the coverage of the original satellite observation data, we cannot further temporarily improve the spatial resolution.However, we still hope to further improve the spatiotemporal resolution of the data to better support future carbon neutrality research.With the continuous development of satellite remote sensing technology, more and more carbon observation satellites have been launched or are planned to be launched.For example, China has launched the world's first active lidar CO 2 detection satellite-DQ1.Compared with passive remote sensing, active lidar observation has the advantage of all-time and all-weather observation, and can detect global CO 2 concentration with higher resolution and greater coverage [47,48].At the same time, in order to realize the quantitative monitoring of man-made carbon emissions, which is the core goal of the future carbon satellite observation plan, future carbon monitoring will develop a multi-directional satellite network observation of greenhouse gas concentration, air pollution components, and biomass combustion emissions, as well as a global carbon monitoring constellation with active and passive joints [49].Therefore, in subsequent studies, multi-source greenhouse gas satellite data and carbon assimilation systems can be combined for interpolation and fusion.In addition, with the increasing number of ground-based observation data, we will consider taking the ground-based observation data as the optimal control condition, multi-satellite interpolation results and carbon assimilation data as the input data, and obtain the XCO 2 fused dataset with a higher spatial-temporal resolution, higher accuracy, and larger coverage based on the TC algorithm, so as to promote the estimation of CO 2 emissions from regional sources and contribute to the research of carbon neutralization.

Figure 1 .
Figure 1.Distribution of coverage of OCO-2 and GOSAT XCO2 for the first half of the month (left graphs) and a full month (right graphs) in January 2018.The satellite observation data are displayed in dark blue, and the interpolated data are displayed in light blue.

Figure 1 . 18 Figure 1 .
Figure 1.Distribution of coverage of OCO-2 and GOSAT XCO 2 for the first half of the month (left graphs) and a full month (right graphs) in January 2018.The satellite observation data are displayed in dark blue, and the interpolated data are displayed in light blue.

Figure 4 .
Figure 4. Scatter diagrams of time series of OCO-2, GOSAT, CT, and TC fusion XCO2 data at the eight TCCON sites from 2017 to 2018.The black solid line represents the observations of the TCCON site, the purple point represents CT, the red point represents OCO-2, the blue point represents GO-SAT, and the green point represents the fused XCO2 based on the TC algorithm.

Figure 4 . 1 Figure 4 .
Figure 4. Scatter diagrams of time series of OCO-2, GOSAT, CT, and TC fusion XCO 2 data at the eight TCCON sites from 2017 to 2018.The black solid line represents the observations of the TCCON site, the purple point represents CT, the red point represents OCO-2, the blue point represents GOSAT, and the green point represents the fused XCO 2 based on the TC algorithm.

Figure 7 .
Figure 7.Comparison of difference between fusion products and multi-instrument fused XCO2.

Figure 7 .
Figure 7.Comparison of difference between fusion products and multi-instrument fused XCO 2 .

Table 2 .
Information of TCCON Sites Used in this Study.

Table 3 .
Search scope of Kriging variogram.

Table 8 .
Comparison between Fusion Dataset and Multi-instrument Fused XCO 2 MAE and RMSE.