1. Introduction
Since the industrial revolution, the concentration of carbon dioxide (CO
2) in the atmosphere has continued to rise, which is an important factor causing the greenhouse effect [
1,
2]. In recent years, in order to control the trend of global warming and slow down the gradual increase in atmospheric CO
2 concentration, various countries have formulated corresponding emission reduction measures, but the atmospheric CO
2 concentration is still rising [
3]. The sixth assessment report of the IPCC in August 2021 revealed that since the 1970s, the rate of global warming has been higher than that of any 50 years in the past 2000 [
4]. In order to realize the great vision of peak CO
2 emission by 2030 and carbon neutralization by 2060, the monitoring of global atmospheric CO
2 concentration is essential.
The traditional CO
2 ground-based network observation can obtain relatively high-precision data, but its sites are few and unevenly distributed, and it is difficult to establish sites in the ocean and other regions, so it is difficult to obtain sufficient observation information. Therefore, it presents a significant challenge to obtain the distribution information of global carbon sources and sinks. With the development of remote sensing technology, greenhouse gas observation satellites have gradually developed and become an important means by which to obtain greenhouse gas data [
5,
6,
7,
8,
9,
10]. Compared with ground-based observation networks, satellite remote sensing observation can cover the Earth’s surface to a great extent, but there are also some deficiencies [
11,
12]. For example, due to the observation limitations of passive satellites, the atmospheric carbon dioxide column concentration (XCO
2) can only be obtained under weather conditions of no cloud cover or low aerosol concentration [
13,
14]. Due to the restriction of the solar altitude angle, it is also difficult for satellites to observe effective data in high latitudes. Furthermore, because of the error of the inversion algorithm and the limitation of the Earth’s surface or atmospheric conditions, the obtained CO
2 concentration data has a certain deviation.
In order to meet the needs of climate research, the inversion accuracy of satellite observation XCO
2 needs to be better than 1% [
15]. Yang Dongxu proposed a spectral correction method for TANSAT based on the UOL FP algorithm, and verified its XCO
2 inversion product with TCCON data, with an RMSE of 1.59 ppm [
16]. Ye Hanhan used the optimization estimation method to retrieve XCO
2, and verified the inversion accuracy of the GF-5 satellite by using the data from TCCON ground sites [
17]. The results showed that the inversion accuracy of XCO
2 was 0.67%, but the inversion algorithm had insufficient ability to correct the long-range complex atmospheric interference. Matthias Katzfussa and Noel Cressie proposed to use the Bayesian hierarchical spatio-temporal random effect (STRE) model to process AIRS data and compared it to the real dataset of simulation research and global satellite carbon dioxide measurement, but the model only improved the calculation speed [
18]. Reuter used seven CO
2 inversion algorithms of the GOSAT and SCIAMACHY satellites. Assuming that there were few outliers, they chose the median as the value in a grid to generate a global high-precision CO
2 fusion dataset EMMA [
19]. However, they did not improve the temporal and spatial resolution of CO
2 data. Nguyenf used the CO
2 data of GOSAT and AIRS, but the method only obtained the CO
2 concentration in the low troposphere [
20]. Lili Zhang presented a HASM XCO
2 surface obtained by fusing TCCON measurements with GEOS-Chem model results [
21]. David Leslie used the neural network method to estimate XCO
2 data from OCO-2 measurement, but it was very time-consuming to train the model in practical application [
22]. Stefan Noël used the Fast atmOspheric traCe gAs retrievaL (FOCAL) algorithm to obtain the results of XCO
2 from GOSAT and GOSAT-2 radiation, but after calculating the collected CO
2 concentration data, they still did not obtain a data product with high precision, high coverage, and high temporal and spatial resolution [
23].
In this study, the method of data fusion was selected to combine multiple satellite data, and the uncertainty of data was the key factor to be considered in the process of data fusion. It was necessary to assign reasonable weight to each dataset according to the random error of each dataset. When multi-source data is fused, the random error of data is often weakened according to the best combination scheme, so as to generate an independent dataset with better accuracy. Weighted arithmetic average is a common method to combine multiple datasets. Ideally, the uncertainty evaluation of data needs absolute “truth” as a reference. However, it is difficult to obtain the global truth XCO
2 within the scope of a satellite footprint, and the previous verification is only limited to a few TCCON ground sites. In 1998, Stoffeland [
24] developed a data quality evaluation method, triple collision (TC), which can avoid obtaining the “truth value”. This method can objectively estimate the random error of three parallel datasets and is applicable to regional and global scales. It has been widely used in various disciplines. In view of this, this study used the Kriging interpolation method to fill the no-data regions of XCO
2 data products from the GOSAT, OCO-2, and carbontracker models, respectively, measured the uncertainty, and integrated the multi-source products to improve the accuracy, coverage, and spatio-temporal resolution through the TC algorithm.
5. Conclusions
OCO-2 and GOSAT satellites are easily restricted by observation conditions, such as aerosols and clouds, resulting in limited coverage of their effective measurements. In order to form XCO2 data with global coverage, Kriging interpolation was used to improve the coverage and spatial-temporal resolution of OCO-2 and GOSAT data, and CT model data was used as a supplement in this paper. After interpolation, the average coverages of the OCO-2, GOSAT, and CT datasets are 71.66%, 52.45%, and 100%, respectively, and the coverages of OCO-2 and GOSAT are increased by 53.65% and 48.5%, respectively. Furthermore, this paper used the TC algorithm to fuse the CT model dataset with two satellite-interpolated datasets. This algorithm can analyze the uncertainty of datasets by comparing the random errors of three independent datasets, and then determine the weights of three datasets according to the random errors to carry out data fusion. Through cross-validation with TCCON data, the MAE of the fused dataset was 0.6273 ppm, RMSE was 0.7683 ppm, and R2 was 0.8279. The results indicate that this method can significantly improve the coverage and accuracy of the XCO2 dataset.
In this paper, we have obtained the semi-monthly 1° × 1° half-monthly average XCO
2 dataset with good accuracy. Due to the coverage of the original satellite observation data, we cannot further temporarily improve the spatial resolution. However, we still hope to further improve the spatiotemporal resolution of the data to better support future carbon neutrality research. With the continuous development of satellite remote sensing technology, more and more carbon observation satellites have been launched or are planned to be launched. For example, China has launched the world’s first active lidar CO
2 detection satellite—DQ1. Compared with passive remote sensing, active lidar observation has the advantage of all-time and all-weather observation, and can detect global CO
2 concentration with higher resolution and greater coverage [
47,
48]. At the same time, in order to realize the quantitative monitoring of man-made carbon emissions, which is the core goal of the future carbon satellite observation plan, future carbon monitoring will develop a multi-directional satellite network observation of greenhouse gas concentration, air pollution components, and biomass combustion emissions, as well as a global carbon monitoring constellation with active and passive joints [
49]. Therefore, in subsequent studies, multi-source greenhouse gas satellite data and carbon assimilation systems can be combined for interpolation and fusion. In addition, with the increasing number of ground-based observation data, we will consider taking the ground-based observation data as the optimal control condition, multi-satellite interpolation results and carbon assimilation data as the input data, and obtain the XCO
2 fused dataset with a higher spatial-temporal resolution, higher accuracy, and larger coverage based on the TC algorithm, so as to promote the estimation of CO
2 emissions from regional sources and contribute to the research of carbon neutralization.