Validation of FY-3D MERSI-2 Precipitable Water Vapor (PWV) Datasets Using Ground-Based PWV Data from AERONET

The medium resolution spectral imager-2 (MERSI-2) is one of the most important sensors onboard China’s latest polar-orbiting meteorological satellite, Fengyun-3D (FY-3D). The National Satellite Meteorological Center of China Meteorological Administration has developed four precipitable water vapor (PWV) datasets using five near-infrared bands of MERSI-2, including the P905 dataset, P936 dataset, P940 dataset and the fusion dataset of the above three datasets. For the convenience of users, we comprehensively evaluate the quality of these PWV datasets with the ground-based PWV data derived from Aerosol Robotic Network. The validation results show that the P905, P936 and fused PWV datasets have relatively large systematic errors (−0.10, −0.11 and −0.07 g/cm2), whereas the systematic error of the P940 dataset (−0.02 g/cm2) is very small. According to the overall accuracy of these four PWV datasets by our assessments, they can be ranked in descending order as P940 dataset, fused dataset, P936 dataset and P905 dataset. The root mean square error (RMSE), relative error (RE) and percentage of retrieval results with error within ±(0.05+0.10∗PWVAERONET) (PER10) of the P940 PWV dataset are 0.24 g/cm2, 0.10 and 76.36%, respectively. The RMSE, RE and PER10 of the P905 PWV dataset are 0.38 g/cm2, 0.15 and 57.72%, respectively. In order to obtain a clearer understanding of the accuracy of these four MERSI-2 PWV datasets, we compare the accuracy of these four MERSI-2 PWV datasets with that of the widely used MODIS PWV dataset and AIRS PWV dataset. The results of the comparison show that the accuracy of the MODIS PWV dataset is not as good as that of all four MERSI-2 PWV datasets, due to the serious overestimation of the MODIS PWV dataset (0.40 g/cm2), and the accuracy of the AIRS PWV dataset is worse than that of the P940 and fused MERSI-2 PWV datasets. In addition, we analyze the error distribution of the four PWV datasets in different locations, seasons and water vapor content. Finally, the reason why the fused PWV dataset is not the one with the highest accuracy among the four PWV datasets is discussed.


Introduction
Water vapor is one of the most important sources of the greenhouse effect [1], as well as a key factor affecting precipitation, severe weather and the global energy cycle [2][3][4]. Although water vapor only accounts for a small part of the total atmosphere, it plays an important role in the earth's weather system and climate change. In addition, due to the emission and absorption of radiation in specific spectral regions by water vapor, it can significantly affect the accuracy of quantitative remote sensing, such as land surface temperature inversion based on thermal infrared data [5,6] and aerosol retrieval [7,8]. The precipitable water vapor (PWV) is the total atmospheric water vapor contained in a vertical

FY-3D MERSI-2 PWV Data
FY-3D, China's latest polar-orbiting meteorological satellite, was launched on 15 November 2017. FY-3D has been in operation since 1 January 2019. It has an orbital height of 836 km and its orbital period is 101.5 min. MERSI-2 is one of the most important sensors onboard FY-3D, and is mainly used for atmospheric monitoring. Thanks to its very large imaging width, MERSI-2 can obtain seamless images of the world every day. MERSI-2 has five NIR bands for water vapor monitoring, including three water vapor absorption bands (905, 936 and 940 nm) and two bands (865 and 1030 nm) in atmospheric window. The spectral response functions (SRF) of these five bands are shown in Figure 1, and their spatial resolution and signal-to-noise (SNR) ratios are shown in Table 1 [32]. The MERSI-2 PWV products used in this work are developed using the data of these five bands, and these data can be downloaded for free from the website of China's NSMC (http://www.nsmc.org.cn/en/NSMC/Home/Index.html, accessed on 6 April 2021).

FY-3D MERSI-2 PWV Data
FY-3D, China's latest polar-orbiting meteorological satellite, was launched on November 15,2017. FY-3D has been in operation since January 1, 2019. It has an orbital height of 836 km and its orbital period is 101.5 min. MERSI-2 is one of the most important sensors onboard FY-3D, and is mainly used for atmospheric monitoring. Thanks to its very large imaging width, MERSI-2 can obtain seamless images of the world every day. MERSI-2 has five NIR bands for water vapor monitoring, including three water vapor absorption bands (905, 936 and 940 nm) and two bands (865 and 1030 nm) in atmospheric window. The spectral response functions (SRF) of these five bands are shown in Figure 1, and their spatial resolution and signal-to-noise (SNR) ratios are shown in Table 1 [32]. The MERSI-2 PWV products used in this work are developed using the data of these five bands, and these data can be downloaded for free from the website of China's NSMC (http://www.nsmc.org.cn/en/NSMC/Home/Index.html, accessed on 6 April 2021). The spectral response functions of five NIR bands of MERSI-2 used for water vapor retrieval. The first line in the legend (black line) represents the total transmittance of the atmosphere (TTA) when the satellite view zenith angle and solar zenith angle are equal to 0 and PWV is equal to 1 g/cm 2 . The second line (yellow line) represents the TTA when the satellite view zenith angle and solar zenith angle are equal to 0 and PWV is equal to 3 g/cm 2 . The next five are spectral response functions of the five MERSI-2 bands used for water vapor retrieval. The principle of retrieval algorithm for developing MERSI-2 PWV products is as follows. Due to their relatively long wavelengths, the atmospheric path reflectance of these five bands can be ignored, and the TOA reflectance observed by the satellite can be calculated using Equation (1). Figure 1. The spectral response functions of five NIR bands of MERSI-2 used for water vapor retrieval. The first line in the legend (black line) represents the total transmittance of the atmosphere (TTA) when the satellite view zenith angle and solar zenith angle are equal to 0 and PWV is equal to 1 g/cm 2 . The second line (yellow line) represents the TTA when the satellite view zenith angle and solar zenith angle are equal to 0 and PWV is equal to 3 g/cm 2 . The next five are spectral response functions of the five MERSI-2 bands used for water vapor retrieval. The principle of retrieval algorithm for developing MERSI-2 PWV products is as follows. Due to their relatively long wavelengths, the atmospheric path reflectance of these five bands can be ignored, and the TOA reflectance observed by the satellite can be calculated using Equation (1). where TOA λ , T λ and REF λ are the TOA reflectance, total atmospheric transmittance and surface reflectance of the band with the center wavelength of λ, respectively. Since the change of surface reflectance can be regarded as a linear change in the spectral range of 850-1250 nm [15], we can use the linear interpolation method to calculate the TOA reflectance of water vapor absorption bands (905, 936 and 940 nm) when the water vapor content is zero according to the TOA reflectance of the near bands (865 and 1030 nm) in atmospheric window. The specific method of calculation is shown in Equation (2).
where TOA * λ is the TOA of the band at λ obtained by linear interpolation using the TOA of band at λ 1 and band at λ 2 .
Next, we can calculate the total transmittance of the atmosphere of water vapor absorption bands according to Equation (3) [22].
After that, we can use the MODerate resolution atmospheric TRANsmission (MOD-TRAN) code [33] in order to calculate the total atmosphere transmittance under different PWV conditions and obtain the PWV retrieval result by interpolation.
According to the PWV retrieval algorithm mentioned above, we can develop three PWV datasets that are derived from the combination of band 15, band 16 and band 19, the combination of band 15, band 17 and band 19 and the combination of band 15, band 18 and band 19, respectively (listed in Table 1). For the convenience of description, we refer to the above three PWV datasets as P905, P936 and P940 datasets according to the water vapor absorption channels used to develop them. However, the above three PWV results are different due to the difference in the sensitivity of the three water vapor absorption channels to water vapor. This will bring confusion to data users. In order to solve this problem and improve the accuracy of PWV retrieval results, Wang et al. [22] fuse the above three PWV results according to the sensitivity of different channels (905, 936 and 940 nm) to water vapor. The specific fusion method is shown in Equation (4).
where P 1 , P 2 and P 3 are the PWV retrieved using the combination of band 15, band 16 and band 19, the combination of band 15, band 17 and band 19 and the combination of band 15, band 18 and band 19, respectively; W 1 , W 2 and W 3 are the weights corresponding to P 1 , P 2 and P 3 , respectively; P is the fused PWV retrieval result.

Ground-Based PWV Data
AERONET is a global ground-based aerosol observation network, which consists of hundreds of sun photometers [34]. It has been in continuous operation for more than 25 years. Its main task is to monitor aerosols, but it can also monitor water vapor. The sun photometer has multiple spectral channels, including channels with wavelengths of 340, 380, 440, 500, 670, 870, 940 and 1020 nm. Among them, the channel at 940 nm is a water vapor absorption channel. The PWV data provided by AERONET are retrieved by using the atmospheric transmittance of the channel [35,36]. As a result of the high accuracy and long-term continuous observation of AERONET PWV data, and the wide distribution of AERONET sites around the world, AERONET PWV data are widely used as true values for the validation of remote sensing PWV data [31,37,38]. The latest version of AERONET data is version 3 [39]. The latest version of level 1.5 of AERONET PWV data (cloud-screened and quality controlled) released in near real time is selected for validation of remote sensing PWV datasets. In this paper, ground-based PWV data from 369 AERONET sites are used to validate the MERSI-2 PWV data. The spatial distribution of these AERONET sites is shown  Each dot represents an AERONET site. The color of these dots indicates the number of matching results between AERONET data and each MERSI-2 PWV product.

Validation of Remote Sensing Data Using Ground-Based Data
MERSI-2 PWV products are raster data with a spatial resolution of 1 km × 1 km, whereas AERONET PWV data are a kind of point data, so it is impossible to directly compare AERONET PWV data with MERSI-2 PWV products. In this paper, we use the temporal mean of AERONET PWV data within half an hour of the transit time of FY-3D to match the spatial average of 9 × 9 pixels of MERSI-2 PWV products centered on the AER-ONET site [31]. This method of matching the spatial mean of remote sensing data with the temporal mean of ground-based data has been widely used in the validation of atmospheric remote sensing datasets [40,41].
where is the number of effective matching results between the MERSI-2 PWV data and AERONET PWV data, is the value of MERSI-2 PWV data of the i-th matching result, and is the value of AERONET PWV data corresponding to . Each dot represents an AERONET site. The color of these dots indicates the number of matching results between AERONET data and each MERSI-2 PWV product.

Validation of Remote Sensing Data Using Ground-Based Data
MERSI-2 PWV products are raster data with a spatial resolution of 1 km × 1 km, whereas AERONET PWV data are a kind of point data, so it is impossible to directly compare AERONET PWV data with MERSI-2 PWV products. In this paper, we use the temporal mean of AERONET PWV data within half an hour of the transit time of FY-3D to match the spatial average of 9 × 9 pixels of MERSI-2 PWV products centered on the AERONET site [31]. This method of matching the spatial mean of remote sensing data with the temporal mean of ground-based data has been widely used in the validation of atmospheric remote sensing datasets [40,41].
where N is the number of effective matching results between the MERSI-2 PWV data and AERONET PWV data, P i is the value of MERSI-2 PWV data of the i-th matching result, and P i is the value of AERONET PWV data corresponding to P i .

Overall Accuracy Assessment and Comparison of Four MERSI-2 PWV Datasets
The overall accuracy of the four MERSI-2 PWV datasets released by China's NSMC (P905, P936, P940 and fused PWV datasets) is evaluated in this section. The scatter plots of the matching results between the AERONET PWV data and four MERSI-2 PWV datasets are shown in Figure 3. In order to facilitate readers to analyze and compare the values of statistical parameters of different PWV datasets, we summarize them in Table 2. As shown in Figure 3, most of the matching results of the MERSI-2 PWV datasets and AERONET PWV data range from 0 to 4 g/cm 2 . These four MERSI-2 PWV datasets all have a very high correlation with ground-based observation data, and their correlation coefficients are not less than 0.95. The slopes of the linear fitting equations for these four MERSI-2 PWV datasets indicate that there is underestimation in all four MERSI-2 PWV datasets. The slopes corresponding to the P905, P936 and fused PWV datasets are all less than 0.9, and their underestimation is relatively obvious, whereas the slope of the P940 dataset is 0.96, and its underestimation is small. This is consistent with the MB of four MERSI-2 PWV datasets. The order of slopes and MBs of these four datasets from small to large are P936 (slope: 0.79, MB: −0.11 g/cm 2 ), P905 (0.86, −0.10 g/cm 2 ), fused (0.88, −0.07 g/cm 2 ) and P940 (0.96, −0.02 g/cm 2 ) PWV datasets.

Overall Accuracy Assessment and Comparison of Four MERSI-2 PWV Datasets
The overall accuracy of the four MERSI-2 PWV datasets released by China's NSMC (P905, P936, P940 and fused PWV datasets) is evaluated in this section. The scatter plots of the matching results between the AERONET PWV data and four MERSI-2 PWV datasets are shown in Figure 3. In order to facilitate readers to analyze and compare the values of statistical parameters of different PWV datasets, we summarize them in Table 2. As shown in Figure 3, most of the matching results of the MERSI-2 PWV datasets and AERONET PWV data range from 0 to 4 g/cm 2 . These four MERSI-2 PWV datasets all have a very high correlation with ground-based observation data, and their correlation coefficients are not less than 0.95. The slopes of the linear fitting equations for these four MERSI-2 PWV datasets indicate that there is underestimation in all four MERSI-2 PWV datasets. The slopes corresponding to the P905, P936 and fused PWV datasets are all less than 0.9, and their underestimation is relatively obvious, whereas the slope of the P940 dataset is 0.96, and its underestimation is small. This is consistent with the MB of four MERSI-2 PWV datasets. The order of slopes and MBs of these four datasets from small to large are P936 (slope: 0.79, MB: −0.11 g/cm 2 ), P905 (0.86, −0.10 g/cm 2 ), fused (0.88, −0.07 g/cm 2 ) and P940 (0.96, −0.02 g/cm 2 ) PWV datasets.   For these four MERSI-2 PWV datasets, the order of RMSE from small to large is the same as that of MAE, and is the P940, fused, P936 and P905 PWV datasets. The MAEs and RMSEs of the P905 and P936 PWV datasets are significantly larger than those of the P940 and fused PWV datasets. This indicates that the absolute errors of the P905 and P936 PWV datasets are significantly larger than those of the P940 and fused PWV datasets. The relatively large RMSE and MAE of the P936 PWV dataset are caused by a large number of PWV retrieval results that are obviously smaller than the actual PWV values in the case of high water vapor content. Similar to the P936 dataset, there is also an underestimation in the P905 dataset, with values greater than 4. In addition, a lot of abnormal retrieval results is also one of the reasons for the relatively large RMSE and MAE of the P905 dataset.
The order of PER10 from small to large is the same as that of PER15, and is the P905, P936, fused and P940 PWV datasets. This means that the P940 PWV dataset has the highest percentage of retrieval results with errors within ±(0.05 + 0.15 * PWV AERONET ) and ±(0.05 + 0.10 * PWV AERONET ), whereas the P905 PWV dataset has the smallest percentage. Since the PER10s of the P940 and fused PWV datasets are all greater than 68.27%, we can consider their expected error (EE) to be less than ±(0.05 + 0.10 * PWV AERONET ). The REs of the P905, P936, P940 and fused MERSI-2 PWV datasets are 0.15, 0.13, 0.10 and 0.11, respectively. This is consistent with the above validation results; that is, the RE of the dataset with the larger MAE and RMSE is also larger, and the RE of the dataset with the smaller PER10 and PER15 is larger.
In summary, all four MERSI-2 PWV datasets have some underestimation. The underestimation of the P940 PWV dataset is slight, whereas the other three PWV datasets have relatively obvious underestimation. In terms of the overall accuracy of the four MERSI-2 PWV datasets, they can be sorted in descending order as follows: P940 dataset, fused dataset, P936 dataset and P905 dataset. Among them, the accuracy of the P940 and fused PWV datasets is obviously better than that of the P936 and P905 datasets.

Error Analysis of MERSI-2 PWV Products under Different Water Vapor Content
Since existing studies have shown that the errors of the remote sensing PWV dataset are usually related to the content of water vapor [42,43], this section will analyze the errors of the four MERSI-2 PWV datasets under different PWV conditions. The error distributions of these four MERSI-2 PWV datasets are shown in Figure 4. The errors here are calculated by using remote sensing retrieval results to minus ground-based observations. This means that a positive error indicates that the remote sensing data have an overestimation of the actual situation, and a negative error indicates that the remote sensing data have an underestimation of the actual situation. More statistical parameters for these four MERSI-2 PWV datasets in different PWV ranges are summarized in Table 3.    In Figure 4, we present the MBs and standard deviations (STD) of errors of the four MERSI-2 PWV datasets for different PWV intervals. The MB and STD can be regarded as indicators to characterize systematic error and random error, respectively [44]. The MBs of these four MERSI-2 PWV datasets in different PWV intervals all decrease with the increase of PWV. However, the difference in MBs of the P940 dataset of different PWV intervals is small, whereas the changes in MBs of the other three PWV datasets are relatively large, and decrease significantly with the increase of PWV. This indicates that the P905, P936 and fused PWV datasets have relatively large systematic errors, whereas the systematic error of the P940 dataset is very small. In terms of STD, the difference of STDs of the P936, P940 and fused PWV datasets in the same PWV interval are small, and their STDs are obviously smaller than that of the P905 PWV dataset. This means that the random errors of the P936, P940 and fused PWV datasets are approximately equal, and their random errors are obviously smaller than that of the P905 PWV dataset. In other words, the P905 PWV dataset is the one with the greatest uncertainty in these four MERSI-2 PWV datasets, and there is no significant difference in the uncertainty of the other three PWV datasets.
The correlation coefficients between the errors of the P905, P936, P940 and fused PWV datasets and AERONET PWV data are 0.41, 0.70, 0.18 and 0.51, respectively. Therefore, we can conclude that the errors of the P905, P936 and fused datasets are related to the AERONET PWV data, whereas the errors of the P940 dataset are basically not related to the ground-based PWV data. This is consistent with the above-mentioned conclusion that the systematic error of the P940 dataset will not change significantly with the increase of PWV. By analyzing the statistical parameters in Table 3, it can be seen that the following phenomena is present in all four MERSI-2 PWV datasets: the RMSE and MAE used to characterize the absolute error will gradually increase with increasing water vapor content. This is consistent with the above conclusion that the systematic errors and random errors of the four MERSI-2 PWV datasets will increase with the increase of PWV. Since the trend of RE is determined by both the increasing rate of MAE and the increasing rate of PWV, the trend of RE is obviously different from that of MAE and RMSE. For example, the increasing rate of MAE of the P940 dataset is less than that of PWV, so the RE of the P940 dataset at different PWV intervals gradually decreases with PWV.

Validation Results of MERSI-2 PWV Datasets in Different Seasons
In this section, the accuracy of the four MERSI-2 PWV datasets in different seasons will be analyzed and compared. The error of these four MERSI-2 PWV datasets in different seasons are shown in Figure 5. The mean values of the ground-based PWV data in different seasons are shown in Figure 5a. Since most of the AERONET sites are located in the Northern Hemisphere, the average of the ground-based measurements from June to August (that is, the summer in the Northern Hemisphere) is the largest in all four seasons, whereas the mean value for the three months of January, February and December is the smallest. Due to the fact that the RMSEs and MAEs of all four MERSI-2 PWV datasets are positively correlated with water vapor content, the RMSEs and MAEs of these four datasets in winter should be the smallest, and the RMSEs and MAEs of these four datasets in summer should be the largest. This is in line with the actual situation. Although the MAEs and RMSEs of these four MERSI-2 PWV datasets show the same trend across seasons, their values differ significantly. According to MAE and RMSE, these four PWV datasets can be arranged in descending order as P905, P936, fused and P940 datasets. Unlike RMSE and MAE, there is no significant seasonal trend in the REs of these four PWV datasets, and the REs of the same PWV dataset do not differ significantly from season to season. However, the descending order of REs, RMSEs and MAEs of these four MERSI-2 PWV datasets is the same. Regardless of evaluating the accuracy of these four MERSI-2 PWV datasets by absolute error or relative error, the P940 PWV dataset is the one with the best accuracy, and the P905 PWV dataset is the one that has the lowest accuracy in all four seasons.  The PER10s and PER15s of these four PWV datasets in different seasons are shown in Figure 5e,f. The PER10s of the P905 dataset in the four seasons of spring, summer, autumn and winter are 56.64%, 50.50%, 63.01% and 61.52%. The corresponding PER10s of the P936 dataset are 69.67%, 60.90%, 67.28% and 68.45%. The corresponding PER10s of the P940 dataset are 79.59%, 75.66%, 73.25% and 77.26%. The corresponding PER10s of the fused dataset are 76.67%, 68.78%, 73.00% and 73.78%. The PER10s of the P905 dataset in all four seasons are obviously smaller than the corresponding PER10s of the other three PWV datasets, whereas the PER10s of the P940 dataset are the largest. Among these four MERSI-2 PWV datasets, the PER10s of the P940 and fused PWV datasets in all seasons are greater than 68.27%, the PER10s of the P936 PWV dataset in some seasons are greater than 68.27%, and the PER10s of the P905 PWV dataset in all seasons are less than 68.27%. This means that only the EE of the P940 and fused PWV datasets is better than ±(0.05 + 0.10 * PWV AERONET ) in all seasons, the EE of the P936 PWV dataset is better than ±(0.05 + 0.10 * PWV AERONET ) in some seasons and the EE of the P905 PWV dataset is worse than ±(0.05 + 0.10 * PWV AERONET ) in all seasons. The four PWV datasets can be sorted in ascending order of PER10 as P905, P936, fused and P940 datasets, which is the same as the ascending order of PER15 for the four PWV datasets.

Validation Results of MERSI-2 PWV Datasets in Different Locations
The spatial distribution of RMSE and MAE for the four MERSI-2 PWV datasets is shown in Figure 6. Since both RMSE and MAE are indicators used to characterize absolute error, the spatial distribution of RMSE and MAE of the same PWV dataset are similar. By analyzing the spatial distribution of RMSE and MAE of these four PWV datasets, it can be seen that most points with MAE and RMSE greater than 0.3 are distributed in tropical or coastal areas with very high water vapor content. This is consistent with the validation results of other remote sensing PWV datasets; that is, the error of PWV retrieval results will become larger as the water vapor content increases [31,42]. In general, the spatial distribution of RMSE and MAE for the P940 and fused datasets are similar, and the RMSEs and MAEs of these two PWV datasets at most AERONET sites are within 0.2. The spatial distribution of RMSE and MAE for the P905 and P936 datasets is quite different from that of the P940 and fused PWV datasets. In addition, the ratios of the validation results of the P905 and P936 PWV datasets with RMSE and MAE greater than 0.3 are significantly higher than those of the other two PWV datasets, especially the ratio of the P905 PWV dataset.
The spatial distribution of RE and PER10 for these four MERSI-2 PWV datasets is shown in Figure 7. Theoretically, RE is negatively correlated with PER10. By analyzing Figure 7, it can be found that this is consistent with the actual situation. The spatial distribution of RE of the P940 and fused PWV datasets is similar, whereas the spatial distribution of RE of the P905 and P936 datasets is obviously different from that of the above two PWV datasets. The RE of the P940 and fused PWV datasets at most AERONET sites is less than 0.15, and the PER10 of these two PWV datasets at most sites is greater than 70%. The proportion of validation results of the P905 and P936 PWV datasets that meet the above requirements are significantly lower than those of the P940 and fused PWV datasets. As shown in Figure 7c,d, the validation results of the P936 dataset with a RE greater than 0.15 and PER10 less than 60% are mainly distributed in the regions with relatively high water vapor content. This is because the retrieval results of the P936 PWV dataset with a PWV greater than 3 g/cm 2 have significant systematic errors. Compared with the P936 PWV dataset, the P905 PWV dataset has a higher proportion of validation results, with a RE greater than 0.15 and PER10 less than 60%. This is because the P905 PWV dataset not only has obviously systematic errors, but also has larger random errors than the other three PWV datasets. tribution of RMSE and MAE for the P940 and fused datasets are similar, and the RMSEs and MAEs of these two PWV datasets at most AERONET sites are within 0.2. The spatial distribution of RMSE and MAE for the P905 and P936 datasets is quite different from that of the P940 and fused PWV datasets. In addition, the ratios of the validation results of the P905 and P936 PWV datasets with RMSE and MAE greater than 0.3 are significantly higher than those of the other two PWV datasets, especially the ratio of the P905 PWV dataset.  and PER10 less than 60% are mainly distributed in the regions with relatively high water vapor content. This is because the retrieval results of the P936 PWV dataset with a PWV greater than 3 g/cm 2 have significant systematic errors. Compared with the P936 PWV dataset, the P905 PWV dataset has a higher proportion of validation results, with a RE greater than 0.15 and PER10 less than 60%. This is because the P905 PWV dataset not only has obviously systematic errors, but also has larger random errors than the other three PWV datasets.

Accuracy Comparision between Four MERSI-2 PWV Datasets, AIRS PWV Dataset and MODIS PWV Dataset
The accuracy of the four MERSI-2 PWV datasets has been analyzed in detail, but the comparison between their accuracy and the accuracy of other remote sensing PWV datasets is still lacking. Therefore, this section will focus on comparing the accuracy of the MERSI-2 PWV datasets and other PWV datasets. The Atmospheric Infrared Sounder (AIRS) is a sensor aboard Aqua. The AIRS L2 PWV dataset is the official standard product released by NASA. MOD05 and MYD05 are water vapor products released by NASA that are derived from MODIS aboard the Terra and Aqua satellites, based on the difference in atmospheric transmittance of different NIR channels [15]. The above PWV datasets have been widely used and fully validated [45][46][47][48]. For example, the existing global validation results show that the RMSE and correlation coefficient of the MODIS PWV dataset are approximately equal to 0.51 g/cm 2 and 0.86, respectively [49]. The above validation results show that the values of the statistical parameters of the MODIS PWV dataset are worse than those of the four MERSI-2 PWV datasets. Since the ground-based data used for validation of the MODIS PWV dataset is different than that used to validate the MERSI-2 PWV datasets, and the selected time periods are also different, we cannot directly determine the order of their accuracy based on the above validation results. Therefore, we will directly compare the accuracy of the MERSI-2 PWV datasets with the MODIS PWV dataset and AIRS PWV dataset.
The scatter plot of the matching results between the MODIS PWV dataset, AIRS PWV dataset and AERONET data are shown in Figure 8. Through the analysis of Figure 8, it can be found that the MODIS PWV dataset has significant systematic errors. This is consistent with the previous validation results [49]. The RMSE, MAE, MB, RE, PER10 and PER15 of the MODIS PWV dataset are 0.54 g/cm 2 , 0.42 g/cm 2 , 0.40 g/cm 2 , 0.29, 21.07% and 30.25%, respectively. The RMSE, MAE, MB, RE, PER10 and PER15 of the AIRS PWV dataset are 0.35 g/cm2, 0.25 g/cm2, 0.14 g/cm2, 0.14, 52.39% and 65.97%, respectively. The values of the corresponding statistical parameters for the four MERSI-2 PWV datasets are shown in Table 2. By comparing the values of these statistical parameters of the MODIS PWV dataset and four MERSI-2 PWV datasets, it can be found that, regardless of whether absolute error or relative error is used to evaluate the remote sensing PWV datasets, the accuracy of the MODIS PWV dataset is worse than that of all four MERSI-2 PWV datasets. In fact, there is a strong correlation between the MODIS PWV dataset and the AERONET PWV data. The relatively large absolute and relative errors of the MODIS PWV dataset are caused by systematic overestimation. By comparing the AIRS PWV dataset and P905 and P936 MERSI-2 PWV datasets, it can be found that these three datasets have similar values of statistical parameters. However, regardless of whether absolute error or relative error is used to evaluate the remote sensing PWV datasets, the accuracy of the AIRS PWV dataset is worse than that of the P940 and fused MERSI-2 PWV datasets.

Discussion
In this paper, we used the ground-based PWV data derived from AERONET in order to validate four MERSI-2 PWV datasets released by China's National Satellite Meteorological Center. To the best of our knowledge, this is the first comprehensive validation of these four MERSI-2 PWV datasets on a global scale. The order of these four MERSI-2 PWV datasets in descending order of accuracy is as follows: P940 dataset, fused dataset, P936 dataset and P905 dataset. Although the fused MERSI-2 PWV dataset also has a high accuracy, it is not the dataset with the highest accuracy in these four datasets. This means that the fusion algorithm mentioned in Section 2.1 cannot effectively improve the accuracy of three source MERSI-2 PWV datasets participating in fusion. Compared with the other three MERSI-2 PWV datasets, does the fused MERSI-2 PWV dataset have any advantages? What can be done to obtain a fusion dataset with higher accuracy? We will discuss these problems below.
The fusion of remote sensing datasets generally includes two objectives: (1) expanding the spatial coverage of remote sensing datasets and (2) improving the accuracy of remote sensing datasets [44,50]. Theoretically, a successful data fusion should be able to

Discussion
In this paper, we used the ground-based PWV data derived from AERONET in order to validate four MERSI-2 PWV datasets released by China's National Satellite Meteorological Center. To the best of our knowledge, this is the first comprehensive validation of these four MERSI-2 PWV datasets on a global scale. The order of these four MERSI-2 PWV datasets in descending order of accuracy is as follows: P940 dataset, fused dataset, P936 dataset and P905 dataset. Although the fused MERSI-2 PWV dataset also has a high accuracy, it is not the dataset with the highest accuracy in these four datasets. This means that the fusion algorithm mentioned in Section 2.1 cannot effectively improve the accuracy of three source MERSI-2 PWV datasets participating in fusion. Compared with the other three MERSI-2 PWV datasets, does the fused MERSI-2 PWV dataset have any advantages?
What can be done to obtain a fusion dataset with higher accuracy? We will discuss these problems below.
The fusion of remote sensing datasets generally includes two objectives: (1) expanding the spatial coverage of remote sensing datasets and (2) improving the accuracy of remote sensing datasets [44,50]. Theoretically, a successful data fusion should be able to achieve at least one of the above two objectives. Due to the fact that the algorithm principles of developing these three MERSI-2 PWV datasets participating in data fusion are the same, and that they are applicable to all cloudless MERSI-2 level 1 data during the daytime, the spatial coverage of these three PWV datasets is the same. The fused MERSI-2 PWV dataset is obtained by fusing three original PWV datasets using a weighted average method, so the spatial coverage of the fused MERSI-2 PWV dataset is the same as that of these three datasets. In terms of data accuracy, the overall accuracy of the fused MERSI-2 PWV dataset is better than that of the P905 and P936 datasets, but is worse than that of the P940 dataset. Overall, compared with these three original MERSI-2 PWV datasets, the fused MERSI-2 PWV dataset neither improves the spatial coverage nor improves the accuracy of the PWV datasets.
The reason why the weighted average algorithm used to fuse three original MERSI-2 PWV datasets cannot improve the spatial coverage of the MERSI-2 PWV datasets has been discussed in detail. Next, we will discuss the reason why the accuracy of the fused MERSI-2 PWV dataset is not optimal. Theoretically, in order to obtain the optimal fusion result, the uncertainty of source datasets should be fully considered when fusing multiple source datasets [44]. In other words, the weight of each PWV dataset participating in fusion should be determined according to the corresponding uncertainty. The weight of each dataset can be calculated using Equation (9) [51,52].
where i and k are the serial numbers of the MERSI-2 PWV dataset, N is the total number of original PWV datasets participating in fusion, UN i is the uncertainty of the i-th PWV dataset and W i is the weight of the i-th PWV dataset participating in fusion.
Since the uncertainty of the PWV dataset can be characterized by the standard deviation of its error [44], we can calculate the uncertainty of these three PWV datasets in different PWV situations based on their validation results, and then obtain the weights of different PWV datasets participating in fusion based on the above uncertainty. The weights of three MERSI-2 PWV datasets calculated according to the uncertainty of the dataset are shown in Figure 9. where and are the serial numbers of the MERSI-2 PWV dataset, is the total number of original PWV datasets participating in fusion, is the uncertainty of the i-th PWV dataset and is the weight of the i-th PWV dataset participating in fusion. Since the uncertainty of the PWV dataset can be characterized by the standard deviation of its error [44], we can calculate the uncertainty of these three PWV datasets in different PWV situations based on their validation results, and then obtain the weights of different PWV datasets participating in fusion based on the above uncertainty. The weights of three MERSI-2 PWV datasets calculated according to the uncertainty of the dataset are shown in Figure 9. However, the weights of three MERSI-2 PWV datasets used by Wang et al. [22] are not calculated according to the uncertainty of each PWV dataset, but determined according to the sensitivity of the TTA of the corresponding water vapor absorption channel to water vapor. The weights of three original MERSI-2 PWV datasets under different water However, the weights of three MERSI-2 PWV datasets used by Wang et al. [22] are not calculated according to the uncertainty of each PWV dataset, but determined according to the sensitivity of the TTA of the corresponding water vapor absorption channel to water vapor. The weights of three original MERSI-2 PWV datasets under different water vapor content used by Wang et al. [22] are shown in Figure 10. In addition to the sensitivity of the TTA of the water vapor absorption channel to water vapor, there are many variables that can affect the uncertainty of PWV retrieval results, such as the calibration accuracy of different channels of MERSI-2, observation error of MERSI-2, the accuracy of the TOA reflectance of different water vapor absorption channels obtained by interpolation and the accuracy of the radiation transmission model, etc. By comparing Figures 9 and 10, it can be seen that the weights calculated based on the sensitivity of the TTA of water vapor absorption channels to water vapor are different from the weights calculated using uncertainty. Therefore, it is unreasonable to determine the required weight of each MERSI-2 PWV dataset participating in fusion only based on the sensitivity of the TTA of the corresponding water vapor absorption channel to water vapor. In order to improve the accuracy of fused MERSI-2 PWV datasets, it is necessary to consider the influence of other variables on the uncertainty of retrieval results. In addition to the above-mentioned uncertainty that can be used to characterize the random error, the systematic error can also affect the effect of fusion. In theory, if the systematic error does exist in the original MERSI-2 PWV datasets, it needs to be removed before fusion. Otherwise, the systematic error will be brought into the fusion result [44]. As shown in Figure 4, among these three MERSI-2 PWV datasets participating in fusion, only the P940 dataset has no significant systematic error. However, the systematic error is not considered when fusing the MERSI-2 PWV datasets. This directly leads to an obvious systematic error in the fused MERSI-2 PWV dataset. As a result, it is essential to remove the systematic error of source MERSI-2 PWV datasets in order to improve the accuracy of the fusion result.

Conclusions
In this work, we used ground-based PWV data derived from 369 AERONET sites in order to evaluate the accuracy of four MERSI-2 PWV datasets released by China Meteorological Administration. The validation results show that all four MERSI-2 PWV datasets are highly correlated with AERONET data. However, the accuracy of the P905 and P936 PWV datasets are significantly worse than those of the P940 and fused datasets. The RMSE, MAE, MB, RE, PER10 and PER15 of the P905 PWV dataset with the lowest accuracy in these four PWV datasets are 0.38 g/cm 2 , 0.24 g/cm 2 , −0.10 g/cm 2 , 0.15, 57.72% and 72.27%, respectively. The RMSE, MAE, MB, RE, PER10 and PER15 of the P936 PWV dataset are 0.35 g/cm 2 , 0.21 g/cm 2 , −0.11 g/cm 2 , 0.13, 66.48% and 78.64%, respectively. The RMSE, MAE, MB, RE, PER10 and PER15 of the P940 PWV dataset with the highest accuracy in Figure 10. The weights of three original MERSI-2 PWV datasets participating in fusion used by Wang et al. [22].
In addition to the above-mentioned uncertainty that can be used to characterize the random error, the systematic error can also affect the effect of fusion. In theory, if the systematic error does exist in the original MERSI-2 PWV datasets, it needs to be removed before fusion. Otherwise, the systematic error will be brought into the fusion result [44]. As shown in Figure 4, among these three MERSI-2 PWV datasets participating in fusion, only the P940 dataset has no significant systematic error. However, the systematic error is not considered when fusing the MERSI-2 PWV datasets. This directly leads to an obvious systematic error in the fused MERSI-2 PWV dataset. As a result, it is essential to remove the systematic error of source MERSI-2 PWV datasets in order to improve the accuracy of the fusion result.

Conclusions
In this work, we used ground-based PWV data derived from 369 AERONET sites in order to evaluate the accuracy of four MERSI-2 PWV datasets released by China Meteorological Administration. The validation results show that all four MERSI-2 PWV datasets are highly correlated with AERONET data. However, the accuracy of the P905 and P936 PWV datasets are significantly worse than those of the P940 and fused datasets. The RMSE, MAE, MB, RE, PER10 and PER15 of the P905 PWV dataset with the lowest accuracy in these four PWV datasets are 0.38 g/cm 2 , 0.24 g/cm 2 , −0.10 g/cm 2 , 0.15, 57.72% and 72.27%, respectively. The RMSE, MAE, MB, RE, PER10 and PER15 of the P936 PWV dataset are 0.35 g/cm 2 , 0.21 g/cm 2 , −0.11 g/cm 2 , 0.13, 66.48% and 78.64%, respectively. The RMSE, MAE, MB, RE, PER10 and PER15 of the P940 PWV dataset with the highest accuracy in these four PWV datasets are 0.24 g/cm 2 , 0.15 g/cm 2 , −0.02 g/cm 2 , 0.10, 76.36% and 86.27%, respectively. The RMSE, MAE, MB, RE, PER10 and PER15 of the fused PWV dataset are 0.28 g/cm 2 , 0.17 g/cm 2 , −0.07 g/cm 2 , 0.11, 73.04% and 83.60%, respectively. According to the overall accuracy of these four MERSI-2 PWV datasets, they can be ranked in descending order as P940 dataset, fused dataset, P936 dataset and P905 dataset. Due to the fact that the systematic error of the original PWV datasets is not considered when developing the fused MERSI-2 PWV dataset, and the weight assigned to different original PWV datasets is unreasonable, the fused PWV dataset has a significant systematic error, and its accuracy is not as good as that of the dataset with the best accuracy in these three original PWV datasets.
An error analysis of the four MERSI-2 PWV products under different water vapor content shows that the random error of the four MERSI-2 PWV datasets will increase with increasing PWV, and the random error of the P905 dataset is obviously larger than that of the other PWV datasets. In addition, the P905, P936 and fused PWV datasets have relatively large systematic errors (−0.10, −0.11 and −0.07 g/cm 2 ), whereas the systematic error of the P940 dataset (−0.02 g/cm 2 ) is very small. The validation results of the MERSI-2 PWV datasets in different seasons indicate that the absolute errors of the four PWV datasets have obvious seasonal trends. The absolute errors are the largest in summer in the Northern Hemisphere and the smallest in winter. This is because most AERONET sites are located in the Northern Hemisphere, and the absolute errors of the four MERSI-2 PWV datasets are positively correlated with water vapor content. However, the relative errors of these four PWV datasets have no obvious seasonal trend. The validation results of the MERSI-2 PWV data in different locations show that most validation results with relatively large absolute errors for these four PWV datasets are distributed in tropical or coastal areas with very high water vapor content. This is consistent with the previous validation results; that is, the error of PWV retrieval results will become larger as the water vapor content increases. In order to obtain a clearer understanding of the accuracy of the four MERSI-2 PWV datasets, we compare the accuracy of the four MERSI-2 PWV datasets with that of the widely used MODIS PWV dataset and AIRS PWV dataset. The results of the comparison show that the accuracy of the MODIS PWV dataset is not as good as that of all four MERSI-2 PWV datasets, due to the serious systematic error of the MODIS PWV dataset, and that the accuracy of the AIRS PWV dataset is worse than that of the P940 and fused MERSI-2 PWV datasets.