Evaluation of Four Satellite Precipitation Products over Mainland China Using Spatial Correlation Analysis

: The accuracy and reliability of satellite precipitation products (SPPs) are important for their applications. In this study, four recently presented SPPs, namely, GSMaP_Gauge, GSMaP_NRT, IMERG, and MSWEP, were evaluated against daily observations from 2344 gauges of mainland China from 2001 to 2018. Bivariate Moran’s I (BMI), a method that has demonstrated high applicability in characterizing spatial correlation and dependence, was ﬁrst used in research to assess their spatial correlations with gauge observations. Results from four conventional indices indicate that MSWEP exhibited the best performance, with a correlation coefﬁcient of 0.78, an absolute deviation of 1.6, a relative bias of − 5%, and a root mean square error of 5. Six precipitation indices were selected to further evaluate the spatial correlation between the SPPs and gauge observations. MSWEP demonstrated the best spatial correlation in annual total precipitation, annual precipitation days, continuous wet days, continuous dry days, and very wet day precipitation with global BMI of 0.95, 0.78, 0.78, 0.78, and 0.87, respectively. Meanwhile, IMERG showed superiority in terms of maximum daily precipitation with a global BMI value of 0.91. IMERG also exhibited superior performance in quantifying the annual count days that experience precipitation events exceeding 25 mm and 50 mm, with a global BMI of 0.96, 0.92. In four sub-regions, these products exhibited signiﬁcant regional characteristics. MSWEP demonstrated the highest spatial correlation with gauge observations in terms of total and persistent indices in the four sub-regions, while IMERG had the highest global BMI for extreme indices. In general, global BMI can quantitatively compare the spatial correlation between SPPs and gauge observations. The Local Indicator of Spatial Association (LISA) cluster map provides clear visual representation of areas that are signiﬁcantly overestimated or underestimated. These advantages make BMI a suitable method for SPPs assessment.


Introduction
The territory of China is characterized by a wide range of geographical conditions and climatic zones, which result in a complex and varied distribution of precipitation across the region. The eastern areas of China are influenced by a monsoonal climate, while the northwestern regions have a temperate continental climate, and the Tibet Plateau have an alpine climate [1,2]. These complexities in precipitation patterns influence the frequency of drought and flood events significantly, which pose a threat to human safety and well-being, food security, and the stability of the ecosystem [3]. Thus, reliable and highresolution quantitative precipitation estimation is of the utmost importance for effective risk management and climate change adaption.
Precipitation observations can be achieved with three main methods: gauge observations, weather radars, and satellite sensors [4]. As a traditional precipitation observation method, the advantages of the gauge observations include accurate point data and long The specific objectives include: (1) to assess the performance of the four SPPs in terms of absolute deviation, relative bias, root mean square error, and correlation coefficient against the gauge observations; (2) to use the BMI to identify the spatial correlation between six precipitation indices obtained from SPPs and those from gauge observations in mainland China and its sub-regions, and to use the LISA cluster map to analyze the local clustering characteristics of their differences; and (3) to compare the advantages of the BMI in spatial assessment with the universal assessment method. The study provides a comprehensive evaluation of the overall spatial correlation and local clustering characteristics of SPPs over mainland China and expands the spatial evaluation methods.

Study Area
Due to the large differences in natural geographical conditions and contrasts in climate, precipitation in mainland China is highly variable and characterized by complex spatial and temporal distributions [47]. Therefore, we divide mainland China into subregions to capture the regional characteristics of precipitation [48]. According to the altitude, annual precipitation distribution, and the existence of mountains, it can be divided into three regions: the eastern monsoon region, the northwest region (NWC), and the Tibet Plateau region (TP). The eastern monsoon region is further divided into the northern region (NC) and the southern region (SC) according to the latitude difference, separated by the Qinling Mountains Huaihe River ( Figure 1).

Four SPPs
IMERG is a level 3 product of the GPM mission, which utilizes data from multiple satellite sensors onboard the GPM platform, including information from previous missions such as TRMM. IMERG leverages data from a multitude of Low Earth Orbit (LEO) satellites and complements it with geostationary Earth orbit (GEO) infrared estimates to overcome the limited sampling of individual LEO satellites [13]. IMERG has proven useful in various meteorological and precipitation assessments [49][50][51]. This study utilized the IMERG v06 Final-Run precipitation dataset, which has a spatial resolution of 0.1 • and a temporal resolution of 1 day.
GSMaP is another project under the Japan Precipitation Measurement Mission (PMM) scientific team. The GSMaP algorithm utilizes various Passive Microwave/Infrared Remote Sens. 2023, 15, 1823 4 of 25 (PMW/IR) sensors, such as the GPM Microwave Imager (GMI) [52]. GSMaP_MVK is obtained by a joint passive microwave and infrared inversion algorithm based on the Kalman filter moving vector method [53]. Another near-real-time product of GSMaP_NRT was developed and attracted many data users because of its short latency (about 3 h after observation). GSMaP_Gauge is then obtained by correcting the GSMaP_MVK using CPC rainfall station data. GSMaP_NRT and GSMaP_Gauge products are used in this study with resolutions of 0.1 • and 1 d.
MSWEP is a recently developed global precipitation dataset by Beck et al. [54]. It incorporates global site data, multiple satellite observations, and reanalysis data, and it is revised with some runoff and potential evapotranspiration data. MSWEP has attracted extensive international attention since its release due to its relatively high spatial resolution (0.1 • ), long time series , and strong data integrity [55]. In this study, MSWEP V2 is adopted in the study, and the daily precipitation was obtained by accumulating precipitation observations for 3 h.

Rain Gauge Data
In this study, daily precipitation data from 2344 meteorological gauges in China from 2001 to 2018 were used. The dataset was compiled by the China Meteorological Administration (CMA). The quality of the data set was strictly controlled before release. We preprocessed the missing daily precipitation data using multi-year daily averages at a given point in time. The distribution of rain gauges is shown in Figure 1.

Conventional Indices
In order to compare the performance of four SPPs generally, we used the point-to-pixel method to calculate AD, RB, RMSE, and Corr between each gauge's observation data and satellite precipitation data, and carried out the spatial average of the index values of different sub-regions. The calculation methods are as follows [56,57]: where N is the number of gauges; S and G are the SPPs data and the gauge observations, respectively; S is the average of the SPPs data; and G is the average of the gauge observations.

Spatial Correlation Analysis BMI
The BMI, which encompasses both global and local variations, has been widely acknowledged as a suitable tool to compare the spatial correlation between two geographic elements [58]. In this research, we employed the BMI method to investigate the spatial correlation between SPPs and gauge observations. The calculation equations of global BMI and local BMI can be expressed as follows: where I B , I B i refer to the global and local BMI, respectively; N is the total number of gauges; Z G i and Z S j refer to the standardized value of gauge observations for i site and the standardized value of SPPs for the j site, respectively; and W ij is the Euclidean distance weight between i and j sites.
The values of I B , I B i range from −1 to 1. A positive value indicates a positive spatial correlation between SPPs and gauge observations, while a negative value indicates a negative spatial correlation [59]. I B can also be expressed as the slope of the linear fit to the bivariate Moran scatter plot, which consists of a plot with the spatially lagged standardized satellite precipitation data on the y-axis and the standardized gauge observations on the x-axis.
The LISA cluster map is derived from local BMI, and identifies the spatial correlation clusters as: High-High (H-H), Low-Low (L-L), High-Low (H-L), and Low-High (L-H). H-H and L-L clusters indicate that there is a positive correlation between SPPs and satellite gauge observations in this region, while H-L and L-H clusters indicate that there is a negative correlation between them. In this study, the significance of the local BMI was assessed by a permutation test [58], and the significance level was set to 0.05.

Selected Precipitation Indices
The spatial correlation between gauge observations and SPPs was achieved by calculating the BMI between precipitation indices computed using gauge observations and those obtained from SPPs. Eight widely used precipitation indicators defined by the Expert Group on Climate Change Detection and Indices (ETCCDI) [60,61] were selected. The six indicators were mainly divided into three categories, and their definitions are shown in Table 1.

Conventional Indices
The conventional indices of the satellite precipitation data in China and its subregions were presented in Table 2. The results showed that MSWEP had the best performance, with a Corr of 0.78, an AD of 1.6, a RB of −5%, and a RMSE of 5. IMERG was ranked second, while GSMaP_NRT performed the poorest due to the lack of merged rain gauge data. Among the four regions, the highest correlation between SPPs and gauge observations was observed in SC, whereas the lowest correlation was in NWC. MSWEP show the highest correlation in all four regions, with a Corr of 0.80, 0.78, 0.70, and 0.74 respectively. GSMaP_Gauge and GSMaP_NRT largely underestimated the daily precipitation in SC, but overestimated it in NC, NWC, and TP, particularly in NWC, with an RB of 53% and 209%, respectively. IMERG overestimated the daily precipitation in all four regions, with a RB of 10%, 6%, 22%, and 13%, respectively. MSWEP mainly underestimated daily precipitation in NC, SC, and NWC by −8%, −6%, and −3%, respectively, but showed a positive deviation  The high performance of the MSWEP product can be attributed to its integration of precipitation estimates from multiple sources, including satellite-based estimates, gaugebased observations, and reanalysis data. Notably, the incorporation of a large number of gauge observations has significantly enhanced its accuracy, which may explain why it has the best performance in the study. Moreover, SC has the highest density of rain gauges on the Chinese mainland. This provides ample data sources for the SPPs to improve their estimates, especially for the MSWEP and GSMaP_Gauge. In contrast, NWC has a sparse rain gauge network, which restricts the improvement of SPPs in this region. This difference in rain gauge density may be one of the reasons why SC has a higher correlation with gauge observations compared to NWC.

Total Indices BMI in China
The global BMI of ATP and ATD was calculated and their bivariate Moran scatter plots were shown in Figures 2 and 3. As illustrated in Figure 2, the BMI of ATP for GSMaP_Gauge, GSMaP_NRT, IMERG, and MSWEP are 0.93, 0.74, 0.96, and 0.95, respectively. Products such as GSMaP_Gauge, IMERG, and MSWEP demonstrate good spatial correlation with the gauge observations and effectively capture the spatial pattern of ATP. Meanwhile, GSMaP_NRT exhibits a significantly lower BMI compared to the other products. In Figure 3, the BMI of ATD for the four products were found to be 0.77, 0.71, 0.68, and 0.89, respectively. Despite having the best spatial correlation for annual precipitation, IMERG demonstrates poor performance in capturing annual precipitation days. However, the scatter plot in the figure indicates that MSWEP can effectively capture not only the spatial distribution characteristics of annual precipitation but also the number of annual precipitation days.
The LISA cluster maps of ATP and ATD are presented in Figures 4 and 5. Figure 4 indicates that the ATP produced by GSMaP_Gauge, IMERG, MSWEP, and the gauge observations demonstrate a strong positive spatial correlation with L-L clusters in NWC, NC and TP, and H-H clusters in central and southern SC. Meanwhile, there is a notable negative spatial correlation between the gauge observations and GSMaP_NRT in the TP, which is evidenced by the L-H clusters in the eastern region of the plateau, suggesting that GSMaP_NRT significantly overestimates ATP.
Meanwhile, GSMaP_NRT exhibits a significantly lower BMI compared to the other products. In Figure 3, the BMI of ATD for the four products were found to be 0.77, 0.71, 0.68, and 0.89, respectively. Despite having the best spatial correlation for annual precipitation, IMERG demonstrates poor performance in capturing annual precipitation days. However, the scatter plot in the figure indicates that MSWEP can effectively capture not only the spatial distribution characteristics of annual precipitation but also the number of annual precipitation days.     In Figure 5, the ATD obtained by the four SPPs and the gauge observations display significant positive spatial correlation in NWC and NC with L-L clusters. GSMaP_NRT and GSMaP_Gauge show a strong positive correlation in the northeast and northwest regions of SC and the eastern region of the TP, as indicated by H-H clusters. IMERG demonstrates strong positive correlation with H-H clusters in the eastern SC, but it also shows a negative spatial correlation with L-H clusters in the junction between NC and SC, suggesting that IMERG significantly overestimates ATD. MSWEP demonstrates a higher proportion of H-H clusters in the SC and TP compared to the other precipitation products.

BMI in Four Sub-Regions
The BMI for the total indices of four sub-regions were calculated and presented in Table 3. Results indicate that IMERG has the highest spatial correlation with the gauge observations for ATP across all sub-regions. Among the ATD, MSWEP performed the best in both NC and SC. In summary, MSWEP excels at capturing the spatial distribution of total indices in NC and SC, IMERG performs well in NWC, and both IMERG and GSMaP_Gauge perform well in Tibet Plateau TP.
(c) (d) In Figure 5, the ATD obtained by the four SPPs and the gauge observations display significant positive spatial correlation in NWC and NC with L-L clusters. GSMaP_NRT and GSMaP_Gauge show a strong positive correlation in the northeast and northwest regions of SC and the eastern region of the TP, as indicated by H-H clusters. IMERG demonstrates strong positive correlation with H-H clusters in the eastern SC, but it also shows a negative spatial correlation with L-H clusters in the junction between NC and SC, suggesting that IMERG significantly overestimates ATD. MSWEP demonstrates a higher proportion of H-H clusters in the SC and TP compared to the other precipitation products.

BMI in Four Sub-Regions
The BMI for the total indices of four sub-regions were calculated and presented in Table 3. Results indicate that IMERG has the highest spatial correlation with the gauge observations for ATP across all sub-regions. Among the ATD, MSWEP performed the best in both NC and SC. In summary, MSWEP excels at capturing the spatial distribution of total indices in NC and SC, IMERG performs well in NWC, and both IMERG and GSMaP_Gauge perform well in Tibet Plateau TP.

Persistent Indices BMI in China
The results of the global BMI and its scatter plots for CDD and CWD between SPPs and gauge observations are presented in Figures 6 and 7, respectively. As can be seen from the scatter plots, the global BMI for the CDD of GSMaP_Gauge, GSMaP_NRT, IMERG, and MSWEP are 0.66, 0.65, 0.67, and 0.78, respectively. The global BMI for the CWD of the four products were 0.73, 0.65, 0.70, and 0.78, respectively. Among the four products, MSWEP has the highest spatial correlation with the gauge observations, indicating that it has the best performance in capturing CDD and CWD. Conversely, GSMaP_NRT, which did not incorporate gauge observations, performed the worst. and gauge observations are presented in Figures 6 and 7, respectively. As can be seen from the scatter plots, the global BMI for the CDD of GSMaP_Gauge, GSMaP_NRT, IMERG, and MSWEP are 0.66, 0.65, 0.67, and 0.78, respectively. The global BMI for the CWD of the four products were 0.73, 0.65, 0.70, and 0.78, respectively. Among the four products, MSWEP has the highest spatial correlation with the gauge observations, indicating that it has the best performance in capturing CDD and CWD. Conversely, GSMaP_NRT, which did not incorporate gauge observations, performed the worst.   The LISA cluster maps for CDD and CWD are depicted in Figures 8 and 9, respectively. It is evident from Figure 8 that the CDD values obtained from the four SPPs exhibit a substantial positive spatial correlation with the gauge observations, as demonstrated by the L-L clusters in the northern region of SC and the H-H clusters in NWC and TP. However, a significant negative correlation between the GSMaP_NRT and GSMaP_Gauge and the gauge observations is observed in the northeast of NC, indicated by the L-H clusters. IMERG exhibits a negative spatial correlation with H-L clusters in the southeast of NC and L-H clusters in the west and south of SC. MSWEP displays a negative spatial correlation with H-L clusters in the border region between SC and TP. The LISA cluster maps for CDD and CWD are depicted in Figures 8 and 9, respectively. It is evident from Figure 8 that the CDD values obtained from the four SPPs exhibit a substantial positive spatial correlation with the gauge observations, as demonstrated by the L-L clusters in the northern region of SC and the H-H clusters in NWC and TP. However, a significant negative correlation between the GSMaP_NRT and GSMaP_Gauge and the gauge observations is observed in the northeast of NC, indicated by the L-H clusters. IMERG exhibits a negative spatial correlation with H-L clusters in the southeast of NC and L-H clusters in the west and south of SC. MSWEP displays a negative spatial correlation with H-L clusters in the border region between SC and TP. The LISA cluster maps for CDD and CWD are depicted in Figures 8 and 9, respectively. It is evident from Figure 8 that the CDD values obtained from the four SPPs exhibit a substantial positive spatial correlation with the gauge observations, as demonstrated by the L-L clusters in the northern region of SC and the H-H clusters in NWC and TP. However, a significant negative correlation between the GSMaP_NRT and GSMaP_Gauge and the gauge observations is observed in the northeast of NC, indicated by the L-H clusters. IMERG exhibits a negative spatial correlation with H-L clusters in the southeast of NC and L-H clusters in the west and south of SC. MSWEP displays a negative spatial correlation with H-L clusters in the border region between SC and TP. In Figure 9, the CWD values obtained from the four SPPs exhibit a positive spatial correlation with the gauge observations, as demonstrated by the L-L clusters in NWC and NC as well as the H-H clusters in TP and the southern region of SC. The L-H and H-L clusters, which are present in the results of GSMaP_NRT, GSMaP_Gauge, and IMERG, are much smaller compared to those in the CDD values. This indicates that the spatial correlations for CWD are much better than those for CDD.

BMI in Four Sub-Regions
The results of the global BMI analysis of the persistent indices in four regions are presented in Table 4. The analysis reveals that MSWEP has the highest spatial correlation with CDD and CWD in all four regions. In comparison, the CDD values obtained from GSMaP_Gauge, GSMaP_NRT, and IMERG have significantly lower spatial correlations with the gauge observations in NC and SC when compared to MSWEP. Furthermore, the CWD values obtained from the four SPPs exhibit the worst spatial correlation in NWC. Among the four SPPs, IMERG shows the weakest performance in NC and SC.

BMI in Four Sub-Regions
The results of the global BMI analysis of the persistent indices in four regions are presented in Table 4. The analysis reveals that MSWEP has the highest spatial correlation with CDD and CWD in all four regions. In comparison, the CDD values obtained from GSMaP_Gauge, GSMaP_NRT, and IMERG have significantly lower spatial correlations with the gauge observations in NC and SC when compared to MSWEP. Furthermore, the CWD values obtained from the four SPPs exhibit the worst spatial correlation in NWC. Among the four SPPs, IMERG shows the weakest performance in NC and SC.

Extreme Indices BMI in China
The results of the global BMI and its scatter plots for R95 and Rmax between SPPs and gauge observations are presented in Figures 10 and 11   The results indicate that the GSMaP_Gauge, IMERG, and MSWEP accurately represent the spatial distribution of extreme events. Among these, MSWEP shows the best performance in R95, while IMERG is best in Rmax. In contrast, GSMaP_NRT demonstrates the weakest performance among the four SPPs.
The LISA cluster maps of R95 and Rmax are shown in Figures 12 and 13. The results indicate that the R95 and Rmax obtained from the four SPPs display significant positive spatial correlations with the gauge observations, as evidenced by the presence of L-L clusters in the NWC and TP. Furthermore, GSMaP_Gauge, IMERG, and MSWEP exhibit significant H-H clusters in the central and southern SC. However, GSMaP_NRT only exhibits significant H-H clusters in the coastal areas of SC. The worst performing product, GSMaP_NRT, shows significant negative spatial correlation with H-L clusters in central SC and L-H clusters in the western NC and eastern NWC. The results indicate that the GSMaP_Gauge, IMERG, and MSWEP accurately sent the spatial distribution of extreme events. Among these, MSWEP shows the bes formance in R95, while IMERG is best in Rmax. In contrast, GSMaP_NRT demons the weakest performance among the four SPPs.
The LISA cluster maps of R95 and Rmax are shown in Figures 12 and 13. The r indicate that the R95 and Rmax obtained from the four SPPs display significant po spatial correlations with the gauge observations, as evidenced by the presence of L-L ters in the NWC and TP. Furthermore, GSMaP_Gauge, IMERG, and MSWEP exhib nificant H-H clusters in the central and southern SC. However, GSMaP_NRT only ex significant H-H clusters in the coastal areas of SC. The worst performing pro GSMaP_NRT, shows significant negative spatial correlation with H-L clusters in c SC and L-H clusters in the western NC and eastern NWC.

BMI in Four Sub-Regions
The results of the BMI of extreme indices in the four sub-regions are shown in Table 5. The analysis shows that IMERG has the highest positive spatial correlation with gauge observations in all four sub-regions, as reflected in the highest values of the global BMI. Meanwhile, the global BMI of GSMaP_NRT was significantly lower in comparison to the other precipitation products, particularly in NWC, where the values of global BMI were only 0.12 and −0.12 for R95 and Rmax, respectively. The results also indicate that the spatial correlation of extreme indices is weaker in TP compared to other regions.

Frequency Indices BMI in China
The results of the global BMI and its scatter plots for R25 and R50 between SPPs and gauge observations are presented in Figures 14 and 15 The results suggest that IMERG performs the best among the four selected products, indicating a robust ability to detect extreme precipitation events. GSMaP_Gauge and MSWEP also demonstrate a strong ability in detecting extreme precipitation events. However, GSMaP_NRT shows the weakest performance among the four products. It is important to note that R50 is temporally non-stationary in some regions due to high thresholds, which will lead to the uncertainties in BMI results.
The LISA cluster maps of R25 and R50 are shown in Figures 16 and 17. The results suggest that the R25 and R50 derived from the four SPPs exhibit considerable positive spatial correlations with the gauge observations, with the H-H cluster located in the eastern SC and L-L clusters located in the NWC. Conversely, the GSMaP_NRT results show noteworthy negative spatial correlations with L-H clusters in the eastern TP and H-L clusters in southwest SC. The GSMaP_Gauge also indicates a slight presence of L-H clusters in eastern TP.

BMI in Four Sub-Regions
The results of the BMI of extreme indices in the four sub-regions are shown in Table  5. The analysis shows that IMERG has the highest positive spatial correlation with gauge observations in all four sub-regions, as reflected in the highest values of the global BMI. Meanwhile, the global BMI of GSMaP_NRT was significantly lower in comparison to the other precipitation products, particularly in NWC, where the values of global BMI were only 0.12 and −0.12 for R95 and Rmax, respectively. The results also indicate that the spatial correlation of extreme indices is weaker in TP compared to other regions.    The results suggest that IMERG performs the best among the four selected products, indicating a robust ability to detect extreme precipitation events. GSMaP_Gauge and MSWEP also demonstrate a strong ability in detecting extreme precipitation events. However, GSMaP_NRT shows the weakest performance among the four products. It is important to note that R50 is temporally non-stationary in some regions due to high thresholds, which will lead to the uncertainties in BMI results.
The LISA cluster maps of R25 and R50 are shown in Figures 16 and 17. The results suggest that the R25 and R50 derived from the four SPPs exhibit considerable positive spatial correlations with the gauge observations, with the H-H cluster located in the eastern SC and L-L clusters located in the NWC. Conversely, the GSMaP_NRT results show noteworthy negative spatial correlations with L-H clusters in the eastern TP and H-L clusters in southwest SC. The GSMaP_Gauge also indicates a slight presence of L-H clusters in eastern TP. The results suggest that IMERG performs the best among the four selected products, indicating a robust ability to detect extreme precipitation events. GSMaP_Gauge and MSWEP also demonstrate a strong ability in detecting extreme precipitation events. However, GSMaP_NRT shows the weakest performance among the four products. It is important to note that R50 is temporally non-stationary in some regions due to high thresholds, which will lead to the uncertainties in BMI results.
The LISA cluster maps of R25 and R50 are shown in Figures 16 and 17. The results suggest that the R25 and R50 derived from the four SPPs exhibit considerable positive spatial correlations with the gauge observations, with the H-H cluster located in the eastern SC and L-L clusters located in the NWC. Conversely, the GSMaP_NRT results show noteworthy negative spatial correlations with L-H clusters in the eastern TP and H-L clusters in southwest SC. The GSMaP_Gauge also indicates a slight presence of L-H clusters in eastern TP.

BMI in Four Sub-Regions
The results of the BMI of frequency indices in four sub-regions are shown in Table 6. The results demonstrate that IMERG exhibits the strongest positive spatial correlation with gauge observations across all four sub-regions, as indicated by its highest global BMI values. In contrast, GSMaP_NRT yields significantly lower global BMI values compared to the other precipitation products, particularly in NWC where the R50 global BMI values are a mere −0.12. Additionally, the analysis reveals weaker spatial correlation of frequency indices in NWC relative to the other regions.

Discussion
Spatial scatter plots of absolute and relative bias are commonly used to evaluate the spatial characteristics of SPPs [61,62]. These plots provide preliminary spatial correlation information. However, BMI has several advantages over spatial scatter plots.
Firstly, BMI provides a value that quantifies the spatial correlation between SPPs and gauge observations. Unlike the correlation coefficient, BMI is not site to site, but accounts for the distribution of neighbor observations by considering the special weights illustrated in Equation (5).
Secondly, the LISA cluster map not only displays the bias, but also provides information on its significance. By employing a permutation test, the significance of local BMI values can be determined, thereby indicating areas where there is a high degree of correlation or discrepancies, including underestimation or overestimation. Figure 14 shows the spatial scatter plots of the relative bias of CDD for four SPPs and gauge observations. When compared to Figure 18, the LISA cluster map (Figure 8) provides a clearer picture of exceptional sites and their special correlation relations, which is difficult to discern from the spatial scatter plots of relative bias.

Discussion
Spatial scatter plots of absolute and relative bias are commonly used to evaluate the spatial characteristics of SPPs [61,62]. These plots provide preliminary spatial correlation information. However, BMI has several advantages over spatial scatter plots.
Firstly, BMI provides a value that quantifies the spatial correlation between SPPs and gauge observations. Unlike the correlation coefficient, BMI is not site to site, but accounts for the distribution of neighbor observations by considering the special weights illustrated in Equation (5).
Secondly, the LISA cluster map not only displays the bias, but also provides information on its significance. By employing a permutation test, the significance of local BMI values can be determined, thereby indicating areas where there is a high degree of correlation or discrepancies, including underestimation or overestimation. Figure 14 shows the spatial scatter plots of the relative bias of CDD for four SPPs and gauge observations. When compared to Figure 18, the LISA cluster map (Figure 8) provides a clearer picture of exceptional sites and their special correlation relations, which is difficult to discern from the spatial scatter plots of relative bias. Finally, the LISA cluster map is capable of identifying the spatial correlation relationships between SPP products and gauge observations at different scales based on their regional spatial distributions. Figure 19 illustrates the LISA cluster map of CDD between four SPP products and gauge observations in SC. Compared to Figure 14, the LISA cluster map highlights significant correlation clusters based on the regional spatial distributions of SPP products and gauge observations. However, the spatial scatter plot of absolute bias will remain unchanged across different scales. It is important to note that BMI is similar to the correlation coefficient in that it only describes the spatial correlation between two spatial variables. However, BMI can't take into account the absolute value difference between the variables. Therefore, it is recommended to use BMI in conjunction with conventional indices that describe the absolute value difference, such as absolute deviation or Finally, the LISA cluster map is capable of identifying the spatial correlation relationships between SPP products and gauge observations at different scales based on their regional spatial distributions. Figure 19 illustrates the LISA cluster map of CDD between four SPP products and gauge observations in SC. Compared to Figure 14, the LISA cluster map highlights significant correlation clusters based on the regional spatial distributions of SPP products and gauge observations. However, the spatial scatter plot of absolute bias will remain unchanged across different scales. It is important to note that BMI is similar to the correlation coefficient in that it only describes the spatial correlation between two spatial variables. However, BMI can't take into account the absolute value difference between the variables. Therefore, it is recommended to use BMI in conjunction with conventional indices that describe the absolute value difference, such as absolute deviation or relative deviation.

Limitations
The significance test of BMI is based on the normal distribution assumption. If the data clearly deviates from the normal distribution, it may affect the results of the significance test. Given the uneven distribution of precipitation across the Chinese mainland, the precipitation index distribution can be impacted by outliers, making it difficult to follow a normal distribution. Although some researchers also state that the test results are robust to the nonnormality of data [62], we conducted an experiment to test whether nonnormality significantly affects the test results. We applied the Box-Cox method to transform the precipitation index into a normal distribution and calculated the BMI of the transformed index, and then compared the BMI of the transformed index with the original index. All precipitation indices were examined with the method, and the results for ATP are presented in Figures 20 and 21.

Limitations
The significance test of BMI is based on the normal distribution assumption. If the data clearly deviates from the normal distribution, it may affect the results of the significance test. Given the uneven distribution of precipitation across the Chinese mainland, the precipitation index distribution can be impacted by outliers, making it difficult to follow a normal distribution. Although some researchers also state that the test results are robust to the nonnormality of data [62], we conducted an experiment to test whether non-normality significantly affects the test results. We applied the Box-Cox method to transform the precipitation index into a normal distribution and calculated the BMI of the transformed index, and then compared the BMI of the transformed index with the original index. All precipitation indices were examined with the method, and the results for ATP are presented in Figures 20 and 21.
Compared with Figure 2 to 20, we observed that there were only slight differences in the global BMI between the Box-cox transferred precipitation indices and the original precipitation indices, which will not influence the comparative results. Compared with Figure 4 to 21, the LISA cluster map also remained unaltered. Therefore, the non-normality of the data did not affect the conclusion of our research. However, in future applications of BMI in SPPs evaluation, it will be important to examine data normality, and the influence of data non-normality should be addressed before conducting significant tests.   It should also be noted that the LISA cluster map is highly sensitive to spatial scale. As shown in Figures 8 and 19, spatial correlation clusters in SC change significantly between the two figures due to differences in spatial scale, which could lead to confusion among the public and the decision makers. Additionally, the distance threshold is a critical consideration in Bivariate Moran's I. It represents the maximum distance at which spatial relationships are expected to occur and is necessary in some circumstances to exclude nonspatially correlated factors from analysis. However, the selection of the distance threshold can significantly affect the analysis results, making it important to carefully consider an appropriate threshold. Bivariate Moran's I is also sensitive to outliers, which may lead to misjudging spatial autocorrelation when extreme values are present in the data.

Conclusions
This research aimed to evaluate the performance of four SPPs, including GSMaP_Gauge, GSMaP_NRT, IMERG V06, and MSWEP V2, over mainland China by comparing their results with daily observations from 2344 gauges. A novel evaluation method, BMI, was employed to assess the spatial correlation between precipitation indices achieved by SPPs and those obtained from gauge observations, and the results were further analyzed using LISA cluster maps. The main findings are as follows: (1) Conventional index evaluations showed that MSWEP performed the best among the four products, with the highest correlation coefficient (0.78) and the lowest absolute deviation (1.6), relative bias (−5%), and root mean square error (5). IMERG was ranked second, while GSMaP_NRT performed the worst. In terms of different sub-regions, the performance of MSWEP and IMERG also performed better, especially in the TP and NWC. Notably, IMERG showed positive deviations in all four regions, while MSWEP showed negative deviations in NC, SC, and NWC, and a positive deviation in the TP. (2) The spatial correlation of the four SPP products with gauge observations was evaluated using BMI for total, persistent, extreme, and frequency indices. MSWEP showed the best spatial correlation relationship with the gauge observations in terms of total and persistent indices, with BMI values of 0.95, 0.89, 0.78, and 0.78, respectively. IMERG and MSWEP also showed the best spatial correlation among the extreme indices, with R95 and Rmax having BMI values of 0.84 and 0.91 for IMERG, and 0.87 and 0.88 for MSWEP, respectively. IMERG show the best performance in frequency indices, with BMI values of 0.96 and 0.92. Conversely, GSMaP_NRT had the worst spatial correlation in extreme and frequency indices. (3) The BMI between the four SPP products and gauge observations in different regions was also calculated. The spatial correlation characteristics of SPP products differed in different regions. Generally, MSWEP showed the highest spatial correlation with gauge observations in terms of total and persistent indices in the four regions, while IMERG had the highest BMI for extreme and frequency indices. Among the four regions, the four SPPs performed high spatial correlation in NC and SC and low in TP and NWC.
In conclusion, BMI was found to be an effective tool for evaluating SPPs as it can quantitatively describe their spatial correlation with gauge observations and provide insight into the spatial trend characteristics of precipitation indices values. The LISA cluster maps were particularly useful in identifying significant overestimation or underestimation areas. These findings have the potential to advance the application of SPP products. However, the limitations of BMI should also be mentioned, such as the distribution assumption, scale effects, etc.