Projection of Future Extreme Precipitation in China Based on the CMIP6 from a Machine Learning Perspective

: In recent years, China has suffered from frequent extreme precipitation events, and pre-dicting their future trends has become an essential part of the current research on this issue. Because of the inevitable uncertainties associated with individual models for climate prediction, this study uses a machine In general, all the indices showed an overall increasing trend in the future period, with the PRCPTOT, Rx5day, and SDII95 showing the most signiﬁcant overall increasing trends.


Introduction
In recent years, the impacts of extreme weather events such as continuous heat waves, extreme precipitation events and droughts [1,2] have become more prominent and more greater advantages in solving nonlinear and high-dimensional problems, it allows for the extraction of dynamics and physical processes present in the climate, among other reliable information [28,29].
Based on this, we used ML to integrate the simulation results of multiple models in the next-generation Coupled Model Intercomparison Project Phase 6 (CMIP6) while establishing the nonlinear relationship between them with the objective of real observational data. To better represent the integration effect, we compared ML with the ensemble median model to explore whether ML can effectively improve the ability to simulate observed data and further establish reliable future climate prediction results based on it. The remainder of this study is organized as follows: Section 2 introduces the research area, research data, evaluation indices, and research methods. Section 3 presents the main findings of current and future climate prediction methodology assessments. Finally, Section 4 summarizes and discusses the spatial distribution and temporal variation trend of extreme precipitation in China in the future.

Study Area
China is located in eastern Asia, with significant differences in winter and summer precipitation, most of which is concentrated in summer [30,31]. China's geographical environment is changeable with regional differences in topography. Due to the role of the East Asian monsoon, the eastern summer rainfall is sufficient. Northwest China is mainly influenced by a westerly, dry climate. Due to the vast size of China, its complex topography and strong monsoon characteristics [32], and the existence of many different climate types, precipitation varies among regions, decreasing from southeast to northwest [33].
In order to further study the future trends of precipitation and to quantify the regional differences, in this study, we divided the Chinese regions into eight climatic zones [34], specifically, the western arid

Data
The daily precipitation observations CN05.1 used in the study are from the grid point dataset provided by the China Meteorological Administration. This dataset was obtained by Wu et al. [35] using 2416 national meteorological stations from 1961 to 2014 in the National Meteorological Information Center, and the daily observation data of precipitation with a spatial resolution of 0.25 • × 0.25 • were obtained by interpolation and superposition of a thin-disk spline function (ANUSPLIN) and the angular distance weighting method (ADW), respectively. Additionally, these data have been used extensively in numerous climate change studies in China [36,37]. The dataset was used as the "true value" for training and verification to verify and evaluate the established model.
We used daily precipitation simulations from 27 CMIP6 global climate models. The numerical simulation experiments available for study in CMIP6 consist of two main parts: (1) historical climate simulation experiments (Historica); and (2) 23 Model Comparison Sub-Plans (MIPs) for future prognosis. Both provide a variety of meteorological variables including precipitation, where the former is based on a large number of observations, including historical climate simulation experiments driven by external forcing of data from ground-based observations and remote sensing observations, and also serves as a reference benchmark experiment, mainly to assess the model's ability to simulate climate change. Secondly, among the many MIPs, we selected the Scenario Model Comparison Program (ScenarioMIP), the most recently included in CMIP6, whose experimental results were used as the initial data for the future prediction part of this study. The study uses historical simulation data from 1961-2014 and future projection data for 2023-2100 from SSP2-4.5 in the Shared Socioeconomic Pathway. SSP2-4.5 was used as the updated RCP4.5 scenario representing the medium socio-economic development path and the medium forcing scenario, which simulates medium land use and aerosol paths [38,39], which can reflect the development scenario under normal development conditions. Before the analysis, all the data were unified to a 1 • × 1 • resolution using a bilinear interpolation scheme [19].

Climate Indices
The six precipitation indices used in the study were defined by the Expert Team on Climate Change Detection and Indices (ETCCDI, Table 1) [40]. They were used to quantify the characteristics of future precipitation characteristics [33,41]. The metrics included the following precipitation indices: total precipitation on rainy days (daily precipitation >1 mm) (PRCPTOT), annual extreme precipitation total above the 95th percentile threshold (R95pTOT), and 5-day maximum precipitation (Rx5day); and the following precipitation intensity indices: precipitation intensity (SDII), extreme precipitation intensity above the 95th percentile threshold (SDII95), and daily precipitation >20 mm of heavy rainfall days (R20 mm). These indices are widely used to identify and monitor climate extremes [42]. In this study, we first calculated the indices of all models and observations and then performed machine learning multi-model integration and calculated the median of the multi-model ensemble on this basis. The back propagation (BP) learning algorithm is a multilayer feedforward network, an artificial neural network algorithm based on error back propagation with forwarding multilayer feedback. Using the gradient descent method, the error squared minima of the actual output value and the target value were searched for so that the threshold value of the network as well as the connection weights could be adjusted without interruption so as to achieve the desired output for each set of inputs in the adjusted network model [43].
The operation of the BP neural network consists of three main parts, namely, the forward propagation process from the input layer pathway hidden layer to the output layer; the reverse correction process from the output layer to the input layer is determined by the error between the predicted output of the network before the actual observed data; and the training process in which the forward and backward processes are alternated. The BP neural network topology includes an input layer, a hidden layer, and an output layer; x 1 , x 2 , . . . , x m are the input data; y 1 , y 2 , . . . , y n are the output data; and w is the network weights, as shown in Figure 2.

Multi-Model Integrated
The overall flow chart of the multi-model integration processing method is shown in Figure 3. The model data and observed data for the historical period (1961-2014) were divided into two parts: the training set representing 1961-1998 (38 years) and the test set representing 1999-2014 (16 years). Among these, the test set mainly served as an assessment of whether the results of the evaluation after multi-model integration were closer to the observed data. In this study, we integrated six indices separately with the overall goal of integrating the advantages of the individual models' simulation capabilities in different regions to the greatest extent possible so as to obtain the best simulation results for the actual observed data in the Chinese regions. The scheme was eventually applied to the prediction of future periods as a way to achieve the best precipitation forecast. A large number of previous works have used the multi-model ensemble median for future precipitation predictions [26,44]. The CMIP6 multi-model ensemble median method has generally improved the simulation of precipitation and precipitation extremes trends on a global scale compared to its predecessor (CMIP5) [27], so it was used as a reference for the machine learning fitting results.

Evaluation Method
In this study, the Taylor diagram was used to evaluate the simulation results of 27 climate models, the multi-model ensemble median, and the integrated multi-model ML in CMIP6; the spatial distribution of each index of model simulation results before and after processing can be compared more intuitively with the consistency of the observed data. Taylor diagrams can integrate multiple evaluation metrics for presentation at the same time, indicating the accuracy of the model matching, the observed data in terms of standard deviation, central root mean squared error (RMSE), and the correlation coefficient [45].
The standard deviation of the observation data and the model is calculated as follows: Among them, X obs and X respectively is the average of the observed data and model data.
The central root mean square error, RMSE, is defined as: The correlation coefficient r of the observed data and the model data is defined as: The key to the Taylor diagram is that the standard deviation, correlation coefficient, and central root mean square error of the simulated and observed data satisfy the following relationships: Because of the large uncertainty in future precipitation predictions, We also applied the relative root mean square error (RMSE ) for a more accurate quantitative assessment of individual models, integrated models, and median multi-model integration to re-validate the correlation with the observed data results [46,47].
The RMSE for the assessment of the climate simulation capabilities of the CMIP6 models is defined as follows [48]: first, the RMSE of each model relative to the observed data needs to be calculated as in Equation (6), and, second, using the calculated RMSE for each mode in turn, the RMSE for each mode is calculated as follows: where RMSE Median is the set median of the RMSE of all the models. In general, a negative (positive) value of RMSE' is better (worse) than half (50%) of all model results for the model.

Evaluations
We obtained the PRCPTOT, R95pTOT, Rx5day, SDII, SDII95, and R20mm for each grid point by calculating the actual observations, 27 pattern data, the multi-model ensemble median data, and the ML output data for the validation of the dataset (1999-2014), respectively. On this basis, the central root mean square error (RMSE) and the ratio of the spatial correlation coefficient (R) to the standard deviation (STD) of the above and the observed values were calculated, and the Taylor diagram for a summary of the statistics was obtained. The results are shown in Figure 4. According to the calculation principles of the three indexes, the higher the spatial correlation coefficient, the closer the standard deviation ratio to 1, and the closer the central root mean square error to 0, the better the data simulation effect (the smaller the distance between the simulated data points and observation points, as expressed in the Taylor diagram). It can be seen from the Taylor diagram of each index that for 27 different models, there are great differences between them. For the four precipitation indices, PRCPTOT, R95pTOT, and Rx5day, most of the patterns had correlation coefficients between 0.6 and 0.85, while the RMSE values was mostly between 0.5 and 1.0. The correlation coefficient of the precipitation frequency index R20mm was between 0.3 and 0.85, while the RMSE values were between 0.75 and 1.5. The two intensity indices, SDII and SDII95, performed better with correlation coefficients between 0.7 and 0.9 and the RMSE values between 0.5 and 1.0. In contrast, the effect of the multi-model ensemble medians on all the indices was slightly better than the 27-model data. The ML performed much better than the 27 patterns and the multi-model ensemble medians, which also reflects the ability of machine learning methods to better integrate the advantages of different models. A further analysis showed that the correlation coefficients of the PRCPTOT, R20rmm, and SDII95 all reached about 0.95, while the R95pTOT Rx5day and SDII were slightly different, with correlation coefficients of 0.86, 0.85, and 0.92. The central root mean square errors were all around 0.5, and the final standard deviation ratio was between 0.75 and 1.0 for all the evaluations. The best performance was obtained for the PRCPTOT, R20mm, and SDII95. It was found that ML has the smallest distance between each index and the observed values, indicating that the simulation effect is closest to the observed values. , and R20mm (f). In the figure, the circle's angle corresponds to the spatial correlation coefficient; the black arc represents the standard deviation ratio; the green arc represents the central root mean square error; and the purple dot is the observation representing the observed data. Figure 5 visualizes the distribution of the RMSE magnitude for each model simulating extreme precipitation indices as compared to the real observations. A larger value indicates a worse simulation performance, while a lower value indicates a better simulation performance. The results show that the ability of each model to simulate each index of precipitation varies greatly, among which EC-Earth3, EC-Earth3-Veg-LR, and EC-Earth3-Veg performed better, and the RMSE of the different indices was mainly negative, indicating that the three models can simulate the study area well. CAS-FGOALS-g3, CMCC-ESM2, and CMCC-CM2-SR5 performed poorly, while EC-Earth3, EC-Earth3-Veg-LR, and EC-Earth3-Veg were found to be able to better simulate the interannual variability of extreme precipitation events in the Asian region [49]. Since the models exhibited large uncertainties in different metrics, this also reaffirms the importance of model integration through machine learning when studying climate simulations and predictions. The last two columns of Figure 5 show the ensemble median approach and the machine learning model fit, respectively, and it can be analyzed that, for each precipitation index, the integrated median multi-model approach performs better overall than the results of the 27 models simulated individually, but the ML in the last column was even more comprehensive, far outperforming all the models including the multi-model ensemble median approach according to the six extreme precipitation metrics. The results also showed that for all indices, the RMSE of ML was the minimum, and the results were better than those of any individual model, which largely eliminates the uncertainty of the structural model, and thus the model can be considered to be reasonably well established for predictive simulations of future climate change.
Based on the above evaluation results, that is, the performance of ML fitting being significantly better than that of a single model, we finally decided to evaluate the spatial simulation capability of ML for each of the six precipitation indices and to better understand the regional climate bias of the model. Figure 6 shows the deviation and relative deviation distribution of the ML fitted values relative to the observed precipitation indices, where a relative deviation greater than 100% indicates a multiplicative overestimation of the ML model mean for the simulated values of the indices. From the figure, we can see that the simulations of the six precipitation indices by the future climate model were underestimated in most areas of central and southern China and the Qinghai-Tibet Plateau, and overestimated in some areas of the western and eastern arid regions, which may be due to the limitations of the climate model itself and its simulation of complex terrain [50]. Some scholars have previously found that the CMIP6 model output for western China had a low agreement for observed precipitation [51], and the use of ML is thus effective in improving the error in this part of the region. Specifically from the spatial point of view, ML was found to have over 50% overestimation for the three extreme precipitation indices in southern Xinjiang and parts of Inner Mongolia; 10%-40% underestimation in parts of Tibet, southwest, central, and southern China; and over 40% underestimation in parts of Qinghai and Xinjiang. ML is relatively good for the simulation of the two extreme precipitation frequency indexes; for SDII especially, the relative deviation was found to not be more than 40%, and for SDII95, a deviation of more than 40% was found to be mainly distributed in parts of Xinjiang. ML overestimated the frequency index R20mm by more than 40% in Xinjiang, Tibet, Qinghai, and parts of Inner Mongolia, and underestimated by more than 40% in Xinjiang and Tibet. In general, ML simulation is relatively accurate for most parts of China, and the biggest deviation was mainly around Xinjiang and the Qinghai-Tibet Plateau, where the resolution of the climate model was relatively low, and it was difficult to completely reproduce the local circulation of complex terrain. In addition, these regions usually lack climate stations, including gridded precipitation data obtained from climate stations, making it hard to accurately simulate precipitation features in complex terrain [52]. Secondly, the simulated precipitation was only slightly underestimated (<10%) in the vast majority of South China compared to the observed data, and comparable results were obtained by Sun et al. [53]; it is also consistent with the underestimation of extreme precipitation in southern China [54]. However, the overall distribution was approximately the same as the previously assessed distribution compared to the multi-model ensemble median used in previous studies, but the simulation effect was greatly improved [55]. The results of this study also differ from some previous studies, and part of the reason for the difference may be that this study is a comparison of climate model ML with gridded precipitation data, rather than the station data typically used in the past. Most regions in China were found to have better performance with the ML model. The uncertainty in northwest China was the smallest. The uncertainty in northeast China was greater than that in northwest China, and the uncertainty in south China was small. The uncertainty in northwest China was low, likely because it is a particularly arid region.

Future Changes in Spatial Distribution and Boxplot
Due to the long time projection range of the model by which to facilitate analysis, we divided the future projection time into three periods starting from 2023: 2023-2050 (early 21st century), 2051-2075 (mid-21st century), and 2076-2100 (late 21st century), while the precipitation observation data from 1990 to 2014 were used as the reference period to facilitate the analysis of the distribution of the six precipitation indices in eight climate zones of China in different future periods compared with the base period and the change trends.
As shown in Figures 7 and 8, the spatial distribution of the relative changes of the three precipitation indices (PRCPTOT, R95pTOT, and Rx5day) and the box line plot of the absolute changes under the SSP2-4.5 scenario are shown. It can be seen that all precipitation, in general, showed an increasing trend in this century, and each precipitation index also increased with time. Relative to the base period, the changes in the three indices are more dramatic at the end of the 21st century, with the change rates of 5.69%, 7.47%, and 7.89%, respectively. Therefore, in the context of future climate change, precipitation in China will not increase significantly. Among these findings, 30.85% to 37.68% of the study area is expected to experience a significant increase (>15%) in R95pTOT by the end of the century, mainly in northern China and the southern part of the western arid zone, while the maximum rate of change will reach 40% in some parts of northern China. The largest absolute changes also occur in north China, with weaker sub-increases in northeast China and southwest China. For example, the absolute change in R95pTOT is expected to increase by an average of about 100 mm in north China by the end of the century, while the increase in the southern part of the western arid zone is expected to be about 1 mm. However, the relative rates of change are inversely distributed, with larger percentages of change occurring in the southern part of the western arid zone as well as in the eastern arid zone. This is because the total precipitation in arid and semi-arid regions is quite low, and small changes in precipitation can cause large differences in these regions [56]. In contrast, in central China and southern China, even though there is abundant rainfall and large absolute variability in some areas, its relative variability is not significant. In general, the box line plots of the absolute variation and relative variation distribution of PRCPTOT, R95pTOT, and Rx5day are generally similar, while the relative variation rates of PRCPTOT and Rx5day are more significant in south China than those of R95pTOT.     Figure 8d,e show the two precipitation intensity indices SDII and SDII95. It can be seen that there were some differences in their distributions, but the rate of change in the three phases increased gradually with time, as did the amount of precipitation. By the end of the century, SDII and SDII95 are expected to change by about 3.78% and 7.30%, respectively, compared to the base period, it is not difficult to find a more significant increasing trend in the extreme index SDII95. Precipitation intensity over about 14.9-17.5% of China, with significant variation (>15%) over the forecast period, and the areas with small changes (−5-5%) accounted for 37.06% of the total area, indicating that the extreme precipitation intensity increase is concentrated in the local areas of eastern China with a maximum change rate of about 30%, while the remaining areas are not expected to have large changes. Among these, as shown in Figure 8d,e, the main regions that were found to have an impact on the values were central China and north China, and the largest absolute changes were found to occur in central China, for which the SDII and SDII95 were 3.521 mm/day and 13.31 mm/day, respectively, followed by southwest China and south China, while the relative change rates of the western arid zone and the Qinghai-Tibet Plateau regions with less perennial precipitation and absolute changes were not obvious. Figure 7(f1-f3) and Figure 8f show the distribution and box plots of the annual number of days of intense rainfall (R20mm). Since torrential rain is not common in western China and the probability of occurrence is very low, small changes will all result in high relative rates of change and will also affect the analysis of other regional characteristics. After weighting, we choose to show the absolute change distribution. The R20mm mainly showed a continuous increase in north China and central China, and most of the regions showed an increase as compared with the base period. In contrast, parts of south China experienced a significant decrease in the middle of the 21st century, and even the number of rainstorm days was below the base period until it increased at the end of the century.
In conclusion, all the indices were found to increase with time during the forecast period, and by the end of the 21st century, PRCPTOT is expected to increase steadily in most parts of the country, with changes in heavy precipitation concentrated in north China as well as northeast China. In particular, the three extreme precipitation indices, R95pTOT, Rx5day, and SDII95, all showed more prominent and significant changes. Additionally, the projected percentage increase in northern China was higher than that in southern China, with the largest increases mainly in north China and northeast China. The increase of PRCPTOT in north China is greater than that in south China, but this did not change the distribution in that precipitation in southern China remains higher than in northern China [57]. Due to the perennially dry climate in the northern and northwestern regions, the largest relative increase is expected in the western arid zone with a large relative change, a phenomenon consistent with CMIP5 projections [58]. Figure 9 shows the spatial distribution of the annual trend changes of the six indices in the three early, middle and end periods in the future. Among these, all the indices passed the 95% MK reliability test of the trend significance level.  . Spatial distribution of trend changes in each phase for PRCPTOT (a1-a3), R95pTOT (b1-b3), Rx5day (c1-c3), SDII (d1-d3), SDII95 (e1-e3), and R20mm (f1-f3). The columns from left to right are the three predicted periods before, during, and at the end of the future period, respectively. The gray dots are marked as the grid points where the trend significance level passed the 95% MK confidence test.

Future Trend Distribution
As shown in Figure 9(a1-c3) for the trend distribution of the three precipitation indices PRCPTOT, R95pTOT, and Rx5day, it can be seen that most of the regional grid points were significant nationwide except for south China and some parts of central China. For PRCPTOT, the largest trend of most regions in the three stages was in the early 21st century, except for a few western regions and south China. Among these, southwest China, as well as north China, reached the trend of 0.512 mm/year and 0.491 mm/year in the early stage, respectively, and this trend slowed down in the middle and at the end of the period, but was still in an upward trend. In particular, south China showed a decreasing trend of −0.219 mm/year in the early 21st century, and then showed a significant increasing trend in some areas in the middle and late stages, but, according to the overall trend, there was still a slight decline of 0.056 mm/year. Finally, most of the western regions of China showed an upward trend at all stages, with a significant one in the Qinghai-Tibet Plateau where the overall trend was found to reach 0.354 mm/year. Secondly, the two extreme precipitation indices R95pTOT and Rx5day differed from the former in that the upward trend was most significant in the middle or late stages across China. In the mid-century, R95pTOT in central China reached the maximum trend value of 0.803 mm/year nationwide, while Rx5day in north China also reached the maximum trend of 0.116 mm/year, followed by northeast China and southwest China, which did not have a trend as obvious as that in the abovementioned regions, but still showed a strong upward trend of change. The R95pTOT in south China had a brief upward trend in the middle term, but the overall trend did not fluctuate much.
For the precipitation intensity indexes SDII and SDII95 (Figure 9(d1-e3)), they have different spatial trend distributions. The trend of SDII was more drastic at all stages, but the grid point of SDII95 passing the 95% confidence test was much larger than that of SDII. It is expected that, except for south China, most of the areas on the central map will still show an upward trend in general, especially north China, which has a relatively obvious trend of change where the two indices reached a peak of 0.018 mm/day/year and 0.014 mm/day/year, respectively. In contrast, SDII in north China, the eastern arid zone, and parts of western China showed almost no significant fluctuations. SDII in southwest China showed a slight downward trend in the middle and late 21st century, but SDII95 showed a continuous upward trend. About R20mm generation frequency is shown in Figure 9(f1-f3). Combining the grid points with trend significance using MK95, it can be clearly observed that the maximum trend occurs in southwest China in all three periods, while the maximum decreasing trend is concentrated in parts of central China and south China. It is not difficult to find that the area with an increasing trend in the southwest region increased in the middle and late 21st century, and only a few areas in the region showed a decreasing trend in the later period. The same phenomenon was also observed in north China, where the majority of the areas remained unchanged and decreased at the beginning, and the majority of the areas with increasing trends occurred at the end, which indicates that R20mm in southwest China and north China will increase significantly with time, and the precipitation in this region will also increase.
From this, it can be seen that the area with an increasing trend in the southwest region increased in the middle and late 21st century, and only a few areas in the region showed a decreasing trend in the later period.
In general, the trend distribution of these indices is not consistent, and the value of PRCPTOT is expected to continue to rise in most regions of China in the future, but a decreasing trend will be observed in some parts of south China. Figure 10 shows the evolution characteristics of five precipitation indexes over time in the forecast period of 2023-2100 compared with the reference period of 1990-2014. Table 2 shows the statistical data on the change in the time trend of the indicators in Figure 10. Among the grid point statistics, we kept only the grid points whose trend significance levels passed the 95% MK confidence test as statistical objects.  Compared with the baseline period, the evolution trends of PRCPTOT, R95pTOT, Rx5day, SDII, SDII95, and R20mm in 2023-2100 can all be seen to show an overall increasing trend, and the rate of change of future precipitation for each index was found to mostly increase with time in the middle and end of this century. Figure 10a-c show the temporal evolution characteristics of three precipitation indices. PRCPTOT (Figure 10a) showed the flattest trend of the three indices, reaching a maximum trend of 0.17%/year at the beginning of this century, and then maintaining a relatively stable trend around 0.06%/year. Throughout the study time period, the average trend of PRCPTOT, compared to the base period, reached 0.11%/year. Similarly, as shown in Figure 10b, the trend of the rate of change of R95pTOT in the future 2023-2100 period showed a clear trend and increased gradually over the period, reaching an average trend of 0.39%/year at the end of this century. In Figure 10c, the curve of Rx5day steadily increases, and its change rate is relatively stable in the future period, with the overall average trend reaching 0.182%/year. Figure 10d,e show the temporal changes in the rate of change of the two intensity indices. The values of SDII and SDII95 are expected to rise in this century, with SDII showing a more pronounced trend. As shown in Figure 10d, the average trend of SDII reached a maximum of 0.20%/year in the middle of the century and decreased towards the end of the century, but still maintained an increase, with an average trend of 0.11%/year throughout the study period. Finally, the change of SDII95 was the most gentle, showing almost no obvious change trend, with an average trend of 0.09%/year, indicating that China's future extreme precipitation intensity will not change much as a whole, and, from the above, we can see that SDII changes mostly exist in local areas. Figure 10f shows the evolution of the annual change rate of heavy rainfall frequency (R20mm). The evolution trend of the R20mm in the three periods of the 21st century was seen to continue to increase; the frequency of extreme precipitation was found to maintain an increasing trend, and the trend gradually expanded, reaching 0.177%/year in the late 21st century.

Discussion
The total amount and intensity of heavy rainfall is expected to increase across the country in the future, which also implies that continuous flooding will be a major problem in China's development. Future changes in precipitation in China are mainly influenced by temperature in addition to the general humidification of the atmosphere, and many studies have shown that precipitation in China is highly susceptible to climate warming, and rising temperatures lead to changes in precipitation levels in China [59]. On the other hand, potential changes in future large-scale monsoon circulation systems also play a key role in the frequency and intensity of precipitation in China, with some studies suggesting a strong influence of the East Asian summer circulation in eastern China and the westerly circulation in northwestern China and the northern Qinghai-Tibet Plateau. Precipitation in eastern China is more frequent in summer and is easily influenced by the East Asian summer winds, leading to changes in future precipitation and spatial distribution [60,61]. Studies have also shown that the expected future intensification of the East Asian summer winds [29,62] may lead to larger increases in precipitation in north China and smaller increases (or even decreases) in southeastern China, especially in the middle and lower reaches of the Yangtze River Basin, and may also affect the timing of precipitation [63]. In addition, the winter northwest wind will also strengthen in the future [64,65], which is also the cause of the projected increase in precipitation in the northwest territories. The above has focused on two factors, namely, thermal and dynamic factors, to explore some possibilities regarding future increases in precipitation intensity.
We used ML to integrate the CMIP6 multi-model, effectively reducing its uncertainty and improving its prediction ability, and revealing the future temporal and spatial changes of precipitation in China, which will allow us to improve our knowledge of the precipitation evolution in China, and the world, in the future. In addition, in the future, we will explore more advanced machine learning techniques to retrieve more information from multiple perspectives and large amounts of data, thereby improving the reliability of future predictions. Additionally, the significant increases in severe extreme precipitation events may have a strong impact on human society and ecosystems. We will also further explore the contribution and impact of extreme precipitation events on future precipitation. The results of this study bring important guidance to the development and design of future remote sensing equipment, as well as guiding directions for the planning and layout of future multi-source observation networks.

Conclusions
In this study, we used a machine learning approach to integrate the CMIP6 multimodels to better capture the nonlinear and complex relationships among climate models compared with the traditional ensemble median approach and achieved a more accurate model-based prediction of future precipitation. On this basis, the temporal changes and spatial distribution of indices of eight climate zones in China during three periods of the 21st century, from 2023 to 2100, were analyzed. The study findings are as follows: (a) In the validation assessment period (1999-2014), the ML integration treatment performed well, and the correlation coefficients of all the indices improved from the multi-model ensemble median to above 0.9 in general, with some reaching 0.95. In general, all the ML-treated indices improved the accuracy of the spatial pattern of precipitation in most regions of China. The improvement was more significant in areas with complex topography, such as around the Qinghai-Tibet Plateau. Due to the uncertainty of the climate model, there were still some errors in some areas, including the western arid zone and eastern arid zone, which were overestimated, while the negative deviations were mainly concentrated in some regions in southern China. The prediction performance of ML for the precipitation intensity index was better than that of the precipitation index, especially for SDII95. (b) In the SSP2-4.5 scenario, the PRCPTOT, R95pTOT, Rx5day, SDII, SDII95, and R20mm precipitation indices continued to increase in mainland China, and by the end of the 21st century, will increase 8.77%, 13.73%, 9.43%, 6.84%, 9.34%, and 4.02%, respectively. Changes in extreme precipitation indices were the most prominent, with extreme precipitation reaching more than half of the total precipitation at the end of the period. China may experience more frequent and intense extreme precipitation events in the future, and the risk of future flooding throughout China will be greater. (c) The increase in precipitation in southern China is not apparent. Even in the middle of the 21st century, precipitation in central and south China has decreased. In contrast, there are certain differences in the spatial distribution, especially precipitation likely to increase more in northern China, with the most significant precipitation change occurring at the end of the 21st century when PRCPTOT, R95pTOT, and Rx5day increase significantly in northern China. Additionally, the frequency and intensity of precipitation are expected to increase more sharply mainly in north China and central China, and slightly in western and southern China. The increase in precipitation intensity is expected to be greater in north China than in south China. This change may alleviate droughts and water shortages in some parts of northern China to some extent, but this pattern will not change for more rainfall in the south and less rainfall in the north [19]. (d) The temporal evolution of all the indices showed an increasing trend, with precipitation increasing more rapidly at the end of this century and SDII having a greater rate of increase in the first and middle of the century. The contribution of R95pTOT to PRCPTOT also increased with time, reaching 0.387%/year at the end of the century, while Rx5day also increased at a rate of 0.182%/year.