Uncertainty Quantiﬁcation of WRF Model for Rainfall Prediction over the Sichuan Basin, China

: The mesoscale Weather Research and Forecasting (WRF) model has been widely employed to forecast day-ahead rainfalls. However, the deterministic predictions from the WRF model in-corporate relatively large errors due to numerical discretization, inaccuracies in initial/boundary conditions and parameterizations, etc. Among them, the uncertainties in parameterization schemes have a huge impact on the forecasting skill of rainfalls, especially over the Sichuan Basin which is located east of the Tibetan Plateau in southwestern China. To ﬁgure out the impact of various parameterization schemes and their interactions on rainfall predictions over the Sichuan Basin, the Global Forecast System data are chosen as the initial/boundary conditions for the WRF model and 48 ensemble tests have been conducted based on different combinations of four microphysical (MP) schemes, four planetary boundary layer (PBL) schemes, and three cumulus (CU) schemes, for four rainfall cases in summer. Compared to the observations obtained from the Chinese ground-based and encrypted stations, it is found that the Goddard MP scheme together with the asymmetric convective model version 2 PBL scheme outperforms other combinations. Next, as the ﬁrst step to explore further improvement of the WRF physical schemes, the polynomial chaos expansion (PCE) approach is then adopted to quantify the impacts of several empirical parameters with uncertainties in the WRF Single Moment 6-class (WSM6) MP scheme, the Yonsei University (YSU) PBL scheme and the Kain-Fritsch CU scheme on WRF rainfall predictions. The PCE statistics show that the uncertainty of the scaling factor applied to ice fall velocity in the WSM6 scheme and the proﬁle shape exponent in the YSU scheme affects more dominantly the rainfall predictions in comparison with other parameters, which sheds a light on the importance of these schemes for the rainfall predictions over the Sichuan Basin and suggests the next step to further improve the physical schemes.


Introduction
The frequency of natural disasters that occur due to extreme rainfalls is increasing these days [1][2][3]. Extreme rainfalls can cause widespread devastation, such as floods and landslides, resulting in loss of life and personal property damage. Therefore, accurate rainfall prediction is of great importance to take preventive measures.
Short-term rainfall prediction relies on mesoscale numerical weather prediction (NWP) models. Although a number of works have been done to develop NWP models, there exists a significant amount of uncertainty associated with initial/boundary conditions, discretizations and physical parameterizations, etc., which still limits the accuracy of NWP models. Through assimilating quality-controlled observations [4,5] and adopting advanced numerical discretization methods [6], the prediction of the NWP model can be highly improved. In addition, the study of physical parameterization also has significant impact on the NWP model's forecasts. For example, a multi-physics ensemble method using different parameterization schemes was employed to study a Mediterranean cyclone [7]. Evans et al. [8] simulated four East Coast Lows events using the weather research and forecasting (WRF) model with 36 model configurations and found that the combination of the Mellor-Yamada-Janjic (MYJ) planetary boundary layer (PBL) and the Betts-Miller-Janjic (BMJ) cumulus (CU) scheme can substantially improve the simulation of heavy rainfall. Another study [9] showed that the scheme combination of the WRF Single Moment 3-class (WSM3) and the Grell-Devenyi (GD) ensemble had a robust performance in leveraging the prediction ability and computational efficiency of the prediction of heavy precipitation. Liu et al. [10] found that the best model configuration for extreme rainfall events over Egypt consisted of the WRF Single Moment 6-class (WSM6) microphysical (MP), the MYJ PBL and the Grell-Freitas CU schemes. Whereas, the combination of WSM6 and Kain-Fritsch (KF) schemes performed optimally in the rainfall forecasts over the Hanjiang River Basin in China [11]. These studies proved that physical parameterization schemes have different performances in different study regions and no identical model configuration for all areas. The multi-physics ensemble can be a solution for determining optimal physics configuration to some extent in a study area of interest.
The focus of this paper is on the Sichuan Basin (shown in Figure 1), which is an expansive 229,500 km 2 low-land region surrounded by the highlands of the Tibetan Plateau on the west, the Daba Mountains on the north, the Wu Mountains on the east, and the Yunnan-Guizhou Plateau on the south. Its unique characteristics of geographical location and complex terrain, especially on the western side of the basin, lead to more significant difficulties in simulating rainfall events than other areas with simple topography. Several prior studies [12,13] have focused on revealing the impact of basin terrains on PBL due to its vital role in the interaction of the earth-atmosphere system. The station observation and reanalysis data have been analyzed to show the atmosphere's PBL processes and vertical structure over basin terrains [12]. Besides, WRF model was employed to examine the performance of different PBL and MP schemes in the rainfall prediction over the Sichuan Basin. The literature also reported that the impact of cumulus schemes on rainfall forecast of WRF model in Sichuan is of great importance [14]. Nonetheless, we consider that the literature regarding uncertainties brought by parameterization schemes to rainfall prediction over the Sichuan Basin is still limited, and it deserves further exploration. Most studies have not focused on the interaction of various physical schemes. Therefore, as this study's first objective, we run 48 ensemble tests to comprehensively show the difference of multiple combinations of physical schemes in the WRF model on summer rainfall over the Sichuan Basin.
The other purpose of this study is to quantify the parametric uncertainty in the physical schemes related to rainfall forecasting. As known, one of the biggest challenges in sub-grid physics modeling is identifying the parameters [15,16]. The traditional approach assigns "most likely" values to those parameters from the corresponding ranges; however, such an approach can not be adequate in any cases as the complex physical processes may be sensitive to some of the parameters. For example, the different values of the critical Richardson number in the asymmetric convective model version 2 (ACM2) PBL scheme [17] and the coefficient for solving the ice crystal fall velocity in the WSM6 MP scheme [18] have large impact on the calculation of the boundary layer height and the accretion amount from ice to water, respectively. Nevertheless, quantifying the uncertainties originating from the empirical parameters in each physical package of the WRF model has not yet been paid much attention.
In the atmosphere field, one of the most common sampling methods used to implement sensitivity analysis is the Monte Carlo (MC) [19] approach or one of its variants [20,21]. Although MC sampling is straightforward to implement because it only requires repetitive executions of deterministic simulations, a very large number of resultant executions are needed to obtain reliable statistics such as mean value and standard deviation (STD) due to the relatively slow rate of convergence. More recently, other sampling methods such as full factorial [22] and Latin hypercube [23,24] have been employed for generating a nearrandom sample of parameter values. The Morris one-at-a-time (MOAT) [25] method was also applied to the NWP field [26] and it proved to be an effective method that can screen out the most sensitive parameter. As mentioned before, many of these approaches need a large number of realizations for accurate results, especially for the systems like an NWP model with large parameter counts, which can incur an excessive computational burden.
Many efforts have been made to reduce the computational burden in the field of uncertainty quantification (UQ) [27,28]. Polynomial approximation plays a vital role in the context of UQ. In recent years, the state-of-the-art polynomial chaos expansion (PCE) method [29] has been widely used to analyze and quantify the uncertainty caused by uncertain parameters in a system of interest [30,31]. It is a non-sampling method which represents the uncertain quantities as an expansion of orthogonal polynomials in terms of random variables. In contrast to most sampling methods (e.g., MC simulations), the computational costs of PCE are often significantly lower [32]. More detailed information of PCE can be found in Section 3.
Given these considerations, we present a framework based on the PCE method and WRF model to quantitatively analyze the impact of uncertain parameters in WRF parameterization schemes on rainfall prediction in the Sichuan Basin, as the second part of this work. Knowledge of the parameters with large uncertainties is of crucial importance to improve the forecasting skill of rainfall over the Sichuan Basin and shows the direction to further improve the physical schemes in the WRF model.
The main contributions of this paper are summarized as follows: • The difference of multiple combinations of physical parameterization schemes of WRF model in summer rainfall prediction over the Sichuan Basin has been comprehensively examined.

•
To the authors' best knowledge, it is the first time that employing the advanced PCE method to study the parametric uncertainty in WRF rainfall prediction. • We show that the PCE-WRF framework is able to identify the parameters with significant uncertainties in specific physical schemes. The results might show the direction to improve these physical schemes further.
The rest of this paper is organized as follows. The datasets, WRF model configuration, and multiple evaluation metrics are described in Section 2. Section 3 presents a brief introduction of PCE approach employed in this study. Section 4 presents and analyzes the main results and Section 5 contains the concluding remarks.

Data
Global-scale atmospheric datasets, such as Global Forecast System (GFS) data and ECMWF Reanalysis v5 (ERA5), are always used to drive the mesoscale models, refs. [33][34][35][36] have evaluated those existing large-scale datasets in terms of wind, rainfall, solar radiance, etc. In this study, we choose the GFS data as the initial/boundary conditions for the regional high-resolution WRF model. The data periods are consistent with the WRF model simulation periods with a 3-h interval, and a horizontal resolution of 0.25 × 0.25 degree for all variables. The topographic data is fetched from the U.S. Geological Survey (USGS) global 30 arc-s elevation (GTOPO30) dataset for nested domain.
Rainfall observations from approximately 9800 Chinese ground-based and encrypted stations over the Sichuan Basin (shown in Figure 2) are interpolated to the WRF model grid points to evaluate the WRF model's output. The time interval is one hour. Besides, ERA5 hourly reanalysis data are adopted to show the weather systems and vapor condition of the rainfall cases selected in this study.

WRF Model Configuration
The high-resolution mesoscale meteorological model chosen in the study is the Weather Research and Forecasting (WRF) model with Advanced Research WRF (ARW) version 3.6.1. The forecasting domain is configured by following the recommendations in [37,38]. As shown in Figure 1, one parent and a nested domain (D02) with a horizontal resolution of 9 km (190 × 166 grid points) and 3 km (291 × 291 grid points), respectively, are chosen. There are 35 vertically stretched eta levels for the two domains, and the time step is 36 s. In this study, we run the WRF model on "cold-start" mode at 06:00 UTC each day, and each run lasts thirty hours. The first six hours are regarded as the spin-up period [39]. Model output has a constant interval of one hour.
The physical parameterization schemes adopted include the Dudhia shortwave radiation scheme [40], the Rapid Radiative Transfer Model long-wave radiation scheme [41], and the Noah land surface model [42]. The configuration of MP, PBL and CU schemes can be found in Section 2.2.2.

Multi-Physics Ensemble
For the sake of quantifying the uncertainty and improving the accuracy of the WRF model on precipitation simulations over the Sichuan Basin, we evaluate the performance of different physical options. The rainfall prediction highly related three parameterization schemes, i.e., MP, PBL, and CU schemes, are employed to generate ensemble members (as summarized in Table 1). The MP scheme mainly involves the processes of explicitly resolved water vapor, cloud, and precipitation. In this paper, we use four different MP schemes including the Lin scheme [43], WSM6 scheme [44], Goddard scheme [45], and Thompson scheme [46]. Despite the differences in ice phase processes, these schemes all include snow, graupel, and ice processes and are appropriate for high-resolution simulations. For the PBL process which is closely related to the spatial distribution of precipitation, the Yonsei University (YSU) PBL scheme [47], the MYJ PBL scheme [48], the Quasi-Normal Scale Elimination (QNSE) PBL scheme [49], and the ACM2 PBL scheme [50] are chosen. We also choose three CU schemes, i.e., the Kain-Fritsch (KF) scheme [51], the BMJ scheme [48] and the GD scheme [52], which can detail the convective component of rainfall and predominate the estimate of precipitation rate. It should be noted that CU parameterization schemes are not used in the nested domain with a resolution of 3 km because CU parameterization is theoretically only valid for a coarser grid size (e.g., greater than 5 km) [53].

Evaluation Metrics
The global metrics used for comparison between ensembles are the root mean square error (RMSE), the mean absolute error (MAE) and the correlation coefficient (CC), defined as: where i is the grid point and N is the total number of verification points. R f and R o denote the rainfall forecasts and the corresponding observations, respectively. Besides, we utilize the structure-amplitude-location (SAL) [54] index to quantitatively and comprehensively evaluate WRF model's rainfall forecasts. The three components of SAL are defined as follows, where is the scaled precipitation volume of object n, and M denotes the total number of precipitation objects in the domain. The domain average of rainfall . In Equation (6), x(R) denotes the center of mass of the precipitation field, r(R) = (∑ M n=1 R n |x − x n |)/ ∑ M n=1 R n , and d is the largest distance between two boundary points of the domain. Therefore, L 1 is the normalized distance between the two centers of mass of the predicted and observed precipitation bodies while L 2 is the average distance of the center of mass between the total precipitation fields and each precipitation individual objects. Perfect rainfall forecasts are characterized by zero values for all three components of SAL.
Finally, the equitable threat score (ETS) being a metric that measures the skill of a precipitation forecast relative to chance is also used in the present study to evaluate the ability of an NWP model for predicting precipitation that falls into different threshold ranges (Table 2) , defined as: where A, B, C, and D stand for hits (events forecast to occur, and did occur), false alarms (events forecast to occur, but did not occur), missing (events forecast not to occur, but did occur), and correct negatives (event forecast not to occur, and did not occur), respectively. Generally, the range of ETS is from − 1 3 to 1, and 1 is the perfect score. Table 2. Grading of precipitation intensity.

Grade Precipitation in 12 h (mm) Precipitation in 24 h (mm)
Light rain

Uncertainty Quantification Method
As mentioned earlier, there are many uncertainties in the WRF models' physical parameterization schemes. Some of the parameters with uncertainties in these schemes might affect the rainfall forecasts significantly. We therefore employ the PCE approach to quantitatively assess those uncertainties. In this section, we give a brief introduction of PCE method used in this study.

PCE Approach
Generally, the PCE approach formulates the relationship between the system output y(ξ) and each random input ξ using a series of orthogonal polynomial basis functions, which can be written as where α i is the PCE coefficient: and {ψ i (ξ)} ∞ i=0 represents the orthogonal polynomial series having the following property where stands for the inner product, i and j are the orders of polynomials, δ ij is a positive constant, Ω is the domain of the basis functions, and p(ξ) is the probability density function (PDF) of the random variable ξ, which determines the type of polynomial sequences (Table 3). We choose the Legendre polynomial basis functions to analyze the effect of uncertainty with a uniform probability distribution in [−1, 1] (i.e., p(ξ) = 1/2) in this study. For practical simulation, the series in Equation (8) needs to be truncated to a finite number of terms, hereafter denoted by M, Using the stochastic collocation method which is a non-intrusive implementation of the PCE, the unknown deterministic coefficients in Equation (11), α i , can be determined. The key of the stochastic collocation method is to find suitable quadrature points. Following [31], the Clenshaw-Curtis quadrature is adopted for computational efficiency in this work. We determine the orthogonal points ξ j and associated weights w j of the uncertain variables in the uncertain space according to their range of values. Then using N quadrature points, we run the WRF model to obtain the forecasts y D ξ (j) for the physical field y(ξ) with uncertainty, and the Equation (9) can be re-written as detailed information on stochastic collocation approach can be found in [31,55].
Once the coefficient is determined, we can calculate the statistical characteristics (e.g., mean and STD) of the uncertainty system output by the truncated PCE model, the larger the values of STD, the more uncertain the parameters.

Experiment Design
The present study focuses on quantifying the uncertainties in KF CU, WSM6 MP, and YSU PBL schemes. As shown in Table 4, ten parameters that are considered to be uncertain are chosen. The default values of these parameters are commonly assigned based on an optimal theory or experience. We set the uncertainty of each parameter as a uniform distribution with bounded values shown in Table 4, which are almost identical to those in Table S1 of Ref. [56] that focuses on the parametric sensitivity of rainfall forecasts in the greater Beijing area. The UQ results over the Sichuan Basin in this work are roughly compared with the conclusion in [56].

Case Selection
Four rainfall cases (shown in Table 5) occurred at the Sichuan Basin in August 2020 and 2021 are chosen, all of which are typically heavy summer rainfalls in the study area. The maximum of 24-h cumulative precipitation for all cases exceeds 100 mm. The rainfall spatial distributions are presented in Figure 3a-d for four cases, respectively. It can be observed that all cases had a northeast-southwest distribution of rainbands, and the heavy precipitation occurred in the northwest of the basin. Generally, a rainfall event is triggered by many factors, including the development of closed isobars, share lines, low-level vortexes, lowlevel jets, topography, etc. [57][58][59].  Although the comprehensive rainfall analysis is not the primary concern of this work, we show a brief introduction of the synoptic patterns for all cases that we selected to study the uncertainty quantification. From Figure 4, we can find that all the rainfall events occurred with the upper-level (500 hPa) troughs and the Western Pacific Subtropical High (WPSH); however, the synoptic patterns over the lower troposphere (850 hPa) of the four cases are of difference. Specifically, for Case1, the trough east of the Tibetan Plateau at 500 hPa level moved eastward, leading the cold flow to the Sichuan Basin; at the same time, a warm and moist southwesterly flow gradually moved northward under the influence of the WPSH and the high-altitude trough in southern Sichuan. The convergence of these two different airflows provided a favourable weather background for this event. From 850 hPa, it can be found that a southwest vortex induced by the high-altitude trough formed a strong vertical upward movement in the western part of the basin, which provided dynamic conditions for the heavy precipitation. Meanwhile, the southeasterly airflows along the west edge of the WPSH transported the warm and moist air parcels from the ocean to the Sichuan Basin, which helped the establishment of the unstable atmospheric stratification and provided the moisture for the rainfall event.
Case2's weather background was basically the same as that of Case1; however, the WPSH was strengthened (500 hPa) and the low-level jet stream in the southwest was also strong. Besides, the southeasterly flow on the right side of the vortex enhanced the heavy precipitation under the topographic uplift in northwestern Sichuan.
Unlike Case1 and Case2, the low-level weather system over the Sichuan Basin for Case3 and Case4 was a shear line instead of a southwest vortex. The convergence along the shear line can intensify the vertical upward motion, which was favorable for the rainfall occurrence. Meanwhile, there was a Tropical Cyclone over the adjacent sea of China during each of the two cases. As a result, the WPSH was weaker with the south edge located at 30 • N for both cases. Such circulation patterns over the lower troposphere helped establish and strengthen the easterly moisture channels for the precipitating processes during these events.

Analysis of WRF Physical Schemes
In this section, we first verify the overall forecasting skills of WRF model for predicting four rainfall events over the Sichuan Basin during summer. Then the differences or uncertainties of employing a different combination of physics schemes are quantitatively studied. Finally, we identify the well-performing physics scheme combinations, which might show the direction of choosing the parameterization schemes for the WRF model properly to forecast rainfall over the Sichuan Basin without wasting many resources.

The Performance of Multi-Physics Ensemble
Prior to analyze the uncertainty in the different combinations of physical schemes, verification of the WRF model's ability of forecasting rainfall over the Sichuan Basin is necessary. To this end, the ensemble mean (48 members) for all cases is calculated, and the results are plotted in Figure 5a-d, respectively. Compared to the corresponding observations (shown in Figure 3a-d), all cases capture the horizontal structure of rainfall well, though the forecasting skill for Case1 is lower than others. However, the center of the predicted heavy precipitation of Case1 is shifting north while a bit south for Case2, Case3, and Case4. We also show the bias of the rainfall forecasts in Figure 5e-h. Note that the significant bias is generally located in the areas of heavy precipitation because of the shift of the center of heavy rainfalls. Except for that, the bias ranges from −15 mm to 15 mm in most areas. These results substantiate that the WRF model ensemble has reasonably good forecasting skills in predicting rainfall events over the Sichuan Basin. However, the performance of individual members still needs to be explored.

Difference between Physical Schemes
In the present study, we mainly evaluate the impact of individual members on the prediction of 24-h cumulative precipitation. Figures A1 and A2  Considering that rainfalls may have a diurnal cycle that is sensitive to different physical schemes, except for evaluating the 24-h accumulated precipitation, we also split the 24 h forecasts into two parts, i.e., 12:00-23:00 UTC (nighttime in local time) and 00:00-12:00 UTC (daytime in local time), to calculate the evaluation metrics, respectively. The RMSE between WRF rainfall predictions of different combinations of physical schemes and observations for all cases in three periods is illustrated in Figure 6. In general, it can be observed that Case1 has the largest RMSE while Case4 has the smallest value during all three periods. More importantly, we can find that the variation of RMSE between multi-physics members is significant for four cases. Specifically, for the Case1, the variation of RMSE between multi-physics members ranges in 19.788-26.592 mm, 13.670-22.456 mm, and 21.619-36.990 mm for the period of 12:00-23:00 UTC, 00:00-11:00 UTC and 12:00-11:00 UTC, respectively. Moreover, the STD of RMSE for Case1 during different periods has been calculated and summarized in the second column of Table 6. Those results indicate that the WRF model indeed has a relatively significant difference in the forecasting ability of rainfall when employing different parameterization schemes in the Sichuan Basin. Similar conclusions can be obtained from other cases. Furthermore, we can find that most members associated with WSM6 and Goddard MP schemes (M13-M36) have smaller RMSEs. Note that the mean of ensemble members performs much better than any individuals for all cases.  Figure 7 illustrates the four-case averaged MAEs in the three periods. The results indicate that most experimental runs capture the rainfall intensity and spatial distribution well. However, the oblivious variation among individual members can be observed. Relatively large MAE values occur in members associated with the Lin MP scheme, which indicates that the Lin MP scheme may be not appropriate for the rainfall prediction over the Sichuan Basin.  We then calculate the four-case averaged ETS (Figure 8) to show the performance of each member in five classes of precipitation (shown in Table 2). Consistent with the RMSE, the ETS of ensemble mean shows better skills than most individuals in 12-h and 24-h cumulative precipitation, especially for the light rain, moderate rain, and storm. It can be found that the variation in ETS values of storm and heavy storm is larger than the variation in other intensities of precipitation during all three periods. Besides, for the storm and heavy storm, better forecasting skills are presented when using the WSM6 MP scheme in combination with the ACM2 PBL scheme (M22-M24), the Goddard MP scheme in combination with the YSU PBL scheme (M25-M27), and the Goddard MP scheme in combination with the ACM2 PBL scheme (M34-M36) during three periods, except for heavy storm during 00:00-11:00 UTC. It should be noted that although the members do not display consistent performance in each magnitude of precipitation, the configuration of parameterization schemes is more sensitive to storm and heavy storm.
The Taylor diagram displayed in Figure 9 also shows the difference in the performance of the various parametrization schemes. It can be observed that the uncertainty in the rainfall predictions employing multi-physics combinations is large, especially during 00:00-11:00 UTC. From Figure 9c, we can find that the Goddard MP scheme in combination with the ACM2 PBL scheme outperforms others, and the sensitivity of the MP scheme is also strong, with a large spread between the statistics of Lin and Goddard schemes.

Identifying the Well-Performing Physics Scheme Combination
As mentioned above, different combinations of physical schemes perform differently, some of which may have much worse forecasting skills compared to others. Ensemble forecasting is a common way to reduce such uncertainties (as displayed in Figure 6); however, the increasing computational burden may make the operational forecasts impossible.
Although the main purpose of this work is not to find the best combinations of multiphysics but to quantify the uncertainty, we also try to identify the most adaptable physics schemes that can be used in the Sichuan Basin, which could be a reference for others' work. Similar to [8], three normalized metrics, i.e., RMSE, MAE, and reciprocal CC are combined to comprehensively evaluate the prediction ability of WRF model with different parameterizations. We first calculate the three metrics of the 48 WRF runs separately for four cases during different periods and then calculate the mean of those three metrics. Lastly, we obtain the four-case averaged 48 values in three periods, which can be used to calculate the interquartile range (IQR).
Next, we employ a step-wise approach based on IQR to conduct the evaluation. That is, if the value of one scheme for a certain physical scheme is relatively small and at the same time there is less overlap with other options whose values are large, this physical option is deemed as a robust choice that should be retained. When the overlap between different options of a physical scheme is relatively large, there is not necessary to make a choice and they all progress to the next step. After keeping some options of one physical scheme for the first time, we continue to calculate the IQR again by the remaining members. Repeat in turn until the best performing member is selected.
The results for three periods are summarized in Figures 10-12 respectively, each row of which represents an evaluation. The size of boxes provides a visual indication of the sensitivity of these schemes. The first row of box plots presents the mean metric for different options of three physical parameterization schemes. From the result of 12-h cumulative precipitation in nighttime (shown in Figure 10), we can find a large overlap of options in MP and CU schemes, which suggests that it is unnecessary to choose them. On the contrary, the YSU and ACM2 PBL schemes have smaller values and less overlap with the other two PBL options; therefore, the members in the second row contain only these two options in the PBL scheme. Although the values of the remaining members vary slightly (Figure 10d), it still can be found that M35 outperforms others. The scenario of 12-h cumulative precipitation in daytime is shown in Figure 11 and it is consistent with the results of 24-h cumulative precipitation ( Figure 12). It can be found that Lin and Thompson MP schemes have larger values and less overlap with the other two MP options, and thus both of them are eliminated. Besides, all of the members in the second row contain only the WSM6 and Goddard MP schemes and the MYJ and QNSE PBL schemes are not robust choices ( Figures 11d and 12d). Therefore, the third row of Figures 11 and 12 keeps all the remaining well-simulated members due to the lack of sufficient ensemble members for the next exclusion step. The results indicate that M35 performs best, followed by M34, M36, M23, M22, and M26. There is no significant difference between the nighttime and daytime in choosing the optimal multi-physics ensemble member.   Figure 10, but for the period of 00:00 UTC to 11:00 UTC. The second row retains only those members with WSM6 and Goddard MP schemes, and the third row additionally excludes those members containing MYJ and QNSE PBL schemes. Finally, to guarantee the rationality of the conclusions, we further apply a spatial test to evaluate the simulations from multiple perspectives. Figure 13 depicts the results of SAL metric for the 12 members filtered from the box plots. The S values of all these schemes during three periods are negative, mainly because the simulated maximum precipitation amounts are larger than the corresponding observations. The value of S during three periods for M35 and M36 is closer to 0 compared to others. The A value for three periods ranges from 0.057 to 0.157, 0.122 to 0.312, and 0.101 to 0.192, respectively. During the whole rainfall period, M22, M23 M26, and M34-M36 have better scores than others. The overall change in L value is small. It is worthwhile to mention that the well-performing combinations recognized by the step-wise approach also have a very acceptable skill in the evaluation of the rainfall structure, amplitude, and location. Based on the above results, it may be concluded that there is an apparent uncertainty in adopting different configurations of physics schemes to predict rainfall over the Sichuan Basin. It is necessary to quantify the uncertainties in those combinations to obtain the optimal rainfall forecasts in the study area. Most metrics show that M35 has the best performance, followed by M34, M36, M23, M22 and M26, in the summer rainfall prediction over the Sichuan Basin.

Results of Parametric Uncertainty Quantification
Although the uncertainties in the WRF rainfall prediction in terms of the configuration of different physics schemes can be determined to some extent by employing an ensemble run, many uncertainties inside of each scheme are usually neglected, for example, the uncertainty resulting from a lack of knowledge about an empirical parameter whose actual value exhibits no substantial variability. In contrast, the uncertainty in such parameters has a massive impact on the performance and state of an NWP model sometimes. This motivates research into the use of UQ approaches to quantify those uncertainties. The present study employs a non-intrusive UQ approach, i.e., PCE, to propagate the parametric uncertainties to the WRF output of precipitation.
Ten parameters (shown in Table 4), and Case1 and Case2 are chosen to study the parametric uncertainty in the WSM6 MP, YSU PBL, and KF CU schemes which may have a consequential impact on rainfall prediction in the Sichuan Basin.

Impact of the Parametric Uncertainty on 24-h Cumulative Precipitation
The STD of the 24-h cumulative precipitation for the tested parameters over the whole domain are computed by the stochastic collocation method described in Section 3.1, and shown in Figures 14 and 15 for Case1 and Case2, respectively. It is observed that STD has a large variation in the study domain and the prominent STDs distribute mainly in the areas of heavy rain for both cases. The maximum of STD is around 9 mm. Table 7 summarizes the whole domain averaged STDs for ten uncertain parameters (P1-P10) and a relatively significant difference among ten parameters can be found. The largest value of STD is found for P8 (2.015 mm), reflecting that the uncertainty in the profile shape exponent in the YSU PBL scheme has the most significant impact on the WRF rainfall prediction over the Sichuan Basin. In addition, the impacts of P3 and P5 in the WSM6 MP scheme and P9 in the YSU PBL scheme are also very significant. We also consider the scenario only in the heavy rainfall area (103°E-106°E, 29°N-33°N) and the results are summarized in Table 8. The largest STD of Case1 reaches 4.139 mm for P8. Two-case averaged STD also indicates that P8 is the most significant one, followed by P9, P3 and P5. It should be noted that P3, P5, and P8 are also the sensitive parameters concluded by Ref. [56] which used the MOAT method to study the sensitivity of parameters in summer rainfall over the Greater Beijing Area. This might indicate that most of the uncertain parameters in physics schemes may have similar impacts in different areas.

Impact of the Parametric Uncertainty on Hourly Precipitation
To further investigate the uncertain parameters' impact on precipitation forecasts over the Sichuan Basin, we choose the small region (103°E-106°E, 29°-33°N) with relatively large 24-h cumulative precipitation for the following study. Figure 16 shows the impact of uncertain parameters on hourly precipitation in two rainfall cases using the IQR boxes, and also shows mean of the predicted hourly precipitation calculated by PCE approach (blue dot) as well as the corresponding observation (black dot).
The boxes are drawn based on hourly precipitation normalized by their mean value. It can be seen that although the mean values of the predicted hourly precipitation for each parameter vary slightly, at most moments there are large discrepancies between them and observation. Furthermore, the variation of boxes shows that the effect of each parameter on hourly precipitation is significant, especially for P3, P5, P8, and P9. It is worth noting that almost every parameter is less sensitive from 12:00 UTC to 23:00 UTC than 0:00 UTC to 11:00 UTC, and the last four hours have the greatest variability in hourly precipitation, indicating that the impact of uncertainty in parameters becomes significant with forecasting lead time.
Therefore, we split the 24 h into two periods, i.e., 12:00 UTC to 23:00 UTC and 0:00 UTC to 11:00 UTC, to show the impact of uncertain parameters on rainfall prediction at different forecasting lead times. The 12-h mean STD of tested parameters during each period is plotted in Figure 17. Consistent with the results in Figure 16, the STD values for all parameters from 0:00 UTC to 11:00 UTC (solid light gray bar) are much larger than that from 12:00 UTC to 23:00 UTC (solid black bar), especially for parameters P1-P3 and P6-P8. The difference between two periods for P9 is minor.

Impact of the Parametric Uncertainty on Other Variables
In this section, we examine the impact of uncertain parameters on two additional variables, i.e., 2-m temperature and 2-m dew point temperature. Figures A3-A6 show the validation results of the WRF prediction of these two variables with default values for parameters P1-P10. It is found that the predicted 2-m temperature and dew point temperature fit well with the station observation during different periods in general, although there are relatively large biases on the east of the Tibetan Plateau.
The results of spatial distributions of the STD values for 12-h averaged 2-m temperature are depicted in Figures 18 and 19. It can be observed that the STD of 2-m temperature has a relatively large variation in the study domain, indicating the significant impact caused by the uncertain parameters, among which P3, P5 and P8 influence the forecasts more dominantly than others in both periods. Besides, the STD values for each parameter are significantly larger during daytime than nighttime.
Consistent with the results of 2-m temperature, the STD values of 24-h averaged 2-m dew point temperature ( Figure 20) for each uncertain parameter vary obviously in the area of interest. It is also found that the impact on dew point temperature forecasts of the uncertain parameter of P5 and P8 is more significant than other parameters. These results illustrate that the uncertainties in the parameters in the WRF physics schemes not only can affect the forecasting skill of rainfall itself but also have an impact on other rainfall-related variables.

Conclusions
This work focused on the challenges presented when forecasting rainfall in regions of complex geographical features, for example, the Sichuan Basin, where the behavior of uncertainties in the parameterization schemes of the WRF model may be more complex than usual. Intending to quantify those uncertainties to some extent, we have conducted a multi-physics ensemble run and employed the PCE approach to examine the difference among multi-combination of physics schemes and the impact of uncertain parameters inside specific schemes, respectively, based on four heavy rainfall processes happened in the Sichuan Basin in summer. We mainly focused on the three kinds of parameterization schemes, i.e., MP, PBL, CU, and verified the results using a couple of metrics. It was observed that most ensemble members have the ability to reproduce the horizontal structure of four rainfall events, however, a significant difference can be found, for example, the variation of the RMSEs between multi-physics ensemble for Case1 during three periods ranges in 19.788-26.592 mm, 13.670-22.456 mm, and 21.619-33.990 mm, respectively. Combining normalized RMSE, MAE and CC, it was found that M22 (WSM6 + ACM2 + KF), M23 (WSM6 + ACM2 + BMJ), M26 (Goddard + YSU + BMJ), M34 (Goddard + ACM2 + KF), M35 (Goddard + ACM2 + BMJ), and M36 (Goddard + ACM2 + GD) have the optimal forecast performance in 12-h and 24-h cumulative precipitation. This result indicated that the Goddard MP scheme together with the ACM2 PBL scheme outperforms other combinations. The possible reason is that the ACM2 scheme can simulate both transport processes generated by large-scale turbulent vortices and capture turbulent mixing processes at small sub-grid scales under convective conditions. PCE-WRF results indicated that uncertainty in ten empirical parameters has significant impact on the WRF rainfall prediction, especially the scaling factor applied to ice fall velocity (P3), the limited maximum value for the cloud-ice diameter (P5), the profile shape exponent (P8), and the coefficient for Prandtl number (P9). Moreover, it was found that the impact of uncertainty of most empirical parameters becomes more significant with forecasting lead time from the analysis of hourly rainfall prediction. In addition, those parametric uncertainties also influenced the forecasts of 2-m temperature and dew point temperature.
Although we have mainly presented the uncertainty quantification in the region of the Sichuan Basin, we can say without a doubt that this can be applied for other areas of interest. The present study sheds light on the importance of these schemes for the rainfall predictions and other related variables and suggests the next step to improve the physical scheme further. Further work under-way includes repeating this analysis on additional events during different seasons and then developing an operational forecasting system based on the multi-physics ensemble. In addition, developing a framework for improving the uncertain parameters using available observations is also necessary. Figure A2. Same as Figure A1, but for the members from M25 to M48.

Appendix B. Spatial Distribution of Predicted 2-m Temperature and Dew Point Temperature
The comparisons of averaged 2-m temperature during 12:00 UTC-23:00 UTC and 00:00 UTC-11:00 UTC between observation and WRF model's output are displayed in Figures A3 and A4. Besides, the validation result of dew point is summarized in Figure A5. These results indicate that the WRF model can predict the 2-m temperature and dew point with relatively high skill over the Sichuan Basin. Moreover, the time series ( Figure A6) of a small area (103°E-107°E, 29°N-33°N) of those two variables show that WRF model can also reproduce the trend of 2-m temperature and dew point well in the Sichuan Basin.