Future Projections and Uncertainty Assessment of Precipitation Extremes in Iran from the CMIP6 Ensemble

: Scientists who want to know future climate can use multimodel ensemble (MME) methods that combine projections from individual simulation models. To predict the future changes of extreme rainfall in Iran, we examined the observations and 24 models of the Coupled Model InterComparison Project Phase 6 (CMIP6) over the Middle East. We applied generalized extreme value (GEV) distribution to series of annual maximum daily precipitation (AMP1) data obtained from both of models and the observations. We also employed multivariate bias-correction under three shared socioeconomic pathway (SSP) scenarios (namely, SSP2-4.5, SSP3-7.0, and SSP5-8.5). We used a model averaging method that takes both performance and independence of model into account, which is called PI-weighting. Return levels for 20 and 50 years, as well as the return periods of the AMP1 relative to the reference years (1971–2014), were estimated for three future periods. These are period 1 (2021–2050), period 2 (2046–2075), and period 3 (2071–2100). From this study, we predict that over Iran the relative increases of 20-year return level of the AMP1 in the spatial median from the past observations to the year 2100 will be approximately 15.6% in the SSP2-4.5, 23.2% in the SSP3-7.0, and 28.7% in the SSP5-8.5 scenarios, respectively. We also realized that a 1-in-20 year (or 1-in-50 year) AMP1 observed in the reference years in Iran will likely become a 1-in-12 (1-in-26) year, a 1-in-10 (1-in-22) year, and a 1-in-9 (1-in-20) year event by 2100 under the SSP2-4.5, SSP3-7.0, and SSP5-8.5 scenarios, respectively. We project that heavy rainfall will be more prominent in the western and southwestern parts of Iran.


Introduction
Extreme rain events can result in landslides and floods, accompanied with a loss of life and the deterioration of infrastructure. Thus, understanding and projecting heavy rainfall is of significant importance to climate change impact, adaptation, and vulnerability assessments.
Numerous studies have reported that extreme precipitation events have become more frequent during the last century, and are occurring even more often over the 21st century ([1-7], for example). A simplified and major reason for more frequent extreme rainfall is the following: when the temperature increases, the saturation specific humidity of the air is higher and therefore the air can contain a higher amount of water vapor, as dictated by the Clausius-Clapeyron relationship [8]. When rain-triggering conditions are developed, more saturated air leads to heavier rainfall [9,10]. This has been the case across some areas of the world during the last century [11]. Seemingly paradoxically, as written by Mann and Kump [12], "While many regions are likely to become drier, scientists predict that even in those regions individual rainfall or snowfall events will become more intense, although longer dry spells will separate them." When our interest is in predicting extreme climatic events, the generalized extreme value (GEV) distribution is typically used for example [13][14][15][16][17][18][19]. The GEV distribution encompasses three asymptotic extreme value distributions by large sample theory [20]. In this study, we used the GEV distribution to predict extreme quantiles in the future.
Over the past decades, many researches have focused on trends in extreme precipitation across Iran. Modarres and Sarhadi [21] reported that there is a decreasing trend in annual total rainfall at 67% of the stations, while an increasing trend was observed in the 24-hour maximum rainfall at 50% of the stations in Iran. Rahimi and Fatemi [22] studied recent trends in average and extreme rainfall events in Iran during a 58-year period . The results indicated that most areas have gone through a significant increasing trend in extreme precipitation values, frequency, and intensity, especially in the coasts of the Persian Gulf and in the southwestern regions of Iran. Shokouhi et al. [23] evaluated the performance of CMIP5 (Coupled Model Inter-Comparison Project Phase 5) climate models in simulating temperature and precipitation in major rain-fed wheatproduction areas in Iran. The results showed that precipitation will increase in northern areas toward the end of this century and a higher reduction in precipitation is anticipated in the southern areas. Darand [24] reported, by using an ensemble of CMIP5 models, that annual total rainfall and rainy days were predicted to decrease. However, the frequency and intensity of downpour were projected to increase significantly, particularly in northern and southwestern parts of Iran.
See [25][26][27][28], for example, for more studies on the impact of climate change on heavy rainfall and flood frequency in Iran.
A few studies including [29] have analyzed the future projections of extreme rainfall over Iran using an ensemble of CMIP6 models. Zarrin and Dadashi-Roudbari [29] projected the intensity of extreme precipitation based on an ensemble of bias-corrected five CMIP6 models. In this study, we update the previous studies based on 24 CMIP6 models under the three shared socioeconomic pathway (SSP) scenarios: SSP2-4.5, SSP3-7.0, and SSP5-8.5 [30]. We predict the amount of changes in the extreme rainfall using different methodology from the previous works. The weights informing the performance and the dependence of each model are provided in building a multi-model ensemble to predict the future extreme rainfall in Iran.
Studies on the projection of future climate change have used ensembles of multiple climate simulations. Multi-model ensemble (MME) methods of climatic projection have been shown to improve on the systematic bias and to have fewer of the general limitations that are normally associated with individual simulation models. Among the many ensemble techniques, model weighting or averaging is typically employed ( [31][32][33], for example).
One typical unequal weighting scheme involves assigning more weights to those models that are more skilled and realistic for a specific process or application. This performancebased weighting method and its variants, including Bayesian model averaging (BMA), have been employed in many different studies [17,19,33]. It has improved the accuracy of the projections and reduced the prediction uncertainty. However, it has been reported that a few models often exhibit extremely high weights, but most others have very low weights [19,34]. Here, the model performance is the systematic determination of model bias and uncertainty, measured by comparing statistically the similarity of a modeled and observed time series [35].
In addition to the model performance, some researchers have considered other criteria, such as model convergence [36], model independency [37][38][39][40], and a semi-performance measure [41]. Model convergence is a good agreement across models which is measured by the distance of the i-th model from the ensemble average. Model independency is a counter measure of inter-dependency or similarity between models. Model semi-performance is a modified model performance to reduce the impacts of very high performed models to the weights distribution. A weighting scheme that accounts for both the independence and performance simultaneously is called the PI-weighting. In this study, we employ PI-weighting to robustly quantify uncertainty in an MME. In calculating the PI-weights, considering only one or two climate variables over a relatively small area can lead to an overfitting problem [40,42]. To avoid this problem, we thus consider five climate variables as in Table 1 over Middle East, while our focus is the annual maximum daily precipitation (AMP1) over Iran as in [43] did. The AMP1 (or annual Rx1day [44]) is defined as the annual maximum precipitation (in mm) among the daily rainfall amount in the year.

Data and Climatology
Much of Iran is affected by subtropical high pressure (STHP) and is therefore located in the desert belt of the Northern Hemisphere; however, climatic diversity in Iran is very high [45]. In the north and on the southern shores of the Caspian Sea, the climate is temperate and humid. In the western part of the country it is Mediterranean climate and in the southern regions, the hot semi-arid climate prevails. Due to the existence of Alborz mountains in the north and Zagros mountain ranges in the west of Iran, the interior of the Iranian plateau also has an arid and desert climate. The mean annual temperature increases from northwest to southeast, from about 10°C in Azerbaijan region to 25-30°C in the south and southeast [46]. Due to its geographical location, Iran's average precipitation is much lower than the global average. More than two-thirds of Iran's area has an average annual precipitation of less than 300 mm, but there are rainfall nuclei above 1000 mm in the Zagros region and the northern slopes of Alborz [47]. Results by Darand and Sohrabi [26] indicated significant changes in daily precipitation over more than 50% of Iran during the last two decades. This increasing trend is observed in the western and southwestern Iran.
We consider five climate variables over Middle East to avoid overfitting in calculating the PI-weights, while our focus is the annual maximum daily rainfall (AMP1) over Iran in this study. Table 1 lists five climate variables. Consecutive wet days and dry days are defined as consecutive days with daily precipitation of ≥1 and <1 mm, respectively [48,49]. These are recommended by the Expert Team on Climate Change Detection and Indices [44,50]. These five variables including four auxiliary variables are used to compute robustly the weights of independence of the CMIP6 models [40,42]. Table S1 in the accompanying Supplementary Materials lists the 24 CMIP6 climate models used in this study. The considered scenarios are shared socioeconomic pathways SSP2-4.5, SSP3-7.0, and SSP5-8.5 [30]. Three overlapping periods are considered for future data, namely, period 1 (2021-2050), period 2 (2046-2075), and period 3 (2071-2100), abbreviated by P1, P2, and P3 in this study. The reason why the periods overlap is to make each period at least 30 years long by the end of this century.  A sample size 44 is better than 30, both in estimating stably the parameters of GEV distribution and in applying the bias correction method. To re-grid and to construct a rain field on grid points of 2 • × 2 • , the Barnes scheme [51] was used for the observed time series and simulation data from the 24 CMIP6 models for each of five climate variables. In total, 598 grid points were used to cover Iran and its surrounding areas. The Barnes interpolation method was used to re-grid and to build a rainfall field over 2 • × 2 • grids in Middle East. (b) Map of the Iran from 44 • to 62 • longitude and 26 • to 40 • latitude, including the sea and land, with 47 grid points of 2 • × 2 • for this study. This map was drawn by using a R package 'ggplot2' [52].
The remainder of this paper is structured as follows. The statistical methods are described in Section 2. The results of model weights and projected future changes in Iran are presented in Section 3. Discussions are then given in Section 4, followed by a conclusion in Section 5. Details including technical specifics, tables, and figures are provided in the accompanying Supplementary Materials.

Generalized Extreme Value Distribution
The generalized extreme value (GEV) distribution has been widely used to analyze extreme univariate values. The cumulative distribution function of the GEV distribution is as follows: where µ, σ, and ξ are the location, scale, and shape parameters, respectively, [20].
The changes in extremes is usually described in terms of the changes in extreme quantiles. These are computed by inverting the following (1): Here, z p is called as the return level associated with the return period 1/p. The value z p is expected to be exceeded once every 1/p years on average [20]. These quantities are defined for all values, but applied to extreme precipitation in this study. Conversely to the above, the return period T(z) = 1/p for the given value z is obtained by calculating p(z) = 1 − G(z). For the given value z, T(z) is sometimes called the expected waiting time, and the value p(z) = 1 − G(z) is referred to as the exceedance probability of z. The exceedance probability is often used as an alternative to the return period [53,54]. Another quantity that we can obtain is the expected number of reoccurrences during a certain period. By multiplying 30 years (for example) by the exceedance probability p(z), we can estimate the expected frequency of such years over a period of 30 years in which we have more than z amount of AMP1 for a year.
The relative change of 20-year return level in the period P i for i = 1, 2, 3 relative to the reference period P 0 is defined as: where R 20 (P) is the 20-year return level in the period P. The parameters in GEV distribution are estimated by the maximum likelihood method [20,55] or the L-moment method [13]. In this study, we employ the latter, which is more efficient in small samples than the maximum likelihood estimator [13]. We used the "lmom" package in R [56].

Bias Correction
The simulated data are often associated with systematic biases of the simulation model. Such biases sometimes produce the historical data which are considerably different from the distribution of the observations. Bias-correction (BC) methods are sometimes employed to address this problem. A BC method transforms the simulated data into new data which have fewer biases with respect to an observed time series. To correct the model outputs more efficiently by taking account of the dependency among variables or nearby grids, several multivariate BC methods have been proposed. In this study, we chose the multivariate bias correction (MBC) method by Cannon [57] among the many available BC techniques [24,58]; it is a multivariate extension of quantile delta mapping (QDM). QDM has an advantage of preserving the approximate trends of the model data. The MBC is applied to the five climate variables provided in Table 1 to take into account the dependency among these variables. The method is not applied to the data of each period separately but to the whole data of all periods (2021-2100) at once. More details are provided in the Supplementary Materials.

Performance and Independence Weighting for Ensembles
Model averaging is a statistical method in which unequal or equal weights are assigned to those models. Despite some arguments, the equal weighting or "model democracy" [32] has been criticized because it does not consider the performance, uncertainty, and independency of each model in building an multi-model ensemble (MME) ([37,59,60], for example).
As the basic idea of PI-weighting, models that agree poorly with observations for a selected set of diagnostics receive less weight, as do models that largely duplicate existing models [39]. Weights are computed for each model based on a combination of the distance D i (apprising the performance) and the model similarity S ij (apprising the dependence): with the total number of models M and the shape parameters σ D and σ S . The weights are normalized so that their sum equals 1. The numerator represents the modeling skill when using a Gaussian weighting, where the weight decreases exponentially the farther away a model is from the observed data. The denominator is the "effective repetition of a model" [37] and is intended to account for the model interdependency [39].
To compute the performance of each model, T-year return levels are compared based on the GEV fitting on the historical data and the observations. To calculate the model similarity S ij , we follow a technique proposed by Sanderson et al. [61] that is based on the principle component analysis (PCA) applied to the simulation data only. This PCA is executed separately for each of five climate variables. Thus, we have five distances between models i and j. Then, the final distance are computed by averaging those five distances for models i and j.
The parameters σ D and σ S tune the strength of weighting and the relative significance of the performance and independence [42]. Large values will result in to an almost equal weights, whereas small values will cause aggressive (or one-sided) weights, providing most of the weight to a few models. The shape parameters are often determined through a perfect model test (or a model-as-truth experiment) using the continuous rank probability score [40,42]. This leave-one-out procedure needs a huge computing time. To overcome this computational burden, we follow a relatively simple method proposed by [43] to determine the shape parameters. More details in computing the distance S ij and in selecting the shape parameters are given in the Supplementary Material. Table S2 provides the similarity values S ij for certain models. Figure 2 shows the intermodel distance matrix for the 24 CMIP6 models considered in this study. The distances are obtained from the five climate variables listed in Table 1 for the Middle East, using the historical data and the future simulation data. Each box shows a pairwise combination, where red color indicates a greater distance. According to Figure 2, models FGOALS-g3, GFDL-ESM4, BCC-CSM2-MR, INM-CM5-0, MPI-ESM1-2-HR, and NorESM2-MM were found to be the most independent, whereas models CanESM5, EC-Earth3-Veg, IPSL-CM6A-LR, and CMCC-CM2-SR5 were found to be the most dependent. The details on these models are available in Table S1 in the Supplementary Material.

PI-Weights
The normalized PI-weights are obtained using Equation (3) with σ S = 0.4 and σ D = 0.16. Figure 3 demonstrates the distributions of P-, I-, and PI-weights. The variability of I-weights is smaller than that of P-weights.
The high P-weight of the KACE-1-0-G decrease in the PI-weight owing to the low I-weight. The PI-weights of the GFDL-ESM4, NorESM2-MM, andBCC-CSM2-MR models increase owing to a relatively high independency. The PI-weight is not located in the middle of the P-and I-weights, but is close to the P-weight, except in a few cases. When the P-weight (I-weight) is almost the same as the equal weight, as in the FGOALS-g3 (ACCESS-CM2 and ACESS-ESM1-5) model, the PI-weight is wholly influenced by the I-weight (P-weight). The performances for some of the models, such as INM-CM4-8, CMCC-ESM2, MPI-ESM1-2-LR, CanESM5, TaiESM1, and CMCC-CM2-SR5 are so low that their (even relatively high) I-weights do not affect the final weights. Based on this view, the performance is more influential to the PI-weights than the independency. Some of these observations may be changed if different σ S and σ D are used.

Future Projection of Extreme Precipitation
Using the PI-weights obtained in the above section, the future extreme precipitations are projected by the MME. Note that the future climate data are used after the bias correction with the MBC method [57]. Figure 4 illustrates the time series plots of the 9-year moving averages of AMP1 in Tehran with a 90% confidence band, from the observations, from the PI-weighted MME of the historical data, and from the PI-weighted averages for the future simulation data under the three SSP scenarios. Lower 5% and upper 5% prediction values among 24 models are lined to construct 90% confidence band. In Figure 4, the line for the observations shows more variation than the lines of bias-corrected historical data by the PI-weighted MME. The variance of 9-year moving averages of observed AMP1 is 4.98, whereas that value for the PI-weighted ensemble of bias-corrected historical data is 1.93. The boxplot of the historical data after BC is similar to that from the observations, whereas the boxplot before BC is much smaller than that from the observations. The increasing trends from P1 to P3 are evident in every scenario. This result is consistent with that by Darand [24]. Summary statistics of the corresponding values of these boxplots are provided in Table S3.

Exceedance Probability and Waiting Time
The spatially averaged estimates of the exceedance probability are presented in Figure S5 and Table S5. There are relatively large differences in the exceedance probability of a rainfall of 50 to 80 mm compared with that for over 80 mm, as shown in Figure S6. The differences between the past and future scenarios are distinct during the period P3.
The expected waiting time (T(z)) until the reoccurrence of a specific AMP1 value (z) are listed in Table 3. For z = 50 mm of rainfall, for example, the expected waiting times until a reoccurrence are 28 years in the past, 13.0 years in the future period P1, 10.8 years in P2, and 7.9 years in P3 based on the SSP5-8.5 scenario.

Expected Number of Reoccurring Years
The expected number of years of occurrence are given in Table S6. For z = 50 mm, as a specific example, during the last 30 years, we have experienced 0.9 year in which AMP1 was greater than 50 mm. In addition, we are likely to have an expected number of years of 2.3 for the future period P1, 2.8 for P2, and 3.8 for P3 under the SSP5-8.5 scenario.
From this comparison, particularly for AMP1 of 50 mm, the expected number of years of occurrence for the future periods under the SSP5-8.5 scenario increases by approximately 2.5 (3.1) times that over the last 30 years for P1 (P2), and 4.2 times that for P3 by the end of 21st century. These results are based on the spatially averaged values. When the exceedance probability is considered for each grid, different local results will be obtained.

Discussion
It is accepted in general that increasing greenhouse gases induce atmospheric temperature warming, which leads to increasing equivalent humidity, according to the Clausius-Clapeyron relationship [8]. The increase in atmospheric water vapor is the main factor in generating convective instability.
Zarrin and Dadashi-Roudbari [29] concluded that the trend and slope of the intensity of extreme precipitation are increasing by the year 2100 in all zones in Iran except for some areas (BWh, BWk, and Cfa zones) for SSP1-2.6 and SSP3-7.0 scenarios. Our result shows that the increasing trend of the AMP1 is projected in the western and southwestern parts of Iran. Based on Figure 6 of this study, there seems no change from P1 to P3 in Cfa zone, which leads to the same result by two studies. In Bwh and BWk zones, our result shows that there are no changes for SSP2-4.5 scenario, but increasing trends for SSP3-7.0 and SSP5-8.5 scenarios. Thus, it seems hard to compare two results directly. However, an agreement in general between two studies is that the increasing trend of extreme precipitation is projected over Iran except for some area.
The daily precipitation data consist of measurements from 00:00 to 24:00 throughout the day. In daily observations, the rainfall does not accumulate between 22:00 and 02:00, for example. In the data used in this study, such precipitation is divided and recorded in two separate days. The actual serious daily risk due to heavy rainfall does not exactly depend on the precipitation over a time duration from 00:00 to 24:00 exclusively. It is therefore recommended to consider the AMP1 data based on the maximum precipitation during the 24 h movement. In this sense, the results presented in this study underestimate the actual intensity and frequency of AMP1. More realistic daily data, such those as obtained after moving for 24 h and the annual maximum of two (and several) days of precipitation, should be used in a future study for assessment of risk owing to extreme rainfall.
Brunner et al. [42,62] considered multiple observational (or reanalysis) datasets to include an estimate of the observational uncertainty. They proposed a novel approach to account for the observational spread and uncertainty in a multi-model weighting study, which can lead to robust result and a more precise uncertainty quantification. In addition, considering multiple observational datasets may address the problem in which the BC and performance-based weighting scheme utilize an excessive number of observations. We believe that using the observations twice in the BC and weight calculation is unadvisable. Xu et al. [34] considered a Bayesian weighting method that removes observations during the initial phase of the downscaling and adds them in the estimation of posterior distribution. However, if the series of observations is sufficiently long to divide into two parts, we may use one part for the BC and the other part for weights calculation. Although we did not apply these methods, this would be a good approach in a future study.

Conclusions
We estimated the future changes in precipitation extremes within Iran using observations, 24 multiple CMIP6 models, generalized extreme value distribution, the multivariate bias correction technique, and the model weighting method (PI-weighting), which account for both the performance and independence of the models. To avoid overfitting in the PI-weighting, we considered five climate variables over Middle East.
In applying the PI-weighting method, we follow ways of selecting two shape parameters [43], based on the p-value of the chi-square statistic and entropy. The methods are simple and intuitively appealing, although they may need more justification to use.
From 20-year and 50-year return levels of the annual maximum daily precipitation (AMP1) averaged over 47 grids in Iran for three future periods under three SSP scenarios, the increasing trends from P1 (2021-2050) to P3 (2071-2100) are evident in SSP3-7.0 and SSP5-8.5 scenario. This result is somehow consistent with that of Darand [24]. We predict that the relative increases of 20-year return level of the AMP1 in the spatial median from the past observations to the year 2100 will be approximately 15.6% in the SSP2-4.5, 23.2% in the SSP3-7.0, and 28.7% in the SSP5-8.5 scenarios, respectively.
From the analysis described in this study, we realized that a 1-in-20 year (1-in-50 year) AMP1 within Iran will likely become a 1-in-12 (1-in-26) year, a 1-in-10 (1-in-22) year, and a 1-in-9 (1-in -20) year event in terms of the median by the year 2100 under the SSP2-4.5, SSP3-7.0, and SSP5-8.5 scenarios, respectively, as compared to the observed data from 1971 through 2014. The increasing trend is projected in the western and southwestern parts of Iran.
The expected frequency of the reoccurring years, particularly for AMP1 about 50 mm under the SSP5-8.5 scenario, is predicted to increase by approximately 2.5 times that of the past 30 years for period 1 (2021-2050), about 3.1 times that for period 2 (2046-2075), and approximately 4.2 times that for period 3 (2071-2100).
Heavy rainfall can have a significant effect on human life, communities, infrastructure, agriculture, and natural ecosystems. Thus, in addressing the impact of climate change due to more frequent extreme precipitation events, governments and communities should prepare the proper infrastructure and systems more carefully and securely to prevent critical damage, such as loss of life from landslides and flooding.
Supplementary Materials: The following are available at https://www.mdpi.com/article/10.3390/ atmos12081052/s1. Table S1: The list of 24 CMIP6 (Coupled Model Intercomparison Project Phase 6) models analyzed in this study. Table S2: The similarity distance metric S ij between model i and model j. Table S3: Statistics of 20-year and 50-year return levels of the annual largest daily rainfall (unit: mm) averaged over 47 grids in Iran for the observations (OBS) and the future periods under the three SSP scenarios. Table S4: Relative change (unit: %) in 20-year and 50-year return levels of the annual largest daily rainfall averaged over Iran relative to 1971-2014. Table S5: Spatially averaged the exceedance probability over Iran for the annual maximum daily precipitation (AMP1) from 20mm to 100mm, obtained from the observations (OBS) and the CMIP6 models. Table S6: The expected frequency of reoccurring years during 30 years for specific the annual maximum daily precipitation (AMP1) values from 20mm to 100mm in Iran, obtained from the observations (OBS) and the CMIP6 models. Figure S1: Plot of the entropy as σ S changes from 0.1 to 1.0. Figure S2: Examples of time series plots of the observations (blue line), CMIP6 data (green line), and the bias-corrected data (red line) in Iran. Figure S3: Arrangement of data and 7-year moving averages composed of the historical data from 1850 to 2014 and the future data for computing the Spearman correlation coefficient between models. Figure S4: Box-plots of 50-year return levels of the annual largest daily rainfall (unit: mm) averaged over 47 grids in Iran for the future periods under the three SSP scenarios. Figure S5: The exceedance probability plots for the annual maximum daily precipitation (AMP1) from 20 mm to 100 mm in Iran, obtained from the observations (OBS) and the CMIP6 models. Acknowledgments: The authors would like to thank the reviewers and the editors of the special issue for their comments and suggestions, which have importantly improved this paper. We acknowledge the World Climate Research Program Working Group on Coupled Modeling, which is responsible for CMIP, and we thank the climate modeling groups for producing and making available their model outputs. For CMIP, the U.S. DoE Program for Climate Model Diagnosis and Intercomparison provided coordinating support and led the development of the software infrastructure in partnership with the Global Organization for Earth System Science Portals. We are grateful to all contributors to the R packages, which were important for this work. The authors thank Yire Shin who provided valuable help for this paper.

Conflicts of Interest:
The authors declare that they have no conflict of interest.