Synthetic Impacts of Internal Climate Variability and Anthropogenic Change on Future Meteorological Droughts over China

The climate change impacts on droughts have received widespread attention in many recent studies. However, previous studies mainly attribute the changes in future droughts to human-induced climate change, while the impacts of internal climate variability (ICV) have not been addressed adequately. In order to specifically consider the ICV in drought impacts, this study investigates the changes in meteorological drought conditions for two future periods (2021–2050 and 2071–2100) relative to a historical period (1971–2000) in China, using two multi-member ensembles (MMEs). These two MMEs include a 40-member ensemble of the Community Earth System Model version 1 (CESM1) and a 10-member ensemble of the Commonwealth Scientific and Industrial Research Organization Mark, version 3.6.0 (CSIRO-Mlk3.6.0). The use of MMEs significantly increases the sample size, which makes it possible to apply an empirical distribution to drought frequency analysis. The results show that in the near future period (2021–2050), the overall drought conditions represented by drought frequency of 30and 50-year return periods of drought duration and drought severity in China will deteriorate. More frequent droughts will occur in western China and southwestern China with longer drought duration and higher drought severity. In the far future period (2071–2100), the nationwide drought conditions will be alleviated, but model uncertainty will also become significant. Deteriorating drought conditions will continue in southwestern China over this time period. Thus, future droughts in southwestern China should be given more attention and mitigation measures need to be carefully conceived in these regions. Overall, this study proposed a method of taking into account internal climate variability in drought assessment, which is of significant importance in climate change impact studies.


Introduction
Drought, usually rooted in the deficiency of expected precipitation over a period of time, has received widespread attention all over the world. In contrast with aridity, which is a long-term normal climate feature in areas with low rainfall, drought is a temporary abnormal phenomenon, which is often related to human activities and can linger for years, and often causes catastrophic socio-economic and environmental consequences. Large scale drought has been observed on all continents, including Europe [1], Africa [2,3], Asia [4,5], Australia [6][7][8], and North America [9,10]. For example, from 1975 to 1976, a notorious drought caused the lowest flow on record in the majority of British rivers, loss of over 50,000 trees, and around £500 million in crop losses [11]. Therefore, an in-depth drought assessment is important for making adaption strategies. possibilities of climate trajectories, while the real climate condition is expected to be a single realization among these possibilities [46]. In general, the inter-member variability, usually in the form of the range or variance of specific statistics (e.g., mean, standard variation), can manifestly represent ICV in the virtual world.
Therefore, the primary objective of this study is to investigate the synthetic impacts of ICV and anthropogenic climate change on droughts in China by using the state-of-the-art large multi-member ensembles of climate models. The synthetic impacts are specifically considered by pooling all ensemble members to estimate different levels of return periods for droughts. In addition, pooling all members of an ensemble enables using an empirical distribution to calculate the return periods for drought duration/severity, which can avoid the potential drawbacks in fitting probability distributions (e.g., great uncertainty in parameter estimation and poor extension in tails).

Dataset
This study used both observed and climate model simulated monthly precipitation data. The observed precipitation data (OBS) were collected from 659 stations in mainland China ( Figure 1) with a 30-year length for the reference baseline  period and provided by the China Meteorological Data Sharing Service System (http://www.cma.gov.cn). The mean, standard deviation, and skewness and kurtosis coefficients of 30-year (1971-2000) annual precipitation are presented for China. The maps of these statistics are interpolated using 659 stations. Generally, southeastern China has much more precipitation with larger variability than northwestern China. The skewness and kurtosis of annual precipitation in China tend to be scattered with higher values mainly located in northwestern China (Figure 1). possibilities of climate trajectories, while the real climate condition is expected to be a single realization among these possibilities [46]. In general, the inter-member variability, usually in the form of the range or variance of specific statistics (e.g., mean, standard variation), can manifestly represent ICV in the virtual world. Therefore, the primary objective of this study is to investigate the synthetic impacts of ICV and anthropogenic climate change on droughts in China by using the state-of-the-art large multi-member ensembles of climate models. The synthetic impacts are specifically considered by pooling all ensemble members to estimate different levels of return periods for droughts. In addition, pooling all members of an ensemble enables using an empirical distribution to calculate the return periods for drought duration/severity, which can avoid the potential drawbacks in fitting probability distributions (e.g., great uncertainty in parameter estimation and poor extension in tails).

Dataset
This study used both observed and climate model simulated monthly precipitation data. The observed precipitation data (OBS) were collected from 659 stations in mainland China ( Figure 1) with a 30-year length for the reference baseline  period and provided by the China Meteorological Data Sharing Service System (http://www.cma.gov.cn). The mean, standard deviation, and skewness and kurtosis coefficients of 30-year (1971-2000) annual precipitation are presented for China. The maps of these statistics are interpolated using 659 stations. Generally, southeastern China has much more precipitation with larger variability than northwestern China. The skewness and kurtosis of annual precipitation in China tend to be scattered with higher values mainly located in northwestern China ( Figure 1). The MMEs of precipitation were extracted from two climate models: the Community Earth System Model version 1 (CESM1) (http://www.cesm.ucar.edu/projects/communityprojects/LENS/data-sets.html) and the Commonwealth Scientific and Industrial Research Organization Mark, version 3.6.0 (CSIRO) (http://cmip-pcmdi.llnl.gov/cmip5). These two MMEs were selected according to their differences in model structures and the number of ensemble members. CESM1 was designed for specifically studying the role of ICV in climate change impact The MMEs of precipitation were extracted from two climate models: the Community Earth System Model version 1 (CESM1) (http://www.cesm.ucar.edu/projects/community-projects/LENS/datasets.html) and the Commonwealth Scientific and Industrial Research Organization Mark, version 3.6.0 (CSIRO) (http://cmip-pcmdi.llnl.gov/cmip5). These two MMEs were selected according to their Water 2018, 10, 1702 4 of 20 differences in model structures and the number of ensemble members. CESM1 was designed for specifically studying the role of ICV in climate change impact studies and has been used in several studies [47][48][49]. It is an Earth System Model (ESM), which contains a large number of ensemble members (40 member) for the period of 1920-2100. In order to investigate the uncertainty related to multi-member ensembles, CSIRO was also selected. CSIRO is an Atmosphere-Ocean General Circulation Model (AOGCM) containing 10 members for the same period [50][51][52][53][54]. In fact, it is ideal to include more multi-member ensembles; however, large member ensembles are usually not available on line. The general information of both MMEs is given in Table 1. Consistent to observed data, the 1971-2000 period of MMEs was used as the reference period. The MME precipitation is compared to observed data in terms of mean and standard deviation of annual precipitation for two climate models ( Figure 2). The results show that both MMEs are considerably biased in simulating mean and variability of annual precipitation, bias correction may be necessary before using these MMEs for drought assessments. studies and has been used in several studies [47][48][49]. It is an Earth System Model (ESM), which contains a large number of ensemble members (40 member) for the period of 1920-2100. In order to investigate the uncertainty related to multi-member ensembles, CSIRO was also selected. CSIRO is an Atmosphere-Ocean General Circulation Model (AOGCM) containing 10 members for the same period [50][51][52][53][54]. In fact, it is ideal to include more multi-member ensembles; however, large member ensembles are usually not available on line. The general information of both MMEs is given in Table  1. Consistent to observed data, the 1971-2000 period of MMEs was used as the reference period. The MME precipitation is compared to observed data in terms of mean and standard deviation of annual precipitation for two climate models ( Figure 2). The results show that both MMEs are considerably biased in simulating mean and variability of annual precipitation, bias correction may be necessary before using these MMEs for drought assessments. The relative error (RE) of the mean (MEAN) and standard deviation (SD) for July precipitation for two climate model ensembles (CESM1; CSIRO) with raw (RAW) simulated data for the reference period . (a) The RE of the mean July precipitation for CESM1 with raw data; (b) The RE of the SD of July precipitation for CESM1 with raw data; (c) The RE of the mean July precipitation for CSIRO with raw data; (d) The RE of the SD of July precipitation for CSIRO with raw data.
Two future periods (2021-2050 and 2071-2100) were extracted from two climate models under the Representative Concentration Pathways (RCP) 8.5 forcing. These two periods, respectively, represent near and far future periods, which were commonly used in other studies [55][56][57][58]. The RCP8.5 scenario was selected, because it represents the upper limit of greenhouse gas emission The relative error (RE) of the mean (MEAN) and standard deviation (SD) for July precipitation for two climate model ensembles (CESM1; CSIRO) with raw (RAW) simulated data for the reference period . (a) The RE of the mean July precipitation for CESM1 with raw data; (b) The RE of the SD of July precipitation for CESM1 with raw data; (c) The RE of the mean July precipitation for CSIRO with raw data; (d) The RE of the SD of July precipitation for CSIRO with raw data. Two future periods (2021-2050 and 2071-2100) were extracted from two climate models under the Representative Concentration Pathways (RCP) 8.5 forcing. These two periods, respectively, represent near and far future periods, which were commonly used in other studies [55][56][57][58]. The RCP8.5 scenario was selected, because it represents the upper limit of greenhouse gas emission scenario [59][60][61]. In order to make a conservative decision to counter climate change impacts, only one extreme greenhouse gas emission scenario was selected. The grid data of MMEs were interpolated to each station by the inverse distance weighting (IDW) method [62].

Bias Correction
Since precipitation outputs of climate models are considerably biased, the application of bias correction is often necessary for future drought evaluation. An empirical quantile-mapping-based bias correction method [63,64] was used to correct biases for the precipitation in MMEs of CESM1 and CSIRO. Specifically, the bias of a MME simulation is determined as the difference between these simulations and the observed data at the reference period in terms of 30 (predetermined) quantiles ranging from 0 and 100 with an interval of around 3.45. The biases are then removed from the MMEs for each future period.
This bias correction method was traditionally used for a single climate simulation [65][66][67][68]. When using it for MMEs, all ensemble members of a climate model for the reference period are pooled together and compared to observations to estimate the correction factors. In other words, the bias correction factors are calculated for the assembled ensemble members, rather than any single member. The calculated bias correction factors are then removed for each individual member. Since all members are simulated by the same model with the same forcing, they are expected to have identical biases. A similar method has been used in other studies [44,46] and its good performance has been proved.

Calculation of Drought Indices
Droughts usually have significant impacts on time scales greater than 1 month [27], so drought indices are usually calculated on the timescale of multiple months. In this study, the three-month SPI is adopted to reflect the seasonal droughts [69][70][71]. It is calculated with the following three steps.
(1) The s (s = 3 here) months' accumulated precipitation (P_sum m ) is calculated for each specific month m (m = 1, 2, . . . , 12, m ≥ s): (2) A gamma distribution is used to fit the time series P_sum m for each month (1-12 months individually): where g(x) is the probability density function (PDF) of the gamma distribution, G(x) is the corresponding cumulative distribution function (CDF), x (or the integral variable t in Equation (3)) is the 3-months' accumulated precipitation (P_sum m in Equation (1)), and a and b are two parameters estimated by the maximum likelihood method (MLE). Considering that non-precipitation may exist while it is excluded when using continuous gamma distribution, Equation (4) serves as a supplement.
where q is the probability of zero precipitation and H(x) is the revised cumulative probability.  (5) in which Φ is the inverse of the normal CDF. Positive SPI values reflect wet conditions, while negative SPI values reflect dry conditions. The drought/wet conditions of different SPI values are presented in Table 2.

Definitions of Droughts
In this study, a drought event is defined as a time period when the drought index (SPI 3 ) is continuously below −1.0 using "the theory of run" [72]; the drought event terminates when the drought index becomes above −1.0. According to the theory of run, two important characteristics-drought duration and severity-are used to characterize a drought event. Drought duration is the length of a time period for which the index (SPI 3 ) remains below the threshold (e.g., −1.0), and drought severity is the cumulative index value during this period. They are calculated using Equations (6) and (7), respectively. To facilitate analysis, the absolute drought severity (the accumulated SPI 3 value) was used here.
where D d and D s are drought duration and severity, respectively. t i is the initiation time and t e is the termination time for a drought event.

Calculation of Drought Return Periods
The duration/severity (D d and D s ) of a drought are further analyzed using a frequency analysis approach. The return periods of D d and D s are calculated using Equation (8) [73,74].
where T represents the return period of a drought duration or severity, F(x) represents the CDF for a probability distribution used to describe a series of droughts' duration or severity, and E(L) is the expected inter-arrival time of a drought event, which is equal to the ratio of the whole length of the time period to the total drought events over that period. F(x) is usually represented by some theoretical probability distributions. The most commonly used distributions include the Exponential, Weibull, Gamma, Lognormal, Kernel and extreme value distributions [14]. The use of these theoretical probability distributions is a tradeoff between a limited sample size and a long return period. However, when the sample size is large enough to approximately represent the population, empirical distribution might be a good choice to conduct frequency analysis. In this study, the application of MMEs, which multiply the sample size, enables the utilization of empirical distribution, even though only 30 years of data are employed. In detail, all samples from multiple members can be assembled in the frequency analysis, as each member represents a possible realization in the real world. Therefore, the F(x) here is estimated using the empirical cumulative frequency with the mathematical expectation (estimated) formula (Equation (9)) [75]: whereF k (x) is the k th cumulative distribution probability and n is the length of series X (x 1 , x 2 , . . . , x n , sorted in an ascending order).
In order to verify the performance of empirical distributions based on a large sample size, this study examined the goodness-of-fit based on the ordinary least square (OLS) [63]. In addition, conventional methods of fitting parametric theoretical distributions [14] (the Exponential, Weibull, Gamma and Lognormal distributions) were employed to serve as a comparison here. Parameters of these theoretical distributions were estimated using the maximum likelihood estimation method (MLE) based on a 30-year single member of the CESM1. Drought duration and drought severity calculated from the 30-year observed data were used as the baseline and corresponding results were presented in Figure 2.
The X-axis and Y-axis in Figure 3 present the sum of squared errors (SSEs) in the OLS criterion for drought duration and drought severity for 10 stations across China, respectively. A smaller SSE value represents a better performance [76]. As shown in the figure, empirical distributions based on the two MMEs show robust results for both drought duration and drought severity over the 10 stations. This proves the superiority of utilizing the MMEs in frequency analysis for extreme climate events and the rationality of employing an empirical distribution.
Water 2018, 10, x FOR PEER REVIEW 7 of 21 empirical cumulative frequency with the mathematical expectation (estimated) formula (Equation (9)) [75]: where ˆ( ) k F x is the kth cumulative distribution probability and n is the length of series X (x1, x2,…, xn, sorted in an ascending order).
In order to verify the performance of empirical distributions based on a large sample size, this study examined the goodness-of-fit based on the ordinary least square (OLS) [63]. In addition, conventional methods of fitting parametric theoretical distributions [14] (the Exponential, Weibull, Gamma and Lognormal distributions) were employed to serve as a comparison here. Parameters of these theoretical distributions were estimated using the maximum likelihood estimation method (MLE) based on a 30-year single member of the CESM1. Drought duration and drought severity calculated from the 30-year observed data were used as the baseline and corresponding results were presented in Figure 2.
The X-axis and Y-axis in Figure 3 present the sum of squared errors (SSEs) in the OLS criterion for drought duration and drought severity for 10 stations across China, respectively. A smaller SSE value represents a better performance [76]. As shown in the figure, empirical distributions based on the two MMEs show robust results for both drought duration and drought severity over the 10 stations. This proves the superiority of utilizing the MMEs in frequency analysis for extreme climate events and the rationality of employing an empirical distribution.  To further examine the superiority of empirical distributions based on large sample sizes provided by the MMEs, cumulative distributions of the drought severity using the different number of simulations (one-member, 10-member and 40-member simulations) of CESM1 for one station in China were analyzed in Figure 4. Here, the gamma distribution was used to serve as a comparison. As represented by the confidence intervals (CI) in Figure 4, the fitting uncertainty, which is mainly derived from parameter estimation, will significantly decline with increasing sample sizes. This shows that frequency analysis, which is based on a large sample size, will be more reliable. However, the corresponding fitting error will also increase with increase in sample sizes, especially for the tail, as indicated by pronounced disparities between the empirical distributions and the gamma distributions. This increasing fitting error will thus partially weaken the superiority of large sample sizes. On the other hand, empirical distributions characterize authentic features of a sample when the sample size is large enough. Therefore, it is reliable to employ empirical distributions combined with MMEs in frequency analysis. To further examine the superiority of empirical distributions based on large sample sizes provided by the MMEs, cumulative distributions of the drought severity using the different number of simulations (one-member, 10-member and 40-member simulations) of CESM1 for one station in China were analyzed in Figure 4. Here, the gamma distribution was used to serve as a comparison. As represented by the confidence intervals (CI) in Figure 4, the fitting uncertainty, which is mainly derived from parameter estimation, will significantly decline with increasing sample sizes. This shows that frequency analysis, which is based on a large sample size, will be more reliable. However, the corresponding fitting error will also increase with increase in sample sizes, especially for the tail, as indicated by pronounced disparities between the empirical distributions and the gamma distributions. This increasing fitting error will thus partially weaken the superiority of large sample sizes. On the other hand, empirical distributions characterize authentic features of a sample when the sample size is large enough. Therefore, it is reliable to employ empirical distributions combined with MMEs in frequency analysis.

Data Analysis
In order to quantify the effects of bias correction and future changes in drought conditions, the relative errors (RE) between MMEs and observed precipitation in the reference period-Equation (10)-and relative changes (RC) of drought variables between future and reference periods-Equation (11)-are calculated:

RE 100%
sim obs where RE represents the relative error, Vsim is the statistic (e.g., mean and SD) in MMEs precipitation and Vobs is the corresponding statistic in observed precipitation. As for RC, Vfut stands for the future drought variables (e.g., drought frequency, maximum drought duration and severity) and Vref stands for the corresponding variables in the reference period. Additionally, the statistical significance of the changes in drought frequency, drought duration, and drought severity in future periods relative to the reference period is tested by using the "Welch's t test" [77,78]. When using the "Welch's t test", all values corresponding to drought frequency, drought duration, or drought severity from all members within the same period are pooled together to construct a sample. All stations with a significant climate change signal at the p = 0.05 significance level will be highlighted in figures.

Data Analysis
In order to quantify the effects of bias correction and future changes in drought conditions, the relative errors (RE) between MMEs and observed precipitation in the reference period-Equation (10)-and relative changes (RC) of drought variables between future and reference periods-Equation (11)-are calculated: where RE represents the relative error, V sim is the statistic (e.g., mean and SD) in MMEs precipitation and V obs is the corresponding statistic in observed precipitation. As for RC, V fut stands for the future drought variables (e.g., drought frequency, maximum drought duration and severity) and V ref stands for the corresponding variables in the reference period. Additionally, the statistical significance of the changes in drought frequency, drought duration, and drought severity in future periods relative to the reference period is tested by using the "Welch's t test" [77,78]. When using the "Welch's t test", all values corresponding to drought frequency, drought duration, or drought severity from all members within the same period are pooled together to construct a sample. All stations with a significant climate change signal at the p = 0.05 significance level will be highlighted in figures. Figure 2 presents the relative errors of the mean and standard deviation for precipitation in July simulated by two climate model large ensembles without bias correction for the reference period. The relative error was calculated based on ensemble mean, rather than each individual member. Since similar correction performances are observed for all months, only results from July are presented in Figure 2 for illustration. Generally, both MMEs without post-processing are considerably biased in simulating mean July precipitation in China, with the relative errors ranging from −10% to 50% over most areas. Both ensembles overestimate mean precipitation in the Qinghai-Tibetan plateau. The CESM1 also overestimates mean precipitation in northern China and the CSIRO overestimates the mean values in southern China. On the other hand, the CESM1 underestimates mean precipitation for southeastern China, whereas the CSIRO underestimates the mean values for central and northeastern China.

Validation of Downscaling Methods
The relative errors of the standard deviation (SD) range from −10% to 30% for CESM1-and CSIRO-simulated July precipitation over most stations. Both climate model ensembles (especially the CESM1) overestimate the SD in the arid Qinghai-Tibetan plateau. In addition, for all other regions, the SD of precipitation is generally underestimated by both ensembles, with the exception of some southwestern regions, where the SD of July precipitation is overestimated by the CSIRO.
For the bias-corrected precipitation, both the mean and the SD are almost identical to that of observations all over China, as indicated by the white background in Figure 5. This proves the reasonable performance of the bias correction method for multi-member climate ensembles.  Figure 2 presents the relative errors of the mean and standard deviation for precipitation in July simulated by two climate model large ensembles without bias correction for the reference period. The relative error was calculated based on ensemble mean, rather than each individual member. Since similar correction performances are observed for all months, only results from July are presented in Figure 2 for illustration. Generally, both MMEs without post-processing are considerably biased in simulating mean July precipitation in China, with the relative errors ranging from −10% to 50% over most areas. Both ensembles overestimate mean precipitation in the Qinghai-Tibetan plateau. The CESM1 also overestimates mean precipitation in northern China and the CSIRO overestimates the mean values in southern China. On the other hand, the CESM1 underestimates mean precipitation for southeastern China, whereas the CSIRO underestimates the mean values for central and northeastern China.

Validation of Downscaling Methods
The relative errors of the standard deviation (SD) range from −10% to 30% for CESM1-and CSIRO-simulated July precipitation over most stations. Both climate model ensembles (especially the CESM1) overestimate the SD in the arid Qinghai-Tibetan plateau. In addition, for all other regions, the SD of precipitation is generally underestimated by both ensembles, with the exception of some southwestern regions, where the SD of July precipitation is overestimated by the CSIRO.
For the bias-corrected precipitation, both the mean and the SD are almost identical to that of observations all over China, as indicated by the white background in Figure 5. This proves the reasonable performance of the bias correction method for multi-member climate ensembles.  (1971-2000). (a) The RE of the mean July precipitation for CESM1 with corrected data; (b) The RE of the SD of July precipitation for CESM1 with corrected data; (c) The RE of the mean July precipitation for CSIRO with corrected data; (d) The RE of the SD of July precipitation for CSIRO with corrected data. Figure 6 presents frequencies of droughts (defined by the threshold value of −1 for SPI) simulated by two MMEs for the reference period and their relative changes for the near and far future periods. In the reference period, the drought frequency ranges between 12 and 34 times for CESM1  (1971-2000). (a) The RE of the mean July precipitation for CESM1 with corrected data; (b) The RE of the SD of July precipitation for CESM1 with corrected data; (c) The RE of the mean July precipitation for CSIRO with corrected data; (d) The RE of the SD of July precipitation for CSIRO with corrected data. Figure 6 presents frequencies of droughts (defined by the threshold value of −1 for SPI) simulated by two MMEs for the reference period and their relative changes for the near and far future periods. In the reference period, the drought frequency ranges between 12 and 34 times for CESM1 and between 12 and 35 times for CSIRO, across all stations in China. Even though the range of drought frequency is similar between two climate models, the uncertainty related to climate models cannot be neglected, as indicated by the higher drought frequency simulated by CESM1 for most regions in China. The low drought frequency in the reference period tends to appear in northwestern China, whereas the high drought frequency is mainly located in northeastern and central China. Additionally, moderate drought frequencies can be seen in southern China. northern regions.

Future Drought Frequencies
Changes of drought frequencies in the far future can exceed that of the near future, in spite of considerable inconsistency between the two multi-member ensembles (Figure 6e,f). The ratio of stations that show significant changes can reach up to 99% for CESM1 (and 64% for CSIRO). Compared with the near future period, projected changes in the end of century are less certain, especially in southwestern China where the CESM1 projects a decline in drought frequency while the CSIRO projects an increasing change. Parts of northwestern China present rising frequency ranging between 20 and 50% projected by CSIRO, in line with the case of CESM1. On the other hand, negative frequency changes are projected by CESM1 and CSIRO in northeastern China. Considerably decreasing frequency of droughts is projected to appear in northern China with relative changes approaching about −50%. In contrast, southeastern China will experience a slight decrease in the frequency of droughts with relative changes varying from −10 to −20%. In the near future, changes in drought frequencies vary among different regions. A half of stations show statistically significant changes at the p = 0.05 level, when using CESM1. The percentile of stations is 20% when using CSIRO. Generally, the frequency is projected to increase in western China, with relative changes ranging between 10 and 30%. On the other hand, a slight decrease of drought frequency can be seen in northern and northeastern China, approximately ranging between −10 and −20% for the CESM1 and CSIRO. The climate model uncertainty can also be inspected by the two climate model ensembles. For example, CESM1 projects a decrease in drought frequency in northern China, while CSIRO projects a much weaker decrease or even an opposite sign in some northern regions.
Changes of drought frequencies in the far future can exceed that of the near future, in spite of considerable inconsistency between the two multi-member ensembles (Figure 6e,f). The ratio of stations that show significant changes can reach up to 99% for CESM1 (and 64% for CSIRO). Compared with the near future period, projected changes in the end of century are less certain, especially in southwestern China where the CESM1 projects a decline in drought frequency while the CSIRO projects an increasing change. Parts of northwestern China present rising frequency ranging between 20 and 50% projected by CSIRO, in line with the case of CESM1. On the other hand, negative frequency changes are projected by CESM1 and CSIRO in northeastern China. Considerably decreasing frequency of droughts is projected to appear in northern China with relative changes approaching about −50%. In contrast, southeastern China will experience a slight decrease in the frequency of droughts with relative changes varying from −10 to −20%.

Future Drought Hazards
The future drought hazards represented by 30-and 50-year return periods of drought duration and drought severity are analyzed based on the empirical distribution. Figures 7 and 8 present the 30-year return periods of drought duration and drought severity in China for the reference period and their relative changes for two future periods. In the reference period, there exists distinctive spatial variability for the 30-year return period in terms of drought duration and drought severity across China. The 30-year return period of drought duration simulated by CESM1 ranges between 4 and 7 months and the 30-year return period of drought severity ranges between 5 and 13. A larger spatial variability is projected by CSIRO with the 30-year return period of drought duration ranging from 4 to 11 months and drought severity ranging from 7 to 21. Additionally, the spatial patterns of the 50-year return levels (Figures 9 and 10) are quite similar with that of the 30-year return levels for both drought duration and drought severity.

Future Drought Hazards
The future drought hazards represented by 30-and 50-year return periods of drought duration and drought severity are analyzed based on the empirical distribution. Figures 7 and 8 present the 30-year return periods of drought duration and drought severity in China for the reference period and their relative changes for two future periods. In the reference period, there exists distinctive spatial variability for the 30-year return period in terms of drought duration and drought severity across China. The 30-year return period of drought duration simulated by CESM1 ranges between 4 and 7 months and the 30-year return period of drought severity ranges between 5 and 13. A larger spatial variability is projected by CSIRO with the 30-year return period of drought duration ranging from 4 to 11 months and drought severity ranging from 7 to 21. Additionally, the spatial patterns of the 50-year return levels (Figures 9 and 10) are quite similar with that of the 30-year return levels for both drought duration and drought severity.   In the near future, deteriorating drought hazards represented by 30-year return periods of drought duration and drought severity can be observed in most regions over China. When using CESM1, more stations show statistically significant changes for both drought duration and drought severity than using CSIRO. Generally, the relative changes of 30-year return period of drought severity are projected to be more remarkable than that of 30-year return period of drought duration, with more pronounced changes projected by CSIRO than that by CESM1. Specifically, CSIRO projects a relative change in 30-year return period of drought duration ranging between −33% and 60% across China and a relative change in 30-year return period of drought severity ranging between −34% and 83%. The values range between −27% and 45% (drought severity) and between −33% and 25% (drought duration) for CESM1. For relative changes in 50-year return levels, their spatial patterns are similar to that in 30-year return levels for both drought duration and drought severity. However, CSIRO projects a slightly more prominent spatial variability than CESM1 for both drought duration and drought severity.  For the far future period, the amplitudes of changes in drought hazards represented by 30-and 50-return periods of drought duration and drought severity are further extended. The number of stations that show statistically significant changes for both drought duration and drought severity will also increase during this period for both models. In addition, a lesser agreement can be observed between the two climate models. Generally, in contrast to the near future period, drought hazards are projected to be alleviated for the far future period with more significant and widespread negative changes across China. In specific, the negative change is more pronounced for the 30-year return period of drought severity compared with that for the 30-year return period of drought duration. In addition, CESM1 projects more negative changes in both 30-year return period of drought duration and drought severity than CSIRO. With an exception of a few stations in the northwestern regions, the drought hazards projected by CESM1 will be relieved across China with the lowest value up to −60% for the 30-year return period of drought duration and −67% for the 30-year return period of drought severity. However, a larger spatial variability is projected by CSIRO. Similar magnitudes and spatial patterns of drought duration and drought severity are observed for 50-year return periods.

Discussion
Internal climate variability (ICV) is one of the major components contributing to climate change, especially for the historical period with less external forcing from greenhouse gas emissions [79]. However, it is usually not considered in climate change impact studies, as long time series used to represent the ICV are usually not available. The development of MMEs provides an opportunity to consider ICV in climate change impact studies. The MME is achieved by running a climate model several times with the same climate forcing, but different initial conditions. Remarkably different climate scenarios may be predicted in a MME, even though a very minor change had been made in the initial conditions of a climate model. Thus, each member contains information of intrinsic variability in the climate system and its responses under human-induced climate change. Any member can be considered as a possible realization of real climate if model biases are not considered. Thus, the inter-member difference can represent the ICV in a climate system, and all members can be assembled to investigate the role of ICV in climate change impacts. Until now, MMEs of several climate models have been developed and widely used in climate change studies [41,42]. These MMEs perform reasonably well with respect to reproducing the long-term observed ICV of mean precipitation and temperature, especially at multi-decadal scales [54].

Discussion
Internal climate variability (ICV) is one of the major components contributing to climate change, especially for the historical period with less external forcing from greenhouse gas emissions [79]. However, it is usually not considered in climate change impact studies, as long time series used to represent the ICV are usually not available. The development of MMEs provides an opportunity to consider ICV in climate change impact studies. The MME is achieved by running a climate model several times with the same climate forcing, but different initial conditions. Remarkably different climate scenarios may be predicted in a MME, even though a very minor change had been made in the initial conditions of a climate model. Thus, each member contains information of intrinsic variability in the climate system and its responses under human-induced climate change. Any member can be considered as a possible realization of real climate if model biases are not considered. Thus, the inter-member difference can represent the ICV in a climate system, and all members can be assembled to investigate the role of ICV in climate change impacts. Until now, MMEs of several climate models have been developed and widely used in climate change studies [41,42]. These MMEs perform reasonably well with respect to reproducing the long-term observed ICV of mean precipitation and temperature, especially at multi-decadal scales [54].
It is well known that the climate model itself is one of the largest uncertainty contributors to climate change impacts [79]. Due to data availability, only two climate models with large ensemble members were used in this study. Other climate models in the Coupled Model Inter-comparison Project Phase 5 (CMIP5) database usually have fewer ensemble members, which could be inadequate to represent the ICV. Even though it would be better to include more models, the two models applied here reflect the model uncertainty to a certain extent, especially for the far future period. For example, the disparities in spatial distributions of changes in drought frequencies, 30-and 50-year return periods of drought duration, and drought severity between CESM1 and CSIRO all become more remarkable in the far future period than in the near future. Besides, the uncertainty is pronounced in southwestern China where two models suggest different directions of drought changes for the far future period. In addition, though the CSIRO contains fewer runs than the CESM1, it often presents more significant changes, as indicated by larger amplitudes in drought frequencies, 30-and 50-year return periods of drought duration, and drought severity for the two future periods. Overall, different model structures combined with different initial conditions may lead to divergent inter-member variability and subsequently result in divergent spatial patterns and corresponding absolute values.
To better reveal the role of ICV in future drought projections, the results of climate change impacts with and without the consideration of ICV are demonstrated as follows by comparing the multi-member-ensemble (CSIRO or CESM1) with the multi-model-ensemble (single simulations from 29 climate models) in estimating the relative changes of 30-year return period of drought severity between the near future and the reference period in China ( Figure 11). When using the multi-member-ensemble, an empirical distribution was fitted based on samples of all members to calculate return periods of drought severity. When using the multi-model-ensemble, the average of 29 climate models with a single simulation was used. Specifically, a single simulation was used to calculate the 30-year return period of drought severity based on the gamma distribution. The mean value was then calculated over all 29 climate models. As shown in Figure 11, without consideration of ICV, more stations in China present deteriorating drought hazards, as indicated by the increase in the 30-year return period of drought severity. Even though it is unable to determine which method is more reliable in estimating the climate change impacts on drought, the use of multi-member ensemble specifically considers the effects of ICV, which is an irreducible uncertainty source for impact studies. It is well known that the climate model itself is one of the largest uncertainty contributors to climate change impacts [79]. Due to data availability, only two climate models with large ensemble members were used in this study. Other climate models in the Coupled Model Inter-comparison Project Phase 5 (CMIP5) database usually have fewer ensemble members, which could be inadequate to represent the ICV. Even though it would be better to include more models, the two models applied here reflect the model uncertainty to a certain extent, especially for the far future period. For example, the disparities in spatial distributions of changes in drought frequencies, 30-and 50-year return periods of drought duration, and drought severity between CESM1 and CSIRO all become more remarkable in the far future period than in the near future. Besides, the uncertainty is pronounced in southwestern China where two models suggest different directions of drought changes for the far future period. In addition, though the CSIRO contains fewer runs than the CESM1, it often presents more significant changes, as indicated by larger amplitudes in drought frequencies, 30-and 50-year return periods of drought duration, and drought severity for the two future periods. Overall, different model structures combined with different initial conditions may lead to divergent intermember variability and subsequently result in divergent spatial patterns and corresponding absolute values.
To better reveal the role of ICV in future drought projections, the results of climate change impacts with and without the consideration of ICV are demonstrated as follows by comparing the multi-member-ensemble (CSIRO or CESM1) with the multi-model-ensemble (single simulations from 29 climate models) in estimating the relative changes of 30-year return period of drought severity between the near future and the reference period in China ( Figure 11). When using the multimember-ensemble, an empirical distribution was fitted based on samples of all members to calculate return periods of drought severity. When using the multi-model-ensemble, the average of 29 climate models with a single simulation was used. Specifically, a single simulation was used to calculate the 30-year return period of drought severity based on the gamma distribution. The mean value was then calculated over all 29 climate models. As shown in Figure 11, without consideration of ICV, more stations in China present deteriorating drought hazards, as indicated by the increase in the 30-year return period of drought severity. Even though it is unable to determine which method is more reliable in estimating the climate change impacts on drought, the use of multi-member ensemble specifically considers the effects of ICV, which is an irreducible uncertainty source for impact studies. Moreover, the signal to uncertainty ratio (SNR, [80]) of changes in drought frequency for the near future period relative to the reference period was also calculated ( Figure 12). The signal is defined as the mean of future changes in drought frequency over all 29 climate models. The uncertainty related to the choice of a climate model was compared to that related to the ICV in estimating the changes of drought frequency. Specifically, the uncertainty related to climate models was defined as the standard deviation of changes in climate change signals projected by 29 climate model simulations. The uncertainty related to ICV was defined as the inter-member variability (standard deviation) of the multi-member ensemble (CSIRO or CESM1). As shown in Figure 12, the uncertainty related to ICV is comparable to that related to climate models. In addition, the ICV plays a significant role in drought impact studies, especially for southern China, where the SNR is smaller Moreover, the signal to uncertainty ratio (SNR, [80]) of changes in drought frequency for the near future period relative to the reference period was also calculated ( Figure 12). The signal is defined as the mean of future changes in drought frequency over all 29 climate models. The uncertainty related to the choice of a climate model was compared to that related to the ICV in estimating the changes of drought frequency. Specifically, the uncertainty related to climate models was defined as the standard deviation of changes in climate change signals projected by 29 climate model simulations. The uncertainty related to ICV was defined as the inter-member variability (standard deviation) of the multi-member ensemble (CSIRO or CESM1). As shown in Figure 12, the uncertainty related to ICV is comparable to that related to climate models. In addition, the ICV plays a significant role in drought impact studies, especially for southern China, where the SNR is smaller than 1. This further indicates the necessity of taking into account the ICV in climate change impact studies.  Overall, the results presented in Figures 11 and 12 indicate that the ICV contributes large uncertainty in drought projections and plays a significant role in climate change impacts on droughts. Thus, the impacts of ICV should be specifically considered in drought projections. Since the observed time series is usually not long enough to manifest the ICV, especially at the multi-decadal scale, the use of multi-member ensemble is an alternative solution. In fact, if multi-member ensembles are available for all climate models, it is ideal to use all of them to adequately represent the overall uncertainty from climate models and ICV [81,82]. However, it is computationally too expensive to run all climate models with multiple members. Since previous studies [54] have shown that the multimember ensemble of a climate model can reasonably represent the observed ICV, especially at the multi-decadal scale, when the ensemble member is larger than 5, a compromise solution may combine multi-model ensembles with multi-member ensemble for impact studies. This may be achieved by adding the ICV estimated using a single multi-member ensemble on the top of the climate change signal estimated using multi-model ensembles. This can be an avenue for future studies.
As mentioned in the Introduction, droughts have several categories; this study only considered meteorological drought because other types of droughts are basically derived from meteorological drought and especially abnormal precipitation conditions. In fact, the assessment of other types of droughts follows a similar framework. The differences mainly lie in the selection of hydrometeorological variables and criteria. In order to describe drought conditions more systematically, more drought indices from various hydro-meteorological variables should be conducted in future studies.
Moreover, drought duration and drought severity may be dependent on each other [26], but the univariate return periods/levels calculated in this study ignored this dependence. However, this dependence can be induced by using joint distributions for drought duration and severity, such as copula functions or conditional distributions. This will also be an avenue for future studies.

Conclusions
In this paper, two MMEs are employed to investigate the joint impacts of ICV and anthropogenic climate change on future meteorological drought conditions. Based on the large sample size provided by MMEs, the empirical distributions are able to be used in frequency analysis to depict drought hazards. The following conclusions can be drawn.
(1) Precipitation predictions simulated by CESM1 and CSIRO ensembles have considerable biases, while these biases can be effectively reduced by the bias correction method. (2) Based on bias-corrected CESM1 and CSIRO ensembles, northwestern China experiences mild drought conditions in the reference period, as indicated by low drought frequencies and hazards. Remarkably frequent droughts with relatively short drought duration and small Overall, the results presented in Figures 11 and 12 indicate that the ICV contributes large uncertainty in drought projections and plays a significant role in climate change impacts on droughts. Thus, the impacts of ICV should be specifically considered in drought projections. Since the observed time series is usually not long enough to manifest the ICV, especially at the multi-decadal scale, the use of multi-member ensemble is an alternative solution. In fact, if multi-member ensembles are available for all climate models, it is ideal to use all of them to adequately represent the overall uncertainty from climate models and ICV [81,82]. However, it is computationally too expensive to run all climate models with multiple members. Since previous studies [54] have shown that the multi-member ensemble of a climate model can reasonably represent the observed ICV, especially at the multi-decadal scale, when the ensemble member is larger than 5, a compromise solution may combine multi-model ensembles with multi-member ensemble for impact studies. This may be achieved by adding the ICV estimated using a single multi-member ensemble on the top of the climate change signal estimated using multi-model ensembles. This can be an avenue for future studies.
As mentioned in the Introduction, droughts have several categories; this study only considered meteorological drought because other types of droughts are basically derived from meteorological drought and especially abnormal precipitation conditions. In fact, the assessment of other types of droughts follows a similar framework. The differences mainly lie in the selection of hydro-meteorological variables and criteria. In order to describe drought conditions more systematically, more drought indices from various hydro-meteorological variables should be conducted in future studies.
Moreover, drought duration and drought severity may be dependent on each other [26], but the univariate return periods/levels calculated in this study ignored this dependence. However, this dependence can be induced by using joint distributions for drought duration and severity, such as copula functions or conditional distributions. This will also be an avenue for future studies.

Conclusions
In this paper, two MMEs are employed to investigate the joint impacts of ICV and anthropogenic climate change on future meteorological drought conditions. Based on the large sample size provided by MMEs, the empirical distributions are able to be used in frequency analysis to depict drought hazards. The following conclusions can be drawn.
(1) Precipitation predictions simulated by CESM1 and CSIRO ensembles have considerable biases, while these biases can be effectively reduced by the bias correction method. (2) Based on bias-corrected CESM1 and CSIRO ensembles, northwestern China experiences mild drought conditions in the reference period, as indicated by low drought frequencies and hazards.
Remarkably frequent droughts with relatively short drought duration and small drought severity are found in northeastern China. Southern China is subject to moderate drought frequencies but with long drought duration and large drought severity. (3) In the middle of the 21st century, CESM1 and CSIRO ensembles project decreases in the drought frequency and drought duration in northern and northeastern China, and increases in drought severity, resulting in shorter but heavier droughts. In contrast, drought conditions will significantly deteriorate in southwestern China, while drought frequencies will go up and the 30-and 50-year drought duration and severity will increase. (4) At the end of the 21st century, drought frequencies and drought hazards are projected to decrease in northern and, especially, northeastern China for both climate models. However, a large uncertainty related to climate models is observed when simulating drought conditions. For example, drought conditions will continue to deteriorate in southwestern China under the CSIRO, while they will greatly alleviate under the CESM1.