Drought Frequency Analysis Based on the Development of a Two-Variate Standardized Index (Rainfall-Runoff)

: Drought is one of the most drastic events, which has imposed irreparable damages on human societies and may occur in any climate regime. To define drought, given its properties of multidimensionality and randomity, one cannot rely on a single variable/index (e.g., precipitation, soil moisture, and runoff). Accordingly, implementing a novel approach, this study investigated drought events in two basins with different climatic regimes, using multivariate frequency analyses of drought duration, severity, and severity peak, based on developing a Two-variate Standardized Index (TSI). The index was developed based on the concept of copula, by applying rainfall-runoff data (1974–2019) and comparing them with two popular drought indices, the Standardized Precipitation Index (SPI) and Standardized Stream Flow Index (SSFI), in terms of derived drought characteristics. The results show that TSI determined more severe drought conditions with fewer return periods than SPI and SSFI in a specific drought event. This implies that the disadvantages of SPI and SSFI might not be found in TSI. The developed index can be employed by policymakers and planners to protect water resources


Introduction
According to the World Disaster Report, overall damages originated by drought were over 79.3 billion dollars from 2008 to 2017, worldwide.In terms of affecting areas and people, drought was in second place among natural disasters [1].Drought has many facets in different regions [2], but it always starts with a lack of precipitation, which can affect soil moisture [3], streams, groundwater, ecosystems, and human beings [4].Additionally, there are different types of drought, such as meteorological, agricultural, hydrological, and socioeconomic.Thus, a drought definition based on a single variable may not be sufficient and is not appropriate for reliable risk assessment and decision making [5,6].
Furthermore, there are uncertainties associated with random hydrological phenomena [7], such as floods and drought, which are multivariate functions.Therefore, to model these phenomena, descriptions and assessments in the multivariate forms are necessary.To meet the requirements of multivariate evaluation, the copula concept was introduced in 1959 for the first time [8].Copulas are mathematical functions by which joint probability distributions can be extracted, based on the marginal probability distributions assigned to dependent variables [9,10].Recently, researchers have widely used copula methods to develop multivariate indices and frequency analyses in various fields.
Copula structures for drought assessment and management were first used in 2006, when a joint distribution of drought duration and severity was constructed, and the corresponding return periods of drought were calculated [11].In a 2009 study in Iran, Severity-Duration-Frequency (SDF) curves for two cities, Anzali with a wet climate and Abadan with a semi-arid climate, were presented considering two severity and duration characteristics of drought derived based on SPI [12].In addition to SPI, a Multivariate Standardized Drought Index (MSDI) and Standardized Soil moisture Index (SSI) were used to analyze drought conditions [13].The authors developed an MSDI based on copula concepts, as well as rainfall and soil moisture variables for California and North Carolina, USA [13].
Further, in 2019, a nonstationary Standardized Streamflow Index was proposed to fit the streamflow series at Yangtze station in China.Time and a modified reservoir index were introduced as covariates to assess the effect of reservoir regulation.The copula method was applied for bivariate modeling of drought duration and severity, in which joint and conditional return periods were considered for drought risk assessment.The study indicated that the monthly streamflow at the station had significantly changed, and the stationary assumption was no longer valid, thus the drought severity was more severe when considering nonstationary properties [14].
In another study, a new Copula-based Drought Index (COPDI) was used for monthly precipitation and evapotranspiration data of Aksehir station in Konya Province, Turkey [15].The authors investigated meteorological and agricultural droughts together.Additionally, drought duration, severity, and peak intensity were analyzed, with multivariate and conditional return periods estimated [15].In the same year, a Nonparametric Multivariate Standardized Drought Index (NMSDI) was developed based on soil moisture and precipitation data in conjunction with copula functions [16].To investigate the bivariate return period of the NMSDI, two typical drought characteristics (severity and duration) at ten stations in Konya Closed Basin in Turkey were used [16].
In 2020, research was conducted to develop a drought Severity-Area-Frequency (SAF) curve with copula theory and to evaluate uncertainty due to different copula functions, copula parameter estimations, and copula input data [17].For this purpose, the precipitation and temperature data from nine meteorological stations within the Heihe River Basin in China were collected, and drought characteristics based on a 12-month Standardized Precipitation Evapotranspiration Index (SPEI) were derived.The results show that the uncertainty ranges caused by copula input data were larger than those originated by copula parameter estimations and copula functions.The study also highlighted the importance of uncertainty analysis in the frequency study [17].
Up to the present date, the procedure of studying drought using copula functions has progressed from deriving joint probability distributions of drought characteristics extracted from a single variable index, such as SPI.While previous studies have presented valuable information for drought analysis, a more conclusive understanding of drought conditions under the evaluation of uncertainty effects on frequency analysis is still needed.Thus, to better assess the behavior of meteorological and hydrological drought together, this paper aims at: (1) developing a Two-variate Standardized Index (TSI) from rainfall and runoff variables based on the model developed by Hao and Aghakouchak [13]; and (2) performing trivariate frequency analyses based on the developed index without considering uncertainty effects.To the best of the authors' knowledge, this is the first paper that investigates the trivariate return periods of meteorological-hydrological droughts using the TSI, a copula-based bivariate standardized index based on rainfall and runoff variables, for two basins with different climatic regimes.

Materials and Methods
In the following sections, the location of the study areas with the data pre-analysis are briefly described.Then, after presenting the development of a two-variate drought index using copula framework and comparing it with two well-known univariate indices, three mean drought properties are defined using the indices over a one-month time scale.Return periods are then calculated with the applied indices in Section (2).Finally, all numerical results are summarized and discussed in Section (3), as demonstrated schematically in Figure 1.

Study Area
In this study, the two basins of Rasht and Urmia in Iran were selected as the case study sites.The main reason to select these basins as the study area was their significantly different climate conditions.The average annual rainfalls in the Rasht and Urmia Basins were 1276 and 394 mm, respectively, during 1974-2019.The Rasht Basin lies within 49°30′-49°40′ E and 37°0′-37°20′ N, covering an area of 77.11 km 2 with wet climate conditions; the climate conditions of the Urmia Basin, located within 44°40′-45°0′ E and 37°20′-37°40′ N and covering a drainage area of 117.68 km 2 , is classified into cold and semi-arid.The monthly rainfall data used in this study covered 45 years from 1974 to 2019 and were obtained from the Rasht and Urmia climatology stations.
Given that rainfall and runoff are point and vector variables, respectively, to evaluate the direct impact of rainfall on runoff measurement, the hydrometric station should be very close to the meteorological station.Otherwise, other factors than rainfall may affect runoff, which in turn reduces the dependency of these two variates, and thus makes it impossible to use copulas in multivariate studies.Therefore, monthly runoff data for the 45-year period were obtained from the Water Organization Bridge hydrometric station on the Siahrood River in the Rasht Basin, and the Urmia hydrometric station on the Shahre Chai River in the Urmia Basin.In this study, to investigate the direct effect of any precipitation event on runoff, the authors assumed that all precipitation was in the form of rainfall, and its amount was the same throughout the basin.After applying the Mann-Kendall test, the included hydrometeorological data showed no trend, and therefore they were considered as homogeneous data.Figure 2 shows the locations of the study areas, along with their climatology and hydrometric stations.The average monthly rainfall and runoff values for the two basins can be found in Figure 3.

Two-Variate Standardized Index
Generally, rainfall fluctuates more than other variables such as soil moisture, runoff, and water table; there is always a different lag time for rainfall impact on hydrological parameters.Because of the rainfall effect on hydrological variables, it can be taken into account to be correlated.Given fluctuations with rainfall, drought assessment based only on rainfall may not be sufficient.As an example, during a specific period, conditions may not be identified as a drought state by considering rainfall, whereas, based on a hydrological variable, a drought state might be determined.Moreover, drought assessment based on a hydrological variable may not be appropriate, owing to rainfall variable properties.For instance, the rainfall variable usually determines drought conditions earlier than other variables.Therefore, rainfall and another hydrological parameter must work together, by developing a two-variate index, to precisely assess drought.The steps toward developing the index for the rainfall and runoff data of the Rasht Basin area are presented in the following sections.

Assessment of Dependency
As a significant coefficient to find a statistical relationship between two variables, the correlation coefficient has been widely applied in the literature [18,19].In this regard, several methods such as Pearson's classical correlation coefficient, Spearman's correlation coefficient, Kendall's tau, chi-plot, k-plot, and rank scatter plot are commonly used to assess the dependency between two random variables.
In this study, the above-mentioned methods and graphs were used to investigate the dependency between rainfall and runoff data.According to Table 1, Pearson's classical and Spearman's correlation coefficients were 0.65, which indicates a good correlation between the variables.Additionally, Kendall's tau was 0.465, which is greater than the value of τα/r at a 5% significance level.Thus, the independence hypothesis of the variables is rejected.
As shown in Figure 4a, data are within the dependent zone (between the curve and the 45° line), implying a dependency between rainfall and runoff variables.Figure 4b shows the dependency between rainfall and runoff data, since the data are out of the independent zone (red dotted zone).In Figure 4c, the variables have a relatively good correlation, since the data are approximately plotted on the diagonal line (Ri = Si).Therefore, it can be concluded that precipitation and runoff were relatively dependent.

Marginal Probability Distribution and Goodness-of-Fit
In this study, different cumulative distribution functions (CDF) were fitted to monthly rainfall and runoff data separately.In addition, the best CDF was employed to fit the accumulated data at any time scale (e.g., 1, 3, 6, 12, and 24 months).Gamma and log-normal distributions were selected as the most suitable distributions for rainfall and runoff variables, respectively.The maximumlikelihood method (ML) was used to estimate parameters of Gamma and log-normal distributions, resulting in the estimated parameters α = 0.0114 (the shape parameter) and β = 1.2244 (the scale parameter) for Gamma distribution and μ = 3.9717 and σ = 0.6110 for log-normal distribution.Figure 5 shows that the selected univariate models revealed satisfactory fitting between the simulation results and the observed values.According to the figure, consistency between the theoretical and empirical probability distributions can be obtained.The selected distributions are nearly located on the 45° line.The empirical probabilities were calculated using the Gringorten formula.The main goal of evaluating the fit of computed probability distributions to sample data is to measure the distance between an empirical distribution function and a theoretical distribution model.For goodness-of-fit test statistics, Kolmogorov-Smirnov (K-S) Dn, Cramer-von Mises Wn 2 , Anderson-Darling (A-D) An 2 , and a modified weighted Watson Un 2 were used [10].The computed sample values of Dn, Wn 2 , An 2 , and Un 2 for the selected distributions (gamma and log-normal) to fit the rainfall and runoff variables are given in Tables 2 and 3, respectively.Since all of the computed statistics were less than the corresponding critical values at various significance levels, gamma and log-normal were accepted as marginal distributions.Copulas are mathematical models and functions that are capable of modeling correlated structures from non-independent random variables.In other words, these functions determine a joint probability distribution based on the distributions fitted to each variable [10].Assuming that the X and Y variables are the precipitation and runoff variables, respectively, their joint probability distribution is in the following form [13,20,21]: where C is the copula function and F(X) and F(Y) are the cumulative distribution functions of precipitation and runoff data, respectively.For developing TSI, the cumulative probability (p) resulting from Equation (1) should be converted to a standardized normal distribution with zero mean and unit standard deviation, as shown in Equation ( 2) [13]: where Φ −1 is the inverse normal distribution.Standardization makes the classification of the index similar to a standardized index such as SPI.
To select the most fitted copula for extracting the joint probability distribution of rainfall and runoff variables, accuracy metrics including the Akaike Information Criterion (AIC), Root Mean Square Error (RMSE), and Nash-Sutcliffe error (NSE) were used (Equations ( 3)-( 5)) [22].
where n is the sample size; Pc(i) and Po(i) are the theoretical and empirical probabilities, respectively; k is the number of fitted parameters; ml is the value of maximum likelihood function; is the average of empirical values; and ( ) and ( ) are values related to empirical and theoretical models, respectively.Table 4 shows that student t copula, from meta-elliptical copulas family with freedom degree value and a correlation matrix of variables with modeling each pair of variables by its own correlation, is the most fitted copula, having the lowest AIC and RMSE, as well as the highest NSE.Thus, the index (TSI) is developed at different time scales using student t copula, along with Gamma and log-normal distributions, as the most fitted marginal distributions to rainfall and runoff variables, respectively.

Drought Frequency Analysis
Multivariate frequency analyses are usually implemented based on the characteristics of a phenomenon such as a drought.In this section, the process of multivariate frequency analyses for the Rasht Basin is described, and then results are presented for both Rasht and Urmia basins.

Drought Characteristics Extraction
A drought definition based on a random variable xt is essential to determine a threshold level x0.Drought occurs when the amount of a random variable is less than the definite threshold level.One can fix the threshold level or variable, which is often proportional to the normal conditions.In this research, the index below the threshold value of zero implies a drought event.
A time series of three drought characteristics, namely duration, severity, and severity peak, is shown in Figure 6.Duration is the time interval in which the random variable is less than the threshold level.Severity implies the sum of the deficiency of the variable from the threshold level within a drought event.Severity peak is the maximum deviation of the variable from the threshold level in each drought event [11,[23][24][25].Table 5 presents extracted drought characteristics based on three indices, SPI, SSFI, and TSI, for both basins.Based on the information given in the table, the fluctuations of SPI are more those of SSFI and TSI.Thus, the numbers of recorded drought events in this index are more than those of the other two; however, the maximums of other characteristics, including duration, severity, and severity peak for this index, are fewer than those of SSFI and TSI.For the TSI, the duration is the longest, and it also has the highest severity and severity peak.The most appropriate distributions for the drought characteristics of duration, severity, and severity peak are exponential, Weibull, and normal distributions, respectively.Additionally, the Gaussian copula has been selected as the most-fitted copula to extract a multivariate probability distribution for analyzing drought frequency.The results of goodness-of-fit tests indicate good fitting of the Gaussian copula, as well as the distributions allocated to drought characteristics at all levels of confidence.The Gaussian copula goodness-of-fit is presented in Table 6 6)-( 8), after assigning the most fitted marginal and joint probability distributions to the drought characteristics studied in this paper [11,26,27].
where D, S, and P parameters are drought duration, severity, and severity peak, respectively.According to Equation ( 6), the probabilities of D ≤ d, S ≤ s, or P ≤ p are equal to the estimated probability of the selected copula.
The probabilities of D ≤ d, S ≤ s, and P ≤ p are calculated using Equation ( 8).The probability of D ≤ d, if S ≥ s and P ≥ p, is determined based on Equation (9).Equation ( 10) is used to calculate the probability of D ≤ d and S ≤ s if P ≥ p.
The return period of the X variable with the non-exceedance probability distribution FX(x) is calculated by Equation (11).
where ʋ is the average time interval between the onsets of drought events [11,28].In Table 7, single and multivariate return periods are presented, with the second, third, and fourth columns representing the duration, severity, and severity peak corresponding to the return periods of the first column, respectively.The return periods of fifth to eighth columns are related to the values of the second to fourth columns and are calculated using Equations ( 12)- (15).
where ∨ is the return period of an event in which D ≥ d or S ≥ s or P ≥ p.Additionally, ∧ belongs to an event in which D ≥ d, S ≥ s, and P ≥ p.

Two-Variate Standardized Index
After describing the steps to develop the TSI in the previous section, the results are provided and discussed here.While the time series of SPI, SSFI, and TSI at different time scales were drawn from 1974 to 2019, to make the behavior of the indices easier to visualize and more understandable, only the period 2001-2011 was taken into account, as shown in Figure 7.Given the comparison of the TSI with the two indices of SPI and SSFI, it can be seen that the behavior of the TSI is a combination of the SPI and SSFI, which means that the TSI exhibits the drought onset and persistence similar to the SPI and SSFI, respectively.Additionally, since the TSI combines the information of both rainfall and runoff, it will lead to even more severe drought than the SPI or SSFI, e.g., the drought period of 2008-2010.
These findings are supported by Hao and Aghakouchak [13], who reported that the MSDI captures the drought before the SSI shows the drought onset.Moreover, the MSDI describes drought development and termination based on the SPI and SSFI, and the MSDI shows a more severe drought than the two other indices.These results are confirmed by Evkaya et al. [15], who developed a twovariate drought index to analyze meteorological-agricultural drought frequency.They indicated that the new index, the COPDI, has an attractive property that describes the drought development based on the state of both rainfall and potential evapotranspiration (PET), so that any temporal change in one variable does not influence the COPDI directly.They also stated that additional use of the COPDI, rather than a univariate drought index, can improve the quality of drought monitoring.
Given the comparison between the index developed in this study and the ones developed by Hao and Aghakouchak [13] and Evkaya et al. [15], all three indices have the same behavior.Generally, in all three indices, the determination of drought onset is based on the precipitation variable, and the estimation of the drought duration is based on a hydrological or agricultural variable.

Drought Frequency Analysis
Figure 8 demonstrates the multivariate return periods ∨ for three different values of severity peaks.As the value of a severity peak increases, the corresponding return period graph becomes more elongated, and its return period value increases.Figures 9 and 10 show the return period curves achieved using Equations ( 12) and (13).According to these figures, increasing the severity peak leads to an increase in the return period.In previous studies, multivariate frequency analyses on drought characteristics from singlevariable indices such as precipitation, runoff, etc. were usually performed, and less attention was paid to the drought characteristics of combined and multivariate indicators.For this purpose, in this study, in addition to the drought characteristics obtained by the SPI and SSFI, those extracted from the TSI were used.To compare the performance of the SPI, SSFI, and TSI indices in drought frequency analysis, the return periods of a specific drought event with a duration, severity, and severity peak equal to 8, 4.6, and 1.65, respectively, were estimated for the two basins.The results are presented in Table 8.According to the table, given both meteorological and hydrological droughts, the estimated return periods of the TSI for the two studied basins are less than the two other indices, i.e., the SPI and SSFI.The findings of the multivariate frequency analysis are supported by Wong et al. [8], Song and Singh [10], and Ma et al. [11], who derived multivariate and conditional return periods using copulas based on two or more drought characteristics.For example, Ma et al. [11] found that all of the univariate return periods are strictly between the two corresponding types of trivariate return periods, ∨ and ∧ .In addition, they determined that these derived multivariate joint and conditional probabilities of droughts provide useful information for water resources planning and management, by which one can evaluate the risks of malfunction for a specific water resources system and estimate the possible return periods for a certain situation in which a water-supply system fails to provide sufficient water.
To compare the drought properties of the two basins studied, Figure 13 was created based on the TSI performance.The figure shows that the severity value can be obtained based on a specific recurrence interval, severity peak, and maximum possible duration.Figure 13 also reveals that, in an event with the severity peak, specific return period, and maximum possible duration determined, the severity for the Urmia Basin is more than that of the Rasht Basin.Therefore, the Urmia Basin, located in a semi-arid region, experiences more severe droughts than the Rasht Basin, located in a humid region.
It is worth mentioning that this finding contradicts the results of a study by Shiau and Modarres [12].They found that, for the same duration and recurrence interval, due to highly fluctuating rainfall in Anzali, a humid region located in the north of Iran, the drought severity was greater, compared with its value in Abadan, a semi-arid region in the southwest of Iran.It seems to the authors that their results could have been more precise if they would have used a multivariate drought index, instead of the SPI, to consider other types of droughts, such as agricultural and hydrological droughts.Generally, the results of this study are in line with the results presented by Evkaya et al. [15], who developed a two-variate drought index to analyze meteorological-agricultural drought frequency; however, they did not consider hydrological droughts.In agreement with Table 5 of this study, their results show that the average values for drought duration (Dd), severity (Sd), and interval time (Ld) were higher for the copula-based drought index over a nine-month time scale (COPDI9), compared to those values of the standardized precipitation index over a nine-month time scale (SPI9) or the standardized evapotranspiration index over a nine-month time scale (SPEI9).Although this study considered the maximum values of drought characteristics instead of the average values, both studies indicated that drought characteristics values for a multivariate index were higher than those values of univariate indices.Another result of the addressed study, which is consistent with Table 7, showed the estimated return periods of the COPDI were fewer than those of the other two indices, the SPI and SPEI.
It should be noted that the present study does not consider uncertainty effects due to copula function and their input data on drought frequency analysis, which could provide more useful probability information of drought investigations by evaluating uncertainty effects, as Kang and Jiang [14], Li et al. [17], and Kwon et al. [29] did.

Summary and Conclusions
Due to the multidimensionality and inherent uncertainty of droughts, this study employed the copula concept to develop a Two-variate Standardized Index (TSI), based on rainfall and runoff variables.Then, multivariate frequency analyses were performed based on the extracted drought characteristics of the developed index, Standardized Precipitation Index (SPI), and the Standardized Stream Flow Index (SSFI).The following conclusions are drawn from this study:  The TSI behaves in a way similar to the combination of the SPI and SSFI, so that, in determining drought events, it performs similar to the SPI index, and, in identifying the persistence of drought, it functions similar to the SSDI.Additionally, in a particular event, the TSI records more severe droughts than the SPI or SSFI. The drought characteristics extracted from the SPI, SSFI, and TSI indices showed that the number of drought events recorded by the SPI were more than those of the TSI or SSFI, which can be attributed to the SPI's high fluctuations; however, the values for drought duration, severity, and severity peak were higher for the TSI compared to the results of the SPI or SSFI. For a given event, the estimated return periods of the TSI were less than those of the SPI or SSFI, and their values were in the following order: SPI > SSFI > TSI. The drought severity in the Urmia Basin for a certain recurrence interval and the same drought characteristics was more than that in Rasht Basin.In other words, the Urmia Basin experienced more severe droughts than the Rasht Basin. The copula functions' performance in multivariate modeling was satisfactory, and the probabilities obtained from these functions were consistent with the Gringorten experimental probabilities.
The results of this study are useful for designing, planning, and making decisions in water resource management.Moreover, the evaluation of effective nonstationary distribution models on drought frequency analyses under the conditions of climate change, land-use change, and river regulations seem to be a crucial topic for future studies.

Figure 4 .
Figure 4. Dependence of monthly rainfall and runoff data for the Rasht Basin using: (a) K-plot; (b) chi-plot; and (c) rank scatter plot.

Figure 8 .
Figure8demonstrates the multivariate return periods ∨ for three different values of severity peaks.As the value of a severity peak increases, the corresponding return period graph becomes more elongated, and its return period value increases.Figures9 and 10show the return period curves achieved using Equations (12) and (13).According to these figures, increasing the severity peak leads to an increase in the return period.Figures 11 and 12 also, respectively, illustrate the conditional probability curves and location of points, which have the same joint probabilities.

Figure 13 .
Figure 13.Corresponding drought severity with ∨ for maximum possible duration and severity peak = 3.

Table 1 .
Rainfall-runoff correlation coefficients for the Rasht Basin.

Table 2 .
The gamma distribution goodness-of-fit test for monthly rainfall data in the Rasht Basin.

Table 3 .
The log-normal distribution goodness-of-fit test for monthly runoff data in the Rasht Basin.Figure 5. Comparison of theoretical and empirical probabilities; Gamma and log-normal distribution fitted to monthly rainfall and runoff data, respectively, for the Rasht Basin.2.2.3.Developing the Two-variate Standardized Index (TSI)

Table 4 .
Fitting evaluation of copulas for the Rasht Basin.

Table 5 .
Drought characteristics of two basins studied.

Table 6 .
using Rosenblatt transformation.Gaussian copula goodness-of-fit test for the Rasht Basin.

Table 7 .
Single and multivariate return periods of drought based on one-month TSI in the Rasht Basin.