Next Article in Journal
Impact of the Municipal Solid Waste Łubna Landfill on Environmental Pollution by Heavy Metals
Previous Article in Journal
Survey of Aquaponics in Europe

Water 2016, 8(10), 426; https://doi.org/10.3390/w8100426

Article
Conditional Copula-Based Spatial–Temporal Drought Characteristics Analysis—A Case Study over Turkey
1
Department of Civil Engineering, Middle East Technical University, Ankara 06800, Turkey
2
Department of Civil Engineering, Near East University, Mersin 99138, Turkish Republic of North Cyprus
*
Author to whom correspondence should be addressed.
Academic Editors: Y. Jun Xu and Miklas Scholz
Received: 1 July 2016 / Accepted: 15 September 2016 / Published: 21 October 2016

Abstract

:
In this study, commonly used copula functions belonging to Archimedean and Elliptical families are fitted to the univariate cumulative distribution functions (CDF) of the drought characteristics duration ( LD ), average severity ( S ¯ ), and average areal extent ( A ¯ ) of droughts obtained using standardized precipitation index (SPI) between 1960 and 2013 over Ankara, Turkey. Probabilistic modeling of drought characteristics with seven different fitted copula functions and their comparisons with independently estimated empirical joint distributions show normal copula links drought characteristics better than other copula functions. On average, droughts occur with an average LD of 6.9 months, S ¯ of 0.94, and A ¯ of 73%, while such a drought event happens on average once in every 6.65 years. Results also show a very strong and statistically significant relation between S ¯ and A ¯ , and drought return periods are more sensitive to the unconditioned drought characteristic, while return periods decrease by adding additional variables to the analysis (i.e., trivariate drought analysis compared to bivariate).
Keywords:
copula; SPI; drought; multivariate; spatial; temporal

1. Introduction

Droughts are a climatic phenomenon that may impact large and small regions alike for long or short time periods and influence almost all aspects of society [1]. Historically, droughts have had negative effects on domestic and agricultural water supply needs [2] and consequently altered the behavior of ecosystems [3,4]. Plans, designs, and operational applications for water resources systems, therefore, require accurate estimation of different drought characteristics, like duration ( LD ), average severity ( S ¯ ), average areal extent ( A ¯ ), and re-occurrence rate under various scenarios.
Drought characteristics have been analyzed using various methods that have evolved over time. Earlier studies investigated the analytical or stochastic nature of past droughts using runs theory. Based on this theory, observed time series of drought-related variables are divided into wet and dry periods (positive and negative runs, respectively; see Figure 1) depending on a given threshold or critical level [5,6]. Later studies focused more on probabilistic drought estimation using ground station-based datasets [7,8,9]. The scarcity of ground stations and short recording lengths motivated the following studies to use synthetic datasets to analyze the frequencies of drought events [10,11]. Since the spatial–temporal relationships among drought characteristics are complex, recent studies have diversified their efforts by analyzing the return periods of droughts using various S ¯ or A ¯ scenarios [12,13,14].
The copula approach [15] provides an ideal test bed in which to analyze multivariate probabilistic problems without making explicit assumptions about the marginal or joint distributions of the variables involved in the estimation problem. Perhaps this ability has motivated studies to implement a multivariate probabilistic approach via copula functions to solve many different problems in hydrologic science and water resources-related applications [16,17,18,19].
Given that drought return periods can be immediately formulated in a probabilistic framework, copula-related methods have the necessary utility required to investigate different drought characteristics. Shiau [13] used bivariate copula functions to investigate bivariate drought analysis, including joint probabilities and return periods. Song and Singh [20] expanded this drought frequency analysis from bivariate to trivariate via a new method that constructs trivariate copulas to describe the joint distribution function of the temporal characteristics of meteorological drought events. Recently, Xu et al. [14] used a trivariate drought identification method within the copula approach to calculate the joint probability and return period of different drought spatial–temporal characteristic pairs. Overall, these studies demonstrated the advantages of the copula approach on bivariate and trivariate modeling of drought characteristics.
Certain characteristics of drought may precede others. For example, droughts with longer durations may tend to have higher maximum severity, or it may take longer to dissipate severe droughts than droughts with moderate severity. Such dependencies between drought characteristics can be immediately studied in a conditional copula framework, particularly when the occurrence of certain characteristics signals the occurrence of other characteristic functions in multivariate problems. Even though joint copula functions have been implemented in many studies, the implementation of conditional copulas remained limited in the literature [14,21]; hence there is still more room for more conditional copula-based applications to fully comprehend the utility of these functions.
Drought characteristics could be profoundly linked both in time and space. Accordingly, knowledge (i.e., observations) of some drought characteristics in time or space may help us predict others. This temporal and/or spatial dependency between different drought characteristics may immediately be exploited using copula functions, particularly using conditional probabilities. Even though spatial–temporal dependencies of drought characteristics have been investigated before using copula functions [22,23,24], such investigations have not been carried out in a conditional framework before.
The main objective of this study is to implement a copula-based methodology to analyze the joint and conditional dependency among various spatial–temporal characteristics of droughts, including LD , S ¯ , and A ¯ . The analysis utilizes univariate cumulative distribution functions (CDF) of drought characteristics obtained over Ankara, Turkey to establish joint and conditional relations between them, using bivariate and trivariate copula functions to calculate expected return periods of droughts with different characteristics scenarios.

2. Materials and Methods

In the following section, copula functions and their application to probabilistic drought characterizations are explained; later the use of bivariate and trivariate functions (i.e., the use of return periods) in drought analysis is elaborated.

2.1. Copula Functions

Copula functions link the univariate CDFs of random variables with their joint CDFs [13,15]. Estimations of these univariate CDFs are trivial and the sample datasets are often sufficient, while the analytical solution for the joint CDFs is not immediately clear. At this point, copula functions link univariate CDFs and create their multivariate (e.g., bivariate or trivariate) CDFs.
Based on Sklar’s theorem, bivariate and trivariate joint CDFs denoted as F ( u , v ) and F ( u , v , w ) , respectively, can be obtained in the form:
F ( u , v ) = C ( F ( u ) , F ( v ) ) ,
F ( u , v , w ) = C ( F ( u ) , F ( v ) , F ( w ) ) ,
where u , v , and w are random variables (e.g., LD , S ¯ , and A ¯ ); C (   ) is the copula function; and univariate CDFs F ( u ) , F ( v ) , and F ( w ) are inputs to these function. Here F ( u ) , F ( v ) , and F ( w ) are the univariate CDFs of u , v , and w respectively, and they are defined as
F ( u ) = P ( U u ) ,
F ( v ) = P ( V v ) ,
F ( w ) = P ( W w ) ,
where P ( U u ) , P ( V v ) , and P ( W w ) are the probabilities that random variables U , V , and W (i.e., different drought characteristics) takes values smaller than u , v , and w.
In Equations (1) and (2), the copula functions link the univariate CDF to bivariate and trivariate joint CDFs ( F ( u , v ) and F ( u , v , w ) ). Using these joint CDFs, bivariate and trivariate conditional CDFs can be found by utilizing the multiplication rule:
F ( u | v ) = F ( u , v ) F ( v ) ,
F ( u , v | w ) = F ( u , v , w ) F ( w ) .
In Equation (7), F ( u , v | w ) is given as an example of the trivariate CDF, while many other trivariate CDF (e.g., F ( u | v , w ) , F ( v | u , w ) , etc.) can be found in a similar fashion.
In this study, the bivariate (Equation (1)) and trivariate (Equation (2)) joint CDF of LD , S ¯ , and A ¯ are obtained using seven widely used copula functions belonging to Archimedean and Elliptical copula families. These copula functions and their parameter spaces are shown in Table 1. For further details about these copula functions, see the studies of Joe [25], Shiau [13], and Nelsen [26].
The required copula dependence parameters ( θ and ρ ) for the seven copula functions are separately estimated for each copula function using the inference function for margins (IFM) method [25]. This method involves estimating F ( u ) and F ( v ) for the variables and then estimating the θ or ρ parameter that maximizes the Kendall’s tau correlation using these univariate CDFs. These obtained parameters are later used to make copula function-based joint CDF predictions, which are referred to as “theoretical joint CDF estimation” ( C ( u , v ) t ) in this study.
The above described copula function-based C ( u , v ) t values are later independently validated using “empirical joint CDF estimates” ( C ( u , v ) e ) obtained using rank-based joint distribution estimates obtained via the copula package [27] in the R programming language [28]. Datasets are split for parameter estimation and validation with the leave-one-out cross-validation method to avoid over-fitting. Validation efforts include calculation of root mean square error (RMSE) using C ( u , v ) t and C ( u , v ) e :
RMSE = i = 1 n ( C ( u , v ) t C ( u , v ) e ) 2 n ,
where n is the length of the drought characteristic datasets. Among the seven copula functions, the one that yields the lowest RMSE value is later selected for further estimation of bivariate and trivariate return periods.

2.2. Return Periods

The expected value of the drought inter-arrival time ( DIT ; the time between two successive drought events) is one of the most commonly estimated drought characteristics in hydrological and water resources systems applications. T may be estimated for univariate (e.g., LD 10 months) as well as multivariate (e.g., LD 10 & S ¯ 1.0 ; or LD 10 | S ¯ 1.0 ) scenarios. Shiau and Shen [29] theoretically derived T for drought events with univariate drought characteristics (e.g., LD , S ¯ , and A ¯ ) as a function of the expected DIT in the form
T U u = E ( DIT ) P ( U u ) ,
where E ( DIT ) is the expected value of DIT ; and T U u is the return period of a drought event, defined by the variable u . E ( DIT ) is calculated as the ratio of the duration of entire study to the total number of droughts. Following Shiau [13], it is also possible to expand this univariate T estimation analysis to a bivariate case by characterizing events with bivariate joint or conditional probabilities in the forms:
T ( U u , V v ) = E ( DIT ) P ( U u , V v )
T ( U u   or   V v ) = E ( DIT ) P ( U u   or   V v )
T ( U u | V v ) = E ( DIT ) P ( U u , | V v ) .
Similarly, the trivariate case can be obtained in the forms:
T ( U u , V v , W w ) = E ( DIT ) P ( U u , V v , W w )
T ( U u   or   V v   or   W w ) = E ( DIT ) P ( U u   or   V v   or   W w )
T ( U u | ( V v , W w ) ) = E ( DIT ) P ( U u | ( V v , W w ) ) ,
where the P ( ) values in the denominators of Equations (10)–(12) are the bivariate, and of Equations (13)–(15) are the multivariate joint probabilities of drought characteristics u , v , and w ; and T ( ) values on the left-hand side of Equations (10)–(15) are the corresponding return periods. By definition, the P ( ) terms in Equations (9)–(15) can be found using the CDFs of the related drought characteristics. Below are some examples of univariate, bivariate, and trivariate cases:
P ( U u ) = 1 F U ( u )
P ( U u | V v ) = 1 F U , V ( u | v )
P ( U u , V v , W w ) = 1 F U , V , W ( u , v , w ) .
Accordingly, the bivariate probability P ( ) terms in Equations (10)–(12) can be found using the univariate CDF estimates and the bivariate copula functions in the forms:
P ( U u , V v ) = 1 F U ( u ) F V ( v ) + C ( F U ( u ) , F V ( v ) )
P ( U u   or   V v ) = 1 C ( F U ( u ) , F V ( v ) )
P ( U u | V v ) = P ( U u , V v ) P ( V v ) = 1 F U ( u ) F V ( v ) + C ( F U ( u ) , F V ( v ) ) 1 F V ( v ) .
The trivariate probabilities in Equations (13)–(15) can be found using the univariate CDF estimates, and bivariate and trivariate copula functions. Below is an example of a trivariate conditional case, while many other forms could be obtained in a similar fashion:
P ( U u | ( V v , W w ) ) = P ( U u , V v , W w ) P ( V v , W w ) ,
where the bivariate and trivariate joint probability terms on the right-hand side can be expressed in terms of copula functions. Below are some examples of such expressions linking multivariate probabilistic terms with copula functions:
P ( V v , W w ) = C ( F V ( v ) , F W ( w ) )
P ( V v , W w ) = 1 F V ( v ) F W ( w ) + C ( F V ( v ) , F W ( w ) )
P ( U u , V v , W w ) = C ( F U ( u ) , F V ( v ) , F W ( w ) )
P ( U u , V v , W w ) = 1 F U ( u ) F V ( v ) F W ( w ) + C ( F U ( u ) , F V ( v ) ) + C ( F U ( u ) , F W ( w ) ) + C ( F V ( v ) , F W ( w ) ) C ( F U ( u ) , F V ( v ) , F W ( w ) )
P ( U u   or   V v   or   W w ) = 1 C ( F U ( u ) , F V ( v ) , F W ( w ) ) .
Equations (9)–(15) describe how T values are estimated for univariate (9), bivariate (10)–(12), and trivariate (13)–(15) cases. These equations are flexible enough to accommodate investigations of various spatial–temporal drought characteristics like LD , S ¯ , and A ¯ (e.g., T of a drought event with LD 10   |   S ¯ > 0.80   &   A ¯ > 0.85 can be calculated). Once the multivariate joint probabilities of the drought characteristics are determined through these copula functions, T for these drought characteristics can be determined by plugging these probability estimates into Equations (10)–(15).

2.3. Standardized Precipitation Index (SPI)

In this study, SPI [30] is selected to analyze drought events. It is worth noting that this selection is arbitrary and inconsequential, while other drought indices (e.g., Percent of Normal, Palmer Drought Severity Index, Crop Moisture Index, Surface Water Supply Index, Reclamation Drought Index, etc.) could have been used as well.
SPI can be calculated for different time scales (i.e., 3, 6, 12, etc.), while the obtained time series (SPI-3, SPI-6, SPI-12, etc.) are used as drought indicators in applications with different goals. From a drought management point of view, managers may prefer SPI-3 as an indicator to monitor short-term anomalies, SPI-12 to monitor persistent events, or SPI-6 to have a flavor of both short-term anomalies and persistent events [31]. Accordingly, in this study SPI-6 is selected for the drought analysis to quantify the deficits of precipitation. For this purpose, all of the monthly precipitation time series are stored in a matrix P s , m , where s refers to each station and m refers to each month. Later six-month moving-window precipitation sums ( PT s , m ) are obtained in the form:
PT s , m = Σ m 5 m P s , m .
These PT s , m values are spatially averaged to obtain representative values for the entire study area in the form:
PT m = s = 1 D P s , m ,
where D is the total number of stations. Following McKee et al. [30], these PT s , m and PT m time series are later fitted to gamma distribution, where the probability of any given precipitation value is obtained using this fitted gamma distribution. These probability estimates obtained using PT s , m and PT m are later converted into SPI s , m and SPI m values, respectively, using the inverse of the standard normal distribution. Here, SPI s , m values are calculated for each station and month separately, while SPI m is representative for the entire study area and calculated for each month.

2.4. Drought Characteristics

LD , S ¯ , and A ¯ are the three drought characteristics considered in this study. The definition of the dry and the non-dry periods during the study period is made using SPI m time series. Following McKee et al. [30], the dry periods are defined to have continuously negative SPI values and have at least one SPI value below −1.0 (e.g., the time series (−0.23, −0.76, −1.15, −0.52, −0.05) constitute a drought event, while the time series (−0.23, −0.76, −0.99, −0.52, −0.05) does not). Accordingly, the periods that are not considered as dry are considered “non-dry” periods. LD and the length of non-dry periods ( LnD ) values are calculated as:
LD = ( t de t ds ) + 1
LnD = ( t nde t nds ) + 1 ,
where t ds and t de are the months a particular drought period started and ended, respectively; and t nds and t nde are the months a particular non-dry period started and ended, respectively. DIT is obtained as
DIT = LD + LnD .
Even though SPI s , m and SPI m values are available for each month, drought characteristic values ( LD , S ¯ , and A ¯ ) are calculated for each drought period. Among these drought characteristics, LD values are calculated using SPI m time series. For each drought period, S ¯ is calculated as:
S ¯ = | 1 LD m = t ds t de SPI m | .
Separate drought definitions are made for the region and the stations using SPI m and SPI s , m values, respectively. If the given region is under drought conditions, then individual stations are assessed as to whether or not they individually satisfy the drought conditions using SPI s , m values. Then the number of months of each station that are under drought condition during the regional drought period is calculated. The fraction of station-based drought conditions compared to the duration of regional drought length is defined as A ¯ . If the SPI values for a particular month are only available for four stations, then A ¯ is calculated as the fraction of these four stations that are under drought considering the length of the particular drought event. Once drought characteristics LD , S ¯ , and A ¯ are obtained for each drought period separately, the univariate, bivariate, and trivariate T values of these characteristics are calculated using Equations (10)–(15).

2.5. Drought Frequency and Visual Analysis

The visual presentation of the multivariate CDFs of the drought characteristics is instructive for building knowledge about the drought events. However, such analysis requires long datasets for visually presentable plot generation. In the absence of a sufficiently long time series, it is trivial to synthetically generate sufficient datasets, while such simulations require the distribution of the particular drought characteristic ( LD , S ¯ , or A ¯ ) to be known in advance. In many cases, observed data do not follow a specific distribution, and their distributions likely vary depending on the variables and time series [21]. In this study, LD , S ¯ , and A ¯ are respectively fitted to various distributions (the seven advanced with variable support, 23 non-negative, and eight bounded available distributions; Table 2) in the EasyFit program [32] to find the distribution that resulted in the best fit to the CDF of the observed drought characteristics data. The best CDF fit is separately found for each drought characteristic using the Maximum Likelihood method [33] by optimizing the Kolmogrov–Smirnov index [34] within the EasyFit program. The distributions that resulted in the best CDF fits are later used to synthetically generate sufficiently long datasets. Here it is stressed that these synthetically generated datasets are only used for visual representations but not in other analysis.

2.6. Study Area and the Datasets

Analysis of drought events is very critical over arid and semi-arid regions, like Ankara, Turkey, where water management projects require knowledge of the water status during a particular drought period. Among the different types of droughts (meteorological, hydrological, agricultural, and socioeconomic), this study focuses on the analysis of the spatial–temporal characteristics of meteorological droughts using monthly precipitation records obtained over six meteorological stations (Beypazari, Esenboga, Kecioren, Kizilcahamam, Nallihan, and Polatli) distributed over Ankara, Turkey (Figure 2). These stations are maintained and the datasets are distributed by The General Directorate of the Turkish State Meteorological Service (TSMS). Precipitation datasets over these six stations between January 1960 and February 2014 are obtained from TSMS. Precipitation time series for the first 60 months and the last two months for Nallihan and Esenboga stations, respectively (out of a total of 650 months). These missing data are ignored for these missing periods, while they constitute only <2% of the total datasets. More information about these stations can be found in the study of Sonmez [35].

3. Results and Discussion

The drought characteristics LD , S ¯ , and A ¯ were calculated for each drought period separately (Table 3). Accordingly, higher S ¯ values indicate more severe drought conditions and values closer to 0 indicate near to normal conditions. For example, the drought event starting in July 1985 and ending in January 1986 (Table 3) has an LD of seven months, S ¯ of 1.05, and A ¯ of 0.74. Here S ¯ is calculated as the absolute value of the average of seven SPI values: | ( 0.28 + 1.04 + 1.50 + 1.42 + 1.60 + 1.33 + 0.17 ) / 7 | = 1.05 ; and A ¯ is calculated as the fraction of total months under drought condition (i.e., the regional average SPI time series show the drought duration is seven months, while the six separate SPI time series over each station show 31 months are under drought conditions out of a total of six stations × seven months = 42 months, implying A ¯ = 31 / 42 = 0.74 ). Overall results show that drought conditions influenced the study area during 39% of the total study period (summation of all LD /650 months), while these drought events, on average, have S ¯ of 0.94, last for 6.92 months, and impact 73% of the area covered by the stations (Table 3).
Droughts tend to start during drier months (July–September) and end during wetter winter months (November–February). This result is expected as SPI-6 considers six-monthly accumulated precipitation values (Equation (28)): SPI-6 values for July–September are calculated using relatively drier months and values for November–February are calculated using relatively wetter months, where the summer and the winter months are climatologically dry and wet, respectively. This systematic pattern is consistent with the drought definition of many water resources managers who are interested in the lack of water. Removal of this strong SPI-6 seasonality may result in reduced SPI-6 sensitivity to actual water deficit (i.e., anomalies obtained during drier months may not necessarily mean the same “water deficit” during wetter months), hence SPI-6 seasonality is retained in this study.
Strong linear dependence is found between S ¯ and A ¯ (correlations are statistically significant at 95% confidence level), while the relationships between other drought characteristics are much weaker (Table 4). This implies that predictability studies involving S ¯ and A ¯ may contain high predictive skill, while particularly LD prediction using LD S ¯ or LD A ¯ relations may not yield accurate results. Here, drought characteristics LD and A ¯ have boundaries (0 and 1.0, respectively); however, there are not many values close to these boundary values, hence the use of correlation statistics may not be problematic assuming these drought characteristics are approximately normal.
The univariate CDFs of drought characteristics are necessary to fit the copula functions [ C ( u , v ) t ], so the best fitting copula function can be used for further analysis. Here, the best-fitting copula is obtained after performing validation efforts utilizing C ( u , v ) e as the truth. For the purpose of fitting copulas, univariate empirical CDF values are used while the theoretical univariate CDF values are used in the synthetic simulations described below (Figure 3). Among copula functions, by definition, the nature of the AMH requires Kendall’s tau between variables to be bounded between −0.18 and 0.33 [27]. Since the S ¯ A ¯ pair does not satisfy this requirement (Table 4), an AMH function could not be fitted to this pair. Leave-one-out type validation of copula parameters show that the elliptical copula family results in better performance in bivariate and trivariate joint CDF estimation analysis than other copula families, while normal copula seems to have the best overall performance (Table 5).
Analysis of drought period counts show a drought event is expected on average every 18 months (i.e., E ( DIT ) = 650   months / 37   droughts ). Univariate drought T calculations using the univariate CDF of variables (Figure 3) show on average drought events with LD 6.92 are expected every 4.15 years, S ¯ 0.94 every 3.06 years, and A ¯ 0.73 every 2.53 years (Table 6). However, in many cases these univariate T estimates may not be sufficient for various drought investigations requiring bivariate or trivariate T information. These multivariate drought T values are calculated using the best fitting copula functions for the bivariate and the trivariate cases. As an example, T is calculated for the bivariate case as 7.13 years for the drought event with S ¯ 0.94 and LD 6.92 ; as 3.41 years for a drought event with A ¯ 0.73 and S ¯ 0.94 ; and for the trivariate case T is calculated as 6.65 years for a drought event with S ¯ 0.94 , A ¯ 0.73 , and LD 6.92 .
The examples given above could be reproduced for infinitely many drought characteristic pairs or triplets. However, plots showing their multivariate variations often are necessary to understand the exact nature of their multivariate relations. However, to generate such plots the lengths of the available 37 drought events are not sufficient. Instead, the drought characteristics are fitted to different distributions and the best fitting distributions are later used to generate synthetic drought characteristic datasets that are used to generate multivariate T plots (Figure 4 and Figure 5). Parameters result in the smallest Kolmogorov–Smirnov statistics (Table 7) are used to generate the relevant synthetic datasets. Results show log-logistic, normal, and logistic distributions result in the best fit (i.e., lowest errors) for the drought characteristics of LD , S ¯ , and A ¯ , respectively (Table 7). Hence, these distributions are used to create synthetic LD , S ¯ , and A ¯ data.
Three drought characteristics ( LD , S ¯ , and A ¯ ) are considered in this study, hence three bivariate drought characteristic pairs ( S ¯ LD , A ¯ LD , and A ¯ S ¯ ) are analyzed. Additionally, three bivariate drought operators (“and”, “or”, “conditional”) that are suited for various drought applications are also considered. Accordingly, a total of nine bivariate T maps were obtained for visual presentation efforts (Figure 4). In Figure 4, the left, middle, and right columns show “and”, “or”, and “conditional” scenarios, respectively; the top, middle, and bottom rows show the S ¯ LD , A ¯ LD , and A ¯ –S pairs, respectively.
The “and” scenario shows that the bivariate T has a higher sensitivity to LD than S ¯ or A ¯ : When LD is included in the bivariate T maps (left middle and top panels), T values are more impacted by LD than A ¯ or S ¯ , particularly for higher values of LD , while the sensitivity difference between A ¯ S ¯ pairs is marginal. If a line is drawn from the lower left corner to the upper right corner of the A ¯ S ¯ bivariate T figure (lower-left panel of Figure 4), then the area to the left of this line shows the region where T is more sensitive to changes in A ¯ than S ¯ while the area on the right side of the line shows the region where T is more sensitive to changes in S ¯ than A ¯ . These results suggest that the overall sensitivity of T to these drought characteristics can be compared via LD > A ¯ S ¯ . On average, bivariate T values are much higher, particularly for higher values of LD ( LD > 10 ). Similar to the “and” case, the “or” case also show pareto front type plots but the pareto front direction is reversed (i.e., the middle column): on average almost one in 2.33 years a mild drought event with “ A ¯ 0.73 or S ¯ 0.94 ” or “ LD 6.91   or   S ¯ 0.94 ” is observed. As expected, the average T value for the “or” case is much lower than for the “and” case as the probability of an “or” case is higher than an “and” case. In the case of “conditional” drought T plots, the T values are most sensitive to the unconditioned value rather than the given observation (i.e., the lines are vertical), while this sensitivity decreases as the drought impact increases (higher LD , S ¯ , or A ¯ ). On average, the dynamic range of T values for conditional case are more similar to the “or” case, particularly for less severe drought events (e.g., for the bivariate case they range between 1.5 and 42 years). These conditional T values could be particularly useful if certain observations could be obtained. For example, if S ¯ 0.94 , then T values range between 1.86 and 9.95 years for 5 LD 10 (Figure 4, top right panel).
Figure 5 shows that the trivariate T values are conditioned on both S ¯ and A ¯ . Given that it would be harder to digest 3D plots, trivariate cases are plotted for four separate A ¯ values (0.65, 0.75, 0.85, and 0.95). Similar to conditional bivariate plots, conditional trivariate plots also show that T values are sensitive to the unconditioned variables (i.e., the plots are perpendicular to the unconditioned variable). Additionally, the conditional trivariate contour plots show that when additional observations are available compared to the bivariate case, then T values decrease (i.e., if A ¯ 0.65 , it is more likely that drought events with longer duration will happen). For example, the bivariate conditional plot for LD | S ¯ 0.7 shows T values ranging between 1.95 and 42.04, while the trivariate conditional plot for LD | S ¯ 0.7 ,   A ¯ 0.95 shows T values ranging between 2.05 and 13.69. The conditional T values have much higher sensitivity to LD when A ¯ 0.75 than when A ¯ 0.95 (Figure 5); for greater A ¯ values, the return periods are much smaller for conditional LD compared to smaller A ¯ values.

4. Conclusions

Droughts have many different spatial–temporal characteristics; hence, a better way to describe them is to use joint probability-based functions, rather than building drought-related analyses on univariate datasets as joint distribution of these drought characteristics often do not follow a particular known distribution. The current study offers a methodology to analyze the spatial–temporal multivariate aspect of droughts via copula functions in a probabilistic approach.
The results show that, among the drought characteristics, S ¯ and A ¯ are found to have significant linear dependence, implying their predictions using each other may yield reasonable forecasts, while the same may not be true for other relationships including LD . On the other hand, bivariate T is more sensitive to variability in LD than S ¯ or A ¯ , implying that predictions of bivariate T could be more accurate with the presence of LD than S ¯ or A ¯ . Nevertheless, observations of S ¯ or A ¯ narrow the variability window of trivariate T predictions, particularly for higher values of S ¯ or A ¯ ; implying that S ¯ or A ¯ observations have utility in predictions of T even though such predictions are more sensitive to variability in LD .
The flexibility of T calculation for different multivariate drought scenarios clearly illustrates the power of the introduced methodology. Additionally, certain drought characteristics may precede others, hence such observed drought characteristics may prove very valuable when making inferences about non-observed drought characteristics, while such relations can be easily studied using the multivariate conditional copula methodology presented in this study. In particular, trivariate drought analysis scenarios could prove very helpful in estimating a particular drought characteristic given availability of other drought characteristic info.
Drought characteristics often have dataset-specific distributions. Even though elliptical family-based copula functions yielded better performance in this study, other studies found that different copula families perform better [13,20]. This implies that the optimality of the copula functions could be dataset- and case-specific; hence inter-comparisons of copula functions could be necessary before estimation of return periods in other studies. However, this result should be confirmed via independent studies before it can be generalized for copula function-related drought analyses.
In this study, ground station-based datasets are utilized in all of the analyses. While they provide the longest datasets over the most locations, these station-based datasets are spatially limited, often contain gaps in their time series, and may have user-dependent data quality. On the other hand, remote sensing-based datasets prove very powerful alternatives to station-based datasets as these spatially and temporally consistent datasets are available everywhere, including remote locations where installation of stations is not economically feasible. Therefore, remotely sensed datasets could be recommended as alternative reanalysis products to be used as such model datasets may provide consistent and sufficiently long time series for drought analysis.

Acknowledgments

The authors thank the anonymous reviewers for their constructive comments and the Turkish State Meteorological Service for precipitation datasets. This research was supported by the EU’s Marie Curie Seventh Framework Programme FP7-PEOPLE-2013-CIG project number 630110 and the Scientific and Technological Research Council of Turkey (TUBITAK) Grant 3501, project number 114Y676.

Author Contributions

Ali Unal Sorman designed the experiments, provided the data, and contributed to the analysis tools; Mahdi Hesami Afshar and Mustafa Tugrul Yilmaz designed the experiments, analyzed the data, contributed to the analysis tools, and wrote the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Tadesse, T.; Wilhite, D.A.; Harms, S.K.; Hayes, M.J.; Goddard, S. Drought monitoring using data mining techniques: A case study for nebraska, USA. Nat. Hazards 2004, 33, 137–159. [Google Scholar] [CrossRef]
  2. Zhong, S.; Shen, L.; Sha, J.; Okiyama, M.; Tokunaga, S.; Liu, L.; Yan, J. Assessing the water parallel pricing system against drought in China: A study based on a CGE model with multi-provincial irrigation water. Water 2015, 7, 3431–3465. [Google Scholar] [CrossRef]
  3. Avilés, A.; Célleri, R.; Solera, A.; Paredes, J. Probabilistic forecasting of drought events using markov chain- and bayesian network-based models: A case study of an andean regulated river basin. Water 2016, 8, 37. [Google Scholar] [CrossRef]
  4. Karavitis, C.A.; Alexandris, S.; Tsesmelis, D.E.; Athanasopoulos, G. Application of the standardized precipitation index (SPI) in Greece. Water 2011, 3, 787–805. [Google Scholar] [CrossRef]
  5. Guerrero-Salazar, P.; Yevjevich, V. Analysis of Drought Characteristics by the Theory of Runs; Colorado State University: Fort Collins, CO, USA, 1975. [Google Scholar]
  6. Yevjevich, V. An Objective Approach to Definitions and Investigations of Continental Hydrologic Droughts; Colorado State University: Fort Collins, CO, USA, 1967. [Google Scholar]
  7. Clausen, B.; Pearson, C. Regional frequency analysis of annual maximum streamflow drought. J. Hydrol. 1995, 173, 111–130. [Google Scholar] [CrossRef]
  8. Rossi, G.; Benedini, M.; Tsakiris, G.; Giakoumakis, S. On regional drought estimation and analysis. Water Resour. Manag. 1992, 6, 249–277. [Google Scholar] [CrossRef]
  9. Şen, Z. Regional drought and flood frequency analysis: Theoretical consideration. J. Hydrol. 1980, 46, 265–279. [Google Scholar] [CrossRef]
  10. Henriques, A.; Santos, M. Regional drought distribution model. Phys. Chem. Earth B Hydrol. Oceans Atmos. 1999, 24, 19–22. [Google Scholar] [CrossRef]
  11. Pelletier, J.D.; Turcotte, D.L. Long-range persistence in climatological and hydrological time series: Analysis, modeling and application to drought hazard assessment. J. Hydrol. 1997, 203, 198–208. [Google Scholar] [CrossRef]
  12. Serinaldi, F.; Bonaccorso, B.; Cancelliere, A.; Grimaldi, S. Probabilistic characterization of drought properties through copulas. Phys. Chem. Earth A/B/C 2009, 34, 596–605. [Google Scholar] [CrossRef]
  13. Shiau, J. Fitting drought duration and severity with two-dimensional copulas. Water Resour. Manag. 2006, 20, 795–815. [Google Scholar] [CrossRef]
  14. Xu, K.; Yang, D.; Xu, X.; Lei, H. Copula based drought frequency analysis considering the spatio–temporal variability in Southwest China. J. Hydrol. 2015, 527, 630–640. [Google Scholar] [CrossRef]
  15. Sklar, M. Fonctions de Répartition à n Dimensions et Leurs Marges; Université Paris: Paris, France, 1959; Volume 8. [Google Scholar]
  16. De Michele, C.; Salvadori, G. A generalized Pareto intensity-duration model of storm rainfall exploiting 2-copulas. J. Geophys. Res. Atmos. 2003, 108. [Google Scholar] [CrossRef]
  17. Ellouze-Gargouri, E.; Bargaoui, Z. Runoff estimation for an ungauged catchment using geomorphological instantaneous unit hydrograph (GIUH) and copulas. Water Resour. Manag. 2012, 26, 1615–1638. [Google Scholar] [CrossRef]
  18. Favre, A.C.; El Adlouni, S.; Perreault, L.; Thiémonge, N.; Bobée, B. Multivariate hydrological frequency analysis using copulas. Water Resour. Res. 2004, 40. [Google Scholar] [CrossRef]
  19. Grimaldi, S.; Serinaldi, F. Design hyetograph analysis with 3-copula function. Hydrol. Sci. J. 2006, 51, 223–238. [Google Scholar] [CrossRef]
  20. Song, S.; Singh, V.P. Frequency analysis of droughts using the Plackett copula and parameter estimation by genetic algorithm. Stoch. Environ. Res. Risk Assess. 2010, 24, 783–805. [Google Scholar] [CrossRef]
  21. Saghafian, B.; Mehdikhani, H. Drought characterization using a new copula-based trivariate approach. Nat. Hazards 2014, 72, 1391–1407. [Google Scholar] [CrossRef]
  22. Klein, B.; Meissner, D.; Kobialka, H.-U.; Reggiani, P. Predictive uncertainty estimation of hydrological multi-model ensembles using pair-copula construction. Water 2016, 8, 125. [Google Scholar] [CrossRef]
  23. Li, H.; Shao, D.; Xu, B.; Chen, S.; Gu, W.; Tan, X. Failure analysis of a new irrigation water allocation mode based on copula approaches in the Zhanghe Irrigation District, China. Water 2016, 8, 251. [Google Scholar] [CrossRef]
  24. Zhang, L.; Singh, V.P. Bivariate rainfall and runoff analysis using entropy and copula theories. Entropy 2012, 14, 1784–1812. [Google Scholar] [CrossRef]
  25. Joe, H. Multivariate Models and Multivariate Dependence Concepts; CRC Press: Boca Raton, FL, USA, 1997. [Google Scholar]
  26. Nelsen, R.B. An Introduction to Copulas; Springer: New York, NY, USA, 2013; Volume 139. [Google Scholar]
  27. Hofert, M.; Kojadinovic, I.; Maechler, M.; Yan, J. Copula: Multivariate dependence with copulas. R package version 0.999-5, 2012. Available online: http://CRAN.R-project.org/package=copula (accessed on 24 September 2016).
  28. The R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2015. [Google Scholar]
  29. Shiau, J.-T.; Shen, H.W. Recurrence analysis of hydrologic droughts of differing severity. J. Water Resour. Plan. Manag. 2001, 1, 30–40. [Google Scholar] [CrossRef]
  30. McKee, T.B.; Doesken, N.J.; Kleist, J. The Relationship of Drought Frequency and Duration to Time Scales. In Proceedings of the 8th Conference on Applied Climatology, Anaheim, CA, USA, 17–22 January 1993; American Meteorological Society: Boston, MA, USA, 1993; pp. 179–183. [Google Scholar]
  31. Wilhite, D.A. Drought and Water Crises: Science, Technology, and Management Issues; CRC Press: Boca Raton, FL, USA, 2005. [Google Scholar]
  32. Schittkowski, K. Data Fitting and Experimental Design in Dynamical Systems with EASY-FITModelDesign—User’s Guide; University of Bayreuth: Bayreuth, Germany, 2009. [Google Scholar]
  33. Richards, F.S.G. A method of maximum-likelihood estimation. J. R. Stat. Soc. Ser. B Methodol. 1961, 23, 469–475. [Google Scholar]
  34. D’Agostino, R.B. Goodness-of-Fit-Techniques; CRC Press: Boca Raton, FL, USA, 1986; Volume 68. [Google Scholar]
  35. Sönmez, İ. Quality control tests for western turkey Mesonet. Meteorol. Appl. 2013, 20, 330–337. [Google Scholar] [CrossRef]
Figure 1. Illustration of drought characteristics using the run theory for a given threshold level of X0 where the drought events are presented with the red bars over time.
Figure 1. Illustration of drought characteristics using the run theory for a given threshold level of X0 where the drought events are presented with the red bars over time.
Water 08 00426 g001
Figure 2. Locations of the six meteorological stations where the data used in this study were obtained.
Figure 2. Locations of the six meteorological stations where the data used in this study were obtained.
Water 08 00426 g002
Figure 3. Empirical and theoretical univariate CDF of drought characteristics LD , S ¯ , and A ¯ . The theoretical CDF values in the top ( LD ), middle ( S ¯ ), and bottom ( A ¯ ) panels are calculated using Wakeby, Pearson 5, and Johnson SB distributions, respectively.
Figure 3. Empirical and theoretical univariate CDF of drought characteristics LD , S ¯ , and A ¯ . The theoretical CDF values in the top ( LD ), middle ( S ¯ ), and bottom ( A ¯ ) panels are calculated using Wakeby, Pearson 5, and Johnson SB distributions, respectively.
Water 08 00426 g003
Figure 4. Bivariate T of drought characteristics obtained using synthetic datasets, where LD is given in months and S ¯ and A ¯ are shown as S and A , respectively.
Figure 4. Bivariate T of drought characteristics obtained using synthetic datasets, where LD is given in months and S ¯ and A ¯ are shown as S and A , respectively.
Water 08 00426 g004
Figure 5. T of trivariate drought characteristics (for the cases of A ¯ greater than 75, 85, and 95 percent), where LD is given in months and for brevity S ¯ and A ¯ are shown as S and A , respectively.
Figure 5. T of trivariate drought characteristics (for the cases of A ¯ greater than 75, 85, and 95 percent), where LD is given in months and for brevity S ¯ and A ¯ are shown as S and A , respectively.
Water 08 00426 g005
Table 1. Equations of Copula functions. Here u and ν are two dependent univariate variables, df is the degree of freedom, θ and ρ are Copula dependence parameters, is the CDF of standard univariate Gaussian distribution, and t df is the t–student distribution function. AMH and GH refer to the Ali–Mikhail–Haq and Gumbel–Hougaard distributions, respectively.
Table 1. Equations of Copula functions. Here u and ν are two dependent univariate variables, df is the degree of freedom, θ and ρ are Copula dependence parameters, is the CDF of standard univariate Gaussian distribution, and t df is the t–student distribution function. AMH and GH refer to the Ali–Mikhail–Haq and Gumbel–Hougaard distributions, respectively.
Function (Family)Joint Cumulative Distribution Function, C ( u , v ) Parameter Range
AMH (Archimedean) uv 1 θ ( 1 u ) ( 1 v ) 1 θ 1
Clayton (Archimedean) [ max ( 0 , u θ + v θ 1 ) ] 1 / θ 0 θ
Frank (Archimedean) 1 θ ln ( 1 + ( e θ u 1 ) ( e θ v 1 ) e θ 1 ) θ 0
GH (Archimedean) e [ ( ln u ) θ + ( ln v ) θ ] 1 θ 1 θ
Joe (Archimedean) 1 ( ( 1 u ) θ + ( 1 v ) θ ( 1 u ) θ ( 1 v ) θ ) 1 θ 1 θ
Normal (Elliptical) 1 ( u ) 1 ( v ) 1 2 π ( 1 ρ 2 ) 1 2 e u 2 2 ρ uv + v 2 2 ( 1 ρ 2 ) dudv 1 ρ 1
t–student (Elliptical) t df 1 ( u ) t df 1 ( v ) 1 2 π ( 1 ρ 2 ) 1 2 ( 1 + u 2 2 ρ uv + v 2 df ( 1 ρ 2 ) ) df + 2 2 dudv 1 ρ 1 df 1
Table 2. Fitted distribution functions (total of 38 types).
Table 2. Fitted distribution functions (total of 38 types).
Advanced with Variable SupportNon-NegativeNon-NegativeNon-NegativeBounded
Gen. extr. val.BurrGammaParetoBeta
Gen. logisticChi-squaredGen. gammaPareto 2Johnson SB
Gen. ParetoDagumInv. GaussianPearson 5Kumaraswamy
Log–Pearson 3ErlangLevyPearson 6Pert
Phased Bi–exp.Exp.Log-gammaRayleighPower func.
Phased bi–WeibullFLog-logisticRiceReciprocal
WakebyFatigue lifeLognormalWeibullTriangular
FrechetNakagami Uniform
Table 3. Drought characteristics defied by using SPI series values.
Table 3. Drought characteristics defied by using SPI series values.
DroughtStart of DroughtEnd of DroughtLD S ¯ A ¯
YearMonthYearMonth
1196091961150.90.8
2196181962160.680.67
31962719621150.940.76
41965819651251.160.92
5196691967260.820.78
6196791968151.050.87
71969919691241.280.96
8197081971270.750.64
91973319744140.840.82
101974111975130.840.44
111976719761260.390.17
12197761978181.030.81
131978819781250.790.67
141979719791261.070.5
151980819801250.950.9
161981819811140.790.5
171982101983580.930.94
18198491985371.281
19198571986171.050.74
201986719861260.860.81
21198771988280.910.88
2219889198910140.820.92
23199181992270.530.48
24199261993290.520.48
251993719952201.050.86
261996819961250.480.33
271998101999140.680.33
28200092001481.280.94
292001820011140.760.67
302003720031261.430.97
31200472005281.340.98
322005102006140.780.46
332007320082121.080.86
342008620081270.860.48
352009920091240.930.63
362012720121261.331
37201372014281.460.91
Statistics of the above given 37 drought periodsMin30.390.17
Mean6.920.940.73
Max201.461
Table 4. Kendall’s tau and Pearson’s correlations between drought characteristics LD, S ¯ , and A ¯ .
Table 4. Kendall’s tau and Pearson’s correlations between drought characteristics LD, S ¯ , and A ¯ .
LD S ¯ LD A ¯ S ¯ A ¯
Kendall’s tau0.150.270.63
Pearson‘s0.100.320.79
Table 5. Parameters ( θ or ρ ) and RMSE values of bivariate and trivariate copula fitting simulations, where the three smallest RMSE values are written in bold for each analysis.
Table 5. Parameters ( θ or ρ ) and RMSE values of bivariate and trivariate copula fitting simulations, where the three smallest RMSE values are written in bold for each analysis.
Copula FunctionVariables of Bivariate CopulasVariables of Trivariate Copula
LD & S ¯ LD & A ¯ S ¯ & A ¯ LD & S ¯ & A ¯
( θ / ρ )RMSE( θ / ρ )RMSE( θ / ρ )RMSE( θ / ρ )RMSE
AMH0.5140.0620.8950.056----
Clayton0.2800.0640.7280.0582.3950.0430.9300.067
Frank1.2720.0612.4250.0588.1700.0303.0260.065
GH1.1150.0651.2710.0642.5140.0291.4250.070
Joe1.1200.0701.2690.0752.9660.0421.5380.084
Normal0.2420.0590.4260.0550.8390.0260.5280.054
t-Student0.2440.0610.4260.0560.8380.0260.5280.056
Table 6. Univariate drought return periods (T).
Table 6. Univariate drought return periods (T).
T (Years)Drought Characteristics
LD (Month) S ¯ (-) A ¯ (-)
1.463.00--
1.473.24-0.17
1.483.260.390.18
2.535.710.870.73
3.066.210.940.79
4.156.921.020.86
40.5714.201.441.00
48.1915.011.46-
116.8320.00--
Table 7. Kolmogorov–Smirnov statistics and parameters of best theoretical CDF distributions (Figure 3) for the drought characteristics LD , S ¯ , and A ¯ .
Table 7. Kolmogorov–Smirnov statistics and parameters of best theoretical CDF distributions (Figure 3) for the drought characteristics LD , S ¯ , and A ¯ .
Drought CharacteristicDistributionParametersTest Statistic
LDWakebyα = 6.575; β = 3.344; γ = 1.470; δ = 0.34; ξ = 3.1760.10
S ¯ Pearson 5α = 167.61; β = 563.17; γ = −2.440.08
A ¯ Johnson SBγ = −0.733; δ = 0.711; λ = 0.937; ξ = 0.0880.08
Water EISSN 2073-4441 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top