1. Introduction
Drought is a natural phenomenon in which the natural water availability for a region is unusually low over an extended period, and the whole precipitation cycle is affected. According to studies based on changes in rainfall patterns in South Korea, an abrupt increase in greenhouse gases is the major cause of heavy rain, severe drought and heavy snow in some regions [
1,
2]. The Palmer Drought Severity Index (PDSI) was calculated using the climate change scenario based on the regional climate change model for the period of 1971–2100 [
3]. It was concluded that the drought risk is likely to increase in South Korea despite an increase in general precipitation during the 21st century. The Standardized Precipitation Index (SPI) and PDSI were also studied together to reflect the seasonal trends of drought across South Korea [
4].
In probability methods, the complex phenomena of drought should not be identified by only one characteristic of drought. The stochastic nature of droughts should be expressed using different drought variables. Drought identification and characterization are primary requirements for drought frequency analysis. There are various indices that have been used to measure the drought characteristics, and their use depends on the type of results needed. Drought indices were also derived from other hydrological and ecological variables [
5,
6,
7]. Drought indices based on different climatic and hydrological variables may depict different regional and temporal patterns [
8]. These indices are mostly based on a deficit in precipitation or discharge [
9] to identify the drought. The precipitation deficit-based drought indices are more reliable and effective because they can directly reflect the deficit in rainfall based on a predetermined threshold. Secondly, despite orographic and land friction effects, precipitation is likely to perform more homogeneously than streamflow data. In addition, precipitation has often been recorded for a longer time period and tends to be less affected by human activities than streamflow data. Therefore, in this study, a drought event is defined using the truncation level approach applicable to precipitation time series and of direct relevance to the water industry and to the environmental demands of the river system. The truncation level approach defines droughts as periods during which the rainfall is below a certain truncation level [
10].
Due to the complex nature of meteorological drought events, one drought variable is unable to provide a comprehensive evaluation. A bivariate distribution should be derived to express the correlated drought variables. In the case of flood studies, bivariate normal distributions [
11], bivariate exponential distributions [
12] and bivariate gamma distributions [
13] or rainfall-runoff hydrological models [
14,
15] are often applied. The major drawbacks of these bivariate distributions are that the individual behavior of random variables must be characterized by the same parametric family of univariate distributions [
16]. In the case of drought studies, the joint distributions are analytically acquired, by either assuming drought characteristics to be independent identically random variables [
17] or assuming that they belong to the same marginal distribution function and have explicit bivariate forms (e.g., bivariate normal, bivariate exponential, bivariate gamma) [
18]. However, the two above stated assumptions are not satisfied in most of the cases, because the drought variables (drought duration and severity) are highly correlated and may belong to different marginal distributions. Copula theory provides the solution of the above stated problems. The copula can preserve the dependence structure and different distribution characteristics of the drought variables. In addition, joint distributions of drought duration and severity can be derived using the nonparametric kernel estimator method [
19] and the entropy-based method [
20].
Copula theory, introduced by [
21], has been used to join univariate distribution functions to form multivariate distribution functions based on the dependence structure among random variables. Therefore, during the last decade, copulas have emerged as a new multivariate modeling method in hydrology [
22]. The commonly-used copulas in hydrology belong to two types: elliptical type (normal and Student
t) and the Archimedean type (Clayton, Gumbel, Joe, Frank and Ali-Mikhail-Haq) [
22,
23]. For fitting of a copula, several marginal distributions have been used to fit drought duration and severity. Those marginal distributions include exponential (exp), gamma (gam), generalized extreme value (gev), generalized logistic (glo), generalized normal (gno), general Pareto (gpa), Gumbel (gum), lognormal (ln3), Pearson Type 3 (pe3) and Weibull (wei). The parameters of copulas are usually estimated using the maximum pseudo-likelihood or the inversion of Kendall’s method [
23]. To select the appropriate marginal distribution of copulas, the goodness of fit is usually applied using the least root mean square error (RMSE), the Kolmogorov–Smirnov (KS) test, the Anderson–Darling (AD) test, the Akaike information criterion (AIC), ordinary least squares (OLS) and the Bayesian information criterion (BIC) [
24].
2. Materials and Methods
The precipitation in South Korea has a high spatial and time variability [
25]. Furthermore, most of the precipitation in South Korea falls during the summer season; this is because of the coincident typhoon season in the western North Pacific. Due to the complex topographical and climatic features of South Korea, the absence of the ability to recognize spatial characteristics is one of the main obstacles of drought analysis. Since the effect of drought is slowly moving to adjacent areas and thus spatio-temporal analysis of drought is gaining more importance among engineers and hydrologists for the design, planning and management of the water resource structure, it is necessary to investigate the spatial and temporal characterization of drought across South Korea. In this paper, drought risk analysis using ten commonly-used univariate probability distributions and six copulas (elliptical and Archimedean) was employed to analyze the spatial and temporal changing properties of drought. In addition, the top twenty historical drought events were selected for temporal analysis and for analyzing the changes in different bivariate return periods.
2.1. Study Area and Data
The Korean Meteorological Administration (KMA) manages climatological data at over 70 rainfall stations throughout South Korea. This study collected monthly precipitation data over 55 stations for more than 35 years (1980–2015). Only 55 rainfall stations were selected because of non-availability or missing precipitation data at a few rainfall stations. The Standardized Precipitation Index (SPI) proposed by [
26] was used in this study for identifying the duration of drought events and to evaluate their severity. SPI was computed on the basis of fitting long-term rainfall data to the gamma distribution on any desired time scale, namely, 3 months, 6 months, 12 months and 24 months. The parameters of the gamma distribution are computed through the maximum likelihood estimation method. This fitted gamma distribution was then transformed to a standard normal distribution with the mean of zero and standard deviation of one [
26]. The advantage of the SPI approach is that it allows a reliable and relatively easy comparison of precipitation deficit in the desired period between different locations and climates. This is because, in SPI, the rainfall is already normalized and compares the current rainfall with the average. In this study, the SPI time scale was chosen as 6 months, just as the time scales of dry and wet alterations in South Korea.
The characteristics of drought were extracted using the theory of run [
10]. The theory of run method was proposed to identify drought duration and severity on the basis of the values that are below the selected truncation level. Therefore, run theory suggested a way to calculate the drought variables (drought duration and drought severity). The detailed description of the theory of run is provided by [
10,
27]. The duration of any drought was defined as the period of rainfall deficit, i.e., the cumulative time of negative
values preceded and followed by positive
values. The severity of any drought period starting at the
-th month was defined as;
. In other words, cumulated SPIs during the drought duration, defined by [
26], are used to measure the magnitude of drought event and called the drought severity. In this study, −0.99 was selected as the truncation level in the run analysis. Therefore, the drought duration is the period when the SPI value was below −0.99, and the drought severity is the cumulative deficit during that drought event.
2.2. Marginal Distributions and Copulas
Copula distributions and cumulative distribution functions (CDFs) of probability distributions are shown in
Table 1 and
Table 2, respectively. In this study, the L-moment method is used to estimate the parameters of these marginal CDFs for drought duration and severity. Candidate probability distributions considered for this study are exponential (exp, 2 parameters), gamma (gam, 2 parameters), generalized extreme value (gev, 3 parameters), generalized logistic (glo, 3 parameters), generalized normal (gno, 3 parameters), general Pareto (gpa, 3 parameters), Gumbel (gum, 2 parameters), lognormal (ln3, 3 parameters), Pearson Type 3 (pe3, 3 parameters) and Weibull (wei, 3 parameters).
The root mean square error (RMSE) and Kolmogorov–Smirnov (KS) test [
28] were used to choose the best fitted marginal distribution.
Let
X = (
X1,
X2,...,
Xn) be an n-dimensional random vector with a continuous marginal distribution function (CDF)
F1,
F2,…,
Fn. [
16] has the relationship between CDF
H of
X explained as follows:
where unique function
is called the copula. The construction of a multivariate joint distribution model for H is accomplished in two parts: computation of the marginal CDF (
F1,
F2,…,
Fn) and computation of the copula model (
C).
In this study, the parameters of copulas were estimated using the maximum pseudo-likelihood [
23] method. Candidate copula families used for the analysis were the elliptical and Archimedean copulas. Elliptical copulas include normal (Gaussian) and Student’s
t, and Archimedean copulas include Clayton, Gumbel, Joe and Frank (
Table 1). The parametric bootstrap-based Cramér–von Mises test (S
n) [
29] and Akaike information criterion (AIC) [
30] were used to assess the goodness of fit of all copulas. Bivariate joint distributions are estimated using the Copula package in R programming.
2.3. Return Period in a Bivariate Framework
Estimation of the return period has special importance in the planning and management of water resources. Let
D denote the drought duration and S denote drought severity, then the return period in univariate settings can be defined as [
31,
32]:
and
indicate the return period of the drought duration and severity, respectively.
and
indicate the cumulative distribution functions of drought duration and drought severity, respectively. For annual streamflow series,
. Here,
can be calculated using the theory of run and the Markov theorem [
33].
where
and
. The unit of
is months. According to [
18], there are two cases of the bivariate return period: (i)
D ≥
d and
S ≥
s (drought variables exceeding another specific value); (ii)
D ≥
d or
S ≥
s (one of the drought variable exceeding another specific value).
where
indicates the copula-based joint distribution function of the drought characteristics. ∧ denotes “and”, and ∨ denotes “or”.
2.4. Kendall Return Period
The standard definition of the return period may lead to under- or over-estimation of the correct values, and another definition of the bivariate return period was introduced by [
31]. The Kendall return period is the average time between the occurrences of two supercritical drought events [
34,
35]. The Kendall return period is also known as the secondary return period. The primary return period, the one usually adopted in the applications for the designing of drought, may only provide partial and vague information about the realization of the events of interest. In fact, it only predicts that a critical event is expected to appear once in a given time interval (i.e., an average forecast). However, it would be more important to be able to calculate (1) the probability that a supercritical (destructive) event will show up at any given realization of the process (e.g., for any storm) and (2) how long will it take, on average, for a supercritical event to appear. As a fundamental result, both questions can now easily be answered using Kendall’s return period: the first one by
defined below and the second one by considering the secondary return period given by Equation (7). The secondary return period provided a precise indication for performing risk analysis and may also yield useful hints for doing numerical simulations [
36]. In addition to this, the use of
and the secondary return period is more the appropriate approach to problems of (multivariate) risk assessment. In this study, we also adopted the secondary return period for the evaluation of drought risk within South Korea and compared with the joint return periods computed based on the routine computation procedure. To calculate Kendall’s return period, the occurrence probability should be computed first. Kendall’s distribution
can be defined as
. The related secondary return period was obtained via:
indicates the critical probability level, and
denotes the Kendall distribution function. The
for copulas of Archimedean family is as follows:
where
denotes the right derivative of the generating function
. For instance, this function for the Gumbel–Hougaard copula can be defined as;
The critical probability levels (
t) were computed as follows:
Understanding Kendall’s return periods from a practical point of view is easier. Suppose is the critical return period specified via the design requirements. In the case of water resource management, the engineers are always interested in designing a water-supply system that can deliver a suitable amount of water under a specified extreme drought event that (on average) occurs once every years. Then, by inverting Equation (7), a critical probability level can be computed, and subcritical, non-threatening events can also be identified.
The Kendall distribution function related to the Clayton copula can be expressed as follows:
Similarly, the value of
can be extended to Frank copulas:
The detailed description of the
function for other copula families can be found in [
23]. The coherent notation of the multivariate threshold and the total order in multidimensional Euclidean spaces was introduced by [
31], used for the calculation of the secondary return period. They introduced notation for the multivariate quantile and purposed different methods to identify critical design events in the presence of several dependent variables.