Vulnerability Assessment of Dam Water Supply Capacity Based on Bivariate Frequency Analysis Using Copula

: The vulnerability of the water supply capacity of a dam is defined as the expected deficit volume from a typical water deficit event. In this study, a water deficit event was assumed to be a rectangle composed of the deficit duration and deficit intensity whose occurrence probability was then estimated by the bivariate frequency analysis based on the copula method. This approach is different from the conventional one based on the assumption of the same occurrence probability for all events. This proposed method was applied to the Namgang dam in Korea as an example and the resulting estimate of the vulnerability was compared with the conventional method. First, the ‘OR’ concept was found to be better than the ‘AND’ concept in the calculation of the occurrence probability. Additionally, based on the consideration of multicollinearity, it could be concluded that the occurrence probability should be estimated by considering the water deficit intensity and duration. For the Namgang dam, the vulnerability was determined to be 9.11 × 10 6 m 3 , which is about 3% of the total storage capacity. This estimated vulnerability is also about 70% of the amount estimated by applying the conventional method with the same occurrence probability for all water deficit events. The interdependent and independent


Introduction
Since most of the annual total rainfall in Korea is concentrated in the rainy season from June to September, many multi-purpose dams have been constructed to save water resources. However, due to the increasing demand for domestic, industrial, agricultural, and environmental water use, several river basins are still exposed to the danger of a water shortage [1][2][3][4][5]. Many hydrologists have claimed that additional water resources should be secured by constructing more dams [6][7][8][9]. However, it is also argued that any project for securing additional water resources cannot be started before the quantitative evaluation of existing dams [10].
In Korea, the concept of firm yield had been used as one of the main criteria for designing a dam. Based on this concept of firm yield, the water supply capacity was determined by the water budget analysis with the aim of a specific period of maximum drought (generally one or two years) [11]. In the 1970s, the inflow condition recorded during the period of the drought from 1967 to 1968 was the most severe one considered. In the 1980s, the concept of firm yield was still applied to the design of a dam, but the inflow data recorded during the last 20 years could be considered. In the 1990s, the concept of reliability was introduced into Korea.
The reliability may be the most widely used evaluation criterion for the water supply capacity of a dam [12][13][14][15][16]. The term reliability is generally used in Korea and sometimes the volume reliability and resiliency are estimated as references. These concepts are all easy to understand and easy to apply. However, the concept of vulnerability has never been used because it is rather complicated in its application. The vulnerability may be the most comprehensive measure for the water supply capacity of a dam. In addition, it can provide more detailed information about the water supply capacity such as the water deficit volume.
Generally, the vulnerability indicates the extent to which a system experienced the exposure to a risk factor [17][18][19][20]. In a water resource area, the vulnerability is defined as the expected deficit volume from a typical water deficit event [12]. The vulnerability can be calculated as the sum of the volume of all the water deficit events multiplied by their occurrence probabilities. Unfortunately, the occurrence probability of a water deficit event is not easy to estimate. Thus, it is generally assumed to be constant. However, this assumption of constant occurrence probability results in the overestimation of the vulnerability since it counts the extreme events too highly. To estimate the occurrence probability, a multivariate frequency analysis of water deficit events is required. As can be imagined, a water deficit event is quantified by many components like the total deficit volume, deficit duration, peak deficit volume, etc.
This study focuses on the estimation of the occurrence probability of a water deficit event by applying the copula method. For this frequency analysis, a water deficit event should be modeled rather simply but systematically. In this study a water deficit event is assumed to be a rectangle composed of the deficit duration and deficit intensity. The deficit volume is simply the multiplication of deficit duration and deficit intensity. Even though the selection problem of two variables among three remains, the bivariate frequency analysis can be done using the copula. Once the occurrence probability of a water deficit event is estimated, the vulnerability can also be estimated by applying its definition. This study considers the Namgang dam in Korea as an example and the resulting estimate of the vulnerability will be compared with that based on the conventional assumption of constant occurrence probability.

Risk Assessment Measures for a Water Supply System
The reliability, resiliency, and vulnerability are regarded as representative water supply safety measures for a water resource system. To explain these three measures, it is important to define a water deficit event. As can be seen in Figure 1, these water supply safety measures are defined by the characteristics of water deficit events. The water deficit event in this study means the condition in which the actual water supply is less than the design of the water supply. The shaded area of Figure  1 indicates an example of a water deficit event. In this figure, v(j), the shaded area indicates the deficit volume for the j-th water deficit event and d(j) indicates the deficit duration for the j-th water deficit event. Figure 1. Definition of water deficit volume (v(i)) and duration (d(i)) (the water deficit intensity is calculated by dividing the deficit volume by the deficit duration).

Figure 1.
Definition of water deficit volume (v(i)) and duration (d(i)) (the water deficit intensity is calculated by dividing the deficit volume by the deficit duration). The reliability may be the oldest measure that has been used to assess the performance of a water resource systems [16]. In general, the reliability refers to the probability that the system will satisfy the design water supply, which can be expressed using the equation below.
where X is the state variable of the water resource system expressed as satisfaction or failure and S is the satisfaction state. A way of calculating the reliability is a method of using the ratio of the period in which the water resource system satisfies the design water supply. The reliability calculated in this way is called the temporal reliability. The method of calculating the temporal reliability can be expressed with the equation below.
where α is the (temporal) reliability, which has the probability value from 0 to 1, M is the number of water deficit events, and T is the entire period to be analyzed. The temporal reliability is simple and convenient to calculate because it only considers the satisfaction or failure of the design water supply for a certain period. However, this measure fails to evaluate the details of the water deficit event. For example, the temporal reliability cannot provide any quantitative information about the degree of damage. The resiliency is another measure to evaluate how quickly the water resource system returns to the satisfactory state after it has entered the failure state. The resiliency can be expressed as the conditional probability of the satisfactory state at the current time step given the failure state at the previous time step. The resiliency is defined by Equation (3).
where β is the resiliency as a conditional probability and X t is a state variable at time t. To calculate the resiliency, the probability must be calculated that the system is in the failure state (F) at time t − 1 and in the satisfactory state (S) at time t. This probability can be calculated by using Z t , which is defined using the equation below.
Thus, the probability that the system is in the failure state at time t − 1 and in the satisfactory state at time t can be calculated by 1 T T ∑ t=1 Z t . As the probability of being failed at time t − 1 (the denominator of Equation (3)) equals 1 − α, the degree of resiliency is expressed by using the equation below.
Lastly, the vulnerability is used to estimate the extent of the deficit volume from a typical water deficit event. Based on the definition of Hashimoto et al. [12], the vulnerability can be calculated as the sum of the volume of all the water deficit events multiplied by their occurrence probabilities. The vulnerability is expressed using the equation below. where γ is the vulnerability, which has the same unit of water deficit volume (m 3 ), and e(j) is the probability of occurrence of the j-th water deficit event. In this scenario, it should be remembered that the sum of e(j) is equal to one, i.e., M ∑ j=1 e(j) = 1.
In real practice, the reliability has been most frequently used since it is easy to calculate and easy to understand. On the other hand, neither the resiliency nor the vulnerability have been used that much. The resiliency shows just one aspect of the water resource system similar to the reliability. Even though the vulnerability considers both the volume and occurrence probability of water deficit events, the calculation of the occurrence probability has been a difficult problem. As a simplification, the occurrence probability has been generally assumed as e(j) = 1/M [15,21,22]. Thus, the estimated vulnerability cannot be the true value of representing the given water resource system. This study focuses on the vulnerability of a water resource system. Since a water deficit event is defined by its several components, like the deficit volume, deficit duration, etc., the calculation of its occurrence probability cannot be based on univariate frequency analysis. In this study, the copula method was introduced to solve this problem of estimating the occurrence probability. The vulnerability analysis result from a copula-based method will become more relevant information for the management of existing dam and decision-making of new dam construction.

Basic Concept of the Copula
Sklar [23] introduced the copula as a mathematical concept for the formation of a multivariate distribution function. The copula has been applied to derive one joint probability density function based on various PDFs [24][25][26]. By applying copula, any kind of PDF can be combined to one joint PDF. As a result, joint PDF representing the characteristics of various PDFs can be derived. The fundamental equation explaining the copula can be expressed by using the equation below.
where H is the cumulative probability that is calculated using the joint PDF, C is the copula, and F i (x i ) is the cumulative probability that is calculated using the i-th PDF. Equation (7) shows that the copula can be used to combine n PDFs into a single joint PDF. There are three major copula families: the Archimedean copula, the elliptical copula, and the Marshall-Olkin copula [27]. In this study, the Archimedean and elliptical copula families were selected for analysis. The reason for the selection is that copula belongs to these two families, which is easy to construct and is known for its wide applicability [24]. A total of four copula models were considered in this study including the Clayton model [28], the Frank model [29], the Gumbel-Hougaard model [30], and the Gaussian model [31]. Among them, the Clayton model, the Frank model, and the Gumbel-Hougaard model are classified as the Archimedean copula. Additionally, the Gaussian model is classified as the elliptical copula. Table 1 summarizes the joint PDFs of these copula models. In Table 1, θ is the parameter for copula model. u 1 and u 2 are the cumulative probabilities calculated by the marginal PDF of each two target variables [32]. Φ is the cumulative distribution function of the standard gaussian distribution. Table 1. Joint PDF of copula models considered in this study.

Copula Model
Joint PDF

Parameter Estimation
In this study, Kendall's τ, which is known as the rank correlation of the data, was used to calculate the parameter of the copula model. The parameter of the copula model in this study can be easily estimated using the Kendall's τ because this study considered the copula models with only one parameter. The equation for Kendall's τ of the two target variables can be expressed with the equation below [25].
where n is the number of data, (X i , Y i ) and (X j , Y j ) are data pairs for i = j, P n is the number of cases , and n C 2 is the number of combinations for the selection of the two variables among n elements. The parameter θ of the copula, then, was estimated by using the relationship between parameter θ and Kendall's τ. Table 2 summarizes the relationships for the copula models that are considered in this study. Table 2. Relationship between the Kendall's τ and the parameter θ.

Copula Model
Relationship between Kendall's τ and the Parameter θ Each copula model has a valid range for the parameter θ and only the parameter θ estimated in that range is significant. The valid range for the Clayton model is [0, ∞) and the valid range for the Gumbel-Hougaard model is [1, ∞). In the case of the Gaussian model, the valid range is [−1, 1]. For the Frank model, the valid range is (−∞, ∞) except for the origin [24].

Determination of the Optimal Copula Model
Among four copula models considered in this study, the optimal copula model was determined based on an Akaike information criterion (AIC) test statistic [33]. Although the AIC is not appropriate for the goodness-of-fit test, it could be used to determine the best model among many options. For example, Zhang and Singh [32] explained how to select the best copula model among many options by comparing AIC of each copula.
The AIC test statistic quantifies the difference between the empirical cumulative probability and the cumulative probability calculated from the copula model. The empirical cumulative probability for given data can be calculated using the Gringorten plotting position formula [34]. The Gringorten plotting position formula can be expressed by using the equation below.
Number of(x ≤ x i and y ≤ y i ) − 0.44 n + 0.12 (9) In Equation (9), H(x i , y i ) is the empirical cumulative probability of the data (x i , y i ).
The difference between the empirical cumulative probability and the cumulative probability of the copula model is quantified with the mean squared error (MSE) [35]. The MSE for AIC is defined using the formula below.
In Equation (10), x e is the cumulative probability from the copula model and x o is the empirical cumulative probability. Lastly, the AIC test statistic can be calculated using the formula below. AIC = n log(MSE) + 2(number of fitted parameters) (11) In this case, the copula model with the lower AIC test statistic is evaluated as the better copula model for the given data. Therefore, the optimal copula model for the given data is the model with the lowest AIC value [33].
However, having the lowest AIC does not guarantee the absolute goodness-of-fit of the copula model. To verify the absolute goodness-of-fit of the copula model, the Kolmogorov-Smirnov (K-S) test was conducted on the optimal copula. In the K-S test, the goodness-of-fit of the model is evaluated based on the maximum absolute difference between the empirical cumulative probability and the cumulative probability calculated from the copula [36]. The absolute goodness-of-fit of the optimal copula model is examined by comparing the difference with the critical value of the K-S test at a certain significance level. In this study, the significance level of 5% was used for the critical value of the K-S test.

Occurrence Probability Calculated by the Copula Model
First, the occurrence probability can be calculated by considering the case where one or two variables exceed the threshold value. This occurrence probability is expressed as P or . In addition, the occurrence probability can be defined by the probability that both variables exceed the threshold value. This case is generally notated by P and . The P or and P and for the data (x i , y i ) can be calculated using the following equations.
where F X,Y (x i , y i ) is the joint cumulative probability of the copula model and F X (x i ) and F Y (y i ) are the cumulative probabilities of x i and y i , respectively.

Study Area and Data
The Nam river is a major tributary of the Nakdong river, which covers the southeastern part of the Korean Peninsula. The Namgang dam is located 80 km from the confluence point of the Nakdong river ( Figure 2a). Figure 2b shows the river network and outlet of the Namgang dam basin and Figure 2c is the Digital Elevation Model (DEM) of the Namgang dam basin (Figure 2c) where the highest point is marked as the "Jiri mountain." The basin area of the Namgang dam is 3466 km 2 and the main channel length is 10 8 km. Even though it has a large basin area, the total storage capacity (3.1 × 10 8 m 3 ) and annual water supply capacity (5.7 × 10 8 m 3 ) of the Namgang dam are relatively small. For reference, the basin area of the Kurobe dam is 188.5 km 2 , which is about 5% of that for the Namgang dam.
However, the total storage capacity of the Kurobe dam is about 1.9 × 10 8 m 3 , which is about 61% of that for the Namgang dam.
Water 2018, 10, x FOR PEER REVIEW 7 of 18 reference, the basin area of the Kurobe dam is 188.5 km 2 , which is about 5% of that for the Namgang dam. However, the total storage capacity of the Kurobe dam is about 1.9 × 10 8 m 3 , which is about 61% of that for the Namgang dam.
(a) Location of the Namgang dam basin.
(b) River network and outlet of the Namgang dam basin.
(c) DEM of the Namgang dam basin. This study used the daily water supply time series from the Namgang dam from 1981 to 2012. The water deficit event is defined as the consecutive day in a water deficit condition when the dam water supply is less than the design water supply. In this study, all the water deficit events were considered for further analysis and a total of 45 water deficit events were collected.
A water deficit event can be quantified by the deficit duration, deficit volume, mean deficit volume (or, deficit severity), peak deficit volume, and more. Since the structure of a water deficit event is rather complicated, a simplified structure is generally introduced. The simplest way of representing a water deficit event may be to use a rectangular pulse. The rectangular pulse has been widely used for modeling rainfall events [37][38][39][40][41][42], as well as drought events [43]. This study also adopted this simplification to represent a water deficit event with its deficit duration, deficit volume, and deficit intensity (i.e., the deficit volume divided by its deficit duration). Table 3 summarizes the basic statistics (mean, standard deviation, and range) of these components of the water deficit events considered in this study. The means of the deficit duration, deficit volume, and deficit intensity were calculated as 15.0 day, 13.5 × 10 6 m 3 , and 12.3 × 10 6 m 3 /day, This study used the daily water supply time series from the Namgang dam from 1981 to 2012. The water deficit event is defined as the consecutive day in a water deficit condition when the dam water supply is less than the design water supply. In this study, all the water deficit events were considered for further analysis and a total of 45 water deficit events were collected.
A water deficit event can be quantified by the deficit duration, deficit volume, mean deficit volume (or, deficit severity), peak deficit volume, and more. Since the structure of a water deficit event is rather complicated, a simplified structure is generally introduced. The simplest way of representing a water deficit event may be to use a rectangular pulse. The rectangular pulse has been widely used for modeling rainfall events [37][38][39][40][41][42], as well as drought events [43]. This study also adopted this simplification to represent a water deficit event with its deficit duration, deficit volume, and deficit intensity (i.e., the deficit volume divided by its deficit duration). Table 3 summarizes the basic statistics (mean, standard deviation, and range) of these components of the water deficit events considered in this study. The means of the deficit duration, deficit volume, and deficit intensity were calculated as 15.0 day, 13.5 × 10 6 m 3 , and 12.3 × 10 6 m 3 /day, respectively. The standard deviations are all proportional to the means with their coefficient of variations of approximately (0.57, 0.66, and 1.25) for the deficit duration, deficit volume, and deficit intensity, respectively. The ranges of those components were also summarized as (1 to 161) day for the deficit duration, (0.1 to 118.9) × 10 6 m 3 for the deficit volume, and (0.1 to 2.7) × 10 6 m 3 /day for the deficit intensity.  Figure 3 shows the box-plot for the basic components of water deficit events observed at the Namgang dam. In Figure 3, each of the outliers outside the 1.5 inter quartile range is marked by the 'X' symbol. There were three outliers for the deficit duration ( Figure 3a) including one for the deficit volume ( Figure 3b) and no outlier was found for the deficit intensity ( Figure 3c). All these outliers are from three water deficit events among which one event was extreme. The effect of these three events was also found to be big enough to severely distort the basic statistics. Therefore, in this study, they were excluded in the fitting procedure of the PDFs. After selecting the PDFs, the goodness-of-fit test was also repeated with all the water deficit events including those three rare events. The vulnerability assessment was also conducted with all the water deficit events including those three rare events.   Figure 3 shows the box-plot for the basic components of water deficit events observed at the Namgang dam. In Figure 3, each of the outliers outside the 1.5 inter quartile range is marked by the 'X' symbol. There were three outliers for the deficit duration (Figure 3a) including one for the deficit volume ( Figure 3b) and no outlier was found for the deficit intensity (Figure 3c). All these outliers are from three water deficit events among which one event was extreme. The effect of these three events was also found to be big enough to severely distort the basic statistics. Therefore, in this study, they were excluded in the fitting procedure of the PDFs. After selecting the PDFs, the goodness-of-fit test was also repeated with all the water deficit events including those three rare events. The vulnerability assessment was also conducted with all the water deficit events including those three rare events.

Probability Density Functions
To calculate the occurrence probability using the copula model, the PDF for each variable of the water deficit event must be determined. In this study, the exponential distribution was chosen for the deficit duration and deficit volume and the log-Gumbel distribution was chosen for the deficit intensity. The reason for selecting the exponential distribution is that the occurrence characteristic of water deficit events is known to follow the Poisson process. In addition, it was also considered that the deficit duration and deficit volume data were positively skewed and more than 50% of the data were distributed near zero. On the other hand, the log-Gumbel distribution for the deficit intensity

Probability Density Functions
To calculate the occurrence probability using the copula model, the PDF for each variable of the water deficit event must be determined. In this study, the exponential distribution was chosen for the deficit duration and deficit volume and the log-Gumbel distribution was chosen for the deficit intensity. The reason for selecting the exponential distribution is that the occurrence characteristic of water deficit events is known to follow the Poisson process. In addition, it was also considered that the deficit duration and deficit volume data were positively skewed and more than 50% of the data were distributed near zero. On the other hand, the log-Gumbel distribution for the deficit intensity was determined based on the comparison of the data histogram and several PDFs. In this study, normal distribution, log-normal distribution, gamma distribution, generalized extreme value distribution, Gumbel distribution, and log-Gumbel distribution were considered as options.
For parameter estimation of the PDF, the maximum likelihood method was applied. Figure 4 compares the histogram and the PDF for each component of the water deficit event. Each PDF in Figure 4 was tested for goodness-of-fit by the χ 2 test. The exponential distribution for the deficit duration was found to have 0.19 of the χ 2 statistic, which is far smaller than the criterion for the 5% significance level. The exponential distribution for the deficit volume was also found to have a smaller χ 2 statistic of 1.34. In the case of the log-Gumbel distribution for deficit intensity, the χ 2 statistic was calculated as 4.29. This is larger than that for the deficit duration and deficit volume but still smaller than the criterion for the 5% significance level. In conclusion, all distributions were found to be valid under the 5% significance level. As mentioned earlier, all the water deficit events were considered in this goodness-of-fit test.
Water 2018, 10, x FOR PEER REVIEW 9 of 18 was determined based on the comparison of the data histogram and several PDFs. In this study, normal distribution, log-normal distribution, gamma distribution, generalized extreme value distribution, Gumbel distribution, and log-Gumbel distribution were considered as options.
For parameter estimation of the PDF, the maximum likelihood method was applied. Figure 4 compares the histogram and the PDF for each component of the water deficit event. Each PDF in Figure 4 was tested for goodness-of-fit by the 2 χ test. The exponential distribution for the deficit duration was found to have 0.19 of the 2 χ statistic, which is far smaller than the criterion for the 5% significance level. The exponential distribution for the deficit volume was also found to have a smaller 2 χ statistic of 1.34. In the case of the log-Gumbel distribution for deficit intensity, the 2 χ statistic was calculated as 4.29. This is larger than that for the deficit duration and deficit volume but still smaller than the criterion for the 5% significance level. In conclusion, all distributions were found to be valid under the 5% significance level. As mentioned earlier, all the water deficit events were considered in this goodness-of-fit test.

Optimal Copula Model for the Water Deficit Events
The Kendall's τ is necessary for estimating the parameter of the copula models that are considered in this study. The Kendall's τ was estimated as 0.67 for the deficit volume and the deficit duration, 0.26 was estimated for the deficit intensity and the rainfall duration, and 0.52 was estimated for the deficit volume and the deficit intensity. For all bivariate data pairs, the positive correlation was estimated. Among the three, the strength of correlation for the deficit volume and the deficit duration was highest while that for the deficit volume and the deficit intensity was lowest. Figure 5 shows the scatter plot of three bivariate data pairs excluding the three outliers.

Optimal Copula Model for the Water Deficit Events
The Kendall's τ is necessary for estimating the parameter of the copula models that are considered in this study. The Kendall's τ was estimated as 0.67 for the deficit volume and the deficit duration, 0.26 was estimated for the deficit intensity and the rainfall duration, and 0.52 was estimated for the deficit volume and the deficit intensity. For all bivariate data pairs, the positive correlation was estimated. Among the three, the strength of correlation for the deficit volume and the deficit duration was highest while that for the deficit volume and the deficit intensity was lowest. Figure 5 shows the scatter plot of three bivariate data pairs excluding the three outliers.

Optimal Copula Model for the Water Deficit Events
The Kendall's τ is necessary for estimating the parameter of the copula models that are considered in this study. The Kendall's τ was estimated as 0.67 for the deficit volume and the deficit duration, 0.26 was estimated for the deficit intensity and the rainfall duration, and 0.52 was estimated for the deficit volume and the deficit intensity. For all bivariate data pairs, the positive correlation was estimated. Among the three, the strength of correlation for the deficit volume and the deficit duration was highest while that for the deficit volume and the deficit intensity was lowest. Figure 5 shows the scatter plot of three bivariate data pairs excluding the three outliers.   The parameter θ of the copula model was then estimated using the relationship between the Kendall's τ and the parameter θ , which is given in Table 2. The parameter θ for the Clayton, Frank, Gumbel-Hougaard, and Gaussian copula models, respectively, were determined, which is shown in Table 4. Every parameter θ was found to be in the valid range of the four copula models. The MSE and the AIC were estimated for selecting the optimal copula model for each case of the bivariate data pair (Table 5). Since the MSE and AIC values are all penalty measures, the lower value indicates the more favorable models. As a result, the Clayton model was selected as the optimal copula model for the deficit volume and deficit duration, the Gaussian model was selected for the deficit intensity and the deficit duration, and the Frank model was selected for the deficit volume and the deficit intensity.  The parameter θ of the copula model was then estimated using the relationship between the Kendall's τ and the parameter θ, which is given in Table 2. The parameter θ for the Clayton, Frank, Gumbel-Hougaard, and Gaussian copula models, respectively, were determined, which is shown in Table 4. Every parameter θ was found to be in the valid range of the four copula models. The MSE and the AIC were estimated for selecting the optimal copula model for each case of the bivariate data pair (Table 5). Since the MSE and AIC values are all penalty measures, the lower value indicates the more favorable models. As a result, the Clayton model was selected as the optimal copula model for the deficit volume and deficit duration, the Gaussian model was selected for the deficit intensity and the deficit duration, and the Frank model was selected for the deficit volume and the deficit intensity.  Figure 6 compares the empirical cumulative probability and cumulative probability calculated by the optimal copula model. In this figure, the x-axis represents the cumulative probability calculated by the optimal copula while the y-axis represents the empirical cumulative probability. The dotted lines in this figure indicate the critical value of the K-S test based on the 5% significance level. As can be seen from Figure 6, the discrepancies for all optimal copulas lie within the upper and lower limits of the 5% significance level (0.203 for n = 45 and α = 0.05). The goodness-of-fit for all optimal copulas were verified under the 5% significance level.
Water 2018, 10, x FOR PEER REVIEW 12 of 18 Figure 6 compares the empirical cumulative probability and cumulative probability calculated by the optimal copula model. In this figure, the x-axis represents the cumulative probability calculated by the optimal copula while the y-axis represents the empirical cumulative probability. The dotted lines in this figure indicate the critical value of the K-S test based on the 5% significance level. As can be seen from Figure 6, the discrepancies for all optimal copulas lie within the upper and lower limits of the 5% significance level (0.203 for n = 45 and α = 0.05). The goodness-of-fit for all optimal copulas were verified under the 5% significance level.
(a) Clayton copula model for deficit volume and duration.
(b) Gaussian copula model for the deficit intensity and duration.
(c) Frank copula model for the deficit volume and intensity. Figure 6. K-plots and K-S test criteria for three optimal copula models considered for the water deficit events observed at the Namgang dam.

Occurrence Probability
Using the copula model determined for each pair of two variables constituting the water deficit event, the occurrence probability of each water deficit event was estimated. Both the 'AND' and 'OR' concepts were applied in this calculation. Figure 7 compares the estimated occurrence probabilities Figure 6. K-plots and K-S test criteria for three optimal copula models considered for the water deficit events observed at the Namgang dam.

Occurrence Probability
Using the copula model determined for each pair of two variables constituting the water deficit event, the occurrence probability of each water deficit event was estimated. Both the 'AND' and 'OR' concepts were applied in this calculation. Figure 7 compares the estimated occurrence probabilities of each water deficit event for both the 'AND' and 'OR' concepts. The occurrence probability in Figure 7 is divided by the sum of the exceedance probabilities of all the water deficit events estimated by applying the determined copula model to satisfy the concept of vulnerability. The mean value of the exceedance probabilities of all the deficit events was 0.621 in the 'AND' case and 0.357 in the 'OR' case. The dotted line in Figure 7 indicates the occurrence probability considered in the conventional method. Since the number of events is 45, the occurrence probability is calculated to be 1/45 = 0.022. Figure 7 clearly shows the difference between the 'AND' and 'OR' cases. Since the occurrence probability in the 'AND' case is calculated by way of multiplication of both the occurrence probabilities of two variables, the resulting occurrence probability of a water deficit event can be much smaller or much bigger depending on the occurrence probability of each variable. Exactly the opposite behavior can be expected for the 'OR' case. In this case, the occurrence probability of a water deficit event is calculated as a way of summation and it is generally higher than that of one variable. For example, when considering the deficit volume and duration, in the 'AND' case, the occurrence probability of a water deficit event with deficit volume 35.0 × 10 6 m 3 that occurred in 1977 was estimated to be 0.00011. On the other hand, in the 'OR' case, the occurrence probability of the same event was calculated to be 0.0016, which is obviously much higher than that in the 'AND' case.
As can be seen from Figure 7a, the variation of estimated occurrence probabilities in the 'AND' case became wide enough to reach 0.068. On the other hand, as can be seen from Figure 7b, the variation of the estimated occurrence probabilities in the 'OR' case was much smaller than that in the 'AND' case. The maximum occurrence probability was estimated to be only 0.037. As a result, in the calculation of the vulnerability, the role of extreme events becomes minimal in the 'AND' case but can be much bigger in the 'OR' case. However, at this point, it is not clear if either of the 'AND' or 'OR' case is appropriate for the estimation of the vulnerability, which will be discussed in the following section.
of each water deficit event for both the 'AND' and 'OR' concepts. The occurrence probability in Figure  7 is divided by the sum of the exceedance probabilities of all the water deficit events estimated by applying the determined copula model to satisfy the concept of vulnerability. The mean value of the exceedance probabilities of all the deficit events was 0.621 in the 'AND' case and 0.357 in the 'OR' case. The dotted line in Figure 7 indicates the occurrence probability considered in the conventional method. Since the number of events is 45, the occurrence probability is calculated to be 1/45 = 0.022. Figure 7 clearly shows the difference between the 'AND' and 'OR' cases. Since the occurrence probability in the 'AND' case is calculated by way of multiplication of both the occurrence probabilities of two variables, the resulting occurrence probability of a water deficit event can be much smaller or much bigger depending on the occurrence probability of each variable. Exactly the opposite behavior can be expected for the 'OR' case. In this case, the occurrence probability of a water deficit event is calculated as a way of summation and it is generally higher than that of one variable. For example, when considering the deficit volume and duration, in the 'AND' case, the occurrence probability of a water deficit event with deficit volume 35.0 × 10 6 m 3 that occurred in 1977 was estimated to be 0.00011. On the other hand, in the 'OR' case, the occurrence probability of the same event was calculated to be 0.0016, which is obviously much higher than that in the 'AND' case.
As can be seen from Figure 7a, the variation of estimated occurrence probabilities in the 'AND' case became wide enough to reach 0.068. On the other hand, as can be seen from Figure 7b, the variation of the estimated occurrence probabilities in the 'OR' case was much smaller than that in the 'AND' case. The maximum occurrence probability was estimated to be only 0.037. As a result, in the calculation of the vulnerability, the role of extreme events becomes minimal in the 'AND' case but can be much bigger in the 'OR' case. However, at this point, it is not clear if either of the 'AND' or 'OR' case is appropriate for the estimation of the vulnerability, which will be discussed in the following section.  It was also found that the selection of the data pair constituting the water deficit events has a significant influence on the results of the occurrence probability. Theoretically, the occurrence probabilities calculated from the same deficit event have to be identical. However, due to the uncertainty in the process of estimating the parameter of the copula and calculating the occurrence probability, the three occurrence probability results show important differences. For example, in the 'AND' case, the occurrence probability of the water deficit event with a deficit volume of 33.0 × 10 6 m 3 , which occurred in 1978, was 0.00098 when considering the deficit volume and duration but 0.038 when considering the deficit intensity and duration and 0.0017 when considering the deficit volume and intensity. This problem was a little alleviated in the 'OR' case even though the difference was still significant. For example, the occurrence probability of the water deficit event of the same event in 1978 was 0.0056 when considering the deficit volume and duration, but it was 0.0087 when considering the deficit intensity and duration and 0.0067 when considering the deficit volume and intensity. Table 6 compares the vulnerabilities estimated for the three different cases of selecting the data pair and two different cases of estimating the occurrence probability. As mentioned in the previous section, the difference between 'AND' and 'OR' cases when estimating the occurrence probability was also very clear in the calculation of the vulnerability. In the case of considering the 'AND' concept, the occurrence probability of an extreme event becomes extreme. As a result, the effect of these extreme events on the vulnerability estimation becomes very limited. Due to this reason, the vulnerability was estimated to be very small in the 'AND' case and the difference among the three cases of selecting the data pair was also small. The smallest one was just 3.16 × 10 6 m 3 for the data pair of the deficit intensity and duration while the largest one was 3.38 × 10 6 m 3 for the data pair of the deficit volume and intensity. These estimated vulnerabilities are all on the level of 25% of that based on the conventional method of using the same occurrence probability (i.e., 13.54 × 10 6 m 3 ).

Vulnerability
On the other hand, the 'OR' case resulted in quite reasonable results of vulnerability estimation. Since the extreme events were all significantly considered in estimating the vulnerability, the vulnerability was estimated to be much higher than that in the 'AND' case. Additionally, depending on the selected data pair, the estimated vulnerability was significantly different. It was 5.34 × 10 6 m 3 for the data pair of the deficit volume and duration, but it was 9.11 × 10 6 m 3 for the data pair of the deficit intensity and duration. These estimated vulnerabilities are at the level of (40% to 70%) of that based on the conventional method. Lastly, it is necessary to select one of the three results summarized in Table 6. The vulnerability is estimated by considering that the deficit volume and duration is 5.34 × 10 6 m 3 , that the deficit intensity and duration is 9.11 × 10 6 m 3 , and that the deficit volume and intensity is 8.30 × 10 6 m 3 . Since the vulnerability was calculated differently for the three different pairs of the two variables considered even though any pair of the two variables defined the same water deficit event, it is necessary to select any one of them for further remediation of the dam water supply plan. For this selection problem, the concept of multicollinearity can be used as an indicator.
The multicollinearity is defined as the lack of independence or the presence of interdependence among independent variables in the regression analysis [44,45]. The interdependent and independent variables can lead to low quality of the resulting parameter estimates [46,47]. Additionally, the effect of the multicollinearity distortion increases with the increasing degree of correlation [48,49]. This problem is generally evaluated by quantifying the variance inflation factor from a variable [50,51]. The variance inflation factor is used as a measure to quantify the level of the multicollinearity in a multiple regression analysis. Furthermore, the above results in Table 6 could be expected by comparing the variance inflation factors estimated for the three components considered in this study. The variance inflation factor of the deficit duration was estimated as 3.46 and the deficit volume was estimated to be 5.87. In the case of the deficit intensity, the variance inflation factor is 2.89. It is generally assumed that, in the case that the variance inflation factor is higher than five, the multicollinearity problem becomes serious [52][53][54][55]. Based on this result, it could be concluded that the occurrence probability considering the deficit intensity and duration (i.e., the case excluding the deficit volume) is more reasonable than the other two cases. The vulnerability estimated by considering the deficit intensity and duration, 9.11 × 10 6 m 3 , becomes the most reasonable value for the Namgang dam.

Conclusions
This study proposed a method to estimate the occurrence probability of a water deficit event by applying the copula method. In this study, a water deficit event was assumed to be a rectangle composed of the deficit duration and deficit intensity. The deficit volume is simply the product of the deficit duration and deficit intensity. By selecting two variables among three, the bivariate frequency analysis was performed using the copula. The vulnerability was then estimated by multiplying the water deficit volume by its occurrence probability. This study considered the Namgang dam as an example and the resulting estimate of the vulnerability was compared with the conventional assumption of constant occurrence probability. The results are summarized below.
First, the exponential distribution was selected for the deficit duration and deficit volume and the log-Gumbel distribution was selected for the deficit intensity. The goodness-of-fit tested by the test has also validated the use of these PDFs. Additionally, based on the MSE and the AIC, the Clayton model was selected as the optimal copula model for the deficit volume and deficit duration. The Gaussian model was selected for the deficit intensity and the deficit duration and the Frank model was chosen for the deficit volume and the deficit intensity. The K-S test confirmed these selections of the copula model. Second, the occurrence probability of each water deficit event was estimated by using the copula model determined for each pair of two variables constituting the water deficit event. Both the 'AND' and 'OR' concepts were applied in this calculation and they were found to be very different from each other. In particular, the occurrence probability in the 'AND' case was estimated to be much smaller or much larger (i.e., wide variation) than that in the 'OR' case. It was also found that the selection of the data pair constituting the water deficit events had a significant influence on the occurrence probability. This problem was serious in the 'AND' case but was rather small in the 'OR' case even though the difference was still significant.
Third, the difference between the 'AND' and 'OR' cases when estimating the occurrence probability was also considered in calculating the vulnerability. In considering the 'AND' concept, the occurrence probability of an extreme event becomes extreme (i.e., a small occurrence probability). As a result, the effect of these extreme events on the vulnerability estimation becomes very limited. Due to this reason, the vulnerability was estimated to be very small in the 'AND' case and the differences among the three cases of selecting the data pair were also small. On the other hand, the 'OR' case resulted in quite reasonable results of vulnerability estimation. Since the extreme events were all significantly considered in estimating the vulnerability, the vulnerability was assessed to be much higher than that in the 'AND' case. The estimated vulnerabilities in the 'AND' case were on the level of 25%, but, in the 'OR' case, they were about 40% to 70% of that based on the conventional method.
Lastly, based on the concept of multicollinearity, it could be concluded that the occurrence probability considering the deficit intensity and duration is more reasonable than the other two cases. This conclusion was made by considering the variance inflation factor, which is a measure to evaluate the effect of the multicollinearity of a variable. The variance inflation factor was estimated as 2.89 for the deficit intensity, 3.46 for the deficit duration, and 5.87 for the deficit volume. It is generally assumed that the multicollinearity problem becomes serious in the case that the variance inflation factor is higher than five. As a final result in the application of the Namgang dam, the vulnerability was determined to be 9.11 × 10 6 m 3 , which is about 3% of the total storage capacity of 3.1 × 10 8 m 3 . This estimated vulnerability is also about 70% of that estimated by applying the same occurrence probability to all water deficit events.
The vulnerability analysis result from the copula-based method will become more relevant for the management of existing dam and decision-making of new dam construction. In addition, since this method is based on the copula model, any kind of marginal distribution for an analysis variable can be applied. Therefore, a vulnerability assessment method from this study can be broadly applied on a water-deficit event with various types of variables.
More case studies with various types of water resource systems need to be conducted to generalize the vulnerability assessment method in this study. Therefore, for further study, several vulnerabilities results from the global/domestic basin will be analyzed and compared.