3.1.1. Estimation of Copula Model Parameters for the Kuzlovec Torrent
The methodology presented in
Section 2 for the estimation of the
SSL values based on the measured
Q and
P was tested using the high-frequency data measured in the Kuzlovec torrent. The
IEs value of 6 h was selected for the Kuzlovec torrent. It was found that the
IEs of 6 h is the reasonable selection for the considered torrent. Because the selected
IEs value is greater than the time of concentration (usually less than 1 h) the consecutive events can be treated independently [
73]. However, the selection of the
IEs can have significant impact on the defined sample and consequently also on the results [
74] because it influences both the total rainfall amounts (
P) and the
SSL values. This means that longer
IEs values could result in larger
P values due to lumping of consecutive events into one event. This would probably lead to larger
SSL values but we argue that the impact on the
Q values may be smaller. Thus, the dependence structure of the collected data sample may change (
Figure 4). Moreover, larger
IEs value would also lead to more multiple-peaks events. Altogether 21 complete events (
Q,
P, and
SSL), which were defined using the procedure described in
Section 2 (
Figure 2) and that occurred in all four seasons from June 2013 to May 2014, were used for estimating the copula and marginal parametric CDFs parameters (
Figure 4).
Table 2 shows the basic properties of the defined sample.
First the autocorrelation in the marginal data was tested.
Figure 3 shows the acf plots for the three variables for the data from the Kuzlovec torrent. There was no significant autocorrelation in the data (
Figure 3). This is confirmed by the Ljung-Box test results for the
P,
Q, and
SSL series that were 1.8 (
p-value: 0.18), 1.7 (
p-value: 0.19) and 0.77 (
p-value: 0.38), respectively. The next step of the copula approach was to assess the dependence between the pairs of considered variables (cor.test function was used [
48]; null hypothesis: variables are independent; alternative hypothesis: variables are not independent). The Kendall’s correlation coefficients between pairs of variables
Q-P,
Q-SSL, and
P-SSL were 0.70 (
z-value: 4.4;
p-value: 0.0001), 0.79 (
z-value: 5.0;
p-value: 6 × 10
−7), and 0.70 (
z-value: 4.4;
p-value: 0.0001), respectively. Therefore, the null hypothesis was rejected with selected significance level of 0.05. The calculated correlation coefficients indicate that the dependence between pairs of variables is similar. Moreover, the exchangeability test results for pairs of variables
P-Q,
P-SSL and
Q-SSL were 0.022 (
p-value: 0.99), 0.022 (
p-value: 0.98) and 0.021 (
p-value: 0.72), respectively. This indicates that the selection of symmetric copula functions (e.g., Clayton, Frank, Gumbel-Hougaard, Joe), which have one parameter to model the dependence among three variables, seems reasonable for the Kuzlovec case study.
Before applying the copula function, marginal distribution functions were defined. Different distribution functions were tested and the most suitable were selected using the AIC criterion. For the
P,
Q and
SSL the distribution functions with a minimum AIC value were Gumbel, Pearson type 3 and log-Pearson type 3, respectively. Aforementioned functions were also tested using the Cramér-von Mises test (
lmomco package) and the results were 0.044 (
p-value: 0.91), 0.082 (
p-value: 0.69) and 0.135 (
p-value: 0.44) for the
P,
Q and
SSL, respectively. This indicates that selected distribution functions could not be rejected at the selected significance level (0.05). The marginal distribution parameters were estimated using the method of L-moments [
67]. The location, scale, and shape parameters for the Pearson type 3 CDF (
Q) were 31.1, 31.6, and 2.3, respectively. Likewise, the location and scale parameters for the Gumbel CDF (
P) were 19.2 and 13.7, respectively. Furthermore, the location, scale, and shape parameters for the log-Pearson type 3 CDF were 3.0, 2.2, and −0.14, respectively. More information about the CDF and parameters can be found in [
67]. The next step of the procedure was to estimate the copula parameter and to select the most suitable copula function. Clayton, Frank, Joe and Gumbel-Hougaard copula functions were compared using the leave-one-out cross-validation model selection criterion [
65]. The criterion results were 24.4, 29.0, 25.0 and 16.1 for the Gumbel-Hougaard, Frank, Clayton and Joe copulas, respectively. According to these results the Frank copula should be selected (the largest value of the selected criterion). This function could not be rejected by the Cramér-von Mises test [
63,
64] at the selected significance level of 0.05 (statistics: 0.06;
p-value: 0.26). Thus, this copula function was selected as the most suitable for the Kuzlovec torrent case study. The Frank copula parameter (Equation (3)) was estimated (
Ɵ = 10.5) using the maximum pseudo-likelihood method [
62]. Therefore, the Frank copula function was selected to define the model for estimating the
SSL values using the data from the Kuzlovec torrent.
3.1.2. Comparison with Other SSL Estimation Techniques and Estimation of SSL Values Based on Measured Q and P
The procedure described in
Section 2 was used to estimate the
SSL values based on the measured
Q and
P. For each pair of measured variables (
Q and
P), which represents a potential event without
SSL measurements, 10,000 possible
SSL values were obtained.
Figure 5 shows a distribution of Equation (7) solutions for four events, which occurred in different seasons. The median value of all 10,000 possible
SSL values for one event was selected as the estimated
SSL value, which was eventually transformed to real space using the inverse PIT. Furthermore, 50% confidence intervals (first and third quartile values were selected) for each event were also determined and are presented in
Figure 5. Moreover, actual measured
SSL values are also shown in
Figure 5. It should be noted that November 2013 and May 2014 events were among the events with the highest
SSL values. On the other hand, June 2013 and August 2013 events can be defined as low-medium magnitude events. The results indicate that the copula is able to reproduce both low-medium and medium-high magnitude sediment transport events.
In the next step of the study the proposed event-based copula methodology for estimating the
SSL values was compared with some other possible estimation techniques (Equations (8) and (9)). Based on 21 events, which were also used in constructing the event-based copula model (
Section 2), the MLR and EXP model were defined and tested to estimate the
SSL values. The least-square method was used to estimate the MLR model parameters and the final constructed model was as follows
. Furthermore, the same method was applied to estimate the exponential model parameters, and the resulting model was:
. In the next step, the copula, MLR, and EXP models were used to estimate the
SSL values based on the measured
Q and
P variables.
Figure 6 shows diagnostic plots for the three compared models, while some commonly used performance criteria are presented in
Table 3. According to the different performance criteria [
69], the proposed copula model yields the worst fit among the tested models. However, the diagnostic plots (
Figure 6) demonstrate that the residuals are generally the smallest when using the copula model. The calculated residuals were approximately normally distributed and there was no autocorrelation in the residuals. The median residual values for three models were −1.2, −3.8, and 14.3 kg for the copula, MLR, and EXP model, respectively. The copula model is able to better reproduce the low-medium magnitude events, while the differences among compared methods in case of medium-high magnitude events are smaller and one could also argue that MLR and EXP models gave better fit to the data compared to the low-medium magnitude events. However,
Figure 5 shows that the copula model can also be used to model medium-high magnitude events. Furthermore, a comparison was also made for the single-peak and multiple-peaks events (
Figure 4). While for the single-peak events copula model yielded the smallest residual values, the differences among tested methods for the multiple-peaks events were smaller and the proposed copula model gave the worst results. However, one should keep in mind that only 4 events were defined as multiple-peaks and actually all these events were composed from 2 peaks. Moreover, the summary statistics for the estimated
SSL values indicate that the copula model gives the most accurate estimates of the measured
SSL values (comparison of
Table 2 and
Table 3). Using the MLR model the estimated
SSL values for some smaller rainfall events (e.g.,
P < 10 mm) were estimated as negative (
Table 3), which is not a meaningful result. Moreover, the EXP model generally overestimated these smaller magnitude events (
Table 2 and
Table 3).
The fit between the measured and estimated data could be improved with the inclusion of additional information in the model. For example, bed load data, antecedent soil moisture conditions and antecedent sediment transport data are just some of the possible options in case that this information is available. For the Kuzlovec torrent the relationship between the 1-day (
ANTP1), 3-day (
ANTP3) and 5-day (
ANTP5) antecedent rainfall and
SSL was also evaluated. The Kendall correlation coefficients for
ANTP1-
SSL,
ANTP3-
SSL and
ANTP5-
SSL were 0.17 (
p-value: 0.28), −0.08 (
p-value: 0.61) and 0.02 (
p-value: 0.89), respectively, which indicates that antecedent rainfall is not a good predictor of the
SSL for the Kuzlovec torrent. One of the possible alternatives to improve the copula model results can also be use of copula function with more parameters (e.g., Khoudraji-Liebscher copula) but one should keep in mind that over-parametrization could occur in such case, especially estimates of fitted parameters can be uncertain in case of 21 events. Moreover, nonparametric distribution function could be an alternative to the parametric distribution in order to reduce the number of parameters. In the case of the Kuzlovec torrent extreme flash flood occurred in August 2014 that caused intense sediment transport and changes in the location of the torrent channel thalweg [
75]. This means that additional measurements should be performed at this location in order to confirm that the proposed model (either copula or regression model) is still valid for this location after this extreme event with more than 100 years return period according to the rainfall data analysis [
75].
Due to the smallest residual values and because probabilistic estimation of the
SSL values can be obtained with the copula model this method was chosen to estimate the
SSL values for the events when
SSC observations were not performed. Based on the methodology presented in
Section 2 the
SSL estimates were conducted for 92 events which occurred in all four seasons. Months June, July, and August were classified as summer, September, October, and November as autumn, December, January, and February as winter and March, April, and May as spring. For events where accumulated rainfall (
P) did not exceed 1 mm the estimates of
SSL were not performed and further results are not presented.
Table 4 shows the results of the estimation procedure for different seasons. The results demonstrate that approximately 3.6 t of
SSL were transported through the measuring cross section in the Kuzlovec torrent in the observed period (June 2013–May 2014) (
Table 4). The lower confidence interval was 2.1 t and the upper one 6.9 t (
Table 4). One can notice that most of the
SSL was transported during winter 2013/2014 (
Table 4), which was relatively warm and with small amounts of snow in the considered period. The highest
P and
Q values are also characteristics of the winter period. The autumn 2013 contributed about 40% of the total
SSL, while in summer 2013 and spring 2014 together about 12% of
SSL was transported. A more comprehensive seasonality analysis could be done if more than one year of data would be available. However, one should also keep in mind that from June 2013 to May 2014 a total of 1613 mm of rainfall was measured in the investigated basin, which is below the long-term average for the analyzed area (about 1700 mm). Furthermore, none of the rainfall events was really extreme. The maximum measured accumulated rainfall amount for one event was 89 mm, which occurred in January, and the duration of the event was about 2 days. This suggests that under different, more extreme, hydro-meteorological conditions significantly more material could be transported even in the small torrential basin like Kuzlovec (up to ~5 or even 10 t/ha/year), because generally most of the material is transported during extreme events such as floods (e.g., [
1,
75]). One should keep in mind that for the Kuzlovec case study only 21 events were used for parameter estimation and fitting of the trivariate copula functions. This means that several types of uncertainties can affect estimation results due to the relatively small sample size: (i) model identifiability; (ii) estimates of the fitted parameters and (iii) ultimate estimates of the derived risks. However, the main aim of the study was to demonstrate that the proposed event-based copula model can be used for estimating
SSL values based on the known
Q and
P values. Thus, the proposed methodology was additionally tested on another case study (Gornja Radgona station on the Mura River) where 281 events were available to fit the proposed copula model and to make an estimation of the
SSL values using the copula model.