Flood frequency analysis is the premise and foundation of water conservancy project planning and construction. The current flood frequency analysis methods usually assume that the flood series satisfies consistency, that is, that the distribution form or the statistical law of the flood sequence is fixed [1
]. However, with climate change and intensification of human activities, especially the construction of large-scale water conservancy and water conservation engineering and the urbanization process, the runoff yield and concentration mechanism, and the temporal and spatial distribution of flooding have been changed [2
]. This results in the inconsistency of flood series and the unreliability of the frequency obtained from current frequency analysis methods [3
]. Therefore, it is of great significance to study the nonstationary flood frequency analysis methods [4
Existing nonstationary flood frequency analysis methods in literature include mixture distribution methods, conditional probability distribution methods, and time-varying moment methods. The main idea of the nonstationary flood frequency analysis methods based on mixture distribution is that the individuals of the extreme series are not from the same population [5
]. That is, the series formed by different hydrological processes does not follow the same distribution, thus it was assumed to consist of several sub-distributions. The nonstationary flood frequency analysis methods based on conditional probability distribution divide the flood into several periods based on the differences in flood formation mechanism, analyze the occurrence probability of annual maximum value in the different periods, and then obtain the probability density function of the extreme series [6
Different from the mixture distribution models and conditional probability distribution models, models with time-varying moment consider the change of climate and land surface to have resulted in a change in the physical processes and mechanism of flood generation, such that the parameters of the distribution followed by the flood sequence are functions of time rather than constant. Much attention has been paid to time-varying moment models [7
]. Vasiliades et al. [8
] applied a time-varying moment model based on Generalized Extreme Value (GEV) distribution to analyze nonstationary frequency, through assuming the parameters of GVE distribution to be the functions of time or other factors and conducting a goodness-of-fit test to the model, thus verifying the significance of the nonstationarity of hydrological sequences.
This study addressed models with time as covariate for nonstationary flood frequency analysis based on GAMLSS (Generalized Additive Models for Location, Scale, and Shape) theory. GAMLSS was first proposed by Rigby and Stasinopoulos [9
]. This model overcomes the limitations of the GLM and GAM models, greatly expands the range of the distribution types, and provides a variety of ways to produce different distributions, including a series of continuous and discrete distributions with high skewness and/or high peak. In addition, the systematic components provide more plentiful content. For example, it can introduce more complex parametric (linear or non-linear), semi parametric, non-parametric, or random-effect terms to establish models between the distribution parameters (mean, variance, etc.) and the explanatory variables. The GAMLSS model has been widely applied in the military, economics, medicine, and other fields [11
]. Hydrologists have also done many researches using GAMLSS in recent years. Serinaldi and Kilsby [14
] used the GAMLSS model to analyze the monthly rainfall data of 6 stations in England and Wales. They found that the model could better describe the characteristics of rainfall series, and had better performance for fitting the relation between extreme rainfall events, rainfall and atmospheric circulation index, and sea surface temperature. López and Francés [15
] proposed two methods based on the GAMLSS model to investigate the nonstationary frequency analysis of the annual maximum flood records of 20 Spanish inland rivers. The results illustrate that the nonstationarity of flood series caused by the effects of climate change and human activities can not to be ignored, and GAMLSS provides a convenient and flexible model framework for considering the influence of climate factors and human activities in the analysis of non-stationary flood frequency.
In flood frequency analysis, univariate probability distribution functions are usually used to estimate the occurrence probability or magnitude of the flood peak or volume in a certain region. However, flood events involve more than one characteristic variable such as flood peak discharge, flood volume, flood water level, etc. In order to estimate the probability of flooding, one needs to know not only the high and extreme values of each variable, but also the likelihood of their occurring simultaneously [16
]. The main issue of the univariate models is their difficulties in capturing the underlying joint probability among multiple physical processes, and this will lead to underestimation of the associated occurring probability [17
]. For instance, if only the rainfall is considered to estimate the flood risk for a catchment, the resultant estimation would be significantly lower than its true risk, when there is a statistically significant dependence between the rainfall on the catchment and the downstream high water levels. To this end, bivariate models are used to address this issue [18
Copula functions, for which the marginal distribution of each variable is uniform, were adopted in the bivariate model with time as covariate in this study. They are popular in high-dimensional statistical applications because they allow one to easily model and estimate the distribution of random vectors by estimating marginals and copulae separately. They can describe the linear, non-linear, symmetric, and asymmetric relations between variables, and are simple, flexible, and adaptable in application. Therefore, copulas are effective mathematical tools to construct the multivariate joint distribution and correlation between variables. In recent years, they have been widely used in multivariate hydrological frequency analysis. In drought characteristics analysis, Mirabbasi et al. [19
] used a copula function to establish the joint probability distribution between drought duration and drought degree. In rainfall frequency analysis, Zhang and Singh [20
] adopted copula functions to construct the bivariate joint distribution between rainfall intensity and depth, rainfall intensity and duration, and rainfall depth and duration, respectively. The results were compared with a Gumbel mixture model and a two-dimensional normal transformation distribution model. Fu G. and Butler D. [21
] used the copula method to separate the dependence structure of rainfall variables from their marginal distributions, and analyzed the different impacts of dependence structure and marginal distributions on system performance.
This paper takes Wangkuai Reservoir, which is undergoing substantial change in climate, land use/land cover, and increased number of soil-water conservation projects in Daqing River Basin, to construct both univariate and bivariate time-varying moment models for flood frequency analysis based on GAMLSS theory. The inflow flood peak and flood volume time series (1956–2004) of Wangkuai Reservoir were selected as basic data to discuss the nonstationary univariate and peak-volume bivariate joint flood frequency analysis methods. Flood quantiles and the combined values of flood peak and flood volume under certain exceedance probabilities have been worked out. This study aims to provide new ideas and approaches for nonstationary flood frequency analysis method under a changing environment.
2. Study Region and Data
Wangkuai Reservoir is located in the upstream of Sha River, Daqinghe Catchment (Figure 1
). Its construction started in June 1958 and finished in September 1960. The control area of the reservoir is 3770 km2
, and the storage capacity is 13.89 × 108
. The currently used design floods were calculated with flood data series under the assumption of stationarity. The watershed receives an average precipitation of 626.4 mm annually, mostly in the summer (70–80%). The annual mean temperature is 7.4 °C.
Since 1980, a series of water conservation measures have been carried out in the Wangkuai Reservoir catchment, such as closing land for reforestation and returning farmland to forest. Meanwhile, a number of small hydraulic structures have been constructed. These factors have increased the vegetation coverage rate and significantly changed the land surface, which has affected the flood process in this watershed and thus resulted in nonstationarity of flood series, as revealed by many studies [22
Flooding runoff data have been monitored for a period of 49 years from 1956 to 2004, and collected on hourly basis. The data were provided by Hydrology and Water Resources Survey Bureau of Hebei Province. Maximum flood peak Q and maximum one-day flood volume W1 of each year are selected in this work.
5. Discussion and Conclusions
The changing environment makes the assumption of stationarity of flood sequences questionable. In this context, this paper constructed both univariate and bivariate nonstationary models with time as covariate based on GAMLSS theory for flood frequency analysis. The inflow flood peak Q and flood volume W1 series of Wangkuai Reservoir were used as basic data.
Within the framework of nonstationary flood frequency, this paper adopted four two-parameter distributions (Gumbel, Weibull, Gamma, and Log-Normal) as alternative distributions, which have the characteristics of power distribution and exponential distribution simultaneously. In the univariate nonstationary model with time as covariate, log-normal distribution performed best according to AIC criterion. The flood peak and volume time series presented a decreasing trend over time. Especially when the quantile is high (such as 95% quantile), the downward trend is more significant. Besides, the decreasing trend is significant before 1980, and tends to be gentle after 1980. This proves that variation of flood sequences has occurred under the changing environment.
Based on the optimal univariate models, copula functions were addressed to construct the dependence structure of Q and W1, with the two optimal log-normal distributions as marginal distributions. The results showed that only the Gumel-Hougaard copula can provide the best joint distribution. The most likely events have similar undulating behavior to the univariate models, and the combination values of the flood peak and volume under the same OR-joint and AND-joint exceedance probability both display a decreasing trend. Meanwhile, the combination values of the nonstationary model intersect with the traditional stationary results between 1970–1980. That is, before 1970, the most likely combinations considering the variation of distribution parameters over time were larger than fixed parameters (stationary), whereas it became the opposite after 1980.