2.1. Study Sites and Data
The seven sites, namely Seomjingang-dam, Soyangang-dam, Andong-dam, Yongdam-dam, Imha-dam, Chungju-dam, and Hwoingseong-dam, located in various parts of South Korea, were chosen for this study (
Figure 1) as they have extensive flood data that are not likely to be influenced by upstream anthropogenic activities. The dams are located on the major rivers of Korea, with three of them located within the Han River basin, two located within the Nakdong River basins, one in the Geum River basin, and the remainder in the Seomjingang River basin. The dams were built to serve multiple purposes, such as flood control, water supply, and hydropower generation. Each reservoir of the studied dam sites set the restrict water level (RWL) for flood control. The excessive water volume above RWL should be released during the specific flood season from 21 June to 20 September.
The sites are all characterized by monsoon climate, with two-thirds of the rainfall received in the summer. The flood event data sets were collected from the Water Resources Information System (WAMIS) available at the website,
http://www.wamis.go.kr/.
Table 1 provides attributes of the dam sites, including the site number, site name, drainage area, altitude, mean annual rainfall, mean annual runoff, collected record period, and number of floods used for the analysis. The drainage area varies from 209.0 km
2 to 6648.0 km
2, with altitudes from 147.5 m a.s.l. to 268.5 m a.s.l., and the number of events used for analysis from 44 to 51.
For each site, a set of flood events was selected from the observed hourly flows based on the peak-over-threshold concept. A threshold was set to obtain on average of about two flood events per year for each site. Each flood peak must exceed the threshold and be separated from the preceding or succeeding one by at least 5 days.
2.2. Empirical Methods for Estimating Instantaneous Peak Flow
Fuller [
7] derived the relationship in Equation (1) in
Table 2, using data from 24 river basins in the USA, to estimate the instantaneous peak flow (IPF),
(m
3/s) using the maximum mean daily flow (MMDF),
(m
3/s), and drainage basin area, A (km
2). The regression coefficients can be modified for other regions by regressing the ratio of IPF over MMDF to the drainage area [
6].
Sangal [
13] proposed an empirical formula, as shown in Equation (2), in which three consecutive days of mean daily flow data, including the MMDF
of the peak day, the MDF of the previous day (
) and the MDF of the next day (
), are used, assuming a triangular hydrograph.
Fill and Steiner [
14] developed a formula, shown in Equation (3) similar to Sangal’s formula to estimate the instantaneous peak flow using the MMDF and the MDFs of the adjacent days. They additionally applied a correction factor k to obtain a better estimation of IPF. The correction factor was obtained from a linear regression between the hydrograph shape factor
and k for 50 cases in Fill and Steiner’s study. As indicated by Chen et al. [
6], Fill and Steiner’s formula allowed for a non-linear relationship between the IPF and MDFs because k is also a function of MDFs.
Chen et al. [
6] recently proposed a slope-based method of estimating the IPF using a sequence of MDFs considering rising and falling slopes to describe the shape of MDF hydrographs. In the slope-based method, the IPF is expressed as in Equation (4) under the assumption that the intersection point of the extended rising and falling limbs of an MDF hydrograph is the associated IPF.
2.3. Steepness Index Unit Volume Flood Hydrograph Approach for Disaggregating Daily Flows
Tan et al. [
15] developed a flow disaggregation method, termed the steepness index unit volume flood hydrograph (SIUVFH) approach, and they applied the method to the Latrobe River, Austria to generate hourly flood hydrographs using the daily flows modelled by the SIMHYD which is a simple lumped conceptual rainfall-runoff model, showing satisfactory results. In the SIUVFH method, daily flood flows are disaggregated into sub-daily flows under the prerequisites of a strong relationship between the standardized flood instantaneous flood peak (Equation (5)) and the standardized daily flood hydrograph rising-limb steepness index (Equation (6)).
where
is the standardized instantaneous flood peak,
is the instantaneous flood peak, and
is the
n + 1 -day flood volume from the preceding
n day up to peak day as shown in
Figure 2.
where
is the standardized daily flood hydrograph rising limb steepness index,
is the daily flood peak, and
is the daily flow on preceding n days as shown in
Figure 2.
To use the SIUVFH method, independent flood hydrographs are first selected and the hourly flows in each of them are then standardized and divided by . The standardized hourly hydrograph is named as an unit volume flood hydrograph (UVFH). Subsequently, the standardized instantaneous flood peak and the standardized daily flood hydrograph rising limb steepness index, are calculated using Equations (5) and (6) for each of the independent reference flood hydrographs. By gathering the pairs of and for all independent flood hydrographs, the relationships between both indices are plotted. If one would like to disaggregate a daily flood event into an hourly time scale, the n + 1 -day flood volume ( and the standardized daily flood hydrograph rising-limb steepness index () should be calculated. Here, two indices are accented by hats to differentiate the daily flood hydrograph to be disaggregated and the reference flood hydrographs. Finally, the hourly UVFH with closest to the is scaled by multiplied by to obtain the hourly flood hydrograph up to the flood peak.
2.4. Development of Artificial Neural Network (ANN)-Based Instantaneous Peak Flow (IPF) Estimation Method
An artificial neural network (ANN) is a type of machine-learning method that can adequately simulate non-linear complex systems without requiring a physical understanding of the underlying mathematical relationship between the input and output, and they have been widely used in hydrological fields [
23,
24,
25,
26]. The development of an ANN consists of partitioning the data, identifying important input variables, optimizing the network architecture and its parameters, and evaluating the model performance.
Many studies have used the cross-validation (CV) technique, such as holdout CV and k-fold CV, in which the learning data are divided into two subsets—training and test datasets. More recently, the learning data may be divided into three subsets of training, validation, and test datasets, where the validation set is used for early stopping. In the present study, the data were divided into three parts, including the training dataset to calibrate the weights of the network, the validation part for early stoppage of training progress to avoid overfitting, and the testing part for evaluating the performance of the optimized network.
Inappropriate division of the data into the training, validation, and test data sets will affect the selection of the optimal ANN structure and its prediction performance. To reduce the high variance of outputs, the Monte-Carlo cross-validation (MCCV) [
27], denoted as a repeated random subsampling or random splitting, was used to compose the sub-data sets. The MCCV repeatedly performs cross-validation using randomly split datasets without replacement to obtain an ensemble of outputs and evaluate prediction errors. This approach is known to significantly reduce the variance of the model output [
28]. Xu et al. [
29], Barrow and Crone [
30], and Lee et al. [
31] successfully applied the MCCV method to hydrological prediction problems. However, the MCCV method has not been focused on flood peak estimation using ANN models with the input of the mean daily flows. Our study uses the MCCV method to generate an ensemble of flood peaks and to account for the uncertainty from various data subsampling and initial weight assignment.
We evaluated the model performance by increasing the training portion from 50% to 80% of the total data size with the increment of 10% (the rest are equally divided into validation and testing sizes). The results indicated that the different proportions had insignificant impacts on the training and validation stages, however, showed considerable impacts for testing stage. When focusing on the general performance for testing data set, the ratio of 60% of the data set for training, 20% for testing, and 20% for validation would be the optimal choice in this study. As indicated by Baxter et al. [
32] and May et al. [
33], these data splitting ratios have been known to be effective for ANN models. Therefore, the datasets available at each site were randomly split 100 times into three subsets of 60%, 20%, and 20% for training, validation, and testing, respectively. With additional random assignment of initial weights between nodes, we performed the MCCV a total of 10,000 times. That is, 100 by 100 runs with combinations of 100 times sub-data splitting and 100 sets of initial strengths between nodes for any one network structure.
Identifying the significant input variables that influence the output is important in developing an ANN model. Two types of ANN model were constructed for the IPF estimation. The first type of ANN model (ANN-1) had input variables of three consecutive days of mean daily flow including the peak daily flow, while the second (ANN-2) further used three consecutive days of areal rainfall amounts in addition to the input variables of ANN-1. Therefore, the number of input variables was set to 3, and it was increased to 6 in an attempt to investigate an improvement of the simulation accuracy by adding three consecutive daily rainfall data. The input and output data were normalized between −1.0 and 1.0 to account for the same scale as the limits of the activation function used in the hidden layer.
In this study, a multi-layer feed-forward neural network is used as an IPF simulation model, which had at least three layers: input, hidden, and output. The number of hidden layers was set to 1 for the simplicity of the network. The different number of hidden nodes was tested to find an optimal network architecture. The choice of the number of nodes in the hidden layer and, hence, the number of connection weights was important because too many weights can result in overtraining, while too few weights can undertrain the network. Empirical relationships between the number of connection weights and the number of training samples have been suggested in the literature [
34]. For example, Rogers and Dowla [
35] suggested that the number of weights should not exceed the number of samples. Masters [
36] suggested that the number of training samples should be two times the number of weights, Weigend et al. [
37] suggested 10 times. In our study, the training samples were not sufficient, so we followed the empirical rule of Rogers and Dowla [
35]. Therefore, the number of weights would be appropriate below about 30–40, and, hence, the number of hidden nodes was desired to be about 5–10 when considering the number of inputs at 3–6. The desired number of hidden nodes can be reduced due to the addition of a bias term to the network. We initially set the maximum number of hidden nodes by 15 above the desired value, but finally chosen the optimal number to not exceed 10 by trial and error.
To reduce the variance of the training results affected by initial weights, an initial set of 100 weights between nodes was randomly assigned to the beginning of the training process. The backpropagation algorithm was used to update the weights of the network. A hyperbolic tangent sigmoid activation function was used in the hidden layer, and a linear activation function was used in the output layer.
The number of epochs for the training step highly depends on the ANN structure and complexity of the problem. In general, an ANN model improves with more epochs of training, but overfitting can occur when using too many epochs, which leads to reduction of the generalization ability of the model for unseen data. However, we have used an early stopping method for the validation set, therefore, the number of epochs is likely not the critical issue. We trained the network with sufficient epochs (5000) and terminated the training when the validation error was at its minimum. The network for the epoch with the minimum validation root mean squared error (RMSE) was selected for the evaluation process.
As the data splitting and initial weight assignment were repeated at random, different neural networks were created that gave different performances at the end of the training process. Therefore, we evaluated the network performance by averaging the errors in the output. This study adopted the relative root mean squared error (RRMSE), the coefficient of determination (R
2), and the Nash–Sutcliffe efficiency (NSE) to evaluate the estimated results of the ANN models. The RRMSE, R
2, and NSE values were calculated using Equations (7)–(9), respectively.
where
is the number of samples,
is the observed values,
is the predicted values,
is the average of the observed values, and
is the average of the predicted values.