Regional Flood Frequency Analysis Using An Artificial Neural Network Model

This paper presents the results from a study on the application of an artificial neural network (ANN) model for regional flood frequency analysis (RFFA). The study was conducted using stream flow data from 88 gauging stations across New South Wales (NSW) in Australia. Five different models consisting of three to eight predictor variables (i.e., annual rainfall, drainage area, fraction forested area, potential evapotranspiration, rainfall intensity, river slope, shape factor and stream density) were tested. The results show that an ANN model with a higher number of predictor variables does not always improve the performance of RFFA models. For example, the model with three predictor variables performs considerably better than the models using a higher number of predictor variables, except for the one which contains all the eight predictor variables. The model with three predictor variables exhibits smaller median relative error values for 2- and 20-year return periods compared to the model containing eight predictor variables. However, for 5-, 10-, 50- and 100-year return periods, the model with eight predictor variables shows smaller median relative error values. The proposed ANN modelling framework can be adapted to other regions in Australia and abroad.


Introduction
Globally, floods are the most damaging natural disasters that cause enormous economic loss and social disruptions across the landscape. In the last decade, floods accounted for roughly 45% of all disasters (and people affected by them) and caused an average of 6000 casualties in each year [1]. Floods cause billions of dollars of damage annually worldwide, and even in the world's driest inhabited continent, Australia, flooding is the costliest natural disaster [1][2][3]. In recent years, floods have become more frequent and highly disastrous due to global climate change [1].
One of the key steps in flood risk assessment process is the estimation of design floods [2]. A design flood is the peak discharge used to design hydraulic structures (e.g., bridge, culvert, retaining wall) and the magnitude of the flood is represented by the annual exceedance probability (AEP) [4]. At-site flood frequency analysis is the commonly accepted method to estimate design flood in a gauged catchment if observed streamflow data are available for a number of years. However, there are numerous catchments where observed streamflow data are not available. Regional flood frequency analysis (RFFA) is considered as the best option to estimate design flood for these catchments [5][6][7].
The efficiency of a RFFA technique primarily depends on two factors: (i) sufficiency and accuracy of historical streamflow data in terms of record length and spatial coverage over the study area; and (ii) the adopted regionalization method that explores flood characteristics in gauged catchments and transfers the relevant flood attributes to ungauged catchments. Some of the most successful RFFA techniques include (i) an index flood estimation method [8], (ii) a quantile regression method [9] and (iii) a parameter regression method [10,11]. All of these techniques are essentially linear, such that floods are linearly related with catchment characteristics either in a log domain or domain with raw data [12,13].
While efforts have been made to develop nonlinear RFFA methods to estimate design floods, the application of such methods are quite limited. In the past few years, some non-linear methods, such as an artificial neural network (ANN), gene expression programming and fuzzy models, were developed and their efficiency was evaluated [12,13]. Aziz et al. [13] investigated the performances of ANN and GEP methods using streamflow records from 452 gauging stations across Australia. They compared the results with quantile regression technique, which is a linear RFFA method, and concluded that a non-linear method produced better results compared to a linear method.
The working principle of ANN is similar to that of the human neural system [14]. Unlike other data processing methods which learn through programming, an ANN model investigates the patterns in a data set and correlate them. The ANN consists of simple computing units called 'artificial neurons'. Each unit is connected to the other units via weight connectors. These units calculate the sum of all weighted inputs and bias. It then produces output of previously weighted input and bias using an activation function.
In the last few decades, the ANN model (introduced by McCulloch and Pitts [15]) has been extensively used to solve various mathematical problems, especially in the field of medical science [14,16,17]. In recent years, the method has been used in engineering fields for forecasting and data compression. Lapedes and Farber [18] applied the ANN model to investigate non-linear data series and found better generalization capabilities of ANN models compared to a regression-based model. ANN model is more capable of identifying non-linear connections between observed and predicted data sets [19,20].
In the Australian context, studies on ANN-based RFFA modelling are limited. The majority of previous RFFA studies have been based on linear models [5,10]. Aziz et al. [21,22] applied ANN-based RFFA modelling; however, they applied a limited set of catchment characteristics in model building. The objectives of this study are: (i) to develop and test ANN-based RFFA models using a higher number of catchment attributes; and (ii) to recommend the best ANN-based RFFA model containing an optimum number of predictor variables for the catchments in New South Wales.

Principle of ANN
The concept of an ANN model is schematically represented in Figure 1. The first column represents various input variables ( ) and the second column represents the specific weight of input variables ( ). The output is determined by considering a given vector. There is a constant value (1) among the inputs which has been introduced to the neuron by its unique weight, known as bias (b). Bias allows the ANN to change the activation function. It is important to note that bias is not essential for a network but it helps in improving the performance of a network significantly [23]. Mathematically, the input (I) and output (Y) of a neuron in an ANN model can be represented as [23]: where X is the input, W is the weight, b is the bias and θ is the sum of all weighted inputs. The output of a neuron can be presented by different activation functions. A sigmoid function (Equation (2)) is a commonly used activation function for such outputs [24]. This sigmoid function maps the data between 1 and 0 ( Figure 2). A sigmoid function is adopted in this study because it is a bounded differentiable function. It is defined for all real inputs and at each point it produces a non-negative derivative [21,22]. Other activation functions could have been adopted; however, the use of the sigmoid function is deemed adequate in this study. The dependent and predictor variables in this study are normalized to achieve zero mean and unit variance for each of the variables.    In recent years, the application of ANN model has increased in rainfall-runoff modelling, groundwater modelling, streamflow forecasting and water quality modelling (e.g., [25][26][27][28][29][30]). Table 1 presents a list of relevant literature which used ANN successfully in solving hydrological and related problems. It can be seen that the majority of studies used the feed forward (FF) algorithm in the ANN architecture, back propagation (BP) for optimization, and log normal as a transfer function (Table 1). In FF neural networks, the information only travels forward in the neural network through the input nodes, then through the hidden layers and finally through the output nodes [23]. Optimization is used for the fine-tuning of weights factors in ANN modelling based on the errors in the previous iteration [23].

Materials and Methods
Data from 88 stream gauging stations across New South Wales (NSW) were used to develop the ANN model ( Figure 4). Selected gauges were situated on natural streams and free from any major regulation. Drainage area between the selected gauging stations varied from 8 to 1010 km 2 . The first, second (median) and third quartile drainage area were 142.5, 260 and 537.25 km 2 , respectively. The streamflow data of the selected stream gauging stations were downloaded from the WaterNSW website [85]. The periods of annual maximum (AM) flow records at these stream gauging stations varied between 25 to 82 years. This study was conducted based on eight predictor variables that included (i) mean annual rainfall (MAR); (ii) areal potential evapo-transpiration (MAE), (iii) drainage area (AREA) (iv) 6-hour duration rainfall for a 2-year return period (I62); (v) shape factor (SF); (vi) stream density (SDEN); (vii) river slope (S1085) and (viii) proportion of forest (FOREST).
A 1:100,000 topographic map was used to delineate catchment boundary and drainage area for individual gauge. Rainfall intensity data (I62) were obtained from the online portal of Australian Bureau of Meteorology (BOM) using intensity-frequency-duration (IFD) estimation tool [86]. The MAR and MAE data at the catchment centroid were extracted from Australian Bureau of Meteorology website [86]. The shape factor (SF) for a catchment was considered as the shortest distance between catchment outlet and centroid divided by the square root of the drainage area. The SDEN was calculated as a ratio of the stream length (total length for all streams in a catchment) and drainage area. Stream length was measured on a 1:100,000 topographic map using a digital distance meter. The forested area for a catchment was measured on 1:100,000 topographic map using a planimeter. The minimum, maximum and median parameter values are presented in Table 2. Five different ANN-based RFFA models were selected to assess their performances (Table 3). Model 1 consisted of all the eight predictor variables. Other models consisted of more than one but less than eight variables. It is important to note that models were selected based on the best potential combination, but there could be other combinations as well. As recommendations in previous RFFA studies, rainfall intensity was included in all five models and drainage area was selected as the predictor variable [70,87]. Six flood events of discharge magnitudes 2 (Q2), 5 (Q5), 10 (Q10), 20 (Q20), 50 (Q50) and 100 (Q100) years return period were used as dependent variables. Design flood discharges were estimated by fitting the Log Pearson type III (LP3) probability distribution function to the observed annual maximum flood discharge [88,89]. The analyses were conducted using the FLIKE software, which is commonly used for flood frequency modelling in Australia [90]. The main advantage of FLIKE is that it fits five probability distribution functions for a given set of flood data and identifies the best frequency model for that data set [91]. In this study, LP3 was selected because it was found as the best fit distribution for Australian stream gauge data [10]. Table 3. Five different ANN based regional flood frequency analysis (RFFA) models with adopted catchment characteristics represented by green colour (predictor variables).

Predictor Variable Model 1 Model 2 Model 3 Model 4 Model 5
Area I62 MAR SF MAE SDEN S1085 Forest A 2-layer feed forward neural network and with a backpropagation algorithm was selected in this study.
The selected 88 stream gauging stations were sub-divided into three sets. Group 1 consisted of 62 stations (70%) and they were used for model training, Group 2 consisted of 13 stations (15%) and they were used for validation. The third group (13 stations), were used to evaluate model performances in predictions. It is important to note that the gauges in each group were selected randomly. The analyses were conducted using a two-layer FF neural network with two hidden layers in each model setup. The Levenberg-Marquardt algorithm was used for model training. The activation function was represented in the model using a hyperbolic sigmoid function. The entire analyses were carried out using MATLAB software. A set of statistical tests were performed to evaluate the model performance. These include, root mean squared error (RMSE), root mean squared normalised error (RMSNE), relative root mean squared error (RRMSE), coefficient of determination (R 2 ), mean bias (BIAS), relative mean bias in percent (rBIAS), absolute relative error (abs-RE) and quantile ratio (r, ratio of predicted and observed discharge) as presented in Equations (3)(4)(5)(6)(7)(8)(9)(10). These evaluation statistics were adopted from Bloschl et al. [92]. (10) where � and are the predicted and observed flood discharge, respectively, � is the mean observed discharge and n is the number of stations.

Results and Discussion
Models performances in predicting flood quantiles are presented in Table 4 in terms of abs-RE and R 2 . The results show that all the models perform well for the majority of quantiles, except for Q2 by Model 1 (61%), Q5 by Model 4 (80%) and Q2 (320%) and Q5 (85%) by Model 5. Model 2 (two predictor variables) performed reasonably well with the lowest and highest RE of 32.9% and 47.7%, respectively. For all five quantiles, Model 3 performed very well (RE values in the range of 29.48%-52.24%). Model 4 performed well for most quantiles, except Q5. Model 5 predictions are relatively poor for all quantiles, with the worst prediction for Q2 (RE of 319.61%). Overall, Model 5 produced the poorest results among the five models.
Coefficient of determination (R 2 ) values for Q5, Q10, Q20, and Q50 show a moderate model accuracy for the Model 1 (R 2 values ranging from 0.71 to 0.76). Model 2 performed well except for the Q100 (R 2 value of 0.16). Model 3 performed well for Q2, Q5, Q10 and Q20. Overall, Model 1 and 3 produced better results compared to other models. Model 4 performed the best for Q2 (R 2 = 0.74) and worst for Q20. As expected, R 2 values for smaller ARIs (average recurrence interval) are higher than those of the large ARIs. Model 5 performed the worst for R 2 value which is consistent with RE value. Model performances in terms of RMSNE and RRMSE are presented Table 5 for all five models. In regard to RMSNE, models performed well for most quantities, except Q2 in the case of Model 1, Q20 for Model 4 and Q2, Q5 and Q100 for Model 5. The RRMSE value for Model 1 is highest for Q2, which indicates low accuracy of the prediction. Model 2 predictions are relatively better with RMSNE values of between 0.89 (Q2) and 2.47 (Q100). In addition, the best value of RRMSE is found for Q5 (0.48) and the poorest value is found for Q100 (0.85). Model 3 performed well for Q10 with the value of 0.75; however, for Q100, the RMSNE is poor with the value of 3.5. The RRMSE values for Q2, Q10 and Q20 are approximately the same (0.48). Furthermore, the values of RRMSE are the same (0.69) for Q50 and Q100. The highest error is associated with Q5, with a value of 0.82. Model 4 performed well for Q2 and Q10 with the values of 0.53 and 0.56, respectively, while the best performance was for Q50. Like other indicators, Model 5 performed the worst for RMSNE and RRMSE.  Table 6 presents the comparison between five models in terms of model predictions for different floods. There is no clear trend of underprediction or overprediction, nor any trend for small or large floods. All models overpredicted for some floods and underpredicted for some other floods. Overall, Model 3 produced the best result (relative bias of 33%) and Model 4 produced the worst result (relative bias of 1258%).  Figure 5 shows a graphical comparison of model performance for predicting flood discharge for the flood events of different magnitudes. In general, Model 2 and 4 overpredicted, while other models underpredicted flood magnitude. Overall, Model 2 predictions are close to 1, which indicates a better performance. Model 5 performed the worst with predicted discharge for Q2, which is more than double the observed value.   The parameter setup for the Model 3 is similar to the current Australian Rainfall and Runoff (ARR) model for NSW which consists of AREA, I62 and SF as predictor variables. While Model 1 (which includes all eight predictor variables) performs well with respect to RE for all quantiles except Q2, Model 3's performances are the best. It is important to note that none of the models are the best for all six quantiles. For example, with respect to RE, Model 3 is the best for predicting Q2 and Q20 but Model 1 is better for predicting Q5, Q10, Q50 and Q100. Model 4 performs poorly with respect to BIAS. Model 1 is the best based on R 2 except for Q2 and Q100. Table 7 compares the performance of our two best models (Model 1 and Model 3) with similar RFFA studies based on ANN. It shows that both our models provide smaller median RE than Aziz et al. [22] for higher ARIs (10-100 years). Also, in terms of R 2 , our models perform much better than those of Dawson et al. [58]. The best model (Model 3) contains three predictor variables: AREA (the main scaling factor in flood generation process), I62 (the main input that triggers runoff) and SF (the factor that affects travel time of generated runoff). It has been found that the other five predictor variables (MAR, MAE, SDEN, S1085 and FOREST) have minor roles in RFFA modelling in the study area.

Conclusions
In this study, ANN-based regional flood frequency models are developed to estimate the design floods. The models were tested using observed discharge data from 88 catchments in NSW. Five models were tested considering two to eight predictor variables. The best combinations of predictor variables were identified based on model performance in predicting design floods of 2, 5, 10, 20, 50 and 100 years return periods. Model performances were evaluated based on nine statistical error metrics. The study found that model predictions are better when all eight predictor variables are included. However, the results based on three predictor variables were found to be close to the results for eight predictor variables. The study concludes that a high number of predictor variables does not always improve the model predictions. The use of MAR, MAE, SDEN, S1085 and FOREST as predictor variables provide little contribution in predicting design flood for an ungauged catchment. The key predictor variables are catchment area, rainfall intensity and slope factor. The findings are very useful because these predictor variables are readily available for the majority of the catchments. The results demonstrate the potential use of the ANN-based regional flood frequency model. However, further testing with larger data sets is necessary before it can be applied elsewhere.