# The Impact of Training Data Sequence on the Performance of Neuro-Fuzzy Rainfall-Runoff Models with Online Learning

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

^{2}), is located in Malaysia, with 10-min rainfall-runoff time-series from which 30 major events are used. The second catchment, Dandenong (272 km

^{2}), is located in Victoria, Australia, with daily rainfall and river stage (water level) data from which 11 years of data is used. DENFIS results were then compared with two groups of benchmark models: a regression-based data-driven model known as the Autoregressive Model with Exogenous Inputs (ARX) for both study sites, and physical models Hydrologic Engineering Center–Hydrologic Modelling System (HEC–HMS) and Storm Water Management Model (SWMM) for Sungai Kayu Ara and Dandenong catchments, respectively. DENFIS significantly outperformed the ARX model in both study sites. Moreover, DENFIS was found comparable if not superior to HEC–HMS and SWMM in Sungai Kayu Ara and Dandenong catchments, respectively. A sensitivity analysis was then conducted on DENFIS to assess the impact of training data sequence on its performance. Results showed that starting the training with datasets that include high peaks can improve the model performance. Moreover, datasets with more contrasting values that cover wide range of low to high values can also improve the DENFIS model performance.

## 1. Introduction

^{2}), Sungai Kayu Ara, and rainfall-water level modelling in a semi-urban temperate catchment (272 km

^{2}), Dandenong. The results obtained from DENFIS are first compared against the Autoregressive Model with Exogenous Inputs (ARX), which is considered as the data-driven benchmark model. Then, DENFIS results are compared with physically-based benchmark models Hydrologic Engineering Center-Hydrologic Modelling System (HEC–HMS) [40] and Storm Water Management Model (SWMM) for the Sungai Kayu Ara and Dandenong catchments, respectively. It is worth mentioning that this study was originally on Dandenong catchment with SWMM model as the benchmark. However, Sungai Kayu Ara catchment was later added to extend the study in an event-based R-R modelling problem as well. Since a calibrated HEC–HMS model with results was readily available from a previous study on this catchment, the results were adopted for the sake of saving time. Therefore, the two catchments of this study have different benchmark models. However, as these two models are not fundamentally too different, no significant impact is expected on the validity of the comparisons presented in this article. For the second objective, the sequence of the training data was purposely manipulated for both catchments to evaluate the impact of training data sequence on DENFIS model performance.

## 2. Materials and Methods

#### 2.1. Study Sites

^{2}as shown in Figure 1. The main river of this basin originates from the reserved highland area of Penchala and Segambut. Sungai Kayu Ara river basin is considered as tropical catchment, which is subjected to the Northeast Monsoon during December to March and the Southwest Monsoon during June to September [41]. Annual mean rainfall of this catchment is approximately 2000 mm as reported by Desa et al. [42]. The average daily temperature varies in the range of 25 °C to 33 °C while the mean monthly relative humidity falls within 70% to 90% depending upon the location and seasonal effect. The annual average evaporation rate is estimated between 4 to 5 mm/day. It is worth mentioning that the majority area of the catchment is flattened for development. Due to the specific characteristics of tropical rains which are normally short and intense, event-based R-R modelling is applicable for high-resolution data (5, 10, 15, 30, or 60 min R-R time series). Therefore, this catchment was chosen as the representative of an event-based R-R modelling in an urbanized tropical area, where high-resolution data (e.g., 10 min time steps) are required due to the nature of tropical R-R events. Moreover, this catchment has 10 rainfall stations, which makes the selection of inputs and model development quite challenging. The detail of rainfall and flow stations of Sungai Kayu Ara catchment and the period in which the data is considered are provided in Appendix A. In this study, a total of 30 rainfall-runoff events were extracted from 10-min rainfall-runoff time-series between March 1996 and July 2004. The event selection from the continuous time series was carried out by considering three main criteria. Firstly, the selected event must have been recorded in at least 6 rainfall stations (out of 10 rainfall stations). Secondly, the wetting front suction, which is one of the parameters used in the Green and Ampt infiltration method, is influenced by initial moisture content of the soil. Therefore, the inter-arrival time for selected rainfall events was decided to be greater than two days. Finally, to gain the optimum results for direct runoff values, rainfall events equal or greater than 3.0 mm were considered as the effective rainfall in this study.

^{2}. The primary creek in this catchment is the Dandenong creek which originates from the Dandenong Ranges National Park and discharges into Port Phillip Bay via both Mordialloc Creek and Patterson River. Although farmlands as well as some forest pockets remain in the catchment, approximately 45% of the land has been overcome by urbanization. The data used in this study is the mean daily rainfall and river level readings for the stations Dandenong, Rowville and Heathmont. Eleven years of daily data from January 2005 to December 2015 are available. Rowville and Heathmont are the two upstream stations with Heathmont having the highest elevation. This catchment was chosen as representative of a larger-sized catchment with multiple rainfall stations. Since this catchment is located in Australia, it was supposed that its rainfall regime could be significantly different as compared to the tropical catchment. The detail of rainfall and water level stations of Dandenong catchment and the period in which the data is considered are provided in Appendix A.

#### 2.2. Dynamic Evolving Neural-Fuzzy Inference System (DENFIS)

_{thr}, which defines the maximum allowable distance between a new data point and the center of existing clusters. In other words, if the calculated distance for a new data point exceeds this threshold, that point will become a new cluster center. The aforementioned distance in the ECM algorithm follows the typical Euclidean distance between two vectors x and y as it is denoted in Equation (1):

_{thr}. The visual demonstration of such a mechanism can be seen in Figure 4 for a 2-D input space.

- Step 1.
- Receiving a new data point x
_{i}, its distance with the centers of all n existing clusters (created previously) need to be calculated using ${D}_{Cj}=\Vert {x}_{i}-{C}_{Cj}\Vert $ for $j=1,2,\dots ,n$ where j is the cluster index and C_{Cj}is the center of the jth cluster. If all examples of the data stream have been presented, the algorithm is complete. - Step 2.
- The calculated distance D
_{ij}will be compared against all existing cluster radius R_{j}. If any radius satisfies the condition ${D}_{ij}<{R}_{j}$, then x_{i}belongs to the closest cluster (denoted as C_{m}) with the minimum distance of ${D}_{im}=\Vert {x}_{i}-{C}_{Cm}\Vert =\mathrm{min}(\Vert {x}_{i}-{C}_{Cj}\Vert )$ for ${D}_{ij}\le {R}_{j}\left(j=1,2,\dots ,n\right)$. In this case, the new data point is adopted by an existing rule; therefore, no new cluster is created, and no existing cluster gets updated (the cases of x_{4}and x_{6}in Figure 4b,c). At this stage, the algorithm returns to Step 1. If D_{ij}> Rj, the algorithm continues to the next step. - Step 3.
- For all n existing cluster centers, the parameter S
_{ij}will be calculated for input data xi and clusters j = 1, 2, …, n, using S_{ij}= D_{ij}+ R_{ij}. The cluster that gives the minimum S_{ij}will be denoted as cluster C_{a}with center C_{Ca}and parameter S_{ia}. Then algorithm goes to the next step. - Step 4.
- If S
_{ia}> 2 × D_{thr}, the input data xi does not belong to any existing clusters and a new cluster needs to be created similar to step 0 (the cases of x_{3}and x_{8}in Figure 4) and then the algorithm then returns to step 1. Else (i.e., S_{ia}≤ 2× D_{thr}), algorithm goes to the next step. - Step 5.
- Since S
_{ia}≤ 2× D_{thr}, the cluster C_{a}needs to be updated by revising the center location and increasing the cluster radius. In this process, the new radius will be set as R_{a (new)}= S_{ia}/2 while the new center will be located at the point on the line connecting x_{i}and C_{Ca}with a distance of R_{a (new)}from point x_{i}(the cases of x_{2}, x_{5}, x_{7}and x_{9}in Figure 4). The algorithm proceeds to the step 1.

_{1}, x

_{2}, …, x

_{k}are the antecedent variables (inputs); and a

_{0}, a

_{1}, a

_{2}, …, a

_{k}are the linear function parameters to be optimized during the learning process through training dataset. Further details on the learning mechanism in DENFIS can be found in Kasabov and Song [32].

#### 2.3. Benchmark Models

#### 2.3.1. Hydrologic Engineering Center–Hydrologic Modelling System (HEC–HMS)

#### 2.3.2. Storm Water Management Model (SWMM)

#### 2.3.3. Autoregressive Model with Exogenous Inputs (ARX)

_{a}and n

_{b}are the number of output and input antecedents, respectively; n

_{k}is the delay associated with each input; e(t) is the true error term; and a

_{i}and b

_{j}are the model parameters to be optimized. This model is one of the most popular benchmarks for evaluating the performance of artificial intelligence (AI) techniques such as ANNs and NFS. Therefore, it is considered as the bottom-line benchmark of this study. In other words, any proposed AI-based data-driven model that cannot supersede ARX, is not worth practicing. This is due to the higher complexity of AI-based techniques compared to ARX.

#### 2.4. Input Data Selection and Model Development

_{xy}is the covariance between variables x and y; and ${\sigma}_{x}{}^{2}$ and ${\sigma}_{y}{}^{2}$ are the variance of variables x and y, respectively. To select the optimal combination between MI and CC, [43] proposed a ranking coefficient as:

#### 2.5. Performance Criteria

^{2}), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Relative Peak Error (RPE). The detailed formulation of these performance criteria is provided in Table 2.

^{2}are known as appropriate measures to assess the goodness-of-fitness between observed and simulated values [52] and have been successfully used in several similar studies [53,54,55,56]. The CE = 1 and R

^{2}= 1 indicate a perfect match between observed and simulated values. Although both CE and R

^{2}seem to have similar functionality, CE has been found a more sensitive measure in extreme values as it penalizes the errors in extreme values more than R

^{2}[57]. On the other hand, RMSE is also a useful measure that accords extra importance on the outliers in the data set; therefore, it is more biased towards the errors in simulating high values [52,58,59]. MAE computes all deviations between observed and simulated values regardless of the data point value. In other words, MAE is not giving any weight to errors on high or low values [57,60]. In addition to the overall goodness-of-fit, accurate prediction of peak flow is also important. Thus, RPE has been included in this study to evaluate the ability of the proposed models to accurately predict peak flows.

## 3. Result and Discussions

#### 3.1. Input Selection Results

_{D}, RS

_{R}, and RS

_{H}refer to the Dandenong rainfall station, upstream Rowville river stage station and upstream Heathmont river stage station, respectively. Moreover, t is considered as the present time; therefore, t − t

_{0}denotes t

_{0}time steps before the present time. For example, R(t − 1) in a daily dataset represents the rainfall data for 1-day before the present day.

#### 3.2. DENFIS Performance on Event-Based R-R Modelling in Sungai Kayu Ara Catchment

_{thr}was considered 0.1 as recommended by [30]. A sensitivity analysis was conducted to explore whether increasing or decreasing the D

_{thr}may improve model performance. As no significant improvement was observed, D

_{thr}= 0.1 was fixed for Sungai Kayu Ara catchment. For comparison purposes, DENFIS was compared against two benchmark models: (1) a regression-based model, ARX, and (2) a HEC–HMS model. The ARX model was developed by using the same training events while the number of inputs was found based on trial and error to reach the best testing performance. In Sungai Kayu Ara catchment, the best performing ARX model was achieved by using 18 rainfall and 1 discharge antecedents. For HEC–HMS, however, the results are adopted from a previous study conducted by Alaghmand et al. [61,62]. In the aforementioned studies, HEC–HMS is calibrated with the same training events while all 10 rainfall stations are used as model inputs. Other parameters of the model such as catchment area, imperviousness, roughness, etc. were also set based on the available data from the catchment.

#### 3.3. DENFIS Performance on Continuous R-R Data Modelling in Dandenong Catchment

_{thr}= 0.1 was adopted to develop DENFIS mode in the Dandenong catchment. In this case also, sensitivity analysis confirmed that the adopted value is efficient and can be fixed for this study. After calibrating (training) the DENFIS model, it was validated by testing dataset. For comparison purposes, two benchmark models ARX and SWMM were calibrated for the Dandenong catchment. The ARX model was calibrated for the same training dataset while the best performing combination was achieved by using 10 rainfall and 3 water level antecedents. The SWMM model was calibrated using 1 arc-second resolution DEM data alongside the data obtained from 9 rainfall stations distributed across the catchment. It is worth mentioning that, the extra 6 rainfall stations were not used in developing DENFIS and ARX models due to the discontinuity in their timeseries compared to the main 3 rainfall stations. Several parameters such as total area, slope inclination and pervious/impervious areas were loaded into the SWMM model. Moreover, other model parameters such as catchment width, infiltration rate, and Manning’s coefficients, were further adjusted using the Sensitivity-based Radio Tuning Calibration Tool (SRTC) [63], which allows parameter fine-tuning to improve the model performance. SRTC minimizes the uncertainty of the inferred model parameters [64]. In this approach, for each parameter an uncertainty value is assigned based the parameter type. The uncertainty values help to define the lower and upper limits of each parameter.

#### 3.4. Impact of Training Data Sequence on DENFIS Performance

^{3}/s were considered as L, between 40 to 70 m

^{3}/s were considered as M, larger than 70 m

^{3}/s were considered as H. DENFIS was then trained with various combinations of these 3 data categories (i.e., low, medium, high) and validated by the testing dataset used in Section 3.2 and Section 3.3 to evaluate the model performances for each combination. For Sungai Kayu Ara catchment, the 12 training events were distributed between the three categories of low, medium, and high based on their peak discharges while the testing events remained the same (i.e., 18 testing events). Whereas for Dandenong catchment, due to the uneven distribution of training data in the low, medium, and high categories, it was decided to choose a 1-year long representative dataset for each data category (i.e., low, medium, high) to avoid any potential impact of data length on this sensitivity analysis. In this way, only 3 out of 8 years of training dataset is used for this sensitivity analysis while the same testing dataset (last 3 years of the time series) is kept for validation. For selection of the three aforementioned years, peak values of 2 m and 2.9 m were considered as the thresholds to segregate L from M and M from H, respectively. A summary of the low, moderate, and high data categories for both catchments of this study is provided in Table 6.

#### 3.5. Study Limitations and Future Research Direction

## 4. Conclusions

- DENFIS performed well in both event-based rainfall-runoff modelling (Sungai Kayu Ara catchment) and continuous rainfall-river stage simulation (Dandenong catchment) in terms of several goodness-of-fit criteria including CE, R
^{2}, RMSE, and MAE. Its results were significantly superior to those obtained from the benchmark model ARX (e.g., in Sungai Kayu Ara catchment, DENFIS result of CE = 0.876 was significantly higher than CE = 0.175 obtained by ARX) and were moderately better than the ones obtained by physically-based benchmark models HEC–HMS and SWMM in Sungai Kayu Ara and Dandenong catchments, respectively. - In peak estimation in the Sungai Kayu Ara catchment, DENFIS produced comparable results in terms of RPE against HEC–HMS model (RPE = 0.113 for DENFIS against RPE = 0.179 for HEC–HMS); however, HEC–HMS had more scattered RPE values with few outliers. In Dandenong catchment, DENFIS (RPE = 0.159) significantly outperformed SWMM (RPE = 0.363) in peak estimation.
- The systematic investigation on the impact of data sequence with low (L), medium (M), and high (H) categories of output data showed that data category of high values, H, contributes to generation of more number of rules in both catchments. Moreover, in the Dandenong catchment, the combinations starting with contrasting categories (i.e., LH or HL) found to be successful in improving the model performance. This was attributed to the fact that the available contrasting data in early stage of training can result in an appropriate initialization of the model parameters. Moreover, this can contribute to generating more diverse rules in the rule-base which can eventually improve the model performance. This finding can be very useful when users choose the training data set.
- The findings of this study suggest the need for running sensitivity analysis on the training dataset during the development of NFS models with local learning. Moreover, the promising results of the proposed AI-based data-driven model, DENFIS, shows the potential advantages of this model in catchments with limited hydrological data.

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A

**Table A1.**Detail of rainfall and flow stations and their recorded data in the two catchments of this study.

Sungai Kayu Ara | |||||

Station No. | Stations ID | Station Name | Start Date | End Date | Coeff. of Variation |

R1 | 3110004 | Balai Polis Sea Park | 1-March-1996 | 31-July-2004 | 4.07 |

R2 | 3110006 | Tmn. Bukit Mayang Mas | 1-March-1996 | 31-July-2004 | 3.24 |

R3 | 3110007 | Sek. Ren. China Yuk Chai | 1-March-1996 | 31-July-2004 | 3.25 |

R4 | 3110009 | Tropicana Golf Resort | 1-March-1996 | 31-July-2004 | 3.68 |

R5 | 3110010 | Balai Polis TTDI | 1-March-1996 | 31-July-2004 | 3.48 |

R6 | 3110011 | Sungai Penchala Upstream | 1-March-1996 | 31-July-2004 | 3.42 |

R7 | 3110012 | Masjid Jamek Sg.Penchala | 1-March-1996 | 31-July-2004 | 3.52 |

R8 | 3110013 | TNB Bandar Utama | 1-March-1996 | 31-July-2004 | 4.40 |

R9 | 3110014 | Sek. Men. Damansara Jaya | 1-March-1996 | 31-July-2004 | 3.11 |

R10 | 3110015 | SRK BDR Sri Damansara | 1-March-1996 | 31-July-2004 | 3.64 |

Q | 3111404 | Sungai Kayu Ara | 1-March-1996 | 31-July-2004 | 1.39 |

Dandenong | |||||

Station No. | Stations ID | Station Name | Start Date | End Date | |

R_{D} | 228204C | Dandenong | 1-January-2005 | 31-December-2015 | 2.74 |

R_{R} | 228368A | Rowville | 1-January-2005 | 31-December-2015 | 3.01 |

R_{H} | 228357A | Heathmont | 1-January-2005 | 31-December-2015 | 3.04 |

RS_{D} | DADAN0322 | Dandenong | 1-January-2005 | 31-December-2015 | 2.49 |

RS_{R} | DADAN0235 | Rowville | 1-January-2005 | 31-December-2015 | 2.92 |

RS_{H} | DADAN0077 | Heathmont | 1-January-2005 | 31-December-2015 | 2.50 |

## References

- Cheng, X.; Noguchi, M. Rainfall-raunoff modelling by neural network approach. In Proceedings of the International Conference on Water Resources & Environment Research, Kyoto, Japan, 29–31 October 1996; pp. 143–150. [Google Scholar]
- Keesstra, S.; Nunes, J.P.; Saco, P.; Parsons, T.; Poeppl, R.; Masselink, R.; Cerdà, A. The way forward: Can connectivity be useful to design better measuring and modelling schemes for water and sediment dynamics? Sci. Total Environ.
**2018**, 644, 1557–1572. [Google Scholar] [CrossRef] - Masselink, R.J.H.; Temme, A.J.A.M.; Giménez, R.; Casalí, J.; Keesstra, S.D. Assessing hillslope-channel connectivity in an agricultural catchment using rare-earth oxide tracers and random forests models. Cuad. Investig. Geogr.
**2017**, 43, 19–39. [Google Scholar] [CrossRef] - May, D.B.; Sivakumar, M. Prediction of urban stormwater quality using artificial neural networks. Environ. Modell. Softw.
**2009**, 24, 296–302. [Google Scholar] [CrossRef] - Rossman, L.A. Storm Water Management Model User’s Manual, Version 5.0; National Risk Management Research Laboratory, Office of Research and Development, US Environmental Protection Agency: Cincinnati, OH, USA, 2010.
- Bergstrom, S.; Forsman, A. Development of a conceptual deterministic rainfall-runoff model. Nord. Hydrol.
**1973**, 4, 240–253. [Google Scholar] [CrossRef] - Chen, X.Y.; Chau, K.W.; Wang, W.C. A novel hybrid neural network based on continuity equation and fuzzy pattern-recognition for downstream daily river discharge forecasting. J. Hydroinform.
**2015**, 17, 733–744. [Google Scholar] [CrossRef] - Nourani, V.; Tahershamsi, A.; Abbaszadeh, P.; Shahrabi, J.; Hadavandi, E. A new hybrid algorithm for rainfall-runoff process modeling based on the wavelet transform and genetic fuzzy system. J. Hydroinform.
**2014**, 16, 1004–1024. [Google Scholar] [CrossRef] - Maheswaran, R.; Khosa, R. Wavelets-based non-linear model for real-time daily flow forecasting in Krishna River. J. Hydroinform.
**2013**, 15, 1022–1041. [Google Scholar] [CrossRef] - Jain, A.; Kumar, A.M. Hybrid neural network models for hydrologic time series forecasting. Appl. Soft Comput. J.
**2007**, 7, 585–592. [Google Scholar] [CrossRef] - Shamseldin, A.Y. Artificial neural network model for river flow forecasting in a developing country. J. Hydroinform.
**2010**, 12, 22–35. [Google Scholar] [CrossRef] [Green Version] - Asadnia, M.; Chua, L.H.; Qin, X.S.; Talei, A. Improved particle swarm optimization-based artificial neural network for rainfall-runoff modeling. J. Hydrol. Eng.
**2014**, 19, 1320–1329. [Google Scholar] [CrossRef] - Meng, C.; Zhou, J.; Tayyab, M.; Zhu, S.; Zhang, H. Integrating artificial neural networks into the VIC model for rainfall-runoff modeling. Water
**2016**, 8, 407. [Google Scholar] [CrossRef] - Semiromi, M.T.; Omidvar, S.; Kamali, B. Reducing computational costs of automatic calibration of rainfall-runoff models: Meta-models or high-performance computers? Water
**2018**, 10, 1440. [Google Scholar] [CrossRef] - Cho, S.Y.; Quek, C.; Seah, S.X.; Chong, C.H. HebbR2-Taffic: A novel application of neuro-fuzzy network for visual based traffic monitoring system. Expert Syst. Appl.
**2009**, 36 Pt 2, 6343–6356. [Google Scholar] [CrossRef] - Alilou, H.; Rahmati, O.; Singh, V.P.; Choubin, B.; Pradhan, B.; Keesstra, S.; Ghiasi, S.S.; Sadeghi, S.H. Evaluation of watershed health using Fuzzy-ANP approach considering geo-environmental and topo-hydrological criteria. J. Environ. Manag.
**2019**, 232, 22–36. [Google Scholar] [CrossRef] [PubMed] - Keesstra, S.D.; Temme, A.J.A.M.; Schoorl, J.M.; Visser, S.M. Evaluating the hydrological component of the new catchment-scale sediment delivery model LAPSUS-D. Geomorphology
**2014**, 212, 97–107. [Google Scholar] [CrossRef] - Mamdani, E.H.; Assilian, S. Experiment in linguistic synthesis with a fuzzy logic controller. Int. J. Man-Mach. Stud.
**1975**, 7, 1–13. [Google Scholar] [CrossRef] - Takagi, T.; Sugeno, M. Fuzzy identification of systems and its applications to modeling and control. IEEE Trans. Syst. Man Cybern.
**1985**, 1, 116–132. [Google Scholar] [CrossRef] - Sugeno, M.; Kang, G. Structure identification of fuzzy model. Fuzzy Sets Syst.
**1988**, 28, 15–33. [Google Scholar] [CrossRef] - Jang, J.-S.R. ANFIS: Adaptive-network-based fuzzy inference system. IEEE Trans. Syst. Man Cybern.
**1993**, 23, 665–685. [Google Scholar] [CrossRef] - Nayak, P.C.; Sudheer, K.P.; Rangan, D.M.; Ramasastri, K.S. A neuro-fuzzy computing technique for modeling hydrological time series. J. Hydrol.
**2004**, 291, 52–66. [Google Scholar] [CrossRef] - Nayak, P.; Sudheer, K.P.; Rangan, D.M.; Ramasastri, K.S. Short-term flood forecasting with a neurofuzzy model. Water Resour. Res.
**2005**, 41. [Google Scholar] [CrossRef] [Green Version] - Remesan, R.; Shamim, M.A.; Han, D.; Mathew, J. Runoff prediction using an integrated hybrid modelling scheme. J. Hydrol.
**2009**, 372, 48–60. [Google Scholar] [CrossRef] - Mukerji, A.; Chatterjee, C.; Raghuwanshi, N.S. Flood forecasting using ANN, neuro-fuzzy, and neuro-GA models. J. Hydrol. Eng.
**2009**, 14, 647–652. [Google Scholar] [CrossRef] - Talei, A.; Chua, L.H.C.; Wong, T.S. Evaluation of rainfall and discharge inputs used by Adaptive Network-based Fuzzy Inference Systems (ANFIS) in rainfall–runoff modeling. J. Hydrol.
**2010**, 391, 248–262. [Google Scholar] [CrossRef] - Talei, A.; Chua, L.H.C.; Quek, C. A novel application of a neuro-fuzzy computational technique in event-based rainfall–runoff modeling. Expert Syst. Appl.
**2010**, 37, 7456–7468. [Google Scholar] [CrossRef] - Bartoletti, N.; Casagli, F.; Marsili-Libelli, S.; Nardi, A.; Palandri, L. Data-driven rainfall/runoff modelling based on a neuro-fuzzy inference system. Environ. Modell. Softw.
**2018**, 106, 35–47. [Google Scholar] [CrossRef] - Zakhrouf, M.; Bouchelkia, H.; Stamboul, M. Neuro-Wavelet (WNN) and Neuro-Fuzzy (ANFIS) systems for modeling hydrological time series in arid areas. A case study: The catchment of Aïn Hadjadj (Algeria). Desalin. Water Treat.
**2016**, 57, 17182–17194. [Google Scholar] [CrossRef] - Chang, T.K.; Talei, A.; Alaghmand, S.; Ooi, M.P.L. Choice of rainfall inputs for event-based rainfall-runoff modeling in a catchment with multiple rainfall stations using data-driven techniques. J. Hydrol.
**2017**, 545, 100–108. [Google Scholar] [CrossRef] - Mosavi, A.; Ozturk, P.; Chau, K.W. Flood prediction using machine learning models: Literature review. Water
**2018**, 10, 1536. [Google Scholar] [CrossRef] - Kasabov, N.K.; Song, Q. DENFIS: Dynamic evolving neural-fuzzy inference system and its application for time-series prediction. IEEE Trans. Fuzzy Syst.
**2002**, 10, 144–154. [Google Scholar] [CrossRef] - Hong, Y.-S.T.; White, P.A. Hydrological modeling using a dynamic neuro-fuzzy system with on-line and local learning algorithm. Adv. Water Resour.
**2009**, 32, 110–119. [Google Scholar] [CrossRef] - Chang, T.K.; Talei, A.; Quek, C.; Pauwels, V.R. Rainfall-runoff modelling using a self-reliant fuzzy inference network with flexible structure. J. Hydrol.
**2018**, 564, 1179–1193. [Google Scholar] [CrossRef] - Luna, I.; Soares, S.; Ballini, R. An adaptive hybrid model for monthly streamflow forecasting. In Proceedings of the Fuzzy Systems Conference, London, UK, 23–26 July 2007; FUZZ-IEEE 2007. IEEE International: Piscataway, NJ, USA, 2007. [Google Scholar]
- Hong, Y.-S.T. Dynamic nonlinear state-space model with a neural network via improved sequential learning algorithm for an online real-time hydrological modeling. J. Hydrol.
**2012**, 468, 11–21. [Google Scholar] [CrossRef] - Talei, A.; Chua, L.H.C.; Quek, C.; Jansson, P.E. Runoff forecasting using a Takagi–Sugeno neuro-fuzzy model with online learning. J. Hydrol.
**2013**, 488, 17–32. [Google Scholar] [CrossRef] - Nguyen, P.K.-T.; Chua, L.H.C.; Talei, A.; Chai, Q.H. Water level forecasting using neuro-fuzzy models with local learning. Neural Comput. Appl.
**2016**, 30, 1877–1887. [Google Scholar] [CrossRef] - Ashrafi, M.; Chua, L.H.C.; Quek, C.; Qin, X. A fully-online Neuro-Fuzzy model for flow forecasting in basins with limited data. J. Hydrol.
**2017**, 545, 424–435. [Google Scholar] [CrossRef] - U S Army Corps of Engineers (USACE). Hydrologic Modeling System: Technical Reference Manual; U.S. Army Corps of Engineers, Hydrologic Engineering Center: Davis, CA, USA, 2000.
- Desa, M.; Niemczynowicz, J. Spatial variability of rainfall in Kuala Lumpur, Malaysia: Long and short term characteristics. Hydrol. Sci. J.
**1996**, 41, 345–362. [Google Scholar] [CrossRef] - Desa, M.; Munira, M.N.; Akhmal, H.; Kamsiah, A.W. Capturing extreme rainfall events in Kerayong catchment. In Proceedings of the 10th International Conference on Urban Drainage, Copenhagen, Denmark, 21–26 August 2005. [Google Scholar]
- Talei, A.; Chua, L.H. Influence of lag time on event-based rainfall–runoff modeling using the data driven approach. J. Hydrol.
**2012**, 438, 223–233. [Google Scholar] [CrossRef] - Onyutha, C. On Rigorous Drought Assessment Using Daily Time Scale: Non-Stationary Frequency Analyses, Revisited Concepts, and a New Method to Yield Non-Parametric Indices. Hydrology
**2017**, 4, 48. [Google Scholar] [CrossRef] - Maier, H.R.; Dandy, G.C. Neural networks for the prediction and forecasting of water resources variables: A review of modelling issues and applications. Environm. Modell. Softw.
**2000**, 15, 101–124. [Google Scholar] [CrossRef] - Mekanik, F.; Imteaz, M.; Talei, A. Seasonal rainfall forecasting by adaptive network-based fuzzy inference system (ANFIS) using large scale climate signals. Clim. Dyn.
**2016**, 46, 3097–3111. [Google Scholar] [CrossRef] - Chau, K.; Wu, C. A hybrid model coupled with singular spectrum analysis for daily rainfall prediction. J. Hydroinform.
**2010**, 12, 458–473. [Google Scholar] [CrossRef] [Green Version] - Alizadeh, M.J.; Kavianpour, M.R.; Kisi, O.; Nourani, V. A new approach for simulating and forecasting the rainfall-runoff process within the next two months. J. Hydrol.
**2017**, 548, 588–597. [Google Scholar] [CrossRef] - Elshorbagy, A.; Corzo, G.; Srinivasulu, S.; Solomatine, D.P. Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology-Part 2: Application. Hydrol. Earth Syst. Sci.
**2010**, 14, 1943–1961. [Google Scholar] [CrossRef] - He, J.; Valeo, C.; Chu, A.; Neumann, N.F. Prediction of event-based stormwater runoff quantity and quality by ANNs developed using PMI-based input selection. J. Hydrol.
**2011**, 400, 10–23. [Google Scholar] [CrossRef] - Da Costa Couto, M.P. Review of input determination techniques for neural network models based on mutual information and genetic algorithms. Neural Comput. Appl.
**2009**, 18, 891–901. [Google Scholar] [CrossRef] - Legates, D.R.; McCabe, G.J., Jr. Evaluating the use of ‘goodness-of-fit’ measures in hydrologic and hydroclimatic model validation. Water Resour. Res.
**1999**, 35, 233–241. [Google Scholar] [CrossRef] - Boscarello, L.; Ravazzani, G.; Cislaghi, A.; Mancini, M. Regionalization of flow-duration curves through catchment classification with streamflow signatures and physiographic-climate indices. J. Hydrol. Eng.
**2016**, 21, 05015027. [Google Scholar] [CrossRef] - Castellarin, A.; Galeati, G.; Brandimarte, L.; Montanari, A.; Brath, A. Regional flow-duration curves: Reliability for ungauged basins. Adv. Water Resour.
**2004**, 27, 953–965. [Google Scholar] [CrossRef] - Massari, C.; Brocca, L.; Ciabatta, L.; Moramarco, T.; Gabellani, S.; Albergel, C.; De Rosnay, P.; Puca, S.; Wagner, W. The Use of H-SAF Soil Moisture Products for Operational Hydrology: Flood Modelling over Italy. Hydrology
**2015**, 2, 2–22. [Google Scholar] [CrossRef] [Green Version] - Masseroni, D.; Galeati, G.; Brandimarte, L.; Montanari, A.; Brath, A. A reliable rainfall-runoff model for flood forecasting: Review and application to a semi-urbanized watershed at high flood risk in Italy. Hydrol. Res.
**2017**, 48, 726–740. [Google Scholar] [CrossRef] - Abrahart, R.; Kneale, P.E.; See, L.M. Neural Networks for Hydrological Modeling; CRC Press: Boca Raton, FL, USA, 2004. [Google Scholar]
- Dawson, C.W.; See, L.M.; Abrahart, R.J.; Heppenstall, A.J. Symbiotic adaptive neuro-evolution applied to rainfall–runoff modelling in northern England. Neural Netw.
**2006**, 19, 236–247. [Google Scholar] [CrossRef] [PubMed] - Brocca, L.; Melone, F.; Moramarco, T.; Morbidelli, R. Spatial-temporal variability of soil moisture and its estimation across scales. Water Resour. Res.
**2010**, 46. [Google Scholar] [CrossRef] [Green Version] - Kisi, O.; Shiri, J.; Tombul, M. Modeling rainfall-runoff process using soft computing techniques. Comput. Geosci.
**2013**, 51, 108–117. [Google Scholar] [CrossRef] - Alaghmand, S.; Bin Abdullah, R.; Abustan, I.; Vosoogh, B. GIS-based river flood hazard mapping in urban area (a case study in Kayu Ara River Basin, Malaysia). Int. J. Eng. Technol.
**2010**, 2, 488–500. [Google Scholar] - Alaghmand, S.; bin Abdullah, R.; Abustan, I.; Eslamian, S. Comparison between capabilities of HEC-RAS and MIKE11 hydraulic models in river flood risk modeling (a case study of Sungai Kayu Ara River basin, Malaysia). Int. J. Hydrol. Sci. Technol.
**2012**, 2, 270–291. [Google Scholar] [CrossRef] - Choi, K.S.; Ball, J.E. Parameter estimation for urban runoff modelling. Urban Water
**2002**, 4, 31–41. [Google Scholar] [CrossRef] - Akhter, M.S.; Hewa, G.A. The use of PCSWMM for assessing the impacts of land use changes on hydrological responses and performance of WSUD in managing the impacts at myponga catchment, South Australia. Water
**2016**, 8, 511. [Google Scholar] [CrossRef] - Onyutha, C. Influence of Hydrological Model Selection on Simulation of Moderate and Extreme Flow Events: A Case Study of the Blue Nile Basin. Adv. Meteorol.
**2016**, 2016, 7148326. [Google Scholar] [CrossRef]

**Figure 4.**Schematic mechanism of the Evolving Clustering Method (ECM) using the samples x

_{1}to x

_{9}in a 2-D space (adopted from Kasabov and Song [32]).

**Figure 5.**Comparison of RPE values for peak discharge of 18 testing events simulated by DENFIS and HEC–HMS models in Sungai Kayu Ara catchment.

**Figure 6.**Observed versus simulated discharge for 18 testing events in Sungai Kayu Ara catchment obtained by: (

**a**) DENFIS and (

**b**) HEC–HMS.

**Figure 7.**Comparison of RPE values for selected peak water levels simulated by DENFIS and SWMM models in Dandenong catchment.

**Figure 8.**Observed versus simulated river stage by: (

**a**) DENFIS, and (

**b**) SWMM models in Dandenong catchment.

**Figure 9.**Comparison of observed discharge time-series with DENFIS rule number progression for training combination HLM in the Sungai Kayu Ara catchment.

**Figure 10.**Comparison of observed water level time-series with DENFIS rule number progression for training combination LHM in Dandenong catchment.

Sungai Kayu Ara (10-min Interval Data) | ||||

Rainfall (mm) | Discharge (m³/s) | |||

Training | Testing | Training | Testing | |

Minimum | 0 | 0 | 3.20 | 0.10 |

Maximum | 26.5 | 48.0 | 135.00 | 180.90 |

Mean | 0.7 | 0.6 | 17.81 | 12.33 |

Standard Deviation | 2.2 | 2.5 | 21.85 | 22.15 |

Skewness | 4.8 | 6.5 | 2.70 | 3.90 |

Dandenong (Daily Data) | ||||

Rainfall (mm) | River Stage (m) | |||

Training | Testing | Training | Testing | |

Minimum | 0 | 0 | 0 | 0 |

Maximum | 149.0 | 84.0 | 6.80 | 3.00 |

Mean | 2.0 | 1.9 | 0.18 | 0.16 |

Standard Deviation | 6.0 | 5.1 | 0.44 | 0.27 |

Skewness | 8.6 | 5.6 | 7.35 | 4.45 |

Performance Criteria | Formula | Unit | Range |
---|---|---|---|

Nash-Sutcliffe Coefficient of Efficiency | $CE=1-\frac{{\displaystyle \sum _{i=1}^{n}{\left({Q}_{Obs,i}-{Q}_{Sim,i}\right)}^{2}}}{{\displaystyle \sum _{i=1}^{n}{\left({Q}_{Obs,i}-{\overline{Q}}_{Obs}\right)}^{2}}}$ | Dimensionless | (−∞, 1] |

Coefficient of Determination | ${R}^{2}={\left[\frac{{\displaystyle \sum _{i=1}^{n}\left({Q}_{Obs,i}-{\overline{Q}}_{Obs}\right)\left({Q}_{Sim,i}-{\overline{Q}}_{Sim}\right)}}{\sqrt{{\displaystyle \sum _{i=1}^{n}{\left({Q}_{Obs,i}-{\overline{Q}}_{Obs}\right)}^{2}}}\times \sqrt{{\displaystyle \sum _{i=1}^{n}{\left({Q}_{Sim,i}-{\overline{Q}}_{Sim}\right)}^{2}}}}\right]}^{2}$ | Dimensionless | [0, 1] |

Root Mean Square Error | $\mathrm{RMSE}=\sqrt{\frac{{\displaystyle \sum _{i=1}^{n}{\left({Q}_{Sim,i}-{Q}_{Obs,i}\right)}^{2}}}{n}}$ | m^{3}s^{−1} | [0, +∞) |

Mean Absolute Error | $\mathrm{MAE}=\frac{{\displaystyle \sum _{i=1}^{n}\left|{Q}_{Sim,i}-{Q}_{Obs,i}\right|}}{n}$ | m^{3}s^{−1} | [0, +∞) |

Relative Peak Error | $\mathrm{RPE}=\frac{\left|\left({Q}_{p,Obs}\right)-\left({Q}_{p,Sim}\right)\right|}{{Q}_{p,Obs}}$ | Dimensionless | [0, +∞) |

Catchment | Selected Inputs |
---|---|

Sungai Kayu Ara | R2(t − 2), R7(t − 1), R9(t − 8), Q(t − 1) |

Dandenong | R_{D}(t − 1), RS_{R}(t − 1), RS_{H}(t − 1) |

**Table 4.**Average CE, R

^{2}, RMSE, MAE, and RPE values over the 18 testing events resulted by Dynamic Evolving Neural Fuzzy Inference System (DENFIS), Hydrologic Engineering Center-Hydrologic Modelling System (HEC–HMS), and Autoregressive Model with Exogenous Inputs (ARX) models for Sungai Kayu Ara catchment.

Model | CE (-) | R^{2} (-) | RMSE (m^{3}/s) | MAE (m^{3}/s) | RPE (-) |
---|---|---|---|---|---|

DENFIS | 0.876 | 0.899 | 5.056 | 2.100 | 0.113 |

HEC–HMS | 0.595 | 0.876 | 7.218 | 4.261 | 0.179 |

ARX | 0.175 | 0.545 | 10.032 | 7.401 | 0.451 |

**Table 5.**CE, R

^{2}, RMSE, and MAE values for testing river stage time series simulated by DENFIS, SWMM, and ARX models in Dandenong catchment.

Model | CE (-) | R^{2} (-) | RMSE (m) | MAE (m) | RPE (-) |
---|---|---|---|---|---|

DENFIS | 0.803 | 0.808 | 0.121 | 0.056 | 0.159 |

SWMM | 0.686 | 0.696 | 0.153 | 0.067 | 0.363 |

ARX | 0.689 | 0.797 | 0.150 | 0.062 | 0.320 |

**Table 6.**Training data categories of low, moderate, and high for the Sungai Kayu Ara and Dandenong catchments.

Catchment | Data Category | ||
---|---|---|---|

Low | Moderate | High | |

Sungai Kayu Ara (Events) | 1, 3, 4, 8, 9 | 2, 5, 7, 10 | 6, 11, 12 |

Dandenong (Year) | 2007 | 2012 | 2011 |

**Table 7.**Rules count and average DENFIS performance in simulating the 18 testing events of Sungai Kayu Ara catchment trained with Low (L), Medium (M), and High (H) datasets.

Parameters | Training Data Sequence | |||||
---|---|---|---|---|---|---|

LMH | LHM | MLH | MHL | HLM | HML | |

Rules Count | 18 | 18 | 18 | 18 | 20 | 20 |

Rules Distribution | 5, 3, 10 | 5, 10, 3 | 7, 4, 7 | 7, 11, 0 | 14, 0, 6 | 14, 6, 0 |

CE | 0.805 | 0.779 | 0.810 | 0.781 | 0.845 | 0.819 |

R^{2} | 0.842 | 0.817 | 0.839 | 0.833 | 0.868 | 0.857 |

RMSE (m^{3}/s) | 5.635 | 5.860 | 5.557 | 5.835 | 5.248 | 5.319 |

MAE (m^{3}/s) | 2.462 | 2.711 | 2.400 | 2.697 | 2.335 | 2.434 |

**Table 8.**Rules count and overall DENFIS performance in simulating 3 years of testing dataset in Dandenong catchment trained with Low (L), Medium (M), and High (H) datasets.

Parameters | Training Data Sequence | |||||
---|---|---|---|---|---|---|

LMH | LHM | MLH | MHL | HLM | HML | |

Rule Count | 15 | 15 | 14 | 13 | 15 | 14 |

Rules Distribution | 5, 8, 2 | 5, 8, 2 | 8, 4, 2 | 8, 2, 3 | 9, 4, 2 | 9, 2, 3 |

CE | 0.408 | 0.771 | 0.442 | 0.712 | 0.758 | 0.713 |

R^{2} | 0.524 | 0.784 | 0.527 | 0.714 | 0.778 | 0.750 |

RMSE (m) | 0.276 | 0.172 | 0.268 | 0.193 | 0.177 | 0.192 |

MAE (m) | 0.199 | 0.088 | 0.185 | 0.117 | 0.093 | 0.143 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Chang, T.K.; Talei, A.; Chua, L.H.C.; Alaghmand, S.
The Impact of Training Data Sequence on the Performance of Neuro-Fuzzy Rainfall-Runoff Models with Online Learning. *Water* **2019**, *11*, 52.
https://doi.org/10.3390/w11010052

**AMA Style**

Chang TK, Talei A, Chua LHC, Alaghmand S.
The Impact of Training Data Sequence on the Performance of Neuro-Fuzzy Rainfall-Runoff Models with Online Learning. *Water*. 2019; 11(1):52.
https://doi.org/10.3390/w11010052

**Chicago/Turabian Style**

Chang, Tak Kwin, Amin Talei, Lloyd H. C. Chua, and Sina Alaghmand.
2019. "The Impact of Training Data Sequence on the Performance of Neuro-Fuzzy Rainfall-Runoff Models with Online Learning" *Water* 11, no. 1: 52.
https://doi.org/10.3390/w11010052