Flood Prediction Using Machine Learning , Literature Review

Floods are among the most destructive natural disasters, which are highly complex to model. The research on the advancement of flood prediction models contributed to risk reduction, policy suggestion, minimization of the loss of human life, and reduction the property damage associated with floods. To mimic the complex mathematical expressions of physical processes of floods, during the past two decades, machine learning (ML) methods contributed highly in the advancement of prediction systems providing better performance and cost-effective solutions. Due to the vast benefits and potential of ML, its popularity dramatically increased among hydrologists. Researchers through introducing novel ML methods and hybridizing of the existing ones aim at discovering more accurate and efficient prediction models. The main contribution of this paper is to demonstrate the state of the art of ML models in flood prediction and to give insight into the most suitable models. In this paper, the literature where ML models were benchmarked through a qualitative analysis of robustness, accuracy, effectiveness, and speed are particularly investigated to provide an extensive overview on the various ML algorithms used in the field. The performance comparison of ML models presents an in-depth understanding of the different techniques within the framework of a comprehensive evaluation and discussion. As a result, this paper introduces the most promising prediction methods for both long-term and short-term floods. Furthermore, the major trends in improving the quality of the flood prediction models are investigated. Among them, hybridization, data decomposition, algorithm ensemble, and model optimization are reported as the most effective strategies for the improvement of ML methods. This survey can be used as a guideline for hydrologists as well as climate scientists in choosing the proper ML method according to the prediction task conclusions.


Introduction
Among the natural disasters, floods are the most destructive, causing massive damage to human life, infrastructure, agriculture and the socioeconomic system.Governments, therefore, are under pressure to develop reliable and accurate maps of flood risk areas and further plan for sustainable flood risk management focusing on prevention, protection and preparedness [1].Flood prediction models are of significant importance for hazard assessment and extreme events management.Robust and accurate prediction highly contribute to water recourse management strategies, policy suggestions and analysis, and further evacuation modeling [2].Thus, importance of advanced systems for short-term and long-term prediction for flood and other hydrological events are strongly emphasized to alleviate damage [3].However, prediction of flood lead-time and occurrence location is fundamentally complex due to the dynamic nature of climate condition.Therefore, today's major flood prediction models are mainly data specific and involves various simplified assumptions [4].Thus, to mimic the complex mathematical expressions of physical processes and basin behavior, such models benefit from specific techniques (Hykin, 1992) e.g.event driven, empirical black box, lumped and distributed, stochastic, deterministic, continuous and hybrids [5].
Physical models [6] such as global circulation [7] have been long used to predict hydrological events, including the coupled effects of atmosphere, ocean and floods.Although physical models have shown great capabilities to predict a diverse range of flooding scenarios, they often require various types of hydro-geomorphological monitoring data sets, requiring intensive computation, which prohibits short-term prediction [8].Furthermore, as [9] states the development of physicallybased models often require in-depth knowledge and expertise over hydrological parameters reported to be highly challenging.Moreover, numerous research suggests that, there is a gap in short-term prediction capability of physical models (Hudson et al., 2011).For instance, on many occasions such models failed to predict properly [10].van den Honert and McAneney [10] documented the failure in prediction of the floods accrued in Queensland, Australia in 2010.Similarly, the numerical prediction models [11] are reported deterministic and not reliable due to systematic errors [12].
In addition to the numerical and physical models, data-driven models have also a long tradition in flood modeling, which recently have gained more popularity.Data-driven methods of prediction assimilate the measured climate indices and hydro-meteorological parameters to provide better insight.Among them, the statistical models of autoregressive moving average (ARMA) [13], multiple linear regression (MLR) [14], and autoregressive integrated moving average (ARIMA) [15] are the most common flood frequency analysis (FFA) methods for modeling flood prediction.FFA, has been among the early statistical methods to predict floods [16].Regional flood frequency analyses (RFFA) [17], as the more advanced versions, have reported more efficient when compared to physical models considering computation cost, generalization.Assuming floods as stochastic processes, they can be predicted using certain probability distributions from historical streamflow data [18].For instance, climatology average method (CLIM) (Wang et al. 2003), empirical orthogonal function (EOF) [19], multiple linear regressions (MLR), quantile regression techniques (QRT) [20] and Bayesian forecasting models (Biondi and De Luca), have been widely used for predicting major floods.However, they reported not to be suitable for short-term prediction, and in this context, they need major improvement due to the lack of accuracy, complexity of the usage, computation cost, and robustness of the method.Furthermore, for the reliable long-term prediction, at least, a decade of data from measurement gauges should be analyzed for a meaningful forecast [21] .At the absence of such dataset, however, FFA can be done using hydrologic models of RFFA e.g.MISBA [22] , and Sacramento [23] as reliable empirical methods with regional applications, where streamflow measurements are unavailable.In this context, distributed numerical models have been used as an attractive solution [24].Nonetheless, they do not provide quantitative flood predictions, and their forecast skill level are "only moderate" and lack accuracy [25].
The drawbacks of physically-based and statistical models, mentioned above, encourage the usage of advanced data-driven models e.g. machine learning (ML).Further reason to the popularity of such models is that they can numerically formulate the flood nonlinearity solely based on historical data without requiring knowledge about the underlying physical processes.Data-driven prediction models using ML are promising tools as they are quicker to develop with minimal inputs.ML is a field of AI to induce regularities and patterns, providing easier implementation, with low computation cost, fast in training, validation, testing and evaluation, high performance compared to physical models, and relatively less complex compared to physical models [26].The continuous advancement of the ML methods, over the last two decades, have demonstrated their suitability for flood forecasting with an acceptable rate of outperforming conventional approaches [27].Recent investigation of [28] which compared performance of a number of physical and ML prediction models showed higher accuracy of ML models.Furthermore, literature includes numerous successful experiments of quantitative precipitation forecasting (QPF) using ML methods for different lead-time prediction e.g.[29,30].In comparison to the traditional statistical models, the ML models have been used for prediction with greater accuracy [31].Ortiz-García, et al. [32] describe how ML techniques can efficiently model the complex hydrological systems such as flood.Many ML algorithms, e.g.artificial neural networks (ANN) [33], neuro-fuzzy [34,35], support vector machine (SVM) [36], and support vector regression (SVR) [37] have been reported effective for both short-term and long-term flood forecast.In addition, it has been proven that the performance of ML can be improved through hybridization with other ML methods, soft computing techniques, numerical simulation and/or physical models.Such applications have provided more robust and efficient models that can effectively learn the complex flood system in an adaptive manner.Although the literature includes numerous evaluation performance analysis of individual ML models e.g.[38][39][40][41], there is no definite conclusion has been reported that which models function better in a certain application.Nonetheless, ML algorithms have important characteristics that need to be carefully taken into consideration.The first one is that they are as good as their training where the system, learns the target task based on past data.If the data is scarce or not cover varieties of the task, their learning falls short, and hence they cannot perform well when they are put into work.The second aspect is the capabilities of each ML algorithm, which may vary across different types of tasks.This can also be called as "generalization problem" which indicates how well the trained system can predict cases it was not trained for, i.e. whether it can predict beyond the range of training dataset.For example, some algorithms may perform well on short-term prediction but not in long-term prediction.These characteristics of the algorithms need to be clarified with respect to the type and amount of available training data, and the type of prediction task e.g.water level, streamflow.In this review, we look into examples of use of various ML algorithms for various types of tasks.At the abstract level, we decided to divide the target tasks into short-term and long-term prediction.We then reviewed the ML applications on flood related tasks, where we had structured the ML methods as single methods and hybrid methods.Hybrid methods are the ones that combine more than one ML methods.
Here, we should note that this paper surveys the ML models used for prediction of flood on the sites where rain gauges or intelligent sensing systems used.Our goal is to survey the prediction models with various lead-time to flood in a particular site.From this perspective, the spatial flood prediction is not involved in this study, as we do not study the prediction models used to estimate/identify the location of flood.In fact we are concerned only with the lead-time on an identified site.

Method and outline
This survey identifies the state of the art of ML methods for flood prediction where recent articles from Scopus are reviewed.Scopus is Elsevier's citation database including mainly peer-reviewed articles in top-level subject fields in types of book series, proceedings, and journals.Due to the multidisciplinary aspect of Scopus that allows easier search outside of the discipline, the depth of coverage, and including the more recent works, the Scopus database has been used for this survey.To choose an article for our survey, four types of quality measure for each article have been considered i.e. source normalized impact per paper (SNIP), CiteScore, SCImago journal rank (SJR), and h-Index.The papers are reviewed in terms of flood resource variables, ML methods, prediction type and the obtained results.Among the articles identified, through the search queries, using the search strategy, those including the performance evaluation and comparison of ML methods have given priority to be included in the review to identify the ML methods that perform better in a particular application.
The applications in flood prediction can be classified according to the flood resource variables i.e. water level, river flood, soil moisture, rainfall-discharge, precipitation, river inflow, peak flow, river flow, rainfall-runoff, flash flood, rainfall, streamflow, seasonal stream flow, flood peak discharge, urban flood, plain flood, groundwater level, rainfall stage, flood frequency analysis, quantiles flood, surge level, extreme flow, storm surge, typhoon rainfall, and daily flows [42].Among these key influencing flood resource variables, the rainfall, and the spatial examination of hydrologic cycle have the most remarkable role in runoff and flood modeling [43].This is the reason why a quantitative rainfall prediction, including avalanches, slush flow, and melting snow, has been traditionally used for flood prediction especially in prediction of flash floods, or short-term flood prediction [44].However, the rainfall prediction has shown to be not adequate for accurate flood prediction (Ferraris et al., 2002).For instance, prediction of streamflow, in a long-term flood prediction scenario, in addition to rainfall, it also depends on soil moisture estimates in a catchment (Robertson et al., 2013).Although, a high resolution precipitation forecasting is essential, the other flood resource variables have been considered in literature (Bliefernicht and Bardossy 2007).Thus, the methodology of this literature review aims to include the most effective flood resource variables in the search queries.
A combination of these flood resource variables and ML methods have been used to implement the complete list of search queries.Note that, the ML Methods for flood prediction may significantly vary according to the application, dataset, and the prediction type.For instance, the ML methods used for the short-term water level prediction is significantly different from the long-term streamflow prediction.

articles
The search query includes three main search term.The flood resource variables is considered as the term 1 of the search, which includes 25 keywords for search queries mentioned above.The term 2 of search includes the ML algorithms.Gori, M. (2017) provides a complete list of ML methods of which 25 most popular algorithms in engineering applications have been used as the keywords of search.The term 3 includes the four search terms most often used in describing the flood prediction i.e. "prediction" or "estimation" or "forecast" or "analysis".The total search results in 6596 articles records.

Q1-n
For creating the ML prediction model the historical records of flood events in addition to the real-time cumulative data of a number of rain gauges or other sensing devices for various return periods are often used.The sources of dataset are traditionally rainfall, and water level, measured either by ground rain gauges, or relatively new remote sensing technologies of satellites, multisensor systems, and/or radars [45].Nevertheless, the remote sensing is an attractive tool to capture higher resolution data in real time.In addition, often the high resolution of weather radar observations provides more reliable dataset comparing to rain gauges [46].Thus, building a prediction model upon radar-based rainfall dataset reported to provide higher accuracy in general [47].Whether using a radar-based dataset or ground gauges to create a prediction model the historical dataset of hourly, daily and/or monthly are divided into individual sets to construct and evaluate the learning models.To do so the individual sets of data would include training, validation, verification and testing.The principal of ML modeling workflow and strategy for flood modeling are described in details in the literature e.g.[37,48].Figure .2represents the basic flow for building a ML model.The major ML algorithms applied to flood prediction include ANNs [49], neuro-fuzzy [50], ANFIS [51], support vector machines (SVM) [52], wavelet neural network (WNN) [53], and multilayer perceptron (MLP) [54].In the following a brief description and background of these fundamental ML algorithms are presented.ANN as an efficient mathematical modeling system, through an efficient parallel processing, has the ability to mimic the biological neural network using inter-connecting the neuron units.Among all ML methods, ANNs, as the most popular learning algorithms, are known to be versatile and efficient in modeling the complex flood processes with a high fault tolerance and accurate approximation [28].In comparison to the traditional statistical models, the ANN approach has been used for prediction with greater accuracy [55].The ANN algorithms are the most popular in modeling the flood prediction since the first usages in 1990s [56].Instead of catchment's physical characteristics, ANNs derive meaning from historical data.Thus, ANNs are considered as the reliable data-driven tools for constructing the black box models to model the complex and non-linear relationship of rainfall and flood [57], as well as river flow and discharge forecasting [58].
Furthermore, a number of surveys e.g.[59] suggests ANN as one of the most suitable modeling technique which provides an acceptable generalization ability, and speed comparing to most conventional models.[60,61] provided a review on the ANN applications in flood.ANNs have already been successfully used for numerous flood prediction applications e.g.stream flow forecasting [62], river flow [63,64], rainfall-runoff [65], precipitation-runoff modeling [66], water quality (Maier and Dandy, 1996) sequences.In ANN, the backpropagation (BP) is a multi-layered NN where weights are calculated using the propagation of the backward error gradient.In BP, there are more phases in the learning cycle using a function for activation to sends signal to the other nodes.Among various ANNs, the backpropagation ANN (BPNN) are identified as the most powerful prediction tool suitable for flood time series prediction (Govindaraju and Rao, 2000).Extreme learning machine (ELM) [70] which is a easy to use form of FFNN, with single-hidden layer.Here, ELM is studied under ANN methods.ELM for flood prediction has recently become of interest of hydrologists and has been used to model the short-term streamflow with promising results [71,72].

Multilayer perceptron (MLP)
The vast majority of ANN models for flood prediction are often trained with a BPNN [73].While BPNN are today widely used in this realm the MLP, as an advanced representation of ANNs, has recently gained popularity [74].MLP [75] is a class of FFNN which utilizes the supervised learning of BP for training of the network of interconnected nodes of multiple layers.Simplicity, non-linear activation and high number of layers are the characteristic of MLP.Due to these characteristics, the model has been widely used in flood prediction and other complex hydrogeological models [76].In an assessment of ANN classes used in flood modeling, the MLP models have been reported to be more efficient, with better generalization ability.Nevertheless, MLP, generally is found to be more difficult to optimize [77].The back-percolation learning algorithm calculate the propagation error in hidden network nodes individually for a more advanced modeling approach.
Here, it is worth mentioning that MLP more than any other variation of ANNs e.g.FFNN, BPNN, and FNN has gained popularity among hydrologists.Furthermore, due to the vast number of case studies using standard form of MLP, it has been departed from regular ANNs.In addition the authors of articles in the realm of flood prediction using MLP refer to their models as a MLP model.
From this perspective, we decided to devote a separate section to MLP.

Adaptive neuro-fuzzy inference system (ANFIS)
Fuzzy logic of Zadeh [78] is a qualitative modeling scheme of soft computing technique through using natural language.Fuzzy logic as a simplified mathematical model works on incorporating the expert knowledge into a fuzzy inference system (FIS).FIS further mimics the human learning through an approximation function with less complexity which provides a great potential for none-linear modeling of extreme hydrological events [79,80] and particularly in flood [81].For instance, [82] studied using it for river level forecasting, Lohani et  shows easier implementation, and better generalization capability through the one pass subtractive clustering algorithm which leads to avoid the several rounds of random selection.

Wavelet Neural Network (WNN)
Wavelets transform (WT), as a mathematical tool, can be used to extract information from various data source through analyzing local variation in time series (Venkata Ramana et al. 2013).
The WT has, in fact, significantly positive effects on the modeling performance.Wavelet transforms supports reliable decompositions of an original time series to improve data quality.The accuracy of prediction is improved through Discrete WT (DWT), which decomposes the original data into bands leading to improvement of flood prediction lead-times (Partal and Cigizoglu 2009).The DWT decomposes the initial data set into individual resolution levels for extracting better data quality for model building [83].DWTs due to their beneficial characteristics have been widely used in flood time series prediction.In flood modeling, DWTs has been widely applied in e.g.rainfall-runoff (Ravansalar 2017), daily streamflow [84], reservoir inflow [85].Furthermore, hybrid models of DWTs e.g.wavelet-based neural network, WNN [86], which combines the WT and FFNN, and waveletbased regression model [87], which integrates WT and multiple linear regression (LR), have been used in time series prediction of flood [88].The application of WNN for flood prediction has been reviewed by [53], whom concluded that WNN can highly enhance the model accuracy.In fact, most recently, WNNs, due to their potential in enhancing the time series data, has gained popularity in flood modeling [39], in applications like, daily flow [89], rainfall-runoff [90], water level [91] and flash flood [92].

Support vector machine (SVM)
Hearst, et al. [93] proposed and classified Support Vector (SV) as a nonlinear search algorithm using statistical learning theory.Later, SVM [94] was introduced as a class of SV, for classification to minimize the overfitting and reduce the expected error of a learning machine.SVM, with a great popularity in flood modeling, is a supervised learning machine which works based on statistical learning theory and structural risk minimization rule.The training algorithm of SVM builds models that assign new non-probabilistic binary linear classifiers, which minimizes the empirical classification error and maximizes the geometric margin via inverse problem solving.SVM has been used to predict a quantity forward in time based on training from past data.Over the past two decades, the SVM has been also extended as a regression tool, known as Support Vector Regression (SVR) [95].
SVMs are today know as robust and efficient ML algorithms for flood prediction [96].SVM and SVR have emerged as the alternative ML methods to ANNs with a high popularity among hydrologists for flood prediction.They use statistical learning theory of structural risk minimization (SRM) which provides a unique architecture to deliver great generalization and superior efficiency.
Most importantly, SVMs are both suitable for linear and non-linear classification and efficient mapping the inputs into feature spaces [97].Thus, it has been applied in numerous flood prediction cases with promising results, excellent generalization ability and better performance, comparing to ANNs and MLRs, e.g.extreme rainfall [98], precipitation [32], rainfall-runoff [99], reservoir inflow [100], streamflow [101], flood quantiles [37], flood time series [102], and soil moisture [103].Unlike ANN, SVM is more suitable for nonlinear regression problems, to identify the global optimal solution in flood models [104].Although the high computation cost of using SVM and its unrealistic outputs might be demanding, due to its heuristic and semi-black box nature, the least square-support vector machine (LS-SVM), has highly improved the performance with acceptable computational efficiency [105] .The alternative approach of LS-SVM is to solve a set of linear tasks instead of complex quadratic problems [106].

Decision tree (DT)
The ML method of DT is one of the contributor in predictive modelling with a wide application in flood simulation.DT uses a tree of decisions from branches to the target values of leaves.In classification trees (CT), the final variables in a DT contain a discrete set of values where leaves represent class labels and branches represent conjunctions of features labels.When the target variable in a DT has continuous values and an ensemble of trees are involved it would be regression trees (RT) [107].The regression and classification trees share some similarities and differences.As DTs are classified as fast algorithms, they have become very popular in ensemble forms to model and predict floods [108].Classification and regression tree (CART) [109,110] which is a popular type of DT used in ML, have been applied successfully to flood modeling yet its applicability to flood prediction has not been fully investigated [111].The random forests (RF) [112] is another popular DT methods for flood prediction [113].RF includes a number of tree predictors.Each individual tree creates a set of response predictor values associated with a set of independent values.Further an ensemble these trees selects the best choice of classes [96].[114] introduced RF as an effective alternative to SVM which often delivers higher performance in the flood prediction modeling.Later, [115]compared  To improve the accuracy of import data and better dataset management, the ensemble mean is proposed as a powerful approach coupled with ML methods [117,118].Empirical mode decomposition (EMD) [119], and ensemble EMD (EEMD) [120], are widely used for flood prediction [121,122].Nevertheless, the EMD-based forecast models are also subjected a number of drawbacks (Huang et al., 2014).Literature includes numerous research on improving the performance of decomposition and prediction models in terms of performance, additivity and generalization ability [123].

Classification of ML methods and applications
State-of-the-art of ML methods for flood prediction identifies the most popular methods.ANFIS, MLP, WNN, EPS, DT, RF, CART, and ANN are identified as the most popular ML modeling methods.The figure shows an apparent continues increase and notable progress in using novel hybrid methods.or in some cases 18 h or 24 h.The daily prediction can be 1, 2, 6 days ahead forecast.The monthly forecast can be for instance up to three months.Nevertheless, in hydrology, the definition of shortterm and long-term in studying the different phenomenon varies.Short-term prediction in flood is often referred to hourly, daily and weekly ahead prediction, and they used for warning system.While the long term predictions are mostly used for the policy analysis purposes.According to Tang et al., (2008) if the prediction leading time to flood is three days longer than the confluence time the prediction is considered to be a long-term.From this perspective, here, in this study we consider the greater lead-time of weekly a long-term prediction.Nonetheless, it is observed that the characteristics of the ML methods used are significantly varies according to period of prediction.Thus, dividing the survey in short-term and long-term is essential.
Here, it is also worth emphasize that in this paper, the prediction lead-time has been classified as "short-term" and "long-term".Although the "flash floods" happen in a short period of time with a great destructive power, they can be predicted with either "short-term" or "long-term" lead-time to the actual flood.In fact, this paper concerns with the lead-times instead of duration or type of a flood.If the lead-time prediction to a flash flood is short-term then it is studied as the short-term lead time.Yet, sometimes the flash floods can be predicted with long-lead times.In the other words, a flash flood might be predicted one month ahead.In this case the prediction is considered as longterm.Regardless of the type of flood, we only focus on the lead-time.In this study the ML methods has been reviewed in two different classes of single methods and hybrid methods.Figure 4, represents the taxonomy of the research.Step 1: running the queries one by one, step 2: checking the results of search, and initiating the next search, step 3: identifying the comparative studies on ML models of prediction, refining the results and building the data base, step 4: identifying whether it is a long-term or a short-term prediction, step 5: identifying if it is a single or hybrid method and constructing the relevant tables of two and three, step 6: identifying if it is a single or hybrid method, step 7: constructing the relevant tables of four and five.Four tables provide the list of studies on different prediction techniques which are the organized comprehensive surveys of the literature.Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 5 October 2018 doi:10.20944/preprints201810.0098.v1 4. Short-term flood prediction with ML Short-term lead-time flood prediction is considered as an important research challenge, specifically in highly urbanized areas for timely warning to residences to reduce damage (Golding, 2009).In addition, the short-term prediction highly contributes in water recourse management.Even with the recent improvements in the NWP models, artificial intelligence (AI) methods and ML, the short-term prediction remains a challenging task [124].This section is divided in two sections of single and hybrid methods of ML to individually investigate each group of methods.

Short-time flood prediction using single ML methods
To have an insight over the performance of ML methods a comprehensive comparison is required to investigate ML methods.Table .2presents a summary of the major ML methods i.e.ANNs, MLP, NARX, M5 model trees, DTs, CART, SVR, RF followed by a comprehensive performance comparison of single ML methods in short-term flood prediction.A revision and discussion of these methods follow to identify the most suitable methods presented in literature.Kim and Barros [125] modified an ANN model to improve flood forecasting short-term leadtime through consideration of atmospheric conditions.They used satellite data from the ISCCP-B3 data set [149].The data set includes hourly rainfall from 160 rain gauges within the region.The ANN reported to be by far more accurate than the statistical models.In another similar work [33] developed the ANN forecast model for hourly lead-time.In their study, various data sets, consisting of the meteorological and hydrodynamic parameters of three typhoons.The test of ANN forecast models showed promising results for 5 h lead-time.In another attempt, Danso-Amoako [126] provide a rapid system for predicting flood with ANN.They provide a reliable forecasting tool for rapidly assessing floods.R 2 value of 0.70 for the ANN model proves that the tool is suitable to well predict the flood variables with a high generalization ability.Panda, Pramanik and Bala [128] compared the accuracy of ANN with a physical model of MIKE 11 for short-term water level prediction.Data set includes the hourly discharge and water level within 2006 and 2009.The data of the year 2006 used for testing RMSE which were 0.89 and 1.00 for ANN and MIKE 11, respectively.The results indicate that ANN model is much better than MIKE 11.Kourgialas, Dokou and Karatzas [127] created a modeling system for prediction of extreme flow based on ANN for 3h, 12h, and 19 h ahead of the flood.They analyzed the data for five years of hourly data to investigate the ANN effectiveness in modeling extreme flood events.The results reported to be highly effective comparing to the conventional hydrological models.Lohani, Goel and Bhatia [129] improved the real-time forecasting rainfall-runoff of foods and the results were compared to T-S fuzzy model and subtractive clustering based T-S (TSC-T-S) fuzzy model.They however concluded that the fuzzy model provides more accurate prediction with longer lead-time.The hourly rainfall data from 1989 to 1995 of gauge site in addition to the rainfall during monsoon have been used.Pereira Filho and dos Santos [130] compared the AR model with ANN in simulating the forecast stage level and streamflow.Data set is created from independent flood events, radar derived rainfall and streamflow rain gauges available between 1991 and 1995.AR and ANN were employed to model the short-term flood in urban area utilizing streamflow and weather data.They showed that the ANN performed better in its verification and it proposed as a better alternative to AR model.
Ahmad and Simonovic [132] used a BPNN for predicting the peak flow utilizing causal meteorological parameters.The data set includes daily discharge data for 1958-1997 from a gauging stations.BPNN proved to be fast and accurate approach with the ability of generalization to be applied to other locations to other similar rivers.Furthermore, to improve simulating daily streamflow using BPNN, [133] used the division-based back propagation to obtain satisfying results.The raw data of local evaporation and rainfall gauges of six years were used for short-term flood prediction of streamflow time series.The data set of one decade from 1988 was used for training and the date set of five after that was used for testing.The BPNN model provided promising results however they lack efficiency in using raw data for time series prediction of streamflow.In addition, [134] show the application of BPNN to assess flash floods using measured data.The data set includes 5 min frequency of water quality data and 15 min frequency rainfall data of 20 years from two rain gauge stations.Their experiments introduce ANN models as simple ML methods to apply yet at the same time it requires expert knowledge by the user.In addition their ANN prediction model provided a great ability to deal with noisy data set.Ghose [135] predicted the daily runoff using a BPNN prediction model.The data of daily water level of two years from 2013-2105 has been used as date set.The BPNN model with efficiency of 96.4% and R2 of 0.94 is reported accurate for flood prediction.
Pan, Cheng and Cai [136] compared the performance of ELM, and SVM for short-term streamflow prediction.Both methods devolved similar amount of accuracy.However, ELM is suggested as a faster method in parameter selection and learning loop.[131] also conducted a comparison between MLRA, Fuzzy c-means, ANN and MLP over a common data set of sites to investigate the ML methods efficiency and accuracy.As the result the MLP, ANN method have been proposed as the best methods.Chang, Chen, Lu, Huang and Chang [137], [138] modeled a multi-step urban flood forecasts using BPNN and nonlinear autoregressive network with exogenous input (NARX) network for hourly forecasts.The results demonstrate that NARX network works better in short lead time prediction comparing to BPNN.The NARX network produces an average R2 value of 0.7.This study suggests that the proposed NARX models effective in urban flood prediction.Furthermore, Valipour et al. (2013) shows how the accuracy of ANN models can be increased through an integration with autoregressive models.
Bruen and Yang [139] modeled real-time rainfall-runoff forecasting for different lead-times using FFNN, ARMA, and functional networks.Here the functional networks [150] compared with a FFNN model.The models were tested using storm time series data set.As the result, functional networks allow quicker training in prediction of the rainfall-runoff processes with different leadtime.The models were able to predict floods with short lead-time.[141] estimated the water leveldischarge using M5 Trees and ANN.The data set collected from the period of 1990 to 1998 and the inputs are supplied by computing the average mutual information.The ANN and M5 model tree performed similar in terms of accuracy.[143] tested four DT models i.e. alternating decision trees (ADT), reduced error pruning trees (REPT), logistic model trees (LMT), and NBT, using data set of 200 floods.The ADT model is reported to performed better for flash flood prediction for a speedy determination of flood susceptible areas.In other research, [142] compared the performance of a NBT and DT prediction model, using geomorphological disposition parameters.Both models and their hybrids were compared in terms of prediction accuracy in a catchment.The advanced DTs are found promising for flood assessment in prone areas.They conclude that an independent data set and benchmarking of other ML methods are require for for judgment of the accuracy and efficiently of the method.[148] worked on a data set including more than 100 tropical cyclones (TCs) affecting a watershed for hourly prediction of precipitation.The performance of MLP, CART, CHAID, exhaustive CHAID, MLR, and CLIM has been compared.The evaluation results show that MLP and DTs provide better prediction.[140] applied a dynamic ANN, as well as a Z-R relation approach to construct one-hour-ahead prediction model.Data set includes three-dimensional radar data structure of typhoon events and rain gauges from 1990 to 2004 including various typhoons.The results indicate that ANN performs better.
Aichouri, Hani, Bougherira, Djabri, Chaffai and Lallahem [144] implemented a MLP model for flood prediction, and compared the results with the traditional MLR model.The rainfall-runoff daily data from 1986 to 2003 was used for model building.The results and comparative study indicate MLP approach performs with better yield for river rainfall-runoff.In a similar research, [145] modeled and predicted the river rainfall-runoff relationship through training six years of collected daily rainfall data using MLP and MLR (1990 to 1995).Further the data of 1996 used as testing to select the best performing network model.The R2 values for ANN and MLR models which are respectively 0.888 and 0.917 show that MLP approach gives much better prediction than MLR.[146] proposed a number of data-based flood prediction for daily stream flows models using MLP, WT, MLR, ARIMA, and ANN.Data set includes two time series of stream flow and meteorological data set including the record from 1970 to 2001.The results shown that MLP, WT, and ANN performed generally better.Yet the proposed WT prediction model evaluated not accurate as ANN and MLP for one-week leadtime.[147] designed optimal models of ANN and MLP for prediction of river level.This study indicates that an optimization tool for ANN network can highly improve the prediction quality.The candidate inputs include river levels, and mean sea level pressure (SLP) for the period of 2001-2002.The MLP is identified be the most accurate model for short-time river flood prediction.
Nayak and Ghosh [98] used SVM and ANN to predict hourly rainfall-runoff using the weather patterns.A model of SVM classifier for rainfall prediction is used and the results compared to ANN, and another advanced statistical technique.The SVM model appeared to predict extreme floods better than ANN.Furthermore, the SVM model proved to function better in terms of uncertainty.Gizaw and Gan [37] developed SVR and ANN models for creating RFFA to estimate regional flood quantiles and of climate change impact assessment.Data set include daily precipitation data obtained from gauges from 1950 to 2016 the present.RMSE, and R2, are used for the evaluation of the models.The SVR model estimated regional flood more accurately than ANN model.SVR reported to be a suitable choice to predict the future flood under uncertainty of climate change scenarios.In a similar attempt [96] provided effective real-time flood prediction using rainfall data set measured by radar.Two models of RF and SVM are developed and the prediction performances compared.Their performance comparison reveals that the effectiveness of SVM in real-time flood forecasting.To improve the quality of prediction, in terms of accuracy, generalization, uncertainty, longer lead-time, speed and computation costs, there has been an ever increasing trend in building hybrid ML methods.These hybrid methods are numerous including more popular ones such as ANFIS and WNN and further novel algorithms e.g.SVM-FR, HEC-HMS-ANN, SAS-MP, SOM-R-NARX, Wavelet-based NARX, WBANN, WNN-BB, RNN-SVR, RSVRCPSO, MLR-ANN, FFRM-ANN, and EPSs.Table .3presents these methods and a revision of the methods and applications follows along with a discussion on the ML methods.In addition, the model with human interaction can provide better performance.In another similar research [34] developed an ANFIS model based on the precipitation data set provides reliable prediction with an R2 more than 0.85 hourly prediction.The results reported highly satisfactory for typhoon season.[153] used ANFIS for at ungauged sites of 151 catchments and the results are evaluated and compared to the ANN, NLR, NLR-R modes using a Jackknife procedure.The evaluation shows that the ANFIS model provided higher generalization capability compared to NLR and ANN models.The ANFIS model implements an efficient mechanism for forecasting the flood region, and providing insight from the data, leading to prediction.Rezaeianzadeh (2014) presents a number of forecasting systems for daily flow prediction using ANN, ANFIS, MLR and MNLR.Furthermore, the performance of the models are calculated with RMSE and the R2.Data set includes precipitation data from various meteorological stations.Further, the evaluation shows that MNLR models with lower RMSE values have a better performance over the ANFIS, MLR, ANN models.Furthermore, MNLR is suggested as a low cost and efficient model for daily prediction of flow.In similar attempt Choubin, Darabi et al.
(2018) evaluated the accuracy of ANFIS, yet this time considering three common ML modeling tools, of CART, SVM, and MLP.The evaluation suggest that CART model performed better.Therefore, CART is strongly suggested as a reliable prediction tool for hydro-meteorological data sets.Kim and Singh [74] developed three models of generalized regression ANN (GRNNM), Kohonen selforganizing feature maps ANN (KSOFM-NNM), and MLP for flood prediction.Further, the prediction performance has been evaluated where KSOFM-NNM performed accurately comparing to MLP and GRNNM in forecasting flood discharge.The hybrid models, overall, shown to overcome the difficulties using single ANN models.[155] proposed an advanced ensemble model through combining FR and SVM to build spatial modeling in flood prediction.The results have been compared with DT.The data set includes an inventory map of flood prediction in various locations.
To build the model up to 100 flood locations where used for training and validation.The evaluation results showed a high success rate for the ensemble model.The results proved the efficiency, accuracy, and speed in the susceptibility assessment of floods.
Young, Liu and Wu [156] developed a hybrid physical model through integrating HEC-HMS model with SVM and ANN for accurate rainfall-runoff modeling during typhoon.The hybrid models of HEC-HMS-SVR and HEC-HMS-ANN have acceptable capability for hourly prediction.However, SVR model has much better generalization and accuracy ability in runoff discharge predictions.It is concluded that the predictions of HEC-HMS through ML hybridization.[157] proposed SAS-MP which is a hybrid of wavelet and season-multilayer perceptron for daily rainfall prediction.Season algorithm is a novel decomposition technique used to improve the data quality.The resulting hybrid model is referred to as W-SAS-MP model.The data set includes the daily rainfall data of three decades since 1974.The W-SAS-MP reported highly efficient for enhancing the daily rainfall prediction accuracy and lead-time.
Chang, Shen and Chang [158] developed a hybrid ANN model for real-time forecasting of regional flood in an urban area.The advanced hybrid model of SOM-R-NARX is an integration of NARX The comparison of the hybrid method against every single of algorithms used in the study prove the effectiveness of the proposed method.[165] developed a hybrid model of prediction from integrating ANN and a nonlinear perturbation model (NLPM), defined as NLPM-ANN to improve the efficiency and accuracy of rainfall-runoff prediction.The model of NLPM-ANN was benchmarked against two models of LPM and NLPM-API on a data set of daily rainfall-runoff in the period of 1973-1999.They reported that the NLPM-ANN works better than the models of LPM and NLPM-API.The results of the case studies of various watersheds prove the model accuracy.
Through an EPS model [166] aimed at limiting the range of the uncertainties in runoff simulations and flood prediction.The classifier ensembles include MLP, SVM, and RF.Note that the ensemble of MLP is a novel approach in flood prediction.The proposed EPS presents a number of integrated models and simulation runs.The model validation was successfully performed using a data set various rain gauges of precipitation data during the 2013-2014 storm season.Using the EPS model decreased uncertainty in forecasting which evaluates the prediction system reliable and robust in estimating the flood duration and destructive power.In another case, [167] developed an EPS model of six ANNs for daily streamflow prediction based on a daily high-flow data from the storm season of 2013-2014.The proposed model has a fast development time, which also provides probabilistic forecasts to deal with uncertainties in prediction.The ensemble prediction system reported highly useful and robust.

Comparative performance analysis
To evaluate a reliable prediction, the accuracy, reliability, robustness, consistently, generalization, and timeliness, are suggested as the three basic criteria (Singh 1989).The timeliness is among the most important criterion and it is only achieved through using robust and yet simple models.Furthermore, the performance of the prediction models is often evaluated through root mean square error (RMSE), mean error (ME), mean squared error (MSE), Nash coefficients (E) and R 2 , also known as correlation coefficient (CC).In this survey the value of R 2 and RMSE are considered for performance evaluation.CC and RMSE can be defined as: Eq.1 Eq.2 Here, n is the total number of year to be predicted, Pi is value of the prediction and Oi the observed variables for year i, where generally a R2 > 0.8 is considered as a n acceptable prediction.Yet, a lower value for RMSE suggests a better prediction.Overall, forecasting models of flood are reported accurate if RMSE value is close 0, and R2 close to 1.The specific intended purpose, computational cost and data set would be our major consideration criteria.Furthermore, the generalization ability, the speed and cost of implementation and operation, ease of use, low-cost maintenance, robustness, and accuracy of simulation are the other important criteria for evaluation of the methods.
Here, it is worth mention that, the value of RMSE can be different in various studies.In addition the value of RMSE in some studies has been calculated for various sites.To present a fare evaluation of RMSE we made sure that unit of RMSE is same and for the multiple RMSE the average is calculated.We also doubled checked for any possible error.Generally, ANNs are suggested as promising means for short-term prediction.Although in a few early studies, they performed weak, especially in generalization aspect (Piotrowski et al., 2006; Napolitano et al. 2010) the better methodologies for higher performance ANNs in handling big data sets yielded better results.In this context, the BPNN and functional networks are suggested to be difficult to be implemented by user.However, the model shown to be reasonably accurate, efficient, and fast approach with the ability to deal with noisy data set.However, NARX network performs better comparing to BPNN.Nevertheless, the accuracy can be enhanced through an integration with autoregressive models.MLP and DTs provide equally acceptable prediction yield with ANNs.And among DTs the ADT provided the fastest and more accurate prediction capability in determining flood.Not as as ANNs, the Rotation Forest (RF) and M5 model tree (MT) are reported efficient and robust.Nguyen (2015), and Taksandel and Mohod (2013) proposed RF-based as effective as ANNs and suitable for long-leads too.
Along with ANNs, the SVM has been also seen as a relatively effective ML tool for rainfall-runoff modeling and classification with better generalization ability and performance.In many cases SVM performed even better specially in very short lead-time (Zakaria and Shabri 2012).Particularly SVMbased models provided promising performances for hourly prediction.Nevertheless, the prediction ability decreased for longer lead-time.This issue has been satisfied though LS-SVM model with also better generalization ability (Yoon et al., 2011; Samui 2011).Generally, SVM reported to be a suitable choice to evaluate the uncertainty in predicting the hazardous flood quantiles which reveals the effectiveness of SVM in real-time flood forecasting.Overall, the reviewed single prediction models can provide relatively accurate short-term forecasts.However, for prediction longer than 2 hours the hybrid models such as ANFIS, and WNN perform better.The performance comparison of ANFIS model with BPNN and AR models with an average correlation coefficients higher than 0.80, shows superiority of ANFIS, in a wide range of short-term flood prediction applications e.g.water level, rainfall-runoff, streamflow (for up to 24 hours).ANFIS has by far demonstrated a superior ability to estimate real-time flash flood estimation compared to most ANN based models.In particular for 1-3 hours-ahead of flood providing high accuracy and reliability.Yet more advanced for of ANFIS hybrid models tuned by SVR provides even better prediction accuracy and good cost effective computation for nonlinear and real-time flood prediction.Furthermore, ANFIS models presented higher generalization ability.Yet, by increasing the prediction lead-time the R2 decreases.For daily flow, MNLR is suggested with a superior performance over the ANN, ANFIS and MLR models.And in the cases where hydro-meteorological data are readily available CART, is superior among ANFIS, SVM, and MLP.(T-S) fuzzy is also a good choice.On the other hand, WNN, performs significantly superior over MLP, ANNs, and ANFIS for daily prediction.For accurate longer head-time prediction, the decomposition techniques such as DWT, autoregression, and Season algorithm can provide great advantages.Overall, the novel hybrid models made of ML, soft computing, and statistical methods e.g.KSOFM-NNM, SOM-R-NARX, WNARX, HEC-HMS-SVR, HEC-HMS-ANN, W-SAS-MP, WBANN, RSVRCPSO, ANN-hydrodynamic model shown to overcome the drawbacks to most ML methods via enhancing the prediction accuracy, lead-time, leading to more realistic flood models with even better susceptibility assessment.On the other hand, novel ensemble methods not just have improved the accuracy robustness of predictions but have been also contributing in limiting the range of the uncertainties in models.Among the EPS methods, the ensembles of ANN, MLP, SVM, and RF have shown promising results.

Long-term flood prediction with ML
The long-term flood prediction is of significant importance to increase the knowledge and water resource management potential over longer periods of time from weeks to months and annual prediction [168] .In the last decades, many notable ML methods such as ANN [57], ANFIS [51,169], SVM [170], SVR [170], WNN [40], and Bootstrap-ANN [40] have been used for long lead time prediction with promising results.Recently, in a number of studies e.g.[171][172][173][174] the performance of various ML methods for long lead time flood prediction have been compared.However, still it is not clear that which ML method can perform better in long-term flood prediction.In this section the tables 4 and 5 represent a summary of these investigations and review the performance of the ML models in dealing with long-term prediction.

Long-term flood prediction methods using single ML methods
This section presents a comprehensive comparison on ML methods.Table .4presents a summary of the major single ML methods used in long-term flood prediction i.e.MLP, ANNs, SVM, RT followed by a comprehensive performance comparison.A revision and discussion of these methods follow to identify the most suitable methods presented in literature.For the seasonal flood forecasting Elsafi [175] proposed numerous ANNs and compared the results.
The water level data from different stations of the period 1970-1985 was selected as the training, and 1986-1987 for the verification period.ANNs worked well, especially where data set is not complete, providing a viable choice for accurate perdition.ANNs provide the possibility of reducing the analytical costs through reducing the data analysis time.Similarly, [176] used ANN for developing a prediction modeling for precipitation.Historical data set of 1900-2001 of different stations has come to the consideration and ANN model applied to various stations to evaluate the prediction performance.Authors summarize that the ANN models offered a great forecasting skills for predicting the long-term evapotranspiration and precipitation.[180] used an ANN model for stream assessment for long-term flood.Data set was collected from more than 100 sites of numerous flood streams.They concluded that ANN model, comparing to HBI significantly improved the prediction ability through using geomorphic data.However, ANN has generalization problem.Nevertheless, the ANN in this case proved useful to water managers.Singh [177] used a number of BPNNs to build prediction models of heavy rains and flood.The data set includes the period of 1871-2010, for each of the individual month.The results indicate that the BPNN models are fast and robust with simple networks which make it a great option to forecast the non-linear floods.[178]  This study advocated that the novel IIS-W-ANN should be considered as an excellent flood forecasting model.Nevertheless, the model can be further optimized for better performance using optimization methods introduced e.g.[200][201][202][203].In fact, such optimizers can complement the IIS-W-ANN in fine-tuning the hidden layer weights and biases for better prediction.Mekanik [190]  with minimal inputs requirements and less development time.[191] compares the performance of ANFIS, ANNs and SVM.The data set includes the monthly flow data from 1953 to 2004 where the period of 2000-2004 used for validation.ANFIS and SVM evaluated better for the long-term prediction.[204] compared the performance of ANFIS, ANNs, and SVM for the monthly prediction of flood.The comparison results indicated that the ML models provided more accuracy than the statistical models in predicting streamflow.Furthermore, ANN and ANFIS presented more accuracy vs. SVM.However, for the low flow prediction, the SVM and ANN models outperformed other ANFIS.[193] proposed a modified variation of hybrid model of NLPM-ANN to predict wetness and flood.To do so, the seasonal rainfall and wetness data of various stations had been considered.The NLPM-ANN model reported to be significantly superior to the models of previous studies.In another hybrid model, [194] investigated the performance of a modified EMD-SVM (M-EMDSVM) model for long-lead times, and compared the accuracy with ANN and SVM models.The M-EMDSVM model was created through modification of EMD-SVM.The evaluation the results shows that M-EMDSVM model is a better alternative to ANN, SVM and EMD-SVM models for long-lead time streamflow prediction.Furthermore, M-EMDSVM model presents better stability, representativeness and precision.Zhu, Zhou, Ye and Meng [195] contributed in the integration of ML with time series decomposition to predict monthly streamflow through estimation and comparison of accuracy of a number of models.For that matter, they integrated SVM with discrete wavelet transform (DWT) and EMD.The hybrid models are called DWT-SVR and EMD-SVR.Results indicate decomposition improves the accuracy of streamflow prediction, yet DWT performs even better.Further comparisons of SVR, EMD-SVR and DWT-SVR models show that EMD and DWT are significantly more accuracy than SVR for monthly streamflow prediction.
Araghinejad [197] presented the applicability of ensembles for probabilistic flood prediction real-life cases.He utilized K-nearest neighbor regression for the purpose of combining individual networks and to improve the performance of prediction.As an EPS of ANNs, hybrid model of K-NN proposed to increase the generalization ability of neural networks, and further compared the results with MLP, MLP-PLC and ANN.The data of hourly water level of the reservoir of 132 typhoons in the period 1971-2001 were used.The proposed EPS has a promising ability of generation and prediction accuracy.
Bass and Bedient [196] proposed a hybrid model of surrogate-ML for long-term flood prediction suitable for TCs.The methods used include ANN integrated with Principal Components Analysis (PCA), Kriging integrated with PC, and Kriging.The models reported efficient and fast to build.Results demonstrate that the methodology has an acceptable generalization ability suitable for urbanized and coastal watersheds.[198] contributed to improve the decomposition-ensemble prediction models through developing EEMD-ANN model for monthly prediction.The performance comparison with SVM, ANFIS, and ANNs shows a significant improvement in accuracy.Ravansalar [199] compared the performance of the prediction models of WNN, ANN and a novel hybrid model so called wavelet-linear genetic programming (WLGP) in dealing with long-term prediction of streamflow.The E resulted as accurate as 0.87 for the WLGP model.The comparison of performance evaluation showed that WLGP is significantly increased the accuracy for the monthly approximation of the peak streamflow.

Comparative performance analysis and discussions
In this secession the comparative performance analysis of ML methods for long-term prediction is presented.Figure 9, represents the values of RMSE and R2 for single methods of ML where ANNs, SVMs, and SVRs show better results.ANNs have been seen as the most widely used ML method due to their accurately, high fault tolerance, and powerful parallel processing in dealing with complex flood functions, especially where data set is not complete.However, the generalization remains an issue with ANN.In this context, ANFIS, MLP and SVM performed better than ANNs.However, the wavelet transforms reported to be a useful for decompositions of original time series improving the ability of most ML methods through providing insight into data set and on various resolution levels as an appropriate data preprocessing.For instance, WNNs generally produce more consistent results comparing to the traditional ANNs.Either in short-term [205] or long-term rainfall-runoff modeling [186], overall, the accuracy, precision and performance of most decomposed ML algorithms e.g.WNN reported better than those which are trained using un-decomposed time series.However, despite the achievement of WNN, the predictions are not satisfactory for long lead-time.To increase the accuracy of the longer lead predictions for up to one year, novel hybrids such as WARM, which is a hybrid of WNN and autoregressive, and wavelet multi-resolution analysis (WMRA) are proposed.In other cases, it has been seen that the performance of models are highly increased through decomposition to produce cleaner inputs.For example, wavelet-neuro-fuzzy models [206] are significantly more accurate and faster than the single ANFIS and ANNs.Nonetheless, with an increases in the lead-time the uncertainty in prediction increases.Thus, the evaluation of model precision should come to the consideration in future studies.
Data decomposition methods e.g.autoregressive, wavelet transforms, wavelet-autoregressive, DWT, IIS and EMD, have highly contributed in developing hybrid methods for longer prediction lead-time, good stability, great representativeness and higher accuracy.These data decomposition methods have been integrated with ANNs, SVM, WNN, and FR it is expected to gain more popularity among researchers.The other trend in improvement of prediction accuracy, and generalization capability is EPS.In fact, the recent ensemble methods contributed in, speed, accuracy and generalization very well.The EPS of ANNs, WNNs, using BB sampling, genetic programming, simple average, Stop training, Bayesian, data fusion, regression, and other soft computing techniques, so far shown promising results and better performance over their traditional ML methods.In ensembles, however, it is noted that, the human decision as input variable provides superior performance than the models without this important input.However, the most significant hybrid models are the novel decomposition-ensemble prediction models suitable for monthly prediction.The performance comparison with SVM, ANFIS, and ANNs shows a significant improvement in accuracy and generalization.Here, it is also worth mentioning the importance of further signal processing techniques e.g.[206] in both long-term and short-term flood.

Conclusions
This paper presents an overview of machine learning models used in flood prediction, and develops a classification scheme to analyze the existing literature.It divides the prediction models into two categories according to lead-time, and further divides the machine learning methods into two categories of hybrid and single methods.The state-of-the-art of these classes are discussed and analyzed in detail considering the performance comparison of the methods available in literature.The performance of the methods has been evaluated in terms of R 2 and RMSE, in addition to the generalization ability, robustness, computation cost, and speed.Despite the promising results already reported in implementing the most popular machine learning methods e.g.ANNs, SVM, SVR, ANFIS, WNN and DTs, there has been a significant research and experimentation for further improvement and advancement.In this context there are four major trends reported, in literature, in improving the quality of prediction.First, one is the novel hybridization, either through integration of two or more machine learning methods or integration of a machine learning method(s) with the more conventional means, and/or soft computing.Second, one is the use of data decomposition techniques for the purpose of improving the quality of data set which has highly contributed in improving the accuracy of prediction.Third is the use of ensemble of methods which dramatically increase the generalization ability of models and decrease the uncertainty of prediction.The forth is the use of the add-on optimizer algorithms to improve the quality of machine learning algorithms e.g. for better tuning the ANNs to reach optimal neuronal architectures.It is expected that through these four key technologies the flood prediction will witness significant improvements for both shortterm and long-term prediction.Surely, the advancement of the novel ML methods highly depends on the proper usage of the soft computing techniques in designing the novel learning algorithms.This fact has been discussed in the paper and the soft computing techniques are introduced as the main contributors in developing the hybrid ML methods of future.
Here, it is also worth mentioning that, the multidisciplinary nature of this work was the most challenging difficulty to overcome in this paper.Having the contribution of the coauthors of both realms of ML and hydrology was the key to success.Furthermore, the novel methodology of search and the creative taxonomy and classification of the ML methods would lead to the original achievement of the paper.
Author Contributions: Dr. Shamshirband and Prof. Ozturk as the machine learning experts contributed in; investigation, methodology, supervision, communication, resources, data curation, and project management, Dr. Mosavi contributed in; original draft preparation, data collection, formal data analysis, investigation, and critical review, Prof. Chau as the hydrology expert contributed in; supervision, validation, revision, discussion, resources, improvement and advisory.

Figure 1 .
Figure 1.Flowchart of search queries

3 .
State-of-the-art of ML methods in flood prediction

Figure 2 .
Figure 2. Basic flow for building ML model

Preprints
(www.preprints.org)| NOT PEER-REVIEWED | Posted: 5 October 2018 doi:10.20944/preprints201810.0098.v1performance of ANN, SVM and RF in general applications to flood where RF delivers more performance.Another major DT is M5 decision tree algorithm[116].The M5 constructs a DT through splitting the decision space and single attributes which decrease the variance of final variable.Further DT algorithms popular in flood prediction include reduced error pruning trees (REPT), Naïve bayes trees (NBT), Chi-squared automatic interaction detector (CHAID), logistic model trees (LMT), alternating decision trees (ADT), and exhaustive CHAID (E-CHAID).

3. 7 .
Ensemble prediction systems (EPS)Using a multitude of ML modeling options had been introduced for flood modeling with a strong background(Bourdin et al., 2012).Thus, there is an emerging strategy to shift from a single model of prediction to an ensemble of models suitable for a specific application, cost and data set(Cunderlik et al., 2013).ML ensemble consists of a finite set of alternative models, which typically allows more flexibility among the alternatives.Ensemble ML methods have a long tradition in flood prediction and, for the prediction of flood, in the recent years, ensemble prediction systems (EPS)(Doycheva, Horn et al. 2017) has been proposed as an efficient prediction system to provide an ensemble of N forecasts.In EPS, N is the number of independent realizations of a model probability distribution.EPS models, generally, use multiple ML algorithms to provide higher performance using an automated assessment and weighting system(Polikar, 2006).Such weighting procedure is carried out to accelerate the performance evaluation process.The advantage of EPS is the timely and automated management and performance evaluation of the ensemble algorithms.Therefore, the performance of EPS for flood modeling, in particular can improved.EPS may use multiple fast learning or statistical algorithms as classifier ensembles e.g.ANNs, MLP, DTs, rotation forest (RF) bootstrap, and boosting, to reach higher accuracy and robustness.The subsequent ensemble prediction systems can be used to quantify the probability of floods, based on the prediction rate used in the event(Molteni et al, 1996).Therefore, the quality of ML ensembles can be calculated based on the verification of probability distribution.Shu, (2004)  presents a review of the applications of Ensemble ML methods used in flood.EPS has been demonstrated to have the capability to improve the model accuracy in flood modeling (HL Cloke, F Pappenberger 2009; Siqueira, Collischonn, W., Fan, F. M., & Chou, S. C).

Figure 2 .
Figure 2. presents the major ML methods used for flood prediction, and the number of articles in the literature over the last decade.

Figure 3 .
Figure 3. Major ML methods used for flood prediction in literature.Reference year 2008.Source: Scopus

Figure 5 .Figure 6 .
Figure 5. Taxonomy of survey; ML methods for flood prediction

Figure 7 .Figure 8 .
Figure 7. Comparative performance analysis of single methods of ML for short-term flood prediction using R 2 and RMSE Figure.10 represents the values of RMSE and R2 for hybrid methods of ML where decomposition and ensemble methods outperform more traditional methods.

Figure. 9
Figure.9 Comparative performance analysis of single methods of ML for long-term prediction

Table 1 .
Table.1 represents the organization of the search queries.Furthermore figure.1 describes the survey search methodology.Organization of search queries.

Table . 2
Short-term prediction using single ML method

Table 3 .
Short-term flood prediction using hybrid ML methods

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 5 October 2018 doi:10.20944/preprints201810.0098.v1
[152]t-Aparicio, Pérez-Sánchez, Pulido-Velazquez and María Cecilia[151]modeled the flash flood using ANN and ANFIS applying to the data set collected from 14 different streamflow gauge stations.RMSE, and R2 have been used as evaluation criteria.The results show that ANFIS has by far demonstrated a superior ability to estimate real-time flash flood estimation compared to ANN.Chang and Chang[152]constructed an accurate water level forecasting system based on ANFIS for 1-3 hours-ahead of flood.The ANFIS successfully provided accurate water level prediction.The hourly water level of five gauges from 1971 to 2001 has been used.They concluded that the ANFIS model can efficiently deal with big data set through fast learning and reliable prediction.Further comparison shows that the ANFIS hybrid model tuned by SVR provides superior prediction accuracy and good cost effective computation for nonlinear and real-time flood prediction.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 5 October 2018 doi:10.20944/preprints201810.0098.v1 network
[161]SOM.The big data set includes 55 rainfall events of daily rainfall.The evaluation suggests SOM-R-NARX is accurate with small RMSE and high R2.Furthermore, comparing to CHIM, it provides prediction accuracy in hourly prediction.[159]proposed a model of wavelet-based NARX (WNARX) for the daily forecasting of rainfalls on a data set of gauge-based rainfall data for the period from 2000 to 2010.The prediction performance is further benched marked with, ANN, WANN, ARMAX and NARX models where WNARX reported superior.Partal and Cigizoglu (2009)developed a model for daily prediction of precipitation with ANN and WNN models.In their case, WNN showed significantly better results with an average value of 0.79 at various stations.Solgi et al. (2014) compared WNN with ANFIS for daily rainfall.As the results, the hybrid algorithm of WNN performed better with an R2 equal to 0.9 for daily lead-time.[83]proposedahybridmodel of wavelet, bootstrap technique and ANN, so called WBANN for an improved accuracy and reliability of ANN model short-term flood prediction.The performance of WBANN has been compared with bootstrap based ANNs (BANNs) and WNN.The wavelet decomposition significantly improved ANN models.In addition, the bootstrap resampling produced consistent results.French, Mawdsley, Fujiyama and Achuthan[160]for the accurate short-term prediction of extreme storm surge water levels a novel hybrid model of ANN and a hydrodynamic model is proposed.The ANN-hydrodynamic model generates realistic flood extents and a great improvement in the model accuracy.[161]proposeda hybrid forecasting technique called RSVRCPSO to accurately estimate the rainfall.RSVRCPSO is an integration of RNN, SVR and chaotic particle swarm optimization algorithm.Data set obtained from three rain gauges from the period of 1985 to August 1997 which includes the data of nine typhoon events.The results suggests that the proposed model yields better performance for rainfall prediction.The RSVRCPSO model, in comparison with SVRCPSO resulted in less NMSE learning and testing which is a reason of superiority in prediction.Data from 8 typhoon events between 2004 and 2005 of rainfall and river stage pairs, were selected for model training.The results indicate that hybrid model of FFRM-ANN provides an efficient FFRM for accurate flood forecasting.
[164]Yang, Kuo, Tan, Lai, Chang, Lee and Hsu[162]proposed a monsoon rainfall enhancement (AME) based on ANNs which is a hybrid form of linear regression, and state space neural network (SSNN).The performance of the proposed model has been benchmarked against the hybrid method of MLR-ANN.Data set includes the total rain, wind and humidity measures from 1989-2008 based on 371 rain gauge stations of six typhoons.The results indicate that the method is highly robust with better prediction accuracy in terms of R2, peak discharge, and total volume.Rajurkar, Kothyari and Chaube[163]modeled the rainfall-runoff through integrating ANN and a simplified linear model.Furthermore, data set includes the daily measurements of rainfall inputs in the period of.The hybrid model found to better in providing theoretical forecasting representation of flood with R2 equal to 0.728.Hsu, Lin, Fu, Chung and Chen[164]proposed a hybrid model from integration of flash flood routing model (FFRM) and ANN called FFRM-ANN model to predict the hourly river stages.The ANN algorithms in this study are FFNN, and FBNN.

Table 4 .
Long-term flood prediction using single ML methods

Table 5 .
[184]s et al. (2005)yze the non-linear through modeling with BPNN and LLR-based models for long-term flood forecasting.Data set includes almost two decades of rainfall, outflow, inflows, evaporation, and water level since 1988.Their evaluation concludes that LLR predicts better than BFGSNN model in terms of performance and accuracy with bigger values of R2 and lower values of RMSE.However, BPNN outperformed the other methods with relatively good results.Nonetheless, among the ANN variations,Sahoo and Ray (2006)proposed BPNN model as the most reliable ANN for long-term flood prediction.[179]alsocompared the performance of ANNs with BPNN, and MLP in long-term prediction of flood discharge.The promising results were obtained in using MLP.However, the generalization remains an issue.Lin, Cheng and Chau[181]applied a SVM model for estimating the streamflow and reservoir inflow for a long lead time.To benchmark they used ANNs and ARMA.The prediction models built using monthly river flow discharges from the period of 1974-1998 for training, and 1999-2003 for testing.Through the comparison of the models' performance, it is demonstrated that SVM is a potential candidate for the prediction of long-term discharges performed better than ANN.In a similar approach[183]proposed a SVM-based model for estimating the soil moisture using remote sensing data and the results were compared to predictive models based on BPNN and MLR.The training is performed on the data of the period of 1998 to 2002 and testing from 2003 to 2005.The SVM model shown to be more accurate and easier to build comparing to BPNN and MLR.[182], employed the RT to model forest flood.The data from 2009-2012 of 50 sites used for model building.It is reported that prediction of annual forest floods through combination of quantitative ground surveys, satellite imagery, hybrid machine learning tools, and future validation.Long-term flood prediction using hybrid methods Banihabib and Behbahani[15]used a hybrid method of autoregressive ANN integrated with sigmoid and radial activity functions.The proposed hybrid method outperforms the conventional statistical methods of ARMA and ARIMA with less RMSE.They reported that ARIMA is suitable for prediction of monthly and annual inflow, while the dynamic autoregressive ANN model with sigmoid activity function could be used even for longer lead time.The data set for this study included a big data set of monthly discharge of the period of 1960 to 2007.Adamowski[14]developed models based on ANN and WNN, and compared the prediction performance with statistical methods.WNN proposed as a more accurate prediction model.As also been confirmed earlier byCannas et al. (2005)[185]for monthly rainfall-runoff forecasting.In a similar work[187]compared the performance of ANN and WNN for prediction of peak flows.He also reported WNN more reliable to simulate the extreme event streams.Decomposition with improved the results considerably.Higher levels of wavelet decomposition further improves the testing results.Statistical performance evaluation of RMSE shows considerable improvement in testing results.Venkata Ramana[188]also combined the wavelet technique with ANN for longterm flood prediction.They considered 74 years of data for period of 1901 to 1975.The dada set of 44 years was used for calibration and the rest for the validation of the model.Their results show a relatively lower performance for ANNs compared WNN models in modeling the rainfall-runoffs.Cannas et al.[185]proposed WNN for monthly rainfall-runoff prediction.Which showed significant improvement over ANNs.In a similar attempt Kasiviswanathan, He, Sudheer and Tay[186]used WNN and WNN-BB, which is an ensemble of WNN utilizing block bootstrap (BB) sampling technique to identify the robust modeling approach among ANN and WNN, through assessing the accuracy and precision.Data set include measurements from 1912 to 2013 at several flow gauge stations.The results suggest WNN-BB as a robust model for long-term streamflow prediction.For longer lead time of up to one year.However, Tantanee, Patamatammakul et al. (2005) proposed the hybrid model of wavelet-autoregressive so called WARM which performed more effective for the long lead time.Prasad[184]proposed another similar hybrid model with integration of WNN and iterative input selection (IIS).The hybrid model was called IIS-W-ANN and benchmarked with M5 model Tree.Their data set includes the streamflow water level measurements of 40 years.The IIS-W-ANN hybrid model outperformed M5 Tree.

preprints.org) | NOT PEER-REVIEWED | Posted: 5 October 2018 doi:10.20944/preprints201810.0098.v1
used ANFIS to forecast the seasonal rainfall.Comparison of the performance and accuracy of the ANN model and a physical model shows promising results for ANFIS.The rainfall measurements of 1900-1999 have been used for training and validation and the next decade for testing.As the result, the ANFIS outperforms the ANN models in all cases, yet comparable to POAMA and also better than climatology.Furthermore, the study demonstrates the accuracy of ANFIS compared to global climate models.In addition, the study suggest ANFIS as an alternative tool for long-term prediction.ANFIS reported easy to implement with low complexity Preprints (www.