Flood Prediction Using Machine Learning Models: Literature Review

Floods are among the most destructive natural disasters, which are highly complex to model. The research on the advancement of flood prediction models contributed to risk reduction, policy suggestion, minimization of the loss of human life, and reduction the property damage associated with floods. To mimic the complex mathematical expressions of physical processes of floods, during the past two decades, machine learning (ML) methods contributed highly in the advancement of prediction systems providing better performance and cost-effective solutions. Due to the vast benefits and potential of ML, its popularity dramatically increased among hydrologists. Researchers through introducing novel ML methods and hybridizing of the existing ones aim at discovering more accurate and efficient prediction models. The main contribution of this paper is to demonstrate the state of the art of ML models in flood prediction and to give insight into the most suitable models. In this paper, the literature where ML models were benchmarked through a qualitative analysis of robustness, accuracy, effectiveness, and speed are particularly investigated to provide an extensive overview on the various ML algorithms used in the field. The performance comparison of ML models presents an in-depth understanding of the different techniques within the framework of a comprehensive evaluation and discussion. As a result, this paper introduces the most promising prediction methods for both long-term and short-term floods. Furthermore, the major trends in improving the quality of the flood prediction models are investigated. Among them, hybridization, data decomposition, algorithm ensemble, and model optimization are reported as the most effective strategies for the improvement of ML methods.

characteristics of the algorithms need to be clarified with respect to the type and amount of available training data, and the type of prediction task, e.g., water level and streamflow.
In this review, we look into examples of the use of various ML algorithms for various types of tasks. At the abstract level, we decided to divide the target tasks into short-term and long-term prediction. We then reviewed ML applications for flood-related tasks, where we structured ML methods as single methods and hybrid methods. Hybrid methods are those that combine more than one ML method.
Here, we should note that this paper surveys ML models used for predictions of floods on sites where rain gauges or intelligent sensing systems used. Our goal was to survey prediction models with various lead times to floods at a particular site. From this perspective, spatial flood prediction was not involved in this study, as we did not study prediction models used to estimate/identify the location of floods. In fact, we were concerned only with the lead time for an identified site.

Method and Outline
This survey identifies the state of the art of ML methods for flood prediction where peer-reviewed articles in top-level subject fields are reviewed. Among the articles identified, through search queries using the search strategy, those including the performance evaluation and comparison of ML methods were given priority to be included in the review to identify the ML methods that perform better in particular applications. Furthermore, to choose an article, four types of quality measure for each article were considered, i.e., source normalized impact per paper (SNIP), CiteScore, SCImago journal rank (SJR), and h-index. The papers were reviewed in terms of flood resource variables, ML methods, prediction type, and the obtained results.
The applications in flood prediction can be classified according to flood resource variables, i.e., water level, river flood, soil moisture, rainfall-discharge, precipitation, river inflow, peak flow, river flow, rainfall-runoff, flash flood, rainfall, streamflow, seasonal stream flow, flood peak discharge, urban flood, plain flood, groundwater level, rainfall stage, flood frequency analysis, flood quantiles, surge level, extreme flow, storm surge, typhoon rainfall, and daily flows [59]. Among these key influencing flood resource variables, rainfall and the spatial examination of the hydrologic cycle had the most remarkable role in runoff and flood modeling [60]. This is the reason why quantitative rainfall prediction, including avalanches, slush flow, and melting snow, is traditionally used for flood prediction, especially in the prediction of flash floods or short-term flood prediction [61]. However, rainfall prediction was shown to be inadequate for accurate flood prediction. For instance, the prediction of streamflow in a long-term flood prediction scenario depends on soil moisture estimates in a catchment, in addition to rainfall [62].
Although, high-resolution precipitation forecasting is essential, other flood resource variables were considered in the [63]. Thus, the methodology of this literature review aims to include the most effective flood resource variables in the search queries.
A combination of these flood resource variables and ML methods was used to implement the complete list of search queries. Note that the ML methods for flood prediction may vary significantly according to the application, dataset, and prediction type. For instance, ML methods used for short-term water level prediction are significantly different from those used for long-term streamflow prediction. Figure 1 represents the organization of the search queries and further describes the survey search methodology.
The search query included three main search terms. The flood resource variables were considered as term 1 of the search (<Flood resource variable1-n>), which included 25 keywords for search queries mentioned above. Term 2 of search (<ML method1-m>) included the ML algorithms. The collection of the references [16,26,28,37,38,42,44] provides a complete list of ML methods, from which the 25 most popular algorithms in engineering applications were used as the keywords of this search. Term 3 included the four search terms most often used in describing flood prediction, i.e., "prediction", "estimation", "forecast", or "analysis". The total search resulted in 6596 articles. Among them, 180 original research papers were refined through our quality measure included in the survey. presents the survey of ML methods used for short-term flood prediction. Section 5 presents the survey of ML methods used for long-term flood prediction. Section 6 presents the conclusions.

State of the Art of ML Methods in Flood Prediction
For creating the ML prediction model, the historical records of flood events, in addition to real-time cumulative data of a number of rain gauges or other sensing devices for various return periods, are often used. The sources of the dataset are traditionally rainfall and water level, measured either by ground rain gauges, or relatively new remotesensing technologies such as satellites, multisensor systems, and/or radars [62].
Nevertheless, remote sensing is an attractive tool for capturing higher-resolution data in real time. In addition, the high resolution of weather radar observations often provides a more reliable dataset compared to rain gauges [63]. Thus, building a prediction model based on a radar rainfall dataset was reported to provide higher accuracy in general [64].
Whether using a radar-based dataset or ground gauges to create a prediction model, the historical dataset of hourly, daily, and/or monthly values is divided into individual sets to T3 <C1> OR <Cn> T1 <V1> OR <Vn> T2 <ML1> OR <MLn> AND AND Q1-n construct and evaluate the learning models. To do so, the individual sets of data undergo training, validation, verification, and testing. The principle behind the ML modeling workflow and the strategy for flood modeling are described in detail in the literature [48,65]. Figure 2 represents the basic flow for building an ML model. The major ML algorithms applied to flood prediction include ANNs [66], neuro-fuzzy [67], adaptive neuro-fuzzy inference systems (ANFIS) [68], support vector machines (SVM) [69], wavelet neural networks (WNN) [70], and multilayer perceptron (MLP) [71]. In the following subsections, a brief description and background of these fundamental ML algorithms are presented.

Artificial Neural Networks (ANNs)
ANNs are efficient mathematical modeling systems with efficient parallel processing, enabling them to mimic the biological neural network using inter-connected neuron units.
Among all ML methods, ANNs are the most popular learning algorithms, known to be versatile and efficient in modeling complex flood processes with a high fault tolerance and accurate approximation [39]. In comparison to traditional statistical models, the ANN approach was used for prediction with greater accuracy [72]. ANN algorithms are the most popular for modeling flood prediction since their first usage in the 1990s [73]. Instead of a catchment's physical characteristics, ANNs derive meaning from historical data. Thus, ANNs are considered as reliable data-driven tools for constructing black-box models of complex and nonlinear relationships of rainfall and flood [74], as well as river flow and discharge forecasting [75]. Furthermore, a number of surveys (e.g., Reference [76]) suggest ANN as one of the most suitable modeling techniques which provide an acceptable generalization ability and speed compared to most conventional models. References [77,78] provided reviews on ANN applications in flood. ANNs were already successfully used for numerous flood prediction applications, e.g., streamflow forecasting [79], river flow [80,81], rainfall-runoff [82], precipitation-runoff modeling [83], water quality [55], evaporation [56], river stage prediction [84], low-flow estimation [85], river flows [86], and river time series [57]. Despite the advantages of ANNs, there are a number drawbacks associated with using ANNs in flood modeling, e.g., network architecture, data handling, and physical interpretation of the modeled system. A major drawback when using ANNs is the relatively low accuracy, the urge to iterate parameter tuning, and the slow response to gradient-based learning processes [87]. Further drawbacks associated with ANNs include precipitation prediction [88,89] and peak-value prediction [90].
Here, ELM was studied under the scope of ANN methods. ELM for flood prediction recently became of interest for hydrologists and was used to model short-term streamflow with promising results [93,94].

Multilayer Perceptron (MLP)
The vast majority of ANN models for flood prediction are often trained with a BPNN [95]. While BPNNs are today widely used in this realm, the MLP-an advanced representation of ANNs-recently gained popularity [96]. The MLP [97] is a class of FFNN which utilizes the supervised learning of BP for training the network of interconnected nodes of multiple layers. Simplicity, nonlinear activation, and a high number of layers are characteristics of the MLP. Due to these characteristics, the model was widely used in flood prediction and other complex hydrogeological models [98]. In an assessment of ANN classes used in flood modeling, MLP models were reported to be more efficient with better generalization ability. Nevertheless, the MLP is generally found to be more difficult to optimize [99]. Back-percolation learning algorithms are used to individually calculate the propagation error in hidden network nodes for a more advanced modeling approach.
Here, it is worth mentioning that the MLP, more than any other variation of ANNs (e.g., FFNN, BPNN, and FNN), gained popularity among hydrologists. Furthermore, due to the vast number of case studies using the standard form of MLP, it diverged from regular ANNs. In addition, the authors of articles in the realm of flood prediction using the MLP refer to their models as MLP models. From this perspective, we decided to devote a separate section to the MLP.

Adaptive Neuro-Fuzzy Inference System (ANFIS)
The fuzzy logic of Zadeh [100] is a qualitative modeling scheme with a soft computing technique using natural language. Fuzzy logic is a simplified mathematical model, which works on incorporating expert knowledge into a fuzzy inference system (FIS). An FIS further mimics human learning through an approximation function with less complexity, which provides great potential for nonlinear modeling of extreme hydrological events [101,102], particularly floods [103]. For instance, Reference [104] studied river level forecasting using an FIS, as did Lohani et al. (2011) [4] for rainfall-runoff modeling for water level. As an advanced form of fuzzy-rule-based modeling, neuro-fuzzy presents a hybrid of the BPNN and the widely used least-square error method [46]. The Takagi-Sugeno (T-S) fuzzy modeling technique [4], which is created using neuro-fuzzy clustering, is also widely applied in RFFA [28].
Adaptive neuro-FIS, or so-called ANFIS, is a more advanced form of neuro-fuzzy based on the T-S FIS, first coined [67,77]. Today, ANFIS is known to be one of the most reliable estimators for complex systems. ANFIS technology, through combining ANN and fuzzy logic, provides higher capability for learning [101]. This hybrid ML method corresponds to a set of advanced fuzzy rules suitable for modeling flood nonlinear functions. An ANFIS works by applying neural learning rules for identifying and tuning the parameters and structure of an FIS. Through ANN training, the ANFIS aims at catching the missing fuzzy rules using the dataset [67]. Due to fast and easy implementation, accurate learning, and strong generalization abilities, ANFIS became very popular in flood modeling. The study of Lafdani et al. [60] further described its capability in modeling short-term rainfall forecasts with high accuracy, using various types of streamflow, rainfall, and precipitation data. Furthermore, the results of Shu and [67] showed easier implementation and better generalization capability, using the one-pass subtractive clustering algorithm, which led several rounds of random selection being avoided.

Wavelet Neural Network (WNN)
Wavelet transform (WT) [46] is a mathematical tool which can be used to extract information from various data sources by analyzing local variations in time series [50]. In fact, WT has significantly positive effects on modeling performance [105]. Wavelet transforms supports the reliable decomposition of an original time series to improve data quality. The accuracy of prediction is improved through discrete WT (DWT), which decomposes the original data into bands, leading to an improvement of flood prediction lead times [106]. DWT decomposes the initial data set into individual resolution levels for extracting better-quality data for model building. DWTs, due to their beneficial characteristics, are widely used in flood time-series prediction. In flood modeling, DWTs were widely applied in, e.g., rainfall-runoff [51[, daily streamflow [106], and reservoir inflow [107]. Furthermore, hybrid models of DWTs, e.g., wavelet-based neural networks (WNNs) [108], which combine WT and FFNNs, and wavelet-based regression models [109], which integrate WT and multiple linear regression (MLR), were used in time-series predictions of floods [110]. The application of WNN for flood prediction was reviewed in Reference [70], where it was concluded that WNNs can highly enhance model accuracy. In fact, most recently, WNNs, due to their potential in enhancing time-series data, gained popularity in flood modeling [50], for applications such as daily flow [111], rainfall-runoff [112], water level [113], and flash floods [114].

Support Vector Machine (SVM)
Hearst et al. [115] proposed and classified the support vector (SV) as a nonlinear search algorithm using statistical learning theory. Later, the SVM [116] was introduced as a class of SV, used to minimize over-fitting and reduce the expected error of learning machines. SVM is greatly popular in flood modeling; it is a supervised learning machine which works based on the statistical learning theory and the structural risk minimization rule. The training algorithm of SVM builds models that assign new non-probabilistic binary linear classifiers, which minimize the empirical classification error and maximize the geometric margin via inverse problem solving. SVM is used to predict a quantity forward in time based on training from past data. Over the past two decades, the SVM was also extended as a regression tool, known as support vector regression (SVR) [117].
SVMs are today know as robust and efficient ML algorithms for flood prediction [118].
SVM and SVR emerged as alternative ML methods to ANNs, with high popularity among hydrologists for flood prediction. They use the statistical learning theory of structural risk minimization (SRM), which provides a unique architecture for delivering great generalization and superior efficiency. Most importantly, SVMs are both suitable for linear and nonlinear classification, and the efficient mapping of inputs into feature spaces [119].
Unlike ANNs, SVMs are more suitable for nonlinear regression problems, to identify the global optimal solution in flood models [126]. Although the high computation cost of using SVMs and their unrealistic outputs might be demanding, due to their heuristic and semiblack-box nature, the least-square support vector machine (LS-SVM) highly improved performance with acceptable computational efficiency [127]. The alternative approach of LS-SVM involves solving a set of linear tasks instead of complex quadratic problems [128].
Nevertheless, there are still a number of drawbacks that exist, especially in the application of seasonal flow prediction using LS-SVM [129].

Decision Tree (DT)
The ML method of DT is one of the contributors in predictive modeling with a wide application in flood simulation. DT uses a tree of decisions from branches to the target values of leaves. In classification trees (CT), the final variables in a DT contain a discrete set of values where leaves represent class labels and branches represent conjunctions of features labels. When the target variable in a DT has continuous values and an ensemble of trees is involved, it is called a regression tree (RT) [130]. Regression and classification trees share some similarities and differences. As DTs are classified as fast algorithms, they became very popular in ensemble forms to model and predict floods [131]. The classification and regression tree (CART) [132,133], which is a popular type of DT used in ML, was successfully applied to flood modeling; however, its applicability to flood prediction is yet to be fully investigated [134]. The random forests (RF) method [69,135] is another popular DT method for flood prediction [136]. RF includes a number of tree predictors. Each individual tree creates a set of response predictor values associated with a set of independent values. Furthermore, an ensemble of these trees selects the best choice of classes [69]. Reference [137] introduced RF as an effective alternative to SVM, which often delivers higher performance in flood prediction modeling. Later, Bui et al. [138] compared the performances of ANN, SVM, and RF in general applications to floods, whereby RF delivered the best performance. Another major DT is the M5 decision-tree algorithm [139].

Ensemble Prediction Systems (EPSs)
A multitude of ML modeling options were introduced for flood modeling with a strong background [140]. Thus, there is an emerging strategy to shift from a single model of prediction to an ensemble of models suitable for a specific application, cost, and dataset.
ML ensembles consist of a finite set of alternative models, which typically allow more flexibility than the alternatives. Ensemble ML methods have a long tradition in flood prediction. In recent years, ensemble prediction systems (EPSs) [141] were proposed as efficient prediction systems to provide an ensemble of N forecasts. In EPS, N is the number of independent realizations of a model probability distribution. EPS models generally use multiple ML algorithms to provide higher performance using an automated assessment and weighting system [140]. Such a weighting procedure is carried out to accelerate the performance evaluation process. The advantage of EPS is the timely and automated management and performance evaluation of the ensemble algorithms. Therefore, the performance of EPS, for flood modeling in particular, can be improved. EPSs may use multiple fast-learning or statistical algorithms as classifier ensembles, e.g., ANNs, MLP, DTs, rotation forest (RF) bootstrap, and boosting, allowing higher accuracy and robustness. The subsequent ensemble prediction systems can be used to quantify the probability of floods, based on the prediction rate used in the event [142,143,144].
Therefore, the quality of ML ensembles can be calculated based on the verification of probability distribution. Ouyang et al [145] and Zhang et al. [146] presented a review of the applications of ensemble ML methods used for floods. EPSs were demonstrated to have the capability for improving model accuracy in flood modeling [140][141][142][143][144][145][146] To improve the accuracy of import data and to achieve better dataset management, the ensemble mean was proposed as a powerful approach coupled with ML methods [140,141]. Empirical mode decomposition (EMD) [142], and ensemble EMD (EEMD) [143] are widely used for flood prediction [144]. Nevertheless, EMD-based forecast models are also subject to a number of drawbacks [145]. The literature includes numerous studies on improving the performance of decomposition and prediction models in terms of additivity and generalization ability [146].

Classification of ML Methods and Applications
The most popular ML modeling methods for flood prediction were identified in the previous section, including ANFIS, MLP, WNN, EPS, DT, RF, CART, and ANN. Figure 3 presents the major ML methods used for flood prediction, and the number of  Considering the ML methods for application to floods, it is apparent that ANNs, SVMs, MLPs, DTs, ANFIS, WNNs, and EPSs are the most popular. These ML methods can be categorized as single and hybrid methods. In addition to the fundamental hybrid ML methods, i.e., ANFIS, WNNs, and basic EPSs, several different research strategies for obtaining better prediction evolved [137]. The strategies involved developing hybrid ML models using soft computing techniques, statistical methods, and physical models rather than individual ML approaches, whereby the extra components complement each other with respect to their drawbacks and shortcomings. The success of such hybrid approaches motivated the research community to explore more advanced hybrid models. Figure 4 presents the progress of single vs. hybrid ML methods for flood prediction in the literature over the past decade. The figure shows an apparent continuous increase and notable progress in using novel hybrid methods. Through Figure 4, the taxonomy of our research was justified, based on distinguishing hybrid and single ML prediction models.  forecasts can be, for instance, up to three months. In hydrology, the definitions of shortterm and long-term in studying the different phenomena vary. Short-term predictions for floods often refer to hourly, daily, and weekly predictions, and they are used as warning systems. On the other hand, long-term predictions are mostly used for policy analysis purposes. Furthermore, if the prediction leading time to flood is three days longer than the confluence time, the prediction is considered to be long-term [37,58]. From this perspective, in this study, we considered a lead time greater than a week as a long-term prediction. It

Machine Learning Methods in Literature
Hybrids Singles was observed that the characteristics of the ML methods used varied significantly according to the period of prediction. Thus, dividing the survey on the basis of short-term and long-term was essential.
Here, it is also worth emphasizing that, in this paper, the prediction lead-time was classified as "short-term" or "long-term". Although flash floods happen in a short period of time with great destructive power, they can be predicted with either "short-term" or "long-term" lead times to the actual flood. In fact, this paper is concerned with the lead times instead of the duration or type of flood. If the lead-time prediction to a flash flood was short-term, then it was studied as a short-term lead time. However, sometimes flash floods can be predicted with long lead times. In other words, flash floods might be predicted one month ahead. In this case, the prediction was considered as long-term.
Regardless of the type of flood, we only focused on the lead time.
In this study, the ML methods were reviewed using two classes-single methods and hybrid methods. Figures 5 and 6 represent the taxonomy of the research. Step 1 involved running the queries one by one; step 2 involved checking the results of the search, and initiating the next search; step 3 involved identifying the comparative studies on ML models of prediction, refining the results and building the database; step 4 involved identifying whether it was a long-term or short-term prediction; steps 5 and 6 involved identifying if it was a single or hybrid method, constructing Table 1, and step 7 involved constructing the other Tables. The four tables provide the list of studies on different prediction techniques, which entail the organized comprehensive surveys of the literature.

Short-Term Flood Prediction with ML
Short-term lead-time flood predictions are considered important research challenges, particularly in highly urbanized areas, for timely warnings to residences so to reduce damage [146]. In addition, short-term predictions contribute highly to water recourse     Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 management. Even with the recent improvements in numerical weather prediction (NWP) models, artificial intelligence (AI) methods, and ML, short-term prediction remains a challenging task [147][148][149][150][151][152]. This section is divided into two subsections-single and hybrid methods of ML-to individually investigate each group of methods.

Short-Term Flood Prediction Using Single ML Methods
To gain insight into the performance of ML methods, a comprehensive comparison was required to investigate ML methods. Table 1 presents a summary of the major ML methods, i.e., ANNs, MLP, nonlinear autoregressive network with exogenous inputs (NARX), M5 model trees, DTs, CART, SVR, and RF, followed by a comprehensive performance comparison of single ML methods in short-term flood prediction. A revision and discussion of these methods follow so as to identify the most suitable methods presented in the literature. however, it lacked efficiency in using raw data for the time-series prediction of streamflow.
In addition, Reference [157] showed the application of BPNN for assessing flash floods using measured data. This dataset included 5-min-frequency water quality data and 15- However, ELM was suggested as a faster method for parameter selection and learning loops. Reference [154] also conducted a comparison between fuzzy c-means, ANN, and MLP using a common dataset of sites to investigate ML method efficiency and accuracy.
The MLP and ANN methods were proposed as the best methods. Chang, Chen, Lu, Huang, and Chang [160] and Reference [161]  This study suggested that the NARX model was effective in urban flood prediction.
Furthermore, Valipour et al. [24] showed how the accuracy of ANN models could be increased through integration with autoregressive (AR) models.
Bruen and Yang [162] modeled real-time rainfall-runoff forecasting for different lead times using FFNN, ARMA, and functional networks. Here, functional networks [173] were compared with an FFNN model. The models were tested using a storm time-series dataset.
The result was that functional networks allowed quicker training in the prediction of rainfall-runoff processes with different lead times. The models were able to predict floods with short lead times. Reference [164] Table 1 and also the accuracy analysis of Figure 3, where the values of R 2 and RMSE of the single ML methods were considered. The quality of ML model prediction, in terms of speed, complexity, accuracy, and ease of use, was continuously improved through using ensembles of ML methods, hybridization of ML methods, optimization algorithms, and/or soft computing techniques. This trend of improvement is discussed in detail in the discussion.

Short-Term Flood Prediction Using Hybrid ML Methods
To improve the quality of prediction, in terms of accuracy, generalization,  Table 3 presents these methods; a revision of the methods and applications follows along with a discussion on the ML methods. proposed a novel hybrid model of ANN and a hydrodynamic model for the accurate shortterm prediction of extreme storm surge water. The ANN-hydrodynamic model generated realistic flood extents and a great improvement in model accuracy. Reference [184] proposed a hybrid forecasting technique called RSVRCPSO to accurately estimate the rainfall. RSVRCPSO is an integration of RNN, SVR, and a chaotic particle swarm optimization algorithm (CPSO). This dataset was obtained from three rain gauges from the period of 1985 to August 1997, which included the data of nine typhoon events. The results suggested that the proposed model yielded better performance for rainfall prediction. The RSVRCPSO model, in comparison with SVRCPSO, resulted in less RMSE learning and testing, which gave way to superiority in prediction.
Pan et al. [185] proposed a monsoon rainfall enhancement (AME) based on ANNs, which was a hybrid form of linear regression and a state-space neural network (SSNN).
The performance of the proposed model was benchmarked against the hybrid method of MLR-ANN. This dataset included the total rain, wind, and humidity measures from 1989-2008 based on 371 rain gauge stations of six typhoons. The results indicated that the method was highly robust with a better prediction accuracy in terms of R 2 , peak discharge, and total volume. Rajurkar et al. [186]  to deal with uncertainties in prediction. The ensemble prediction system was reported as highly useful and robust.

Comparative Performance Analysis
To evaluate a reliable prediction, the accuracy, reliability, robustness, consistency, generalization, and timeliness are suggested as the basic criteria (Singh 1989). The timeliness is one of the most important criteria, and it is only achieved through using robust yet simple models. Furthermore, the performance of the prediction models is often Here, it is worth mentioning that the value of RMSE can be different across various studies. In addition, the values of RMSE in some studies were calculated for various sites.
To present a fair evaluation of RMSE, we made sure that the unit of RMSE was the same, and, for the multiple RMSEs, the average was calculated. We also double-checked for any possible error. The comparative performance analysis of single and hybrid ML methods for short-term flood prediction using R 2 and RMSE are presented in Figures 7 and 8 respectively. Generally, ANNs are suggested as promising means for short-term prediction.
Despite performing weakly in a few early studies, especially in the generalization aspect, better methodologies for higher-performance ANNs in handling big datasets yielded better results. In this context, the BPNN and functional networks are suggested as being difficult to be implemented by the user. However, the models were shown to be reasonably accurate, efficient, and fast with the ability to deal with noisy datasets. However, the NARX network performed better compared to BPNN. Nevertheless, accuracy could be enhanced through integration with autoregressive models. MLP and DTs provide equally acceptable prediction yields with ANNs. Among DTs, the ADT model provided the fastest and most accurate prediction capability in determining floods. Although not as popular as ANNs, the rotation forest (RF) and M5 model tree (MT) were reported as efficient and robust. References e.g. [69,136] proposed RF-based models that were as effective as ANNs and suitable for long lead times.

RMSE R2
Along with ANNs, the SVM was also seen as a relatively effective ML tool for rainfallrunoff modeling and classification with better generalization ability and performance. In many cases, SVM performed even better, especially for very short lead times [122,125]. In particular, SVM-based models provided promising performances for hourly prediction.
Nevertheless, the prediction ability decreased for longer lead times. This issue was addressed using the LS-SVM model, which also showed better generalization ability [127].
Generally, SVM was reported to be a suitable choice to evaluate the uncertainty in predicting hazardous flood quantiles, which revealed the effectiveness of SVM in real-time flood forecasting.
Overall, the reviewed single prediction models could provide relatively accurate short-term forecasts. However, for predictions longer than 2 h, hybrid models such as

Long-Term Flood Prediction with ML
Long-term flood prediction is of significant importance for increasing knowledge and water resource management potential over longer periods of time, from weekly to monthly and annual predictions [191]. In the past decades, many notable ML methods, such as ANN [74], ANFIS [68,192], SVM [193], SVR [193], WNN [51], and bootstrap-ANN [51], were used for long lead-time predictions with promising results. Recently, in a number of studies (e.g., References [55,[194][195][196][197][198]), the performances of various ML methods for long lead-time flood predictions were compared. However, it is still not clear which ML method performs best in long-term flood prediction. In this section, Tables 4 and 5 represent a summary of these investigations, and we review the performance of the ML models in dealing with long-term predictions.

Long-Term Flood Prediction Using Single ML Methods
This section presents a comprehensive comparison on ML methods. Table 4 presents a summary of the major single ML methods used in long-term flood prediction, i.e., MLP, ANNs, SVM, and RT, followed by a comprehensive performance comparison. A revision and discussion of these methods follow, identifying the most suitable methods presented in the literature. For seasonal flood forecasting, Elsafi [197] proposed numerous ANNs and compared the results. The water level data from different stations from 1970-1985 were selected for training, and the data from 1986-1987 were used for verification. The ANNs worked well, especially where the dataset was not complete, providing a viable choice for accurate prediction. ANNs provided the possibility of reducing the analytical costs through reducing the data analysis time that used to face in e.g., [198]. Similarly, reference [87] used ANNs to develop a prediction model for precipitation. A historical dataset of 1900-2001 of different stations was considered and the ANN model was applied to various stations to evaluate prediction performance. The authors summarized that the ANN models offered great forecasting skills for predicting long-term evapotranspiration and precipitation.
Reference [202] used an ANN model for stream assessment for long-term floods. This dataset was collected from more than 100 sites of numerous flood streams. They concluded that the ANN model, compared to Hilsenhoff's biotic index (HBI), significantly improved the prediction ability using geomorphic data. However, the ANN had generalization problems. Nevertheless, the ANN in this case proved useful to water managers.
Singh [199]  SVM was demonstrated as a potential candidate for the prediction of long-term discharges, outshining the ANN. In a similar approach, Reference [205] proposed an SVM-based model for estimating soil moisture using remote-sensing data, and the results were   Table 4, as well as the accuracy analysis in Figure 9, where values of R 2 and RMSE for the single ML methods were considered. The quality of the ML model prediction, in terms of speed, complexity, accuracy, and ease of use, improved continuously through the use of ensembles of ML methods, hybridization of ML methods, optimization algorithms, and/or soft computing techniques. This trend of improvement is discussed in detail in the discussion.

Long-Term Flood Prediction Using Hybrid ML Methods
A critical review on the long-term flood prediction using hybrid methods is presented in  significantly increased the accuracy for the monthly approximation of peak streamflow.

Comparative Performance Analysis and Discussion
In this section, the comparative performance analysis of ML methods for long-term prediction is presented. Figure 9 represents the values of RMSE and R 2 for single methods of ML, where ANNs, SVMs, and SVRs show better results. Figure 10 represents   Either in short-term [227] or long-term rainfall-runoff modeling [50], overall, the accuracy, precision, and performance of most decomposed ML algorithms (e.g., WNN) were reported as better than those which were trained using un-decomposed time series.
However, despite the achievement of WNNs, the predictions were not satisfactory for long lead times. To increase the accuracy of the longer-lead-time predictions up to one year, novel hybrids such as WARM, which is a hybrid of WNN and an autoregressive model, and wavelet multi-resolution analysis (WMRA) were proposed. In other cases, it was seen that the performance of models improved greatly through decomposition to produce cleaner inputs. For example, wavelet-neuro-fuzzy models [228] were significantly more accurate and faster than single ANFIS and ANNs. However, with an increase in the lead time, the uncertainty in prediction increased. ANNs showed significant improvements in accuracy and generalization. Figure 10 represents the comparative performance analysis of hybrid methods of ML for short-term prediction. Here, it is also worth mentioning the importance of further signal processing techniques (e.g., Reference [228]) for both long-term and short-term floods. This paper suggests that the drawbacks to major ML methods in terms of accuracy, uncertainty, performance, and robustness were improved through the hybridization of ML methods, as well as using an ensemble variation of the ML method. It is expected that this trend represents the future horizon of flood prediction.

Conclusions
The current state of ML modeling for flood prediction is quite young and in the early stage of advancement. This paper presents an overview of machine learning models used Comparative performance analysis of single methods for long-term

RMSE R2
in flood prediction, and develops a classification scheme to analyze the existing literature.
The survey represents the performance analysis and investigation of more than 6000 articles. Among them, we identified 180 original and influential articles where the performance and accuracy of at least two machine learning models were compared. To do so, the prediction models were classified into two categories according to lead time, and further divided into categories of hybrid and single methods. The state of the art of these classes was discussed and analyzed in detail, considering the performance comparison of the methods available in the literature. The performance of the methods was evaluated in terms of R 2 and RMSE, in addition to the generalization ability, robustness, computation cost, and speed. Despite the promising results already reported in implementing the most popular machine learning methods, e.g., ANNs, SVM, SVR, ANFIS, WNN, and DTs, there was significant research and experimentation for further improvement and advancement.
In this context, there were four major trends reported in the literature for improving the quality of prediction. The first was novel hybridization, either through the integration of two or more machine learning methods or the integration of a machine learning method(s) with more conventional means, and/or soft computing. The second was the use of data decomposition techniques for the purpose of improving the quality of the dataset, which highly contributed in improving the accuracy of prediction. The third was the use of an ensemble of methods, which dramatically increased the generalization ability of the models and decreased the uncertainty of prediction. The fourth was the use of add-on optimizer algorithms to improve the quality of machine learning algorithms, e.g., for better tuning the ANNs to reach optimal neuronal architectures. It is expected that, through these four key technologies, flood prediction will witness significant improvements for both short-term and long-term predictions. Surely, the advancement of these novel ML methods depends highly on the proper usage of soft computing techniques in designing novel learning algorithms. This fact was discussed in the paper, and the soft computing techniques were introduced as the main contributors in developing hybrid ML methods of the future.
Here, it is also worth mentioning that the multidisciplinary nature of this work was the most challenging difficulty to overcome in this paper. Having contributions from the coauthors of both realms of ML and hydrology was the key to success. Furthermore, the novel search methodology and the creative taxonomy and classification of the ML methods led to the original achievement of the paper.
For future work, conducting a survey on spatial flood prediction using machine learning models is highly encouraged. This important aspect of flood prediction was excluded from our paper due to the nature of modeling methodologies and the datasets used in predicting the location of floods. Nevertheless, the recent advancements in machine learning models for spatial flood analysis revolutionized this particular realm of flood forecasting, which requires separate investigation.