Bayesian Model Averaging: A Unique Model Enhancing Forecasting Accuracy for Daily Streamflow Based on Different Antecedent Time Series

Streamflow forecasting is a vital task for hydrology and water resources engineering, and the different artificial intelligence (AI) approaches have been employed for this purposes until now. Additionally, the forecasting accuracy and uncertainty estimation are the meaningful assignments that need to be recognized. The addressed research investigates the potential of novel ensemble approach, Bayesian model averaging (BMA), in streamflow forecasting using daily time series data from two stations (i.e., Hongcheon and Jucheon), South Korea. Six categories (i.e., M1–M6) of input combination using different antecedent times were employed for streamflow forecasting. The outcomes of BMA model were compared with those of multivariate adaptive regression spline (MARS), M5 model tree (M5Tree), and Kernel extreme learning machines (KELM) models considering four assessment indexes, root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE), correlation coefficient (R), and mean absolute error (MAE). The results revealed the superior accuracy of BMA model over three machine learning models in daily streamflow forecasting. Considering RMSE values among the best models during testing phase, the best BMA model (i.e., BMA2) enhanced the forecasting accuracy of MARS1, M5Tree4, and KELM3 models by 5.2%, 5.8%, and 3.4% in Hongcheon station. Additionally, the best BMA model (i.e., BMA1) improved the forecasting accuracy of MARS1, M5Tree1, and KELM1 models by 6.7%, 9.5%, and 3.7% in Jucheon station. In addition, the best BMA models in both stations allowed the uncertainty estimation, and produced higher uncertainty of peak flows compared to that of low flows. As one of the most robust and effective tools, therefore, the BMA model can be successfully employed for streamflow forecasting with different antecedent times.


Introduction
Implementing a stable model to forecast streamflow can be influential for the fields of hydrology and water resources researches [1][2][3][4]. Streamflow forecasting, however, is an intricate project because of nonstationary time series and reliance on temporal and spatial parameters which have unclear and complicated components [5][6][7]. Increasing issue complications often depend on long antecedent times (or lead times) such as days and months [8][9][10][11]. Therefore, streamflow forecasting using different Sustainability 2020, 12, 9720 2 of 22 antecedent times can be categorized as universal assignment for hydrology and water resources researches [12][13][14][15][16].
Machine learning (ML) models have popular and flexible approaches for simulating and catching the nonlinear phenomena for science and engineering during three decades including multivariate adaptive regression spline (MARS), M5 model tree (M5Tree), and Kernel extreme learning model (KELM), etc. The MARS model has been successfully applied and employed for solving streamflow forecasting problems until now. Al-Sudani et al. [17] surveyed the ability of MARS incorporated with differential evolution (MARS-DE) model to forecast streamflow in Tigris River, Iraq. They investigated that the MARS-DE model provided a reliable forecasted accuracy for semi-arid streamflow. Adamowski et al. [18] managed the MARS model to forecast streamflow in Himalayan watershed, Uttaranchal State, India. They found that the MARS model preformed a superior forecasted accuracy compared to the artificial neural network (ANN) model. Tyralis et al. [19] utilized the MARS model for daily streamflow forecasting in 511 basins, USA. The MARS model, however, did not improve the performance of linear regression model obviously compared to the other models (e.g., extremely randomized trees, XGBoost, and polyMARS).
M5Tree model has also been utilized for perceiving the pros and cons of streamflow forecasting. Solomatine and Xue [20] applied the M5Tree model for flood forecasting in the Huai River, China. They provided that the forecasted accuracy of M5Tree model were similar with that of ANN models, and the hybrid model covering M5Tree and ANN indicated the best forecasted accuracy. Štravs and Brilly [21] developed the M5Tree model for low streamflow forecasting in the Sava River basin, Slovenia. They employed the recession streamflow data based on 7-day lead time for forecasting and showed the reliable accuracy. Sattari et al. [22] hired the M5Tree model for daily streamflow forecasting in the Sohu River, Turkey. They demonstrated that the M5Tree model forecasted 7-day lead time streamflow accurately. Adnan et al. [23] worked using the M5Tree model for monthly and daily streamflow forecasting in the Hunza River, Pakistan. This experiment said that the M5Tree model could not forecast monthly and daily streamflow effectively compared to the least square support vector machine (LSSVM) model. However, the diverse researches using multiple machine learning models can be found for streamflow forecasting including the MARS and M5Tree models from the published articles and reports. Yaseen et al. [24] evaluated the MARS and M5Tree models for monthly streamflow forecasting in Turkey and Iraq. This document explained that the LSSVM model, however, forecasted the monthly streamflow accurately compared to the MARS and M5Tree models. Yin et al. [25] utilized the MARS and M5Tree models for streamflow forecasting in a semiarid and mountainous region, Northwestern China. They evaluated that the performance of M5Tree model was superior to the support vector regression (SVR) and MARS models for 1-, 2-, and 3-day lead times forecasting. Kisi et al. [26] investigated the MARS and M5Tree models for streamflow forecasting in the Mediterranean region, Turkey. This article showed that the MARS and M5Tree model did not accomplish the outstanding performance compared to the LSSVM model. Rezaie-Balf et al. [27] handled the MARS and M5Tree models to forecast daily streamflow in Iran and South Korea. They indicated that the MARS model combined ensemble empirical mode decomposition (EEMD) was the effective method to forecast streamflow based on 1-, 2-, 3-, and 4-day lead times. Additionally, Rezaie-Balf et al. [28] explored the MARS and M5Tree models to forecast the reservoir inflow for Aswan High Dam, Egypt. They discovered that the MARS model embedded with complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) suggested the reliable accuracy to forecast dam inflow up to 6-month lead time.
Extreme learning machines (ELM) has also been accomplished to understand the nonlinear behavior of streamflow forecasting. Lima et al. [29] forecasted the daily streamflow using the ELM model in British Columbia, Canada. This research explained that the online sequential extreme learning machine (OSELM) model was trained utilizing abundant dataset to choose the optimal parameters, and generated the effective performance to forecast streamflow based on 1-, 2-, and 3-day lead times. Yadav et al. [30] verified the ELM model for streamflow forecasting in the Neckar River, Germany.
They illustrated that the OSELM model forecasted streamflow up to 6 h lead time accurately compared to the ANN, support vector machine (SVM), and genetic programming. Yaseen et al. [2] investigated the ELM model for forecasting monthly streamflow in the Tigris River, Iraq. They concluded that the ELM model surpassed the SVR and the generalized regression neural network (GRNN) models to forecast the monthly streamflow. Rezaie-Balf and Kisi [13] applied the ELM model for daily streamflow forecasting in the Tajan River, Iran. This study revealed that the evolutionary polynomial regression (EPR) model outperformed the multilayer perceptron neural network (MLPNN) and optimally pruned extreme learning machine (OPELM) models to forecast daily streamflow. Niu et al. [31] developed the ELM model for forecasting daily streamflow in Xinfengjiang Reservoir, China. They demonstrated that the ELM integrated with quantum particle swarm optimization (ELM-QPSO) model enhanced the performance accuracy of ELM model to forecast daily streamflow. Under addressed research, Kernel extreme learning machine (KELM), a special type of ELM model, has been considered.
BMA model is a unique approach to implement a mechanism and clarify the model uncertainty [32]. However, the limited researchers have developed and applied the BMA model for fields of hydrology and water resources engineering including streamflow, rainfall, and water stage, etc. Duan et al. [33] employed the BMA model to develop the stable hydrologic predictions. They surveyed that the BMA model carried out the effective probabilistic prediction compared to the original ensemble model. Jiang et al. [34] investigated the BMA model for evaluating the multi-satellite precipitation using simulated hydrological streamflows, South China. This research showed that the satellite streamflow was merged by the BMA model, and the simulated streamflow was improved effectively. Wang et al. [35] developed the BMA model for rainfall forecasting based on seasonal concept, Australia. They inspected that the BMA model outperformed the specific model with two fixed predictors to forecast the merging seasonal rainfall. Rathinasamy et al. [36] developed the BMA model for forecasting streamflow at different time-scales (i.e., daily, weekly, and monthly) in two stations, USA. They produced several wavelet Volterra to obtain ensemble BMA model. The BMA model coupling ensemble multi wavelet Volterra outperformed the single wavelet Volterra and the mean averaged ensemble wavelet Volterra clearly to forecast daily, weekly, and monthly streamflow. Liu and Merwade [37] developed the BMA model to operate the system of water stage prediction in the Black River watershed, Missouri and Arkansas, USA. They reported that the BMA model provided the accurate prediction for flood water stage. In addition, flood inundation range estimated from BMA flood map was more effective than the probabilistic flood inundation range.
It can be considered from literature reviews of the BMA model that there have been no previously published the articles using the BMA model to compare the performance accuracy of MARS, M5Tree, and KELM models for streamflow forecasting until now. The purposes of this article can be arranged as follows: (1) to evaluate various input category of streamflow data with different antecedent times, (2) to compare and assess the performance accuracy of multivariate adaptive regression spline (MARS), M5 model tree (M5Tree), Kernel extreme learning model (KELM), and Bayesian model averaging (BMA) models for streamflow forecasting, and (3) to map the uncertainty ranges utilizing the performance of novel BMA model.

Multivariate Adaptive Regression Spline (MARS)
MARS model (see Figure 1) does not require the particular presumptions of practical relationships between input and output indicators [38]. The performance of MARS model using spline functions gives larger flexibility than linear ones based on curvature and thresholds. The basic functions (BFs), which are assigned as smooth polynomials (e.g., splines), are built using two step approaches. In the first approach, the model performance is enhanced until probabilistic nodes are identified. The second includes the elimination of lowest real terms. Imagine y is an output indicator and X = X 1 , . . . , X p is an input indicator. Therefore, the actual response can be expressed using following Equation (1) [27,39].
where e = the error distribution. The MARS model achieves the approximate function (ƒ) using the BFs. The Equation (2) provides a linear combination of BFs and shared relation for the MARS model.
where individual λ m (x) = a spline function or output of two (or more) spline functions. The least squares method (LSM) can evaluate the coefficients β 0 (i.e., constant value). A model, therefore, can form the training error (e.g., having maximum reduction) using separating β 0 and basis pair.
The following pair is boosted to the addressed model based on the M BFs as [27,39]: where LSM can be applied for estimating β. When a novel BF is boosted to the model space, the associated interactions are recognized among the BFs. BFs are accumulated to the model for acquiring the maximum number of terms that deliver a sufficient fitness model. Then, a backward technique is applied to reduce the numbers of terms effectively. In the backward technique, BFs with the lowest accuracy are deleted to determine the best alternate model. Generalized cross validation (GCV), a method for comparing alternative models, can be represented as [27,39].
where M and N = the number of observations and BFs, respectively, MSE = mean squared error, and d = the penalty of each BF. To broaden the knowledge of MARS model, [27] provided the detailed theory and application using MARS models for streamflow forecasting.
where LSM can be applied for estimating β . When a novel BF is boosted to the model space, the associated interactions are recognized among the BFs. BFs are accumulated to the model for acquiring the maximum number of terms that deliver a sufficient fitness model. Then, a backward technique is applied to reduce the numbers of terms effectively. In the backward technique, BFs with the lowest accuracy are deleted to determine the best alternate model. Generalized cross validation (GCV), a method for comparing alternative models, can be represented as [27,39].
where M and N = the number of observations and BFs, respectively, MSE = mean squared error, and d = the penalty of each BF. To broaden the knowledge of MARS model, [27] provided the detailed theory and application using MARS models for streamflow forecasting.

M5 Model Tree (M5Tree)
M5Tree model (see Figure 2) is a layered algorithm to judge the connection between input and output indicators [27,40]. The classification and regression trees (CART) is the basic algorithm for developing M5Tree model [41]. The M5Tree model assembles a linear-based model to the specific division which calculates the class properties of data portion leading to the leaf [27]. The standard deviation reduction (SDR) can influence to build the tree of M5Tree model. Additionally, it can reinforce the expected error reduction for specific points using Equation (5).

M5 Model Tree (M5Tree)
M5Tree model (see Figure 2) is a layered algorithm to judge the connection between input and output indicators [27,40]. The classification and regression trees (CART) is the basic algorithm for developing M5Tree model [41]. The M5Tree model assembles a linear-based model to the specific division which calculates the class properties of data portion leading to the leaf [27]. The standard Sustainability 2020, 12, 9720 5 of 22 deviation reduction (SDR) can influence to build the tree of M5Tree model. Additionally, it can reinforce the expected error reduction for specific points using Equation (5).
where E = a group of demonstrations that reach the leaf and E i = a sub-group of input data to antecedent leaf. The pruning method was employed to suppress the overfitting burden and attain the accurate formation [27]. To apply the M5Tree model for streamflow forecasting, the previous articles (e.g., [20,27]) furnished the core approach to solve the addressed problems of streamflow forecasting.
Sustainability 2020, 12, x FOR PEER REVIEW 5 of 22 where E = a group of demonstrations that reach the leaf and Ei = a sub-group of input data to antecedent leaf. The pruning method was employed to suppress the overfitting burden and attain the accurate formation [27]. To apply the M5Tree model for streamflow forecasting, the previous articles (e.g., [20,27]) furnished the core approach to solve the addressed problems of streamflow forecasting.

Kernel Extreme Learning Machines (KELM)
ELM model (see Figure 3), one of novel training algorithms for feedforward neural networks (FFNN) with single hidden layer, was recommended by the previous article of Huang et al. [42] to lessen the handicaps of conventional training algorithm and enhance the model accuracy [43,44]. The training speed of ELM model, which generates the connection weights randomly in the hidden layer, is faster than that of other models. Additionally, the performance of ELM model shows robust generalizations with accurate control [42]. The aforementioned specification evaluates the ELM model as a superior model compared to other models with conventional training algorithm. The conventional version of the ELM model meets the disadvantages of providing diverse accuracies in various trials because of randomly assigned connection weights. To solve the weak point of standard ELM model, Huang et al. [45] supplied the Kernel ELM (i.e., KELM) model by improving the process of allocating random connection weights between the input and hidden layers, which explains briefly the theory of KELM model. Detailed demonstration can be found in published article of Huang et al. [45]. The conventional FFNN model (i.e., having single hidden layer) with N hidden nodes can be shown using Equation (6).
where g( ) ⋅ , i b , i W , and i β = transfer function, specified bias randomly, connection weights from hidden to output layer, and connection weights from hidden and output layer, respectively. Equation (6), therefore, can be re-written as [43].
where Y = N target values and H = the matrix of hidden layer.

Kernel Extreme Learning Machines (KELM)
ELM model (see Figure 3), one of novel training algorithms for feedforward neural networks (FFNN) with single hidden layer, was recommended by the previous article of Huang et al. [42] to lessen the handicaps of conventional training algorithm and enhance the model accuracy [43,44]. The training speed of ELM model, which generates the connection weights randomly in the hidden layer, is faster than that of other models. Additionally, the performance of ELM model shows robust generalizations with accurate control [42]. The aforementioned specification evaluates the ELM model as a superior model compared to other models with conventional training algorithm. The conventional version of the ELM model meets the disadvantages of providing diverse accuracies in various trials because of randomly assigned connection weights. To solve the weak point of standard ELM model, Huang et al. [45] supplied the Kernel ELM (i.e., KELM) model by improving the process of allocating random connection weights between the input and hidden layers, which explains briefly the theory of KELM model. Detailed demonstration can be found in published article of Huang et al. [45]. The conventional FFNN model (i.e., having single hidden layer) with N hidden nodes can be shown using Equation (6).
where g(·), b i , W i , and β i = transfer function, specified bias randomly, connection weights from hidden to output layer, and connection weights from hidden and output layer, respectively. Equation (6), therefore, can be re-written as [43].
Sustainability 2020, 12, 9720 6 of 22 where Y = N target values and H = the matrix of hidden layer.
where M = the number of nodes in the hidden layer. The connection weights in the output layer can be generated applying the Moore-Penrose generalized inverse (H + ) of hidden layer matrix.
Sustainability 2020, 12, x FOR PEER REVIEW 6 of 22 where M = the number of nodes in the hidden layer. The connection weights in the output layer can be generated applying the Moore-Penrose generalized inverse (H + ) of hidden layer matrix.
As one of accurate nonlinear regression models, the ELM model has been employed widely in the fields of hydrology and water resources engineering (e.g., [13,46]). In this study, 12 neurons and polynomial kernel were applied in hidden layer by applying trial and error process. Additionally, the regularization coefficient of KELM model was set to 10 to minimize difference between observed and forecasted streamflow values in both stations.

Input Layer
Hidden Layer Output Layer : : Randomly generated δ 0 β j Determined using Moore-Penrose

Bayesian Model Averaging (BMA)
BMA model, as a Bayesian inference, is implemented for model selection, forecasting, prediction, and estimation, and is also developed to combine the interferences and predictions from statistical models [32]. It can provide a criteria for simple model selection with limited simulations. One can simulate the parameter uncertainty utilizing a prior distribution as well as posterior parameter when requesting BMA model. This approach causes the brash inferences by neglecting uncertainty of candidate models [47]. This unique characteristics can provide a way to forecast natural behavior utilizing statistical post-processing approach [48][49][50]. Dismissing process, therefore, to acquire the posterior densities on BMA model parameters can be found from predictive probability density function (PDF) of x.
where 1 2 M f , f , , f  = the group for candidate models of a specific quantity x based on temporal and spatial scale, k θ = estimated parameter, and k ω = connection weights for the relative performance As one of accurate nonlinear regression models, the ELM model has been employed widely in the fields of hydrology and water resources engineering (e.g., [13,46]). In this study, 12 neurons and polynomial kernel were applied in hidden layer by applying trial and error process. Additionally, the regularization coefficient of KELM model was set to 10 to minimize difference between observed and forecasted streamflow values in both stations.

Bayesian Model Averaging (BMA)
BMA model, as a Bayesian inference, is implemented for model selection, forecasting, prediction, and estimation, and is also developed to combine the interferences and predictions from statistical models [32]. It can provide a criteria for simple model selection with limited simulations. One can simulate the parameter uncertainty utilizing a prior distribution as well as posterior parameter when requesting BMA model. This approach causes the brash inferences by neglecting uncertainty of candidate models [47]. This unique characteristics can provide a way to forecast natural behavior utilizing statistical post-processing approach [48][49][50]. Dismissing process, therefore, to acquire the posterior densities on BMA model parameters can be found from predictive probability density function (PDF) of x.
where f 1 , f 2 , . . . , f M = the group for candidate models of a specific quantity x based on temporal and spatial scale, θ k = estimated parameter, and ω k = connection weights for the relative performance of every ensemble member ( f k ). Therefore, the connection weights form the probability density and M k=1 ω k = 1 [51]. In BMA model' category, f k demonstrates a component PDF (g k (x f k θ k )) [52].

Assessment of Models Performance
To figure out the performance of MARS, M5Tree, KELM, and BMA models, four assessment indexes were handled.

Root Mean Square Error (RMSE)
The error discrepancy between observed and forecasted streamflow values can be assessed by root mean square error (RMSE) function [53]. Perfect forecasting can be judged by RMSE = 0. In case of the highest error caused by the peak and higher values, RMSE can be distorted [54] and exploited for model evaluation with absolute units [55].

Nash-Sutcliffe Efficiency (NSE)
Evaluating the models' capability can be accomplished by the Nash-Sutcliffe efficiency (NSE) function [56]. NSE = 0 when the squared difference between observed and forecasted streamflow values is immense to approve the variance in observed streamflow values. If NSE < 0, this indicates that the observed mean is better than forecasted one by the model [57]. If NSE = 1, all points are ideal category [58].

Correlation Coefficient (R)
The correlation coefficient (R) is defined as the ratio of dependent indicator from the independent one. If R = 0, it implies that streamflow cannot be forecasted using developed models, whereas if R = 1, it demonstrates that the observed and forecasted streamflows have a strong correlation.

Mean Absolute Error (MAE)
The mean absolute error (MAE) can supply better knowledge for a model's forecasting, and cannot be contemplated in the vicinity of higher or lower significance. However, it assesses all derivations from observed streamflow values in the same manner [59]. If MAE = 0, it defines that the employed models can forecast streamflow absolutely, while if MAE = 1, it describes that the observed and forecasted streamflows do not have any relationship for forecasting category.
where S obs and S f or are the observed and forecasted streamflow values; S obs and S f or are the observed and forecasted mean streamflow values; and n is total number of employed data.

Study Area and Data
Under the addressed research, two stations (i.e., Hongcheon and Jucheon) were appointed for forecasting streamflow of the Hongcheon and Jucheon Streams (e.g., branches of the Han River), South Korea. Hongcheon station is located at Hongcheon Bridge with a latitude of 37 • Table 1. The statistical evidence that the streamflow data have highly skewed distributions indicates the chaotic behavior of the studied data. Since the streamflow follows a complicated transformation of excess rainfall, surface, and subsurface flows, Salas et al. [60] reported that the streamflow bounced off chaotic behavior with low dimension for the outlet of specific watershed.

Study Area and Data
Under the addressed research, two stations (i.e., Hongcheon and Jucheon) were appointed for forecasting streamflow of the Hongcheon and Jucheon Streams (e.g., branches of the Han River), South Korea. Hongcheon station is located at Hongcheon Bridge with a latitude of 37°41' N and a longitude of 127°52´ E, and Jucheon station is located at Jucheon Bridge with a latitude of 37°16´ N and a longitude of 128°15´ E, respectively.  Figure  4. The properties of the used data for models' development are summed up in Table 1. The statistical evidence that the streamflow data have highly skewed distributions indicates the chaotic behavior of the studied data. Since the streamflow follows a complicated transformation of excess rainfall, surface, and subsurface flows, Salas et al. [60] reported that the streamflow bounced off chaotic behavior with low dimension for the outlet of specific watershed.    One of the important projects for streamflow forecasting is determination of the appropriate input variables [1,27]. For the application of forecasting model, the optimal selection of the best input combination based on the antecedent times was suggested using six different combinations. The detection of appropriate antecedent times for streamflow forecasting can be clarified as identification of catchment characteristics including area, shape, length, and slope, etc. The previous research demonstrated that the recent antecedent times (e.g., (t − 1), (t − 2), and (t − 3)) were better associated than the ancient ones [27,61]. Rezaie-Balf et al. [27] accomplished that the antecedent times for daily streamflow forecasting were determined as (t − 1)~(t − 4) days in Tajan (Iran) and Hongcheon (South Korea) rivers. Under the addressed study, the antecedent times were increased to verify the effective outcomes (e.g., forecasting accuracy) of input combinations based on the article of [27]. Thus, the six input combinations (i.e., six categories from M1 to M6) using six antecedent times values which are provided in Table 2 were employed for the implemented methods. Table 2. Different input combinations for streamflow forecasting.

Hongcheon Station
Forecasting accuracy of developed models during testing phase are provided in Table 3 Table 3 that the BMA models provided better performance than the MARS, M5Tree, and KELM models based on each category (i.e., Categories M1-M6) during testing phase. Therefore, the BMA2 (i.e., having t − 1 and t − 2 antecedent times) model supplied the best forecasting accuracy compared to the other models, whereas the M5-based models (i.e., MARS5, M5Tree5, KELM5, and BMA5) showed the worst performance considering all models and categories. The scatter diagrams between the observed and forecasted streamflow values using the best models (i.e., MARS1, M5Tree4, KELM3, and BMA2) based on each model (i.e., MARS, M5Tree, KELM, and BMA) and category for Hongcheon station are illustrated in Figure 5a-d including the exact (y = x) line, fitted line, and R value, respectively. It can be judged that the forecasted streamflow values based on the BMA2 model were more adjacent to the equivalent observed values during testing phase. Figure 6 supports the RMSE values for each model during testing phase in Hongcheon station. It can be observed from Figure 6 that the BMA models based on M1-M6 categories provided lower RMSE compared to other models during testing phase. Sustainability 2020, 12, x FOR PEER REVIEW 11 of 22       Figure 7a that the BMA2 model forecasted the observed streamflow closely compared to other models (i.e., MARS2, M5Tree2, and KELM2). Additionally, Figure 7b explains the uncertainty estimation using 95% prediction interval for the BMA2 model. It can be found from Figure 7b that the BMA2 model provided higher uncertainty of peak flows compared to that of low flows.
Sustainability 2020, 12, x FOR PEER REVIEW 12 of 22 Figure 7a,b present the comparison of methods in streamflow forecasting based on M2 category during testing phase in Hongcheon station. It can be judged from Figure 7a that the BMA2 model forecasted the observed streamflow closely compared to other models (i.e., MARS2, M5Tree2, and KELM2). Additionally, Figure 7b explains the uncertainty estimation using 95% prediction interval for the BMA2 model. It can be found from Figure 7b that the BMA2 model provided higher uncertainty of peak flows compared to that of low flows.
(a) Relationship between observed and forecasted daily streamflow (b) 95% prediction interval for uncertainty estimation In addition, Figure 8a,b provide that the comparison of streamflows based on M4 category during testing phase in Hongcheon station. Furthermore, it can be seen from Figure 8a that the BMA4 model better forecasted the observed streamflow than the alternative models (i.e., MARS4, M5Tree4, and KELM4). Additionally, Figure 8b represents the uncertainty estimation using 95% prediction interval for the BMA4 model. It can be seen from Figure 8b that the BMA4 model has also higher uncertainty in catching peak flows compared to that of low flows. In addition, Figure 8a,b provide that the comparison of streamflows based on M4 category during testing phase in Hongcheon station. Furthermore, it can be seen from Figure 8a that the BMA4 model better forecasted the observed streamflow than the alternative models (i.e., MARS4, M5Tree4, and KELM4). Additionally, Figure 8b represents the uncertainty estimation using 95% prediction interval for the BMA4 model. It can be seen from Figure 8b that the BMA4 model has also higher uncertainty in catching peak flows compared to that of low flows. Sustainability 2020, 12, x FOR PEER REVIEW 13 of 22 (a) Relationship between observed and forecasted daily streamflow (b) 95% prediction interval for uncertainty estimation  Table 4 supplies the forecasted accuracy of employed models during testing phase in Jucheon station. Bold values display the best category of each model (i.e., MARS, M5Tree, KELM, and BMA).  Table 4 that the BMA model provided better forecasting ability than the MARS, M5Tree, and KELM models considering each category. Based on all models and categories, the BMA1 (i.e., having t-1 antecedent time) model furnished the best forecasting compared to the other models, while the worst accuracy was accomplished by the M5based models during testing phase in Jucheon station.   Table 4 supplies the forecasted accuracy of employed models during testing phase in Jucheon station. Bold values display the best category of each model (i.e., MARS, M5Tree, KELM, and BMA).  Table 4 that the BMA model provided better forecasting ability than the MARS, M5Tree, and KELM models considering each category. Based on all models and categories, the BMA1 (i.e., having t − 1 antecedent time) model furnished the best forecasting compared to the other models, while the worst accuracy was accomplished by the M5-based models during testing phase in Jucheon station. Considering all models and categories, the observed and forecasted streamflow values using the best models (i.e., MARS1, M5Tree1, KELM1, and BMA1) are illustrated in Figure 9a-d including the exact (y = x) line, fitted line, and R value for Jucheon station, respectively. It can be seen that the forecasted streamflow values based on BMA1 model were more neighboring to the corresponding observed values during testing phase. Figure 10 Figure 11a,b present the comparison of methods in streamflow forecasting based on M1 category during testing phase in Jucheon station. It can be seen in Figure 11a that the BMA1 model provided better forecasting performance compared to other models (i.e., MARS1, M5Tree1, and KELM1). Additionally, Figure 11b explains the uncertainty estimation using 95% prediction interval for the BMA1 model. It can be considered from Figure 11b that the BMA1 model provided higher uncertainty of peak flows compared to that of low flows. Besides, Figure 12a,b yields the comparison of methods based on M4 category during testing phase in Jucheon station. It can be seen from Figure 12a that the BMA4 model produced better forecasting compared to other models (i.e., MARS4, M5Tree4, and KELM4). Additionally, Figure 12b describes the uncertainty estimation using 95% prediction interval for the BMA4 model. It can be considered from Figure 12b that the BMA4 model provided higher uncertainty for catching peak flows compared to that of low flows.

Jucheon Station
Sustainability 2020, 12, x FOR PEER REVIEW 16 of 22 Figure 11a,b present the comparison of methods in streamflow forecasting based on M1 category during testing phase in Jucheon station. It can be seen in Figure 11a that the BMA1 model provided better forecasting performance compared to other models (i.e., MARS1, M5Tree1, and KELM1). Additionally, Figure 11b explains the uncertainty estimation using 95% prediction interval for the BMA1 model. It can be considered from Figure 11b that the BMA1 model provided higher uncertainty of peak flows compared to that of low flows. Besides, Figure 12a,b yields the comparison of methods based on M4 category during testing phase in Jucheon station. It can be seen from Figure 12a that the BMA4 model produced better forecasting compared to other models (i.e., MARS4, M5Tree4, and KELM4). Additionally, Figure 12b describes the uncertainty estimation using 95% prediction interval for the BMA4 model. It can be considered from Figure 12b that the BMA4 model provided higher uncertainty for catching peak flows compared to that of low flows.
(a) Relationship between observed and forecasted daily streamflow (b) 95% prediction interval for uncertainty estimation

Discussion
The addressed research boosted that the BMA models based on each category correctly captured the nonlinear time series of streamflow and could carry out the accurate forecasting in both stations. The comparison of individual RMSE values among the best models in Hongcheon station supplied that the BMA2 model enhanced the accomplishment by 5.2%, 5.8%, and 3.4% compared to MARS1, M5Tree4, and KELM3 models during testing phase, respectively. Additionally, the comparison of individual RMSE values among the best models in Jucheon station furnished that the BMA1 model increased an efficiency by 6.7% (MARS1 model), 9.5% (M5Tree1 model), and 3.7% (KELM1 model) during testing phase.
Based on the category of the best models, the forecasted accuracy using the BMA model in Hongcheon and Jucheon stations was found to be slightly better than the other models. In addition, the best models in the Hongcheon station could be found considering the different category (i.e., MARS1, M5Tree4, KELM3, and BMA2 models), whereas the best models in the Jucheon station were discovered based on the M1 category (i.e., MARS1, M5Tree1, KELM1, and BMA1) during testing phase, respectively. The improvement difference between both stations might be derived from the

Discussion
The addressed research boosted that the BMA models based on each category correctly captured the nonlinear time series of streamflow and could carry out the accurate forecasting in both stations. The comparison of individual RMSE values among the best models in Hongcheon station supplied that the BMA2 model enhanced the accomplishment by 5.2%, 5.8%, and 3.4% compared to MARS1, M5Tree4, and KELM3 models during testing phase, respectively. Additionally, the comparison of individual RMSE values among the best models in Jucheon station furnished that the BMA1 model increased an efficiency by 6.7% (MARS1 model), 9.5% (M5Tree1 model), and 3.7% (KELM1 model) during testing phase.
Based on the category of the best models, the forecasted accuracy using the BMA model in Hongcheon and Jucheon stations was found to be slightly better than the other models. In addition, the best models in the Hongcheon station could be found considering the different category (i.e., MARS1, M5Tree4, KELM3, and BMA2 models), whereas the best models in the Jucheon station were discovered based on the M1 category (i.e., MARS1, M5Tree1, KELM1, and BMA1) during testing phase, respectively. The improvement difference between both stations might be derived from the characteristics (e.g., maximum and minimum values) of data available. The similar results can be found from the previous documents [2,62,63]. Additionally, comparison of two stations revealed that the employed models were more successful in forecasting streamflows of Jucheon station compared to Hongcheon station. This can be explained by the different properties of the data sets, for example, training data of Jucheon station have much more skewed distribution (skewness = 11.966, Table 1) than those of the other station. If the different models produced the best accuracy using the same data, the additional statistical skills (e.g., null hypothesis [64] and Akaike's information criterion [65]) are proposed to determine the best model for the undergoing project. Considering the previous researches for Bayesian approaches, Rasouli et al. [62] proposed that three machine learning models (i.e., Bayesian neural network (BNN), support vector regression (SVR), and Gaussian process (GP)) were utilized to forecast the daily streamflow using from 1-to 7-day lead time, British Columbia, Canada. The BNN model outperformed other models slightly. Wang et al. [35] proved that the BMA-ensemble-wavelet-Volterra model were superior to the wavelet-Volterra and ensemble-wavelet-Volterra models, obviously. Therefore, the forecasting accuracy of addressed research follows the previous researches.
As one of the continuous projects for streamflow forecasting, the different forecasting models (e.g., seasonal autoregressive integrated moving average (SARIMA) [66,67] and bootstrap aggregation (bagging) [68]), which demonstrated their superiority for temporal forecasting in previous literature, can be applied to compare and evaluate the performance accuracy of BMA model. In addition, different nature-inspired evolutionary algorithms and data pre-processing approaches can be joined with the BMA model to increase the forecasting accuracy of hydrological processes including streamflow, water stage, and groundwater, etc. Thus, to boost the forecasting accuracy of undergoing project, the continuous researches utilizing the BMA model, evolutionary algorithms, and data pre-processing techniques should be recommended for daily streamflow forecasting.

Conclusions
Accurate streamflow forecasting is a major problem of interest related to water resources and hydrology. This research evaluated the efficiency of Bayesian model averaging (BMA) model for daily streamflow forecasting in two different streams including Hongcheon and Jucheon stations, South Korea. Six categories (i.e., M1-M6) of input combination using different antecedent times were employed for streamflow forecasting. Additionally, the forecasting accuracy of the BMA model were compared with those of other models (i.e., MARS, M5Tree, and KELM) with respect to root mean square error (RMSE), Nash-Sutcliff efficiency (NSE), correlation coefficient (R), and mean absolute error (MAE).
The forecasting accuracy confirmed that the best BMA model (i.e., BMA2) increased an achievement by 5.2% (MARS1 model), 5.8% (M5Tree4 model), and 3.4% (KELM3 model) based on RMSE values among the best models during testing phase in Hongcheon station. Additionally, the best BMA model (i.e., BMA1) enhanced an accuracy by 6.7% (MARS1 model), 9.5% (M5Tree1 model), and 3.7% (KELM1 model) based on the best models during testing phase in Jucheon station. In addition, the best BMA models (i.e., BMA2 in Hongcheon station and BMA1 in Jucheon station) permitted the uncertainty estimation, and accomplished higher uncertainty of peak flows compared to that of low flows.
The addressed research outcomes suggested that the BMA model could be successfully employed for streamflow forecasting with different antecedent times for sustainable and efficient water management. For the continuous research, the different hybrid approaches such as coupling BMA model, evolutionary algorithm, and data pre-processing, can be recommended as a potential alternative methodology to enhance the forecasting accuracy based on diverse hydrological processes.