Suspended Sediment Modeling Using a Heuristic Regression Method Hybridized with Kmeans Clustering

: The accurate estimation of suspended sediments (SSs) carries signiﬁcance in determining the volume of dam storage, river carrying capacity, pollution susceptibility, soil erosion potential, aquatic ecological impacts, and the design and operation of hydraulic structures. The presented study proposes a new method for accurately estimating daily SSs using antecedent discharge and sediment information. The novel method is developed by hybridizing the multivariate adaptive regression spline (MARS) and the Kmeans clustering algorithm (MARS–KM). The proposed method’s efﬁcacy is established by comparing its performance with the adaptive neuro-fuzzy system (ANFIS), MARS, and M5 tree (M5Tree) models in predicting SSs at two stations situated on the Yangtze River of China, according to the three assessment measurements, RMSE, MAE, and NSE. Two modeling scenarios are employed; data are divided into 50–50% for model training and testing in the ﬁrst scenario, and the training and test data sets are swapped in the second scenario. In Guangyuan Station, the MARS–KM showed a performance improvement compared to ANFIS, MARS, and M5Tree methods in term of RMSE by 39%, 30%, and 18% in the ﬁrst scenario and by 24%, 22%, and 8% in the second scenario, respectively, while the improvement in RMSE of ANFIS, MARS, and M5Tree was 34%, 26%, and 27% in the ﬁrst scenario and 7%, 16%, and 6% in the second scenario, respectively, at Beibei Station. Additionally, the MARS–KM models provided much more satisfactory estimates using only discharge values as inputs.


Introduction
The rapidly growing global population has made freshwater resources scarce and compelled hydrologists to explore methods for better river management and water resource conservation [1]. Accurate modeling of suspended sediment load (SSL) plays a vital role in river restoration, pollution, and soil erosion reduction, thus solving challenges related to water quality, channel design, and the operation of hydraulic structures [2,3]. However, precise forecasting of SSL is challenging due to the concurrent effects of many meteorological and hydrological factors on sediment processes, such as wind speed, evaporation, precipitation, river discharge, water temperature, and ice packs. The variations of these parameters in space and time make the sediment dynamics highly complicated and nonlinear [4,5]. Many SSL estimation models have been developed in the literature, ranging from physically based to data-driven models. Physically based models require a large provided more accurate results than the standalone ANFIS models in estimating SSL at a USA gauging station.
This study selected the MARS model due to its shorter training process and better ability to model complex nonlinear processes without strong model assumptions than ANN models [30]. Another selected method was the M5Tree model due to its large data handling capability and smaller computational cost than the ANN and SVM models [29]. In recent years, MARS and M5Tree have been applied successfully in modeling runoff and sediment load [31][32][33][34][35][36][37][38][39]. Malik et al. [37] compared the performance of the MARS model with the SVM-based model (least square SVM) and two ANN models (radial basis and self-organizing map neural network) for estimating daily SSL at different gauging stations in Godavari catchment, India. Results indicated that the radial basis neural network and MARS models provided more satisfactory results than other data-driven models. Senthil Kumar et al. [38] evaluated the accuracy of the M5Tree model in predicting SSL and compared its performance with ANN coupled with backpropagation and Levenberg-Marquardt algorithms, fuzzy logic, and REPTree models. They obtained the most precise sediment concentration simulations using M5Tree.
Although the MARS and M5Tree models demonstrated promising results in previous studies for sediment modeling, both models could not efficiently capture the uncertainties in sediment time series data due to their complex behavior. The literature found that hybrid MARS and M5Tree models can provide more precise results than standalone datadriven models. A hybrid of the wavelet method and the M5Tree model (WM5Tree) was introduced by Goyal et al. [39] to estimate the sediment yield, and they found that the wavelet M5Tree provided more accurate results than ANN models. Nourani et al. [35] endorsed the findings of Goyal et al. [39]. They predicted the daily sediment load of the Lighvanchai and Upper Rio Grande rivers by comparing the WM5Tree with ANN and standalone M5Tree models and found that the hybrid M5Tree model outperformed the other models. Rahgoshay et al. [36] applied M5Tree, MARS, and hybrid of SVM with GA and particle swarm optimization (PSO) models to predict the sediment load of two earth dams. They found that SVM hybrid models (SVM with GA and SVM with PSO) provided more precise results than the MARS and M5Tree models.
The abovementioned literature revealed hybrid models still need improvement for precise modeling of suspended sediment. In this study, a new model is developed through hybridization of MARS with the Kmeans method (MARS-KM) to overcome standalone MARS models' weakness in precisely capturing uncertainties in sediment dynamics. The MARS model and many other machine learning models' main disadvantage is that they are very time-consuming, especially for large amounts of data with high variance, as in sediment load. For this purpose, the K means clustering method is utilized in this study. Therefore, the main contributions of this study are to (1) develop a novel model that introducing the K-means clustering into the MARS model for more accurate and faster estimation; (2) compare the prediction accuracy of the proposed MARS-KMeans model with the commonly used machine learning models, e.g., MARS, M5Tree, and ANFIS in sediment load estimation; and (3) select methods based on 50-50% data division.
K-means has been successfully used in recent literature to improve machine learning models' prediction accuracy due to its robust nature in estimation [40][41][42][43][44]. The application of hybrid the MARS-KMeans method is rarely found in the literature for prediction [45]. There is no published study in the literature that uses the MARS-KM method for modeling any variables in hydrology to the best of our knowledge. This gave impetus to this study.

Case Study and Data Analysis
The Jialing catchment was chosen in this study as a case study due to its vital role in the hydropower operation management of the world's biggest dam, i.e., the Three Gorges Reservoir. The selected catchment is the second largest tributary of the Yangtze River with a drainage area of 160,000 km 2 and one of the reservoir's main discharges and Sustainability 2021, 13, 4648 4 of 21 sediment sources. Thus, a precise estimation of sediment load input from the basin to the reservoir is very important for efficient reservoir operation. Two key gauging stations in the Jialing catchment, located at Guangyuan and Beibei, were chosen to predict SSL. These hydraulic gauging stations' geographical locations aid in understanding the overall sediment phenomena and contribution of the catchment to the main Yangtze River. The selected catchment also has two main tributaries, the Qu River on the main channel's left side and the Fu River on the right side (see Figure 1). To estimate the daily sediment load of both gauging stations, the daily data of runoff and suspended sediment from 1 January 2007, to 31 December 2015, were gathered from the Hydrological Yearbooks of the People's Republic of China. The time variation graphs of both stations' sediment loads are illustrated in Figure S1 (see Supplementary Materials). The related authorities checked the reliability and homogeneity of the data before releasing the data. Chinese national standard criteria were followed in obtaining streamflow and sediment measurements. The first vertical profiles (usually 10-30 profiles depending on the river width) are determined for measurement. Then, water depth and velocity of flow (utilizing a velocity meter) at each profile are measured. Flow velocity is recorded for different depths. Water samples for sediment concentrations (SCs) are collected from each depth, and these samples are dried and weighed in the lab. Finally, daily sediment loads are computed by multiplying the SCs by the streamflow [46,47]. Reservoir. The selected catchment is the second largest tributary of the Yangtze River with a drainage area of 160,000 km 2 and one of the reservoir's main discharges and sediment sources. Thus, a precise estimation of sediment load input from the basin to the reservoir is very important for efficient reservoir operation. Two key gauging stations in the Jialing catchment, located at Guangyuan and Beibei, were chosen to predict SSL. These hydraulic gauging stations' geographical locations aid in understanding the overall sediment phenomena and contribution of the catchment to the main Yangtze River. The selected catchment also has two main tributaries, the Qu River on the main channel's left side and the Fu River on the right side (see Figure 1). To estimate the daily sediment load of both gauging stations, the daily data of runoff and suspended sediment from 1 January 2007, to 31 December 2015, were gathered from the Hydrological Yearbooks of the People's Republic of China. The time variation graphs of both stations' sediment loads are illustrated in Figure S1 (see Supplementary Materials). The related authorities checked the reliability and homogeneity of the data before releasing the data. Chinese national standard criteria were followed in obtaining streamflow and sediment measurements. The first vertical profiles (usually 10-30 profiles depending on the river width) are determined for measurement. Then, water depth and velocity of flow (utilizing a velocity meter) at each profile are measured. Flow velocity is recorded for different depths. Water samples for sediment concentrations (SCs) are collected from each depth, and these samples are dried and weighed in the lab. Finally, daily sediment loads are computed by multiplying the SCs by the streamflow [46,47]. Table S1 reports the brief statistics of the used streamflow (Q) and sediment (S) data. In both stations, the Q and S have high skewness, indicating that the time series has many extreme values. An augmented Dickey-Fuller test was applied to the sediment data using the adftest MATLAB command to see if they are stationary or not. For the Guangyuan and Beibei stations, the test statistics were −27.53 and −21.67, while the critical value was −1.94. Test statistics indicate that the sediment data are stationary in both stations.

Adaptive Neuro-Fuzzy Inference System (ANFIS)
Time series models are traditionally used to predict future phenomena in different fields of study. The major drawback of these models is their dependence on the autocorrelation factor. Machine learning models have been introduced and applied to overcome this foremost weakness of time series prediction models. ANFIS is one of the machine learning models that has been extensively used around the globe. It is a fusion of two different techniques, namely, artificial neural networks and fuzzy logic. Zadeh [48] developed fuzzy logic based on semantic uncertainty, which has been extensively used for modeling different environmental phenomena, including air pollution estimation and water pollution prediction. This fuzzy inference system (FIS) is pillared on three theories, i.e., the rule-based system, model's database, and inference system. The first part determines the if-then rules, the second part defines the membership function, and the third phase uses the rules to produce the results. The main problem of FIS is its incapability to optimize membership function and adjust parameters automatically. To overcome this problem, an artificial neural network (ANN) is coupled with FIS to develop ANFIS. The ANFIS model uses the training features of ANN to adjust the membership function of fuzzy logic. It is a multilayer feed-forward network based on ANN learning capabilities and fuzzy thinking. ANFIS has many membership functions, such as Gaussian, trapezoidal, sigmoid, generally bell-shaped, triangular, etc. In real-life problems, the selection of appropriate membership functions is vital. In the current study, the ANFIS model was developed using different input combinations and applied to sediment prediction at the study sites, Guangyuan and Beibei. The basic architecture of ANFIS is given in Figure 2.
In Figure 2, X1 and X2 are inputs to the system, and f is the output. The rules for the ANFIS system are described in Equations (1) and (2) below: Rule 1: IF X1 is P1 and X2 is Q1, then f1 = s1 X1 + t1 X2 + v1 1 Figure 1. The geographical location of the study area.

Adaptive Neuro-Fuzzy Inference System (ANFIS)
Time series models are traditionally used to predict future phenomena in different fields of study. The major drawback of these models is their dependence on the autocorrelation factor. Machine learning models have been introduced and applied to overcome this foremost weakness of time series prediction models. ANFIS is one of the machine learning models that has been extensively used around the globe. It is a fusion of two different techniques, namely, artificial neural networks and fuzzy logic. Zadeh [48] developed fuzzy logic based on semantic uncertainty, which has been extensively used for modeling different environmental phenomena, including air pollution estimation and water pollution prediction. This fuzzy inference system (FIS) is pillared on three theories, i.e., the rule-based system, model's database, and inference system. The first part determines the if-then rules, the second part defines the membership function, and the third phase uses the rules to produce the results. The main problem of FIS is its incapability to optimize membership function and adjust parameters automatically. To overcome this problem, an artificial neural network (ANN) is coupled with FIS to develop ANFIS. The ANFIS model uses the training features of ANN to adjust the membership function of fuzzy logic. It is a multilayer feed-forward network based on ANN learning capabilities and fuzzy thinking. ANFIS has many membership functions, such as Gaussian, trapezoidal, sigmoid, generally bell-shaped, triangular, etc. In real-life problems, the selection of appropriate membership functions is vital. In the current study, the ANFIS model was developed using different input combinations and applied to sediment prediction at the study sites, Guangyuan and Beibei. The basic architecture of ANFIS is given in Figure 2.

M5 Tree Model
The M5 tree model was developed by Quinlan [51] and later rebuilt and enhanced by Wang and Witten [52]. This model is developed based on a decision tree concept, which links input and output variables. The model works in two steps. In the first step, input variables are divided into different groups based on linear regression. This minimizes the error of approximation between exact and forecast values. The standard deviation reduction plays an important role in fixing the division rule of the M5 tree model. Therefore, the decision tree depending upon input information is developed based on standard deviation reduction in the first step. In the second step, the tree is cropped from every leaf. Each further attempt involves finer classification levels, as these are further divided into branch and leaf nodes. The classification and regression tree (CART) algorithm is the basis of the M5 tree model. This model is trained based on the concept of standard deviation reduction (SDR), as discussed using the formula given below.
Here, SDR means standard deviation reduction; sd is expressed as a standard deviation; T represents a set of examples that reach the node; Ti is the ith outcome of the possible set. The data's standard deviation (SD) is less than the parent nodes. The M5 tree model is developed in the current study to forecast SSL for seven different input combinations.

Multivariate Adaptive Regression Splines (MARS)
MARS, developed by Friedman [53], has undergone many reforms to enhance its performance. MARS's significant advantage is that it does not need any particular assumptions for mapping the input-output relationship, which is a major limitation of the M5 tree model. Therefore, the MARS model covers the limitation of the M5 tree model, where the endpoints of the segments are treated as nodes. MARS is the non-parametric regression model and is useful to forecast continuous numeric outcomes. The model's beauty is its flexible steps to manage relationships, which are almost additive or contain relations with other input variables to the model. MARS can explain the complex and nonlinear association between predictor and response variables. The MARS model can also work with the use of both the backward and forward stepwise procedure. Andres et al. [54] explain its use to remove preventable variables to improve the forecasting accuracy In Figure 2, X 1 and X 2 are inputs to the system, and f is the output. The rules for the ANFIS system are described in Equations (1) and (2) below: where s i , t i , and v i are linear output parameters that are required to optimize during model training. The working of all layers of ANFIS is discussed below [49,50]. Layer 1 (fuzzification layer): here, every node is considered an adaptive node, and each node passes the external signal to the next layer In this step, the Gaussian membership function is used.
in which σ and c are membership function parameters. Therefore, the output of the first layer is as follows: Layer 2 (product layer): this layer produces the membership degree of inputs. All nodes of this layer are fixed nodes and labeled as ∏. The firing strength is calculated in this layer as Layer 3 (normalized layer): here, the firing strength ratio at i-th rule to the sum of firing strengths of all rules is calculated, Layer 4 (defuzzification layer): in this layer, each node is an adjustable node with the node function given below.
Here, w i is the output of the normalized firing strength, and {s i , t i , v i } are the parameters set of the node i.
Layer 5 (output layer): the final output of the model is calculated in this layer. It is the sum of incoming signals, as below.
The ANFIS model is highly capable of learning and classifying input-output data.

M5 Tree Model
The M5 tree model was developed by Quinlan [51] and later rebuilt and enhanced by Wang and Witten [52]. This model is developed based on a decision tree concept, which links input and output variables. The model works in two steps. In the first step, input variables are divided into different groups based on linear regression. This minimizes the error of approximation between exact and forecast values. The standard deviation reduction plays an important role in fixing the division rule of the M5 tree model. Therefore, the decision tree depending upon input information is developed based on standard deviation reduction in the first step. In the second step, the tree is cropped from every leaf. Each further attempt involves finer classification levels, as these are further divided into branch and leaf nodes. The classification and regression tree (CART) algorithm is the basis of the M5 tree model. This model is trained based on the concept of standard deviation reduction (SDR), as discussed using the formula given below.
Here, SDR means standard deviation reduction; sd is expressed as a standard deviation; T represents a set of examples that reach the node; T i is the ith outcome of the possible set. The data's standard deviation (SD) is less than the parent nodes. The M5 tree model is developed in the current study to forecast SSL for seven different input combinations.

Multivariate Adaptive Regression Splines (MARS)
MARS, developed by Friedman [53], has undergone many reforms to enhance its performance. MARS's significant advantage is that it does not need any particular assumptions for mapping the input-output relationship, which is a major limitation of the M5 tree model. Therefore, the MARS model covers the limitation of the M5 tree model, where the endpoints of the segments are treated as nodes. MARS is the non-parametric regression model and is useful to forecast continuous numeric outcomes. The model's beauty is its flexible steps to manage relationships, which are almost additive or contain relations with other input variables to the model. MARS can explain the complex and nonlinear association between predictor and response variables. The MARS model can also work with the use of both the backward and forward stepwise procedure. Andres et al. [54] explain its use to remove preventable variables to improve the forecasting accuracy and make this model perform better during the backward stepwise procedure. The stepwise forward process helps to select the appropriate input variables for the MARS model. Two basic functions with a range of inputs define the other variable. The variable Y, mapped from the variable X with c as the threshold, can be given below.
Here, two neighboring splines are used to have continuity in the basis function at the knot. The MARS model is widely used in many different fields, as it has properties to predict future approximation phenomena with good accuracy. In the present research, the MARS model is developed to forecast the sediment for seven different input combinations. The working procedure of MARS is shown in Figure 3. and make this model perform better during the backward stepwise procedure. The stepwise forward process helps to select the appropriate input variables for the MARS model. Two basic functions with a range of inputs define the other variable. The variable Y, mapped from the variable X with c as the threshold, can be given below.

0,
0, Here, two neighboring splines are used to have continuity in the basis function at the knot. The MARS model is widely used in many different fields, as it has properties to predict future approximation phenomena with good accuracy. In the present research, the MARS model is developed to forecast the sediment for seven different input combinations. The working procedure of MARS is shown in Figure 3.

K-Means (KM) Algorithm and MARS Hybrid Model
Lloyd [55] first introduced the KM technique in 1957 as a standard algorithm. Later, MacQueen [56] anticipated the term KM, a clustering technique based on the partitionbased cluster analysis. This algorithm has been used in many different areas of research. The distance matrix used in the KM model is the Euclidean distance [57]. The first phase takes K initial seeds of clustering, and then the mean Euclidean distance is compared with each initial seed. This helps to assign the closest cluster seed. The process is iterated until the error is below the threshold. The choice of initial seed and the numbers of clusters are crucial, as they decide the KM method's accuracy [58]. The KM algorithm classifies the objects based upon characteristics into K number of groups.
If Ji is the Euclidean distance, xk is the data vector, c is the number of clusters, and ci is the cluster center, an objective function, J is defined as In this study, the MARS model is coupled with the KM algorithm for better prediction results with the least error. The working procedure of MARS-KM is shown in Figure  4.

K-Means (KM) Algorithm and MARS Hybrid Model
Lloyd [55] first introduced the KM technique in 1957 as a standard algorithm. Later, MacQueen [56] anticipated the term KM, a clustering technique based on the partitionbased cluster analysis. This algorithm has been used in many different areas of research. The distance matrix used in the KM model is the Euclidean distance [57]. The first phase takes K initial seeds of clustering, and then the mean Euclidean distance is compared with each initial seed. This helps to assign the closest cluster seed. The process is iterated until the error is below the threshold. The choice of initial seed and the numbers of clusters are crucial, as they decide the KM method's accuracy [58]. The KM algorithm classifies the objects based upon characteristics into K number of groups.
If J i is the Euclidean distance, x k is the data vector, c is the number of clusters, and c i is the cluster center, an objective function, J is defined as (14) In this study, the MARS model is coupled with the KM algorithm for better prediction results with the least error. The working procedure of MARS-KM is shown in Figure 4.

Modeling Approaches and Accuracy Assessments
Four approaches, adaptive neuro-fuzzy inference system (ANFIS), multivariate adaptive regression splines (MARS), M5 model tree (M5Tree), and MARS with k-means clustering algorithm (MARS-KM), were used for modeling suspended sediment using various input combinations of sediment load (St: kg/s) and streamflow (Q: m 3 /s). The proposed models were developed using MATLAB software and compared using data at two stations, Guangyuan and Beibei. For each modeling approach, the sediment (kg/s) was modeled either separately using only the Q (m 3 /s) measured at previous lags or combined with St (kg/s) estimated at previous lags. Here, the sediment (St: kg/s) is the response variable. The explanatory variables considered were varied from one to more inputs, formed by a combination of several Q and St lag values. In total, seven input combinations were compared, denoted as combinations (i), (ii) … etc. In the first four input combinations, only streamflow inputs were considered: (i) Qt; (ii) Qt and Qt-1; (iii) Qt, Qt-1, and Qt-2; and (iv) Qt, Qt-1, Qt-2, and Qt-3. After selecting the best Q-based input combination, after the fourth combination, sediment inputs were added to the best Q-based combination. For example, for the ANFIS method, the input combinations considered are (v) Qt and St-1;

Modeling Approaches and Accuracy Assessments
Four approaches, adaptive neuro-fuzzy inference system (ANFIS), multivariate adaptive regression splines (MARS), M5 model tree (M5Tree), and MARS with k-means clustering algorithm (MARS-KM), were used for modeling suspended sediment using various input combinations of sediment load (St: kg/s) and streamflow (Q: m 3 /s). The proposed models were developed using MATLAB software and compared using data at two stations, Guangyuan and Beibei. For each modeling approach, the sediment (kg/s) was modeled either separately using only the Q (m 3 /s) measured at previous lags or combined with St (kg/s) estimated at previous lags. Here, the sediment (St: kg/s) is the response variable. The explanatory variables considered were varied from one to more inputs, formed by a combination of several Q and St lag values. In total, seven input combinations were compared, denoted as combinations (i), (ii) . . . etc. In the first four input combinations, only streamflow inputs were considered: (i) Qt; (ii) Qt and Qt-1; (iii) Qt, Qt-1, and Qt-2; and (iv) Qt, Qt-1, Qt-2, and Qt-3. After selecting the best Q-based input combination, after the fourth combination, sediment inputs were added to the best Q-based combination. For example, for the ANFIS method, the input combinations considered are (v) Qt and St-1; (vi) Qt, St-1, and St-2, and (vi) Qt, St-1, St-2, and St-3, where Qt-1 and St-1 indicate the streamflow and sediment load at time t-1 (one previous day in this study). Performance assessment using different input combinations allows importance evaluation of variables and determining the lag values as inputs. Additionally, two different training scenarios were compared: splitting the dataset into two equal subsets having 50% of total data in each subset and a permutation between the two. Here, the two scenarios are denoted as the first training-test (scenario 1) and second training-test (scenario 2). In the first scenario, data from 4 January 2007, to 3 July 2011, were used for models' training, while the remaining data from 4 July 2011, to 31 December 2015, were used for model testing at both stations. In the second scenario, the training and test data sets were swapped (data from 4 July 2011, to 31 December 2015, were used for the models' training and data from 4 January 2007, to 3 July 2011, were used for the models' testing). These two scenarios allowed comparing the overall models' accuracy for the total range of the dataset. Model accuracy was computed by comparing the measured and the modeled data at each station separately using Nash-Sutcliffe efficiency (NSE), root mean squared error (RMSE: kg/s), and mean absolute error (MAE: kg/s). Table 1 shows the results obtained using ANFIS models for seven input combinations and two scenarios. In Table 1, Qt-1 and St-1 indicate the streamflow and sediment load at time t-1 (one previous day in this study) and vice versa. For only the Q as input, i.e., from input combination (i) Qt to input combination (iv) Qt, Qt-1, Qt-2, and Qt-3, the models showed a moderate to low accuracy, with mean NSE, RMSE, and MAE of 0.514, 1856.25 kg/s, and 346.75 kg/s, respectively. The strong contribution of Qt is apparent, since there is no improvement in the models' performance after input combination (i). Instead, the mean NSE value dropped sharply from 0.608 to 0.382 (37.17%) after including time lags of Q as input in combinations (ii) to (iv). The mean RMSE also increased from 1663 to 2104 kg/s (20.96%) and the mean MAE from 326 to 382 kg/s or by 14.65%. This indicates that only Qt should be considered as a predictor for modeling St. The comparison of model performance for two scenarios, namely, the first training-test and the second training-test, showed little difference in the models' performance. The three statistical indices were relatively close to each other for the two scenarios. Table 1 shows a strong impact of sediment (St) on the ANFIS model accuracy. Beyond the input combination (iv), i.e., input combinations (v), (vi), and (vii), the ANFIS model showed a gradual performance improvement. The (St-1) combined with Qt (input combination (v)) provided the highest accuracy compared to the ANFIS model with only Qt as input (input combination (i)). The mean NSE increased by 8.85% for combination (v) compared to input combination (i). However, it should be noted that the importance of St depends greatly on the number of lags included. Including more lags as predictors, i.e., the inclusion of two lags, St-1 and St-2 (input combination (vi)) or three lags, St-1, St-2, and St-3 (input combination (vi)), leads to a reduction in mean NSE by 6.44 and 29.68%, and an increase in mean RMSE by 24.11%, respectively. Overall, the best accuracy using the ANFIS model was achieved for input combination (v) with a mean NSE of 0.667.

Comparison of Accuracy among Models: Guangyuan Station
The results obtained using the M5Tree model are reported in Table 2. The improvement achieved using the M5Tree model compared to the ANFS model was marginal. The M5Tree with the second input combination (Qt and Qt-1) yielded the best accuracy among the Q input-based models, with an average NSE value of 0.581. The prediction accuracy for the second input combination was higher than 1.20%, 11.87%, and 4.17% compared to that obtained using the input combinations of (i), (iii), and (iv), respectively. The difference in RMSE between the second and the fourth combination was the largest, an increase from 1716 to 1870 kg/s (8.23%.). To assess the impact of St on the model's performance, one to three St lags were combined with Qt and Qt-1 in input combinations (v) to (vii) ( Table 2). The RMSE showed a slight decrease by~1.87% and~8.31% for input combination (iv) and (v) (models without St and with St), respectively. Nearly the same accuracy was achieved for all models using input combinations (vi) and (vii). The NSE was markedly higher and ranged from 0.663 to 0.672, with an average of 0.668 for those two input combinations. The accuracy increased most distinctly, with a decrease in RMSE and MAE by~10.25 and~18.80%, on average, between input combination (ii) and (vi). However, the inclusion of St-3 did not increase the NSE or decrease the RMSE and MAE. Consequently, M5Tree using the sixth and seventh combinations can be considered as the best models.
The statistical performance of the MARS model for both training scenarios is shown in Table 3. The results showed moderate MARS model accuracy for the first four input combinations (i, ii, iii, and iv), with a mean NSE value ranging from 0.567 to 0.595. It indicates a marginal gradual increase, yet more significant than that observed using ANFIS and M5Tree models. However, the improvement in the models' accuracy by increasing the number of inputs from one (Qt) to four (Qt, Qt-1, Qt-2, and Qt-3) was almost marginal, less than~4.70% in NSE and~3.37% and~8.00%, in RMSE and MAE, respectively. Table 3 shows that the accuracy of the MARS models was higher for the second training-test dataset than the first training-test dataset for all input combinations. It means no substantial improvement with the increased number of inputs beyond two. Therefore, the combination (ii), having only Qt and Qt-1, was deemed for model development. The performances of MARS models improved significantly after the fourth input combination ( Table 3). The mean NSE increased from 0.595 to 0.759 or by~21.60%, the mean RMSE decreased from 1691 to 1312 kg/s or by~22.41%, and the mean MAE dropped from 350 to 241 kg/s or by~31.14%. The MARS model for the input combination (v) showed an overall higher accuracy than the sixth and seventh input combinations, having a slightly larger NSE value of 0.759. The MARS models' performance for the sixth and seventh combination was similar, with equal mean RMSE and MAE values of 1357 and 256 kg/s and a negligible mean NSE value of only 0.002. Overall, the MARS model for input combination (v) was the best model. The statistical performance of the MARS-KM model for different input combinations and training scenarios is given in Table 4. Good accuracy using MARS-KM was observed for all the input combinations in terms of all three statistical metrics. The performance was higher compared to the ANFIS, M5Tree, and MARS models. The mean RMSE was higher (1342 kg/s) for the first four input combinations ((i) to (iv)) than for the last four input combinations ((v) to (vii)) (1158 kg/s). The differences observed for the four first input combinations were as follows: (1) good prediction accuracy using the MARS-KM model for the fourth input combination (Qt, Qt-1, Qt-2, and Qt-3) with a mean NSE value of 0.813, a mean RMSE value of 1158 kg/s, and a mean MAE value of 210 kg/s; (2) a slight to significant difference of mean RMSE, equal to 3.105% and 13.711% between the fourth and first combinations, and second and third combinations, respectively; and (3) the MAE of MARS-KM models' accuracy dropped significantly for the fourth input combination. The inclusion of different lags of St as input showed a marked improvement in the models' accuracy. For the three last input combinations (v, vi, and vii), the mean RMSE and MAE values rapidly decreased (dropped from 1158 to 1144 kg/s), while the NSE slightly increased from 0.813 to 0.818. Overall, MARS-KM also showed better performance for the fifth input combination.
The comparison of four machine learning methods with the corresponding best input combination is shown in Table S2. The four models exhibited different accuracy varying with mean NSE ranging from 0.608 to 0.818, mean RMSE between 1143 and 1663 kg/s, and mean MAE between 177 and 332 kg/s. The MARS-KM enhanced the St prediction significantly, while the ANFIS showed the least accuracy compared to the other models. The results highlight that, although the models were developed using the same input variables, their capabilities in capturing the major variability and uncertainty in the dataset were variable. The most apparent difference between the models was between MARS-KM and the ANFIS. The MARS-KM improved accuracy compared to ANRIS by 21%, 31.27%, and 44.53% in terms of NSE, RMSE, and MAE, respectively. The M5Tree and MARS models lie between the two extremes. The differences between the two showed that the MARS algorithm generally enhanced the M5Tree accuracy by an average of 14.80% and Sustainability 2021, 13, 4648 13 of 21 6.95% reduction of RMSE and MAE, respectively. Table S2 indicates that, on average, the results obtained using different models differ significantly in terms of different metrics. For example, the variation of RMSE was lower than 12% between MARS and MARS-KM and 31% between ANFIS and MARS-KM. Overall, the models can be ranked in decreasing performance order as MARS-KM, MARS, M5Tree, and ANFIS.

Comparison of Accuracy among Models: Beibei Station
The results obtained using the ANFIS model for all input combinations and training scenarios at the Beibei Station are presented in Table 5. The results showed ANFIS models for the first four input combinations, (i) Qt; (ii) Qt and Qt-1; (iii) Qt, Qt-1, and Qt-2; and (iv) Qt, Qt-1, Qt-2, and Qt-3, yielded relatively similar mean RMSE and MAE, ranging from 3553 to 3628 kg/s and 610 to 710 kg/s, respectively, whereas the fourth input combination showed the highest mean RMSE and lowest NSE. Therefore, only Q t was combined with different lags of S to form the input combinations of (v), (vi), and (vii). The results showed a strong to moderate improvement in accuracy, with mean NSE value ranging from 0.602 to 0.693, mean RMSE ranging from 3130 to 3582 kg/s, and a mean MAE between 595 and 663 kg/s. The highest mean NSE value of 0.693 was found for the input combination (v). Relatively low mean NSE of 0.602 and large mean RMSE and MAE were found for the input combination (vii), highlighting the negligible contribution of St-2 and St-3 ( Table 5). The increasing number of input variables showed a minimal contribution to ANFIS model performance improvement at this station.
Results obtained for the M5Tree model are reported in Table 6. Suspended sediment simulated from the first four input combinations showed a weak and insignificant difference between the models, with a slight superiority of the first input combination (only the Qt). Table 6 shows the variation of the RMSE, MAE, and NSE values averaged for different input combinations. The results showed an apparent decrease in the model's performances from the input combination (i) to the input combination (iv). The mean RMSE and MAE values increased from 3892 to 4309 kg/s (9.67%) and from 566 to 612 kg/s (7.51%), respectively, and the mean NSE value gradually declined to its lowest value of 0.423. This negligible difference in the models' accuracy might be due to the marginal effect of the higher lag streamflow data, which have already been highlighted in the previous discussion. Therefore, the inclusion of only Qt was sufficient for predicting suspended sediment. The effect of an increasing number of input variables on M5Tree performances can also be observed in Table 6. There was a positive effect of St on model accuracy. Interestingly, the suspended sediment was more sensitive to its antecedent values than the Q. The lowest RMSE and MAE values of 3087 and 473 kg/s, respectively, and the mean NSE of 0.705 were obtained using the input combination (vii) (Qt, St-1, St-2, and St-3). The significant increase in the NSE value from 0.522 to 0.705 (18.3%) indicates that the inclusion of St-1, St-2, and St-3 has significantly contributed to M5Tree performances. Nevertheless, the sensitivity of the M5Tree model to the inclusion of the St was not the same for both training-test scenarios. The improvement was more significant for the second training-test data set.
The performances of the MARS model for different input combinations are shown in Table 7. An overall maximum mean NSE value of 0.545 was obtained using only the Qt as the input (input combination (i)), suggesting a weak level of agreement between measured and predicted suspended sediment. Mean RMSE and MAE values of 3822 and 631 kg/s were achieved using only the Qt, and the level of accuracy remained very low regardless of the number of Q lags included from one to four. Table 7 revealed Qt as the most dominant input variable. Therefore, only the first combination was coupled with different lag values of sediment. The St simulations for the input combinations (v), (vi), and (vii) showed better performance ( Table 7). The NSE ranged from 0.706 to 0.728, with a mean of 0.720. The performances of the input combination (vi) (Qt, St-1, and St-2) and input combination (vii) (Qt, St-1, St-2, and St-3) were quite similar, where the input combination (v) performed less well with a lower mean NSE (0.706) and higher RMSE (3070 kg/s) and MAE (534 kg/s) values. The above results indicate that the input combination (vii) should be used to obtain the best accuracy.  Table 8 summarizes the statistics of MARS-KM models. MARS-KM at Beibei Station showed exceptional performance compared to other models. The results indicated the model's higher ability to predict the suspended sediment independently and successfully, even without the inclusion of St lags as input. The NSE values for the seven input combinations were in the range of 0.810 to 8.17 with a mean of~0.754. The best accuracy was achieved using only the Qt and Qt-1 as inputs (input combination (ii)), with a mean NSE of 0.817. The RMSE and MAE for the best model (input combination (ii)) were much smaller, 2428 and~2428 kg/s, respectively. The results described above indicate the MARS-KM model's effectiveness in predicting suspended sediment. The relative performance of the four models with the best input combinations is presented in Table S3. The results demonstrated that the performances of the models were relatively far from excellent. None of the models, i.e., ANFIS, M5Tree, MARS, and MARS-KM, achieved an NSE higher than 0.90. Overall, the performances of MARS-KM were remarkably superior. The MARS-KM improved the mean NSE of ANFIS, M5Tree, and MARS by 12.4%, 11.2%, and 8.9%, respectively. Additionally, it reduced the RMSE of ANFIS, M5Tree, and MARS by 22.42%, 21.34%, and 19.91%, respectively, and the MAE by 32.60%, 15.22%, and 22.73%, respectively.
The suspended sediment estimation ability of machine learning models was further assessed through visual comparison with the in-situ data (Figure 5a-d). MARS-KM and MARS reproduced suspended sediment variation much better and thus enhanced M5Tree and ANFIS models' prediction accuracy. The scatterplots (Figures S2-S5, see Supplementary Materials) revealed that (i) the underestimated data points were much higher than the overestimated data points, and (ii) all the models failed to simulate large suspended sediment values. It is apparent from Figure 6 that the MARS-KM shows superiority in simulating cumulative sediment amounts compared to other alternatives.
In a recently published paper, Juez et al. [59] studied the sediment hysteresis, a direct link among sediment size, distal sediment supply, and proximal sediment data obtained through a laboratory experiment. The authors demonstrated that sediment in the channel downstream depends mainly on the time-varying sediment load with different hysteresis types. The study also highlighted that sediment availability is governed by the evolution of two important morphological parts of the riverbed, degradation and aggradation. The shapes of hysteresis loops have often been intrinsically correlated to these two morphological processes. Finally, one of the important findings of the abovereported investigation is that the sediment concentration-discharge hysterical behavior is increasingly likely to amount between the distal sediment supply and the proximal sediment availability. logical processes. Finally, one of the important findings of the above-reported investigation is that the sediment concentration-discharge hysterical behavior is increasingly likely to amount between the distal sediment supply and the proximal sediment availability. Suspended sediment concentration (SSC) and discharge (Q) are two variables with a temporal shift; consequently, it is important to focus on these approaches' weaknesses and limitations in modeling these kinds of variables. Based on the direct linking of SSC to Q, the empirical models can estimate SSC reasonably. However, data-driven models resulted in slightly better performances in predicting the amount of SSC. Interpretation of the empirical models is more straightforward, as physical, morphological, and hydrological processes are explicitly expressed with simplified equations. However, it is appropriate for data-driven black-box models to examine whether it is possible to simulate SSC with input values outside the data range used during their calibration. Indeed, both the empirical and data-driven models showed encouraging results in terms of model accuracy; physical interpretation of the data-driven models is a challenge, thus requiring further analysis in quantifying the correlation between the input and output variables in a more meaningful way. When extending the models to another watershed, the empirical models may be more practical because of local calibration. Additionally, most input variables (i.e., the Q) undergo a rapid momentary fluctuation, which is very hard to capture with data-driven models. This deficiency makes the data-driven models unable to generate a durable and continuous response. Clearly, although the advantages exist, the limitations are always existing, which should be addressed in future studies.  Suspended sediment concentration (SSC) and discharge (Q) are two variables with a temporal shift; consequently, it is important to focus on these approaches' weaknesses and limitations in modeling these kinds of variables. Based on the direct linking of SSC to Q, the empirical models can estimate SSC reasonably. However, data-driven models resulted in slightly better performances in predicting the amount of SSC. Interpretation of the empirical models is more straightforward, as physical, morphological, and hydrological processes are explicitly expressed with simplified equations. However, it is appropriate for data-driven black-box models to examine whether it is possible to simulate SSC with input values outside the data range used during their calibration. Indeed, both the empirical and data-driven models showed encouraging results in terms of model accuracy; physical interpretation of the data-driven models is a challenge, thus requiring further analysis in quantifying the correlation between the input and output variables in a more meaningful way. When extending the models to another watershed, the empirical models may be more practical because of local calibration. Additionally, most input variables (i.e., the Q) undergo a rapid momentary fluctuation, which is very hard to capture with data-driven models. This deficiency makes the data-driven models unable to generate a durable and continuous response. Clearly, although the advantages exist, the limitations are always existing, which should be addressed in future studies.

Conclusions
In this investigation, a new method was developed by hybridizing MARS and the Kmeans clustering algorithm to improve the accuracy of suspended sediment prediction. The models were developed using daily discharge and sediment data at two stations in China. The MARS-KM models' performance was compared with ANFIS, MARS, and M5Tree models using three statistical metrics, RMSE, MAE, and NSE, and graphical comparison. The following conclusions were reached from the outcomes of the presented work:

Conclusions
In this investigation, a new method was developed by hybridizing MARS and the Kmeans clustering algorithm to improve the accuracy of suspended sediment prediction. The models were developed using daily discharge and sediment data at two stations in China. The MARS-KM models' performance was compared with ANFIS, MARS, and M5Tree models using three statistical metrics, RMSE, MAE, and NSE, and graphical comparison. The following conclusions were reached from the outcomes of the presented work: The proposed MARS-KM considerably improved the accuracy of the ANFIS, MARS, and M5Tree methods. The increments in the RMSE of the three mentioned methods were by 39%, 30%, and 18%, and 24%, 22%, and 8% for the first and second scenarios at the Guangyuan Station, and by 34%, 26%, and 27%, and 7%, 16%, and 6% for the first and second scenarios at the Beibei Station, respectively.

•
The suspended sediment in the studied region is generally sensitive to its lagged values rather than the lag discharge values. However, the MARS-KM models could estimate suspended sediment satisfactory using only discharge (Q) as inputs. It is very important in practical applications, because the measurement of suspended sediment is often very difficult.

•
Comparison of models' ability in simulating cumulative suspended sediment loads also showed the superiority of MARS-KM compared to ANFIS, MARS, and M5Tree methods.
In this study, seasonality in the sediment load and discharge time series was not considered. It can be addressed in future studies. The methods may produce better estimates when season information is included in the models.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/su13094648/s1, Figure S1: Time variation graph of sediment load for (a) Guangyuan and (b) Beibei Station, Figure S2: Time variation graphs of the observed and estimated sediments by ANFIS, M5Tree, MARS and MARS-KM models in the test period at Guangyuan Station for the 1st training-test scenario, Figure S3: Time variation graphs of the observed and estimated sediments by ANFIS, M5Tree, MARS and MARS-KM models in the test period at Guangyuan Station for the 2nd training-test scenario, Figure S4: Time variation graphs of the observed and estimated sediments by ANFIS, M5Tree, MARS and MARS-KM models in the test period of Beibei Station-1st training-test scenario, Figure S5: Time variation graphs of the observed and estimated sediments by ANFIS, M5Tree, MARS and MARS-KM models in the test period at Beibei Station for the 2nd training-test scenario; Table S1: The statistical parameters of the data used in the study, Table S2: Performance of the best ANFIS, M5Tree, MARS and MARS-KM models in sediment prediction at Guangyuan Station, Table S3: Performance of the best ANFIS, M5Tree, MARS and MARS-KM models in sediment prediction at Beibei Station.