# Suspended Sediment Modeling Using a Heuristic Regression Method Hybridized with Kmeans Clustering

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Case Study and Data Analysis

^{2}and one of the reservoir’s main discharges and sediment sources. Thus, a precise estimation of sediment load input from the basin to the reservoir is very important for efficient reservoir operation. Two key gauging stations in the Jialing catchment, located at Guangyuan and Beibei, were chosen to predict SSL. These hydraulic gauging stations’ geographical locations aid in understanding the overall sediment phenomena and contribution of the catchment to the main Yangtze River. The selected catchment also has two main tributaries, the Qu River on the main channel’s left side and the Fu River on the right side (see Figure 1). To estimate the daily sediment load of both gauging stations, the daily data of runoff and suspended sediment from 1 January 2007, to 31 December 2015, were gathered from the Hydrological Yearbooks of the People’s Republic of China. The time variation graphs of both stations’ sediment loads are illustrated in Figure S1 (see Supplementary Materials). The related authorities checked the reliability and homogeneity of the data before releasing the data. Chinese national standard criteria were followed in obtaining streamflow and sediment measurements. The first vertical profiles (usually 10–30 profiles depending on the river width) are determined for measurement. Then, water depth and velocity of flow (utilizing a velocity meter) at each profile are measured. Flow velocity is recorded for different depths. Water samples for sediment concentrations (SCs) are collected from each depth, and these samples are dried and weighed in the lab. Finally, daily sediment loads are computed by multiplying the SCs by the streamflow [46,47].

#### 2.2. Adaptive Neuro-Fuzzy Inference System (ANFIS)

_{1}and X

_{2}are inputs to the system, and f is the output. The rules for the ANFIS system are described in Equations (1) and (2) below:

_{1}is P

_{1}and X

_{2}is Q

_{1}, then f

_{1}= s

_{1}X

_{1}+ t

_{1}X

_{2}+ v

_{1}

_{1}is P

_{2}and X

_{2}is Q

_{2}, then f

_{2}= s

_{2}X

_{1}+ t

_{2}X

_{2}+ v

_{2}

_{i}, t

_{i}, and v

_{i}are linear output parameters that are required to optimize during model training. The working of all layers of ANFIS is discussed below [49,50].

_{i}is the output of the normalized firing strength, and {s

_{i}, t

_{i}, v

_{i}} are the parameters set of the node i.

#### 2.3. M5 Tree Model

_{i}is the ith outcome of the possible set. The data’s standard deviation (SD) is less than the parent nodes. The M5 tree model is developed in the current study to forecast SSL for seven different input combinations.

#### 2.4. Multivariate Adaptive Regression Splines (MARS)

#### 2.5. K-Means (KM) Algorithm and MARS Hybrid Model

_{i}is the Euclidean distance, x

_{k}is the data vector, c is the number of clusters, and c

_{i}is the cluster center, an objective function, J is defined as

## 3. Application and Results

#### 3.1. Modeling Approaches and Accuracy Assessments

^{3}/s). The proposed models were developed using MATLAB software and compared using data at two stations, Guangyuan and Beibei. For each modeling approach, the sediment (kg/s) was modeled either separately using only the Q (m

^{3}/s) measured at previous lags or combined with St (kg/s) estimated at previous lags. Here, the sediment (St: kg/s) is the response variable. The explanatory variables considered were varied from one to more inputs, formed by a combination of several Q and St lag values. In total, seven input combinations were compared, denoted as combinations (i), (ii) … etc. In the first four input combinations, only streamflow inputs were considered: (i) Qt; (ii) Qt and Qt-1; (iii) Qt, Qt-1, and Qt-2; and (iv) Qt, Qt-1, Qt-2, and Qt-3. After selecting the best Q-based input combination, after the fourth combination, sediment inputs were added to the best Q-based combination. For example, for the ANFIS method, the input combinations considered are (v) Qt and St-1; (vi) Qt, St-1, and St-2, and (vi) Qt, St-1, St-2, and St-3, where Qt-1 and St-1 indicate the streamflow and sediment load at time t-1 (one previous day in this study). Performance assessment using different input combinations allows importance evaluation of variables and determining the lag values as inputs. Additionally, two different training scenarios were compared: splitting the dataset into two equal subsets having 50% of total data in each subset and a permutation between the two. Here, the two scenarios are denoted as the first training-test (scenario 1) and second training-test (scenario 2). In the first scenario, data from 4 January 2007, to 3 July 2011, were used for models’ training, while the remaining data from 4 July 2011, to 31 December 2015, were used for model testing at both stations. In the second scenario, the training and test data sets were swapped (data from 4 July 2011, to 31 December 2015, were used for the models’ training and data from 4 January 2007, to 3 July 2011, were used for the models’ testing). These two scenarios allowed comparing the overall models’ accuracy for the total range of the dataset. Model accuracy was computed by comparing the measured and the modeled data at each station separately using Nash–Sutcliffe efficiency (NSE), root mean squared error (RMSE: kg/s), and mean absolute error (MAE: kg/s).

#### 3.2. Comparison of Accuracy among Models: Guangyuan Station

#### 3.3. Comparison of Accuracy among Models: Beibei Station

_{t}was combined with different lags of S to form the input combinations of (v), (vi), and (vii). The results showed a strong to moderate improvement in accuracy, with mean NSE value ranging from 0.602 to 0.693, mean RMSE ranging from 3130 to 3582 kg/s, and a mean MAE between 595 and 663 kg/s. The highest mean NSE value of 0.693 was found for the input combination (v). Relatively low mean NSE of 0.602 and large mean RMSE and MAE were found for the input combination (vii), highlighting the negligible contribution of St-2 and St-3 (Table 5). The increasing number of input variables showed a minimal contribution to ANFIS model performance improvement at this station.

## 4. Conclusions

- The suspended sediment in the studied region is generally sensitive to its lagged values rather than the lag discharge values. However, the MARS–KM models could estimate suspended sediment satisfactory using only discharge (Q) as inputs. It is very important in practical applications, because the measurement of suspended sediment is often very difficult.
- Comparison of models’ ability in simulating cumulative suspended sediment loads also showed the superiority of MARS–KM compared to ANFIS, MARS, and M5Tree methods.

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Ampomah, R.; Hosseiny, H.; Zhang, L.; Smith, V.; Sample-Lord, K. A Regression-Based Prediction Model of Suspended Sediment Yield in the Cuyahoga River in Ohio Using Historical Satellite Images and Precipitation Data. Water
**2020**, 12, 881. [Google Scholar] [CrossRef] [Green Version] - Zarris, D.; Vlastara, M.; Panagoulia, D. Sediment Delivery Assessment for a Transboundary Mediterranean Catchment: The Example of Nestos River Catchment. Water Resour. Manag.
**2011**, 25, 3785–3803. [Google Scholar] [CrossRef] - Afan, H.A.; El-Shafie, A.; Mohtar, W.H.M.W.; Yaseen, Z.M. Past, present and prospect of an Artificial Intelligence (AI) based model for sediment transport prediction. J. Hydrol.
**2016**, 541, 902–913. [Google Scholar] [CrossRef] - Khan, M.Y.A.; Tian, F.; Hasan, F.; Chakrapani, G.J. Artificial neural network simulation for prediction of suspended sediment concentration in the River Ramganga, Ganges Basin, India. Int. J. Sediment Res.
**2019**, 34, 95–107. [Google Scholar] [CrossRef] - Bisoyi, N.; Gupta, H.; Padhy, N.P.; Chakrapani, G.J. Prediction of daily sediment discharge using a back propagation neural network training algorithm: A case study of the Narmada River, India. Int. J. Sediment Res.
**2019**, 34, 125–135. [Google Scholar] [CrossRef] - Adnan, R.M.; Liang, Z.; Trajkovic, S.; Zounemat-Kermani, M.; Li, B.; Kisi, O. Daily streamflow prediction using optimally pruned extreme learning machine. J. Hydrol.
**2019**, 577, 123981. [Google Scholar] [CrossRef] - Kumar, D.; Pandey, A.; Sharma, N.; Flügel, W.-A. Modeling Suspended Sediment Using Artificial Neural Networks and TRMM-3B42 Version 7 Rainfall Dataset. J. Hydrol. Eng.
**2015**, 20, C4014007. [Google Scholar] [CrossRef] - Adnan, R.M.; Yuan, X.; Kisi, O.; Adnan, M.; Mehmood, A. Stream Flow Forecasting of Poorly Gauged Mountainous Watershed by Least Square Support Vector Machine, Fuzzy Genetic Algorithm and M5 Model Tree Using Climatic Data from Nearby Station. Water Resour. Manag.
**2018**, 32, 4469–4486. [Google Scholar] [CrossRef] - Alizamir, M.; Kisi, O.; Muhammad Adnan, R.; Kuriqi, A. Modelling reference evapotranspiration by combining neuro-fuzzy and evolutionary strategies. Acta Geophys.
**2020**, 68, 1113–1126. [Google Scholar] [CrossRef] - Kisi, O.; Shiri, J.; Karimi, S.; Adnan, R.M. Three different adaptive neuro fuzzy computing techniques for forecasting long-period daily streamflows. In Big Data in Engineering Applications; Springer: Singapore, 2018; pp. 303–321. [Google Scholar]
- Yuan, X.; Chen, C.; Lei, X.; Yuan, Y.; Adnan, R.M. Monthly runoff forecasting based on LSTM–ALO model. Stoch. Environ. Res. Risk Assess.
**2018**, 32, 2199–2212. [Google Scholar] [CrossRef] - Muhammad Adnan, R.; Chen, Z.; Yuan, X.; Kisi, O.; El-Shafie, A.; Kuriqi, A.; Ikram, M. Reference Evapotranspiration Modeling Using New Heuristic Methods. Entropy
**2020**, 22, 547. [Google Scholar] [CrossRef] - Nourani, V.; Andalib, G. Daily and monthly suspended sediment load predictions using wavelet based artificial intelligence approaches. J. Mt. Sci.
**2015**, 12, 85–100. [Google Scholar] [CrossRef] - Mustafa, M.R.; Rezaur, R.B.; Saiedi, S.; Isa, M.H. River suspended sediment prediction using various multilayer perceptron neural network training algorithms—A case study in Malaysia. Water Resour. Manag.
**2012**, 26, 1879–1897. [Google Scholar] [CrossRef] - Nourani, V.; Kalantari, O.; Baghanam, A.H. Two Semidistributed ANN-Based Models for Estimation of Suspended Sediment Load. J. Hydrol. Eng.
**2012**, 17, 1368–1380. [Google Scholar] [CrossRef] - Kisi, O.; Yaseen, Z.M. The potential of hybrid evolutionary fuzzy intelligence model for suspended sediment concentration prediction. Catena
**2019**, 174, 11–23. [Google Scholar] [CrossRef] - Bakhtyar, R.; Ghaheri, A.; Yeganeh-Bakhtiary, A.; Baldock, T. Longshore sediment transport estimation using a fuzzy inference system. Appl. Ocean Res.
**2008**, 30, 273–286. [Google Scholar] [CrossRef] - Kabiri-Samani, A.; Aghaee-Tarazjani, J.; Borghei, S.; Jeng, D. Application of neural networks and fuzzy logic models to long-shore sediment transport. Appl. Soft Comput.
**2011**, 11, 2880–2887. [Google Scholar] [CrossRef] - Mianaei, S.J.; Keshavarzi, A.R. Prediction of riverine suspended sediment discharge using fuzzy logic algorithms, and some implications for estuarine settings. Geo-Mar. Lett.
**2010**, 30, 35–45. [Google Scholar] [CrossRef] - Kumar, A.; Kumar, P.; Singh, V.K. Evaluating Different Machine Learning Models for Runoff and Suspended Sediment Simulation. Water Resour. Manag.
**2019**, 33, 1217–1231. [Google Scholar] [CrossRef] - Kisi, O.; Zounemat-Kermani, M. Suspended Sediment Modeling Using Neuro-Fuzzy Embedded Fuzzy c-Means Clustering Technique. Water Resour. Manag.
**2016**, 30, 3979–3994. [Google Scholar] [CrossRef] - Vafakhah, M. Comparison of cokriging and adaptive neuro-fuzzy inference system models for suspended sediment load forecasting. Arab. J. Geosci.
**2013**, 6, 3003–3018. [Google Scholar] [CrossRef] - Rajaee, T.; Mirbagheri, S.A.; Zounemat-Kermani, M.; Nourani, V. Daily suspended sediment concentration simulation using ANN and neuro-fuzzy models. Sci. Total. Environ.
**2009**, 407, 4916–4927. [Google Scholar] [CrossRef] - Firat, M.; Güngör, M. Monthly total sediment forecasting using adaptive neuro fuzzy inference system. Stoch. Environ. Res. Risk Assess.
**2009**, 24, 259–270. [Google Scholar] [CrossRef] - Samet, K.; Hoseini, K.; Karami, H.; Mohammadi, M. Comparison between Soft Computing Methods for Prediction of Sediment Load in Rivers: Maku Dam Case Study. Iran. J. Sci. Technol. Trans. Civ. Eng.
**2018**, 43, 93–103. [Google Scholar] [CrossRef] - Chang, C.K.; Azamathulla, H.M.; Zakaria, N.A.; Ab Ghani, A. Appraisal of soft computing techniques in prediction of total bed material load in tropical rivers. J. Earth Syst. Sci.
**2012**, 121, 125–133. [Google Scholar] [CrossRef] [Green Version] - Mirbagheri, S.A.; Nourani, V.; Rajaee, T.; Alikhani, A. Neuro-fuzzy models employing wavelet analysis for suspended sediment concentration prediction in rivers. Hydrol. Sci. J.
**2010**, 55, 1175–1189. [Google Scholar] [CrossRef] [Green Version] - Rajaee, T. Wavelet and Neuro-fuzzy Conjunction Approach for Suspended Sediment Prediction. CLEAN Soil Air Water
**2010**, 38, 275–286. [Google Scholar] [CrossRef] - Rajaee, T.; Mirbagheri, S.A.; Nourani, V.; Alikhani, A. Prediction of daily suspended sediment load using wavelet and neurofuzzy combined model. Int. J. Environ. Sci. Technol.
**2010**, 7, 93–110. [Google Scholar] [CrossRef] [Green Version] - Chou, S.-M.; Lee, T.-S.; Shao, Y.E.; Chen, I.-F. Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines. Expert Syst. Appl.
**2004**, 27, 133–142. [Google Scholar] [CrossRef] - Adnan, R.M.; Liang, Z.; Parmar, K.S.; Soni, K.; Kisi, O. Modeling monthly streamflow in mountainous basin by MARS, GMDH-NN and DENFIS using hydroclimatic data. Neural Comput. Appl.
**2021**, 33, 2853–2871. [Google Scholar] [CrossRef] - Al-Sudani, Z.A.; Salih, S.Q.; Sharafati, A.; Yaseen, Z.M. Development of multivariate adaptive regression spline integrated with differential evolution model for streamflow simulation. J. Hydrol.
**2019**, 573, 1–12. [Google Scholar] [CrossRef] - Mehdizadeh, S.; Fathian, F.; Safari, M.J.S.; Adamowski, J.F. Comparative assessment of time series and artificial intelligence models to estimate monthly streamflow: A local and external data analysis approach. J. Hydrol.
**2019**, 579, 124225. [Google Scholar] [CrossRef] - Adnan, R.M.; Petroselli, A.; Heddam, S.; Santos, C.A.G.; Kisi, O. Short term rainfall-runoff modelling using several machine learning methods and a conceptual event-based model. Stoch. Environ. Res. Risk Assess.
**2021**, 35, 597–616. [Google Scholar] [CrossRef] - Nourani, V.; Molajou, A.; Tajbakhsh, A.D.; Najafi, H. A Wavelet Based Data Mining Technique for Suspended Sediment Load Modeling. Water Resour. Manag.
**2019**, 33, 1769–1784. [Google Scholar] [CrossRef] - Rahgoshay, M.; Feiznia, S.; Arian, M.; Hashemi, S.A.A. Simulation of daily suspended sediment load using an improved model of support vector machine and genetic algorithms and particle swarm. Arab. J. Geosci.
**2019**, 12. [Google Scholar] [CrossRef] - Malik, A.; Kumar, A.; Kisi, O.; Shiri, J. Evaluating the performance of four different heuristic approaches with Gamma test for daily suspended sediment concentration modeling. Environ. Sci. Pollut. Res.
**2019**, 26, 22670–22687. [Google Scholar] [CrossRef] - Kumar, A.R.S.; Ojha, C.S.P.; Goyal, M.K.; Singh, R.D.; Swamee, P.K. Modeling of Suspended Sediment Concentration at Kasol in India Using ANN, Fuzzy Logic, and Decision Tree Algorithms. J. Hydrol. Eng.
**2012**, 17, 394–404. [Google Scholar] [CrossRef] - Goyal, M.K. Modeling of Sediment Yield Prediction Using M5 Model Tree Algorithm and Wavelet Regression. Water Resour. Manag.
**2014**, 28, 1991–2003. [Google Scholar] [CrossRef] - Kim, C.M.; Parnichkun, M. Prediction of settled water turbidity and optimal coagulant dosage in drinking water treatment plant using a hybrid model of k-means clustering and adaptive neuro-fuzzy inference system. Appl. Water Sci.
**2017**, 7, 3885–3902. [Google Scholar] [CrossRef] [Green Version] - Wu, L.; Peng, Y.; Fan, J.; Wang, Y.; Huang, G. A novel kernel extreme learning machine model coupled with K-means clustering and firefly algorithm for estimating monthly reference evapotranspiration in parallel computation. Agric. Water Manag.
**2021**, 245, 106624. [Google Scholar] [CrossRef] - Georgogiannis, A. Robust k-means: A theoretical revisit. Adv. Neural Inf. Process. Syst.
**2016**, 29, 2891–2899. [Google Scholar] - Kondo, Y.; Salibian-Barrera, M.; Zamar, R. RSKC: An R package for a robust and sparse k-means clustering algorithm. J. Stat. Softw.
**2016**, 72, 5. [Google Scholar] [CrossRef] [Green Version] - Brodinová, Š.; Filzmoser, P.; Ortner, T.; Breiteneder, C.; Rohm, M. Robust and sparse k-means clustering for high-dimensional data. Adv. Data Anal. Classif.
**2019**, 13, 905–932. [Google Scholar] [CrossRef] [Green Version] - Affes, Z.; Kaffel, R.H. Forecast Bankruptcy Using a Blend of Clustering and MARS Model—Case of US Banks. SSRN Electron. J.
**2016**, 281, 27–64. [Google Scholar] [CrossRef] [Green Version] - Dai, S.; Yang, S.; Cai, A. Impacts of dams on the sediment flux of the Pearl River, southern China. Catena
**2008**, 76, 36–43. [Google Scholar] [CrossRef] - Zhang, W.; Wei, X.; Jinhai, Z.; Yuliang, Z.; Zhang, Y. Estimating suspended sediment loads in the Pearl River Delta region using sediment rating curves. Cont. Shelf Res.
**2012**, 38, 35–46. [Google Scholar] [CrossRef] - Zadeh, L.A. Fuzzy sets. Inf. Control.
**1965**, 8, 338–353. [Google Scholar] [CrossRef] [Green Version] - Parmar, K.S.; Bhardwaj, R. River Water Prediction Modeling Using Neural Networks, Fuzzy and Wavelet Coupled Model. Water Resour. Manag.
**2015**, 29, 17–33. [Google Scholar] [CrossRef] - Kisi, O.; Parmar, K.S. Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution. J. Hydrol.
**2016**, 534, 104–112. [Google Scholar] [CrossRef] - Quinlan, J.R. Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Hobart, Tasmania, 16–18 November 1992; Volume 92, pp. 343–348. [Google Scholar]
- Wang, Y.; Witten, I.H. Induction of model trees for predicting continuous lasses. In Proceedings of the Poster Papers of the European Conference on Machine Learning, 9th European Conference on Machine Learning, Prague, Czech Republic, 23–25 April 1997. [Google Scholar]
- Friedman, J.H. Multivariate Adaptive Regression Splines. Ann. Stat.
**1991**, 19, 1–67. [Google Scholar] [CrossRef] - De Andrés, J.; Lorca, P.; de Cos Juez, F.J.; Sánchez-Lasheras, F. Bankruptcy forecasting: A hybrid approach using Fuzzy c-means clustering and Multivariate Adaptive Regression Splines (MARS). Expert Syst. Appl.
**2011**, 38, 1866–1875. [Google Scholar] [CrossRef] - Lloyd, S. Least squares quantization in PCM. IEEE Trans. Inf. Theory
**1982**, 28, 129–137. [Google Scholar] [CrossRef] - MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 21 June 1967; Volume 1, pp. 281–297. [Google Scholar]
- Hartigan, J.A.; Wong, M.A. Journal of the Royal Statistical Society. Ser. C
**1979**, 28, 100–108. [Google Scholar] - Adnan, R.M.; Khosravinia, P.; Karimi, B.; Kisi, O. Prediction of hydraulics performance in drain envelopes using Kmeans based multivariate adaptive regression spline. Appl. Soft Comput.
**2021**, 100, 107008. [Google Scholar] [CrossRef] - Juez, C.; Hassan, M.A.; Franca, M.J. The Origin of Fine Sediment Determines the Observations of Suspended Sediment Fluxes Under Unsteady Flow Conditions. Water Resour. Res.
**2018**, 54, 5654–5669. [Google Scholar] [CrossRef]

**Figure 5.**Scatterplots of the observed and estimated sediments by ANFIS, M5Tree, MARS, and MARS–KM models in the test period at (

**a**) Guangyuan Station for the first training-test scenario, (

**b**) Guangyuan Station for the second training-test scenario, (

**c**) Beibei Station for the first training-test scenario, and (

**d**) Beibei Station for the second training-test scenario.

**Figure 6.**Cumulative sediment amounts produced by the ANFIS, M5Tree, MARS, and MARS–KM models in the test period: (

**a**) Guangyuan Station for the first training-test scenario, (

**b**) Guangyuan Station for the second training-test scenario, (

**c**) Beibei Station for the first training-test scenario, and (

**d**) Beibei Station for the first training-test scenario.

**Table 1.**Performance of ANFIS model for different input combinations and training-test scenarios at Guangyuan Station.

Statistics | Data Set | Input Combination | ||||||
---|---|---|---|---|---|---|---|---|

(i) | (ii) | (iii) | (iv) | (v) | (vi) | (vii) | ||

RMSE | First training-test | 1649 | 1658 | 1725 | 1834 | 1470 | 1503 | 1608 |

Second training-test | 1677 | 1929 | 2003 | 2374 | 3162 | 3484 | 4496 | |

Mean | 1663 | 1794 | 1864 | 2104 | 2316 | 2494 | 3052 | |

MAE | First training-test | 340 | 302 | 338 | 377 | 296 | 296 | 319 |

Second training-test | 323 | 350 | 355 | 387 | 608 | 565 | 209 | |

Mean | 332 | 326 | 347 | 382 | 452 | 431 | 264 | |

NSE | First training-test | 0.563 | 0.559 | 0.522 | 0.46 | 0.653 | 0.637 | 0.585 |

Second training-test | 0.652 | 0.543 | 0.504 | 0.303 | 0.68 | 0.611 | 0.353 | |

Mean | 0.608 | 0.551 | 0.513 | 0.382 | 0.667 | 0.624 | 0.469 |

**Table 2.**Performance of M5Tree model for different input combinations and training-test scenarios at Guangyuan Station.

Statistics | Data Set | Input Combination | ||||||
---|---|---|---|---|---|---|---|---|

(i) | (ii) | (iii) | (iv) | (v) | (vi) | (vii) | ||

RMSE | First training-test | 1756 | 1731 | 1963 | 1953 | 1622 | 1428 | 1428 |

Second training-test | 1700 | 1701 | 1702 | 1786 | 2047 | 1651 | 1651 | |

Mean | 1728 | 1716 | 1833 | 1870 | 1835 | 1540 | 1540 | |

MAE | First training-test | 329 | 328 | 380 | 361 | 284 | 249 | 249 |

Second training-test | 311 | 310 | 314 | 334 | 304 | 269 | 269 | |

Mean | 320 | 319 | 347 | 348 | 294 | 259 | 259 | |

NSE | First training-test | 0.505 | 0.519 | 0.381 | 0.394 | 0.577 | 0.672 | 0.672 |

Second training-test | 0.642 | 0.642 | 0.642 | 0.605 | 0.482 | 0.663 | 0.663 | |

Mean | 0.574 | 0.581 | 0.512 | 0.500 | 0.530 | 0.668 | 0.668 |

**Table 3.**Performance of MARS model for different input combinations and training-test scenarios at Guangyuan Station.

Statistics | Data Set | Input Combination | ||||||
---|---|---|---|---|---|---|---|---|

(i) | (ii) | (iii) | (iv) | (v) | (vi) | (vii) | ||

RMSE | First training-test | 1717 | 1676 | 1676 | 1667 | 1225 | 1311 | 1311 |

Second training-test | 1783 | 1705 | 1715 | 1715 | 1399 | 1402 | 1402 | |

Mean | 1750 | 1691 | 1696 | 1691 | 1312 | 1357 | 1357 | |

MAE | First training-test | 318 | 327 | 347 | 351 | 265 | 286 | 286 |

Second training-test | 325 | 312 | 349 | 349 | 217 | 226 | 226 | |

Mean | 322 | 320 | 348 | 350 | 241 | 256 | 256 | |

NSE | First training-test | 0.526 | 0.549 | 0.549 | 0.554 | 0.759 | 0.72 | 0.724 |

Second training-test | 0.607 | 0.64 | 0.636 | 0.636 | 0.758 | 0.757 | 0.757 | |

Mean | 0.567 | 0.595 | 0.593 | 0.595 | 0.759 | 0.739 | 0.741 |

**Table 4.**Performance of MARS–KM model for different input combinations and training-test scenarios at Guangyuan Station.

Statistics | Data Set | Input Combination | ||||||
---|---|---|---|---|---|---|---|---|

(i) | (ii) | (iii) | (iv) | (v) | (vi) | (vii) | ||

RMSE | First training-test | 1103 | 1400 | 1108 | 1029 | 1003 | 1003 | 1002 |

Second training-test | 1285 | 1284 | 1284 | 1286 | 1282 | 1284 | 1285 | |

Mean | 1194 | 1342 | 1196 | 1158 | 1143 | 1144 | 1144 | |

MAE | First training-test | 216 | 280 | 223 | 232 | 187 | 187 | 185 |

Second training-test | 188 | 185 | 185 | 187 | 167 | 168 | 170 | |

Mean | 202 | 233 | 204 | 210 | 177 | 178 | 178 | |

NSE | First training-test | 0.805 | 0.685 | 0.803 | 0.83 | 0.838 | 0.838 | 0.839 |

Second training-test | 0.796 | 0.796 | 0.796 | 0.795 | 0.797 | 0.796 | 0.796 | |

Mean | 0.801 | 0.741 | 0.800 | 0.813 | 0.818 | 0.817 | 0.818 |

**Table 5.**Performance of ANFIS model for different input combinations and training-test scenarios at Beibei Station.

Statistics | Data Set | Input Combination | ||||||
---|---|---|---|---|---|---|---|---|

(i) | (ii) | (iii) | (iv) | (v) | (vi) | (vii) | ||

RMSE | First training-test | 4003 | 3982 | 4064 | 4087 | 3591 | 3441 | 3909 |

Second training-test | 3163 | 3123 | 3159 | 3168 | 2668 | 3295 | 3254 | |

Mean | 3583 | 3553 | 3612 | 3628 | 3130 | 3368 | 3582 | |

MAE | First training-test | 743 | 644 | 663 | 673 | 715 | 746 | 796 |

Second training-test | 677 | 576 | 643 | 632 | 474 | 511 | 529 | |

Mean | 710 | 610 | 653 | 653 | 595 | 629 | 663 | |

NSE | First training-test | 0.52 | 0.525 | 0.505 | 0.499 | 0.614 | 0.645 | 0.546 |

Second training-test | 0.68 | 0.688 | 0.681 | 0.679 | 0.772 | 0.653 | 0.661 | |

Mean | 0.600 | 0.607 | 0.593 | 0.589 | 0.693 | 0.649 | 0.602 |

**Table 6.**Performance of M5Tree model for different input combinations and training-test scenarios at Beibei Station.

Statistics | Data Set | Input Combination | ||||||
---|---|---|---|---|---|---|---|---|

(i) | (ii) | (iii) | (iv) | (v) | (vi) | (vii) | ||

RMSE | First training-test | 4567 | 4771 | 4596 | 4728 | 4426 | 3221 | 3221 |

Second training-test | 3217 | 3218 | 3951 | 3890 | 3301 | 3019 | 2953 | |

Mean | 3892 | 3995 | 4274 | 4309 | 3864 | 3120 | 3087 | |

MAE | First training-test | 733 | 782 | 726 | 794 | 680 | 495 | 495 |

Second training-test | 566 | 569 | 638 | 612 | 494 | 464 | 451 | |

Mean | 650 | 676 | 682 | 703 | 587 | 480 | 473 | |

NSE | First training-test | 0.375 | 0.318 | 0.367 | 0.33 | 0.413 | 0.689 | 0.689 |

Second training-test | 0.669 | 0.668 | 0.5 | 0.516 | 0.651 | 0.708 | 0.721 | |

Mean | 0.522 | 0.493 | 0.434 | 0.423 | 0.532 | 0.699 | 0.705 |

**Table 7.**Performance of MARS model for different input combinations and training-test scenarios at Beibei Station.

Statistics | Data Set | Input Combination | ||||||
---|---|---|---|---|---|---|---|---|

(i) | (ii) | (iii) | (iv) | (v) | (vi) | (vii) | ||

RMSE | First training-test | 4250 | 4451 | 4431 | 4355 | 3453 | 3321 | 3265 |

Second training-test | 3394 | 3344 | 3375 | 3296 | 2686 | 2616 | 2650 | |

Mean | 3822 | 3898 | 3903 | 3826 | 3070 | 2969 | 2958 | |

MAE | First training-test | 702 | 763 | 766 | 826 | 561 | 581 | 575 |

Second training-test | 560 | 602 | 589 | 643 | 507 | 483 | 462 | |

Mean | 631 | 683 | 678 | 735 | 534 | 532 | 519 | |

NSE | First training-test | 0.459 | 0.406 | 0.412 | 0.432 | 0.643 | 0.67 | 0.681 |

Second training-test | 0.631 | 0.642 | 0.635 | 0.652 | 0.769 | 0.781 | 0.775 | |

Mean | 0.545 | 0.524 | 0.524 | 0.542 | 0.706 | 0.726 | 0.728 |

**Table 8.**Performance of MARS–KM model for different input combinations and training-test scenarios at Beibei Station.

Statistics | Data Set | Input Combination | ||||||
---|---|---|---|---|---|---|---|---|

(i) | (ii) | (iii) | (iv) | (v) | (vi) | (vii) | ||

RMSE | First training-test | 2664 | 2377 | 2664 | 2664 | 2258 | 2419 | 2403 |

Second training-test | 2950 | 2478 | 2914 | 2918 | 2921 | 2534 | 2525 | |

Mean | 2807 | 2428 | 2789 | 2791 | 2590 | 2477 | 2464 | |

MAE | First training-test | 508 | 434 | 518 | 524 | 382 | 337 | 334 |

Second training-test | 429 | 367 | 421 | 417 | 485 | 458 | 471 | |

Mean | 469 | 401 | 470 | 471 | 434 | 398 | 403 | |

NSE | First training-test | 0.787 | 0.831 | 0.787 | 0.787 | 0.847 | 0.825 | 0.827 |

Second training-test | 0.721 | 0.803 | 0.728 | 0.728 | 0.727 | 0.794 | 0.796 | |

Mean | 0.754 | 0.817 | 0.758 | 0.758 | 0.787 | 0.810 | 0.812 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Adnan, R.M.; Parmar, K.S.; Heddam, S.; Shahid, S.; Kisi, O.
Suspended Sediment Modeling Using a Heuristic Regression Method Hybridized with Kmeans Clustering. *Sustainability* **2021**, *13*, 4648.
https://doi.org/10.3390/su13094648

**AMA Style**

Adnan RM, Parmar KS, Heddam S, Shahid S, Kisi O.
Suspended Sediment Modeling Using a Heuristic Regression Method Hybridized with Kmeans Clustering. *Sustainability*. 2021; 13(9):4648.
https://doi.org/10.3390/su13094648

**Chicago/Turabian Style**

Adnan, Rana Muhammad, Kulwinder Singh Parmar, Salim Heddam, Shamsuddin Shahid, and Ozgur Kisi.
2021. "Suspended Sediment Modeling Using a Heuristic Regression Method Hybridized with Kmeans Clustering" *Sustainability* 13, no. 9: 4648.
https://doi.org/10.3390/su13094648