Assessment of Soft Computing Techniques for the Prediction of Suspended Sediment Loads in Rivers

: A key goal of sediment management is the quantiﬁcation of suspended sediment load (SSL) in rivers. This research focused on a comparison of different means of suspended sediment estimation in rivers. This includes sediment rating curves (SRC) and soft computing techniques, i.e., local linear regression (LLR), artiﬁcial neural networks (ANN) and the wavelet-cum-ANN (WANN) method. Then, different techniques were applied to predict daily SSL at the Pirna and Magdeburg Stations of the Elbe River in Germany. By comparing the results of all the best models, it can be concluded that the soft computing techniques (LLR, ANN and WANN) better predicted the SSL than the SRC method. This is due to the fact that the former employed non-linear techniques for the data series reconstruction. The WANN models were the overall best performer. The WANN models in the testing phase showed a mean R 2 of 0.92 and a PBIAS of − 0.59%. Additionally, they were able to capture the suspended sediment peaks with greater accuracy. They were more successful as they captured the dynamic features of the non-linear and time-variant suspended sediment load, while other methods used simple raw data. Thus, WANN models could be an efﬁcient technique to simulate the SSL time series because they extract key features embedded in the SSL signal.


Introduction
Sedimentation is a nuisance in hydraulic and environmental engineering projects, such as dams, hydropower plants, canals and irrigation networks, wastewater treatment plants and water intakes, requiring sediment management measures for trouble-free operation of the facility. The problems caused by sedimentation include a reduction in channel conveyance, decrease in reservoir storage, blockage of the inlet to turbine, etc. [1,2]. Thus, estimation and forecasting the sediment load during the lifecycle of a project is a key design parameter in water resource planning and management.
Sediment load in rivers has been categorized into two types: suspended sediment load (SSL) and sediment bed load (SBL). Further, total annual river sediment in an alluvial river contains approximately 70 to 90 percent SSL [3]. The SSL comprises the main part of the river sediment transport and has a complex nature, in contrast with the SBL [4]. The SSL is considered one of the key factors affecting the landscapes [5] and pelagic environments [6,7]. It impacts river morphology, reservoir operation and useful life, as well as the functioning of hydraulic structures [8,9].
Sediments are produced by watershed erosion, triggered by the existence of erosive factors such as anthropic actions, climate, and specific features of the catchment. There are numerous variables and factors that participate in the dynamics of the hydrosedimentological processes that cause the detachment of the sediment particles from the watershed to their final influx in the river [10]. It is very difficult to estimate each of the individual erosional processes that contribute to sediment transport with any degree of certainty, let alone the final sediment load entering a river from the watershed. Therefore, the precise estimation and prediction of the SSL in a drainage basin has been a difficult task for hydrogeologists, environmentalists, and hydraulic engineers due to the aforementioned complex phenomena.
The scientific literature provides us with a large number of relationships, proposed by different researchers to estimate the sediment rate. These equations require several empirical parameters, depending on field conditions, for the precise computation of sediment load [11]. However, notwithstanding the availability of a large number of equations for the computation of sediment load, it is still a herculean task for an engineer to identify the appropriate equation for a particular river. Furthermore, none of these equations enjoys universal acceptability in the engineering community as far as the forecast of sediment transport rates in rivers is concerned.
Historically, several studies have been carried out to model sediment transport processes. Generally, mathematical models that require a large amount of input data and a prolonged computation time are used [12,13]. To overcome these complexities, numerous studies have been performed to simplify the complex phenomena by adopting practical approaches or methods that do not entail a physical mechanism. In this context, standard time-series approaches such as conventional sediment rating curves (SRC), multi-linear regression (MLR), and auto-regressive integrated moving average (ARIMA) are usually employed to predict hydro-metrological variables, as well as SSL [14][15][16][17][18]. However, the main drawback of the standard time series approaches is that they consider linear approaches for analyzing stationary data and are limited to capturing the nonlinearity and non-stationary features of the hydro-sedimentological dataset.
Over the two last decades, thanks to technological advancements, soft computing techniques have increasingly been employed in hydro-sedimentological studies and have effectively been employed as an alternative modelling tool [19][20][21]. In this regard, artificial neural networking (ANN), and hybrid wavelet and neural networking (WANN) are the most popular techniques that have been applied.
The local linear regression model (LLR) is based on a non-parametric technique that is employed for the prediction of nonlinear time series [22]. It has been widely adopted for hydro-meteorological estimation such as solar radiation assessment [23][24][25], streamflow assessment [26,27] reservoir water level estimation [28] and SSL assessment [29].
The ANN technique mimics the biological brain and nervous system functioning with a self-learning capability. ANN has great potential in the modelling of complex and non-linear features of the hydro-meteorological time series. It is a black-box approach with no need for prior knowledge that has been applied to develop an efficient link between inputs and outputs. An extensive review of ANN applications in the hydrological field for the estimation and prediction of numerous hydrological parameters has been acknowledged by the ASCE Task Committee [30,31]. In the last two decades, studies have shown that artificial neural networks (ANNs) have promising results in terms of modelling and forecasting streamflow [32,33], reservoir water level [34,35], and suspended sediment in rivers [2,[36][37][38]. The ANN model is especially employed where basic physical interactions are not entirely known, but there are sufficient data to train a network.
Mirbagheri et al. [39] have compared various conventional methods and soft computing approaches for the prediction of suspended sediment load. Kisi and Shiri [40] compared different soft-computing techniques, such as Gene Expression Programming (GEP), ANFIS and ANNs, for a daily forecast of SSC. Olyaie et al. [41] compared the performance of various conventional sediment-rating curves (SRCs) and soft-computing approaches such as ANFIS and ANNs to estimate the daily SSL. Singh et al. [42] modelled suspended sediment using four different heuristic techniques and two regression-based techniques. Pektas et al. [43] investigated the extrapolation performance of the ANN models for suspended sediment data and concluded that the ANN model provides a closer estimation of the observed peaks than conventional models. Khan et al. [1] predicted the missing and future daily suspended sediment load with the ANN model for the Mangla reservoir, Pakistan, and used it as a boundary condition for the hydraulic model to calculate the bed level changes and delta advancement rate for greater precision.
In recent years, the wavelet transform (WT) analysis has become a powerful tool because it simultaneously interprets the temporal and frequency information of the signal, which improves the model performance. The wavelet transform analysis gives a timefrequency representation of a signal at various periods in the time domain, and it also analyzes the non-stationarities in time series [44]. Previous studies have also revealed that the WT analysis tool surpassed the Fourier transform in investigations of the non-stationary time series [45]. The WT analysis is a comparatively innovative approach in the field of water resource research, which can create different time series that classify probable trends, seasonal deviations, and internal correlation among different variables. Recently, numerous studies have reported on the employment of coupled wavelet transform and ANN (WANN) models to investigate and forecast the different parameters of hydrological and environmental engineering. Further, many former studies have indicated that the proficiency of short-or long-term forecasts of the hydrological data is enhanced by employing the hybrid WANN model approach owing to the advantage of multiresolution sub-time series as the ANN input data [2,46]. Dumka et al. [47] developed a hybrid WANN model for monthly rainfall-runoff modelling on the Bhakhara River, Uttarakhand. Saraiva et al. [48] enhanced ANN prediction accuracy in the context of the daily flow of intermittent rivers by employing WANN models. Bajirao et al. [2] estimated the SSL by using different soft computing approaches and concluded that the performance of the ANN model was enhanced by coupling WT analysis. Sharghi et al. [49] studied two different hydro-ecological rivers with different land-cover characteristics and concluded that the WANN models were successful in the prediction of SSL. Their assessment encompassed a river draining a small basin (142 km 2 ) as well as a river with a large basin (≈10,000 km 2 ). It was also suggested that the good quality of results obtained using WANN for SSL modelling merited its usage to model other hydro-environmental (groundwater, precipitation, etc.) processes.
Despite the adequate efficiency of the coupled WANN model approach, some drawbacks can be associated with WANN modelling. The WT analysis tool decomposes the primary time series data into different sub-series data with detailed hidden information. Therefore, a large volume of data is fed to the ANN model as input, which can cause model complexity, errors, non-convergence, and overtraining. Owing to the large number of time series input samples and the aforementioned WANN drawbacks, the selection of an appropriate input sample and data length to train the ANN models is the most important challenge to avoid errors and a complex dataset. Investigating the number of the appropriate input variables and the appropriate data length for model development is one of the particular aims of the current study, which aims to develop a precise model for predicting SSL.
In view of the aforementioned facts, the present study was undertaken to investigate and compare the performance of several soft computing techniques (LLR, ANN and WANN models) with a conventional modelling technique (SRC) for the estimation of daily SSL at the Pirna and Magdeburg Stations of Elbe River, Germany. The objectives of the current study are: (a) to evaluate the best technique for the prediction of daily SSL by employing different models at different sites; (b) to evaluate the selection of appropriate input variables and data length for a training dataset using the Gamma test and M-test; (c) to analyze the best-selected input parameters and validate the prediction performance based on sensitivity analysis.
The novelty of this research lies in the usage of the Hill-climbing model identification technique, i.e., Gamma test and the M test, for the prediction of daily SSL in rivers. As the hydrological processes are highly complex, dynamic, and non-linear, finding the best-input combination is important. In the literature, this is performed through a trial-and-error approach, which is laborious. In order to overcome this problem, a novel mathematical tool, the Gamma test (GT), which is a non-parametric test, is adopted for the selection of the most suitable input variables, which would lead to a reliable and smooth model. Furthermore, an important issue in data-driven models is the determination of a suitable data length for the model development, which is neither under-fitted nor over-fitted. In general, the literature suggested reserving 70% to 90% of the dataset to train a model. However, in order to find the optimal training data length, we used the M-test for model development. The M-test shows the relationship between datapoints and Gamma statistic (Γ) values, which can be used to determine the data length needed to obtain a uniform function convergence.

Sediment Rating Curve
Suspended sediment load is considered the main part of the transported sediments in rivers. The measurement of the daily-suspended sediment concentration in rivers is a cumbersome and expensive process as compared to the daily river discharge. Hydrologists proposed different approaches for the estimation of missing SSL, with some associated uncertainty [50,51]. The missing SSL estimation is based on the available measured river discharges and sediment concentrations and also on the development of the regression relationship of river discharge (Q) and SSL. Hydrologists often use an empirical relation such as sediment rating curves (SRCs), which is a reliable method for the estimation and prediction of SSL. The sediment rating curves normally describe a functional association of SSL and Q, as shown in Equation (1) where SSL is the suspended sediment load in metric tons per day (t/day), C is the suspended sediment concentration in parts per million (PPM) and Q is the flow discharge in cubic meters per second (m 3 /s). a and b are the constant. Further, the efficiency of the SRCs for the assessment of long-term catchment sediments can be improved by adding a correction coefficient (C f ) to the equation of SRC. Correction factors (CF) are applied in the SRC equation as follows: In the present study, the Ferguson correction factor and the Smearing correction factor were applied, among different correction coefficients suggested by different scholars.

Quasi-Maximum Likelihood Estimator of Ferguson Method
This coefficient was employed by Ferguson [52] and Horowitz and McConnell [53] to rectify the logarithmic transformation ramification, considering a normal distribution for residual errors. This technique applies a correction coefficient derived from the square of the regression residual standard error, and is expressed as follows where e is the exponential function, s is the standard error of the regression equation, C 0 is the measured sediment concentration (t/day), C e is the predicted sediment concentration (t/day), and n is the number of observations.

Smearing Coefficient
This method is famous as the non-parametric correction factor is employed to eradicate bias by discarding the normal distribution of residual errors, and its common form is as follows [54] where εi is the residual least square of the regression model.

Local Linear Regression (LLR)
LLR is a non-parametric method that is generally employed to make a fast prediction with a high degree of accuracy. The LLR model is the most effective and reliable tool in regions of high data density in the input space, but it is not significantly effective if the datapoints are scarce and distant from the locality of the query point. The main prerequisite of the model performance is the selection of the number of nearest neighbours, k , of the query point from the particular dataset to build a linear model and then solve the linear matrix.

Artificial Neural Networks (ANN)
The ANN is inspired by the biological (brain) neuron system that is well-suited to the modelling of non-linear and complicated tasks such as the estimation and forecasting of rainfall, runoff and river sediment. Generally, the feed-forward multilayer perceptron neural network (MLPNN) is chosen from the various ANN structures, and extensively applied for the classification and regression analysis of nonlinear datasets [55].
In the neural network, each neuron (node) has input and output variables. The output variables' value is determined by neurons using the net and activation (transfer) function on input variables. Generally, the net function is determined by the linear form, given as where x i is an input variable, W i is the connection weights from the ith neuron in the input layer, and b is the bias or threshold value of the neuron. The net function (u) at a hidden node is transformed into output (y) using a non-linear activation function. The performance of the neurons can be improved by changing the transfer functions and varying parameters: for instance, the size of the hidden layer, size of the neurons in a hidden layer, gains and thresholds. The neural network is trained by employing an appropriate learning algorithm that altered the connection weights among the neurons of the training dataset. The weights are fixed after the successful completion of network learning. These interconnection weights are fine-tuned by an error convergence technique until the simulated output best matches the targeted output.

Wavelet Transform (WT)
Wavelet transform (WT) analysis seems to be a leading approach, compare to Fourier transform (FT), particularly when analyzing the non-stationary time series [56,57]. The WT analysis approach is preferred to the FT analysis because wavelets are localized in both time and frequency domains, while the standard FT is only localized in the frequency domain. Owing to this ability of the WT, the changes in the hydrological processes can be analyzed.
The WT decomposes the data series by transforming it into its subcomponent "wavelets", a scaled and shifted version of the 'mother' wavelet. Scaling a wavelet indicates the process of amplifying or compressing the data series in time, and it is inversely proportional to the frequency. An amplified wavelet helps to observe the gradually varying changes in a data series, while a compressed wavelet helps to observe swift changes. On the other hand, shifting a wavelet simply means advancing or delaying the onset of the wavelet along the length of the data series. It helps to align a data series to examine a particular feature in the data.
The present study will not examine a detailed background theory of the wavelet transform. A mathematical summary of WT and the exhaustive literature on its applications are covered in Labat et al. [58].

Gamma Test (GT)
Hydrological processes are highly complex, dynamic, and non-linear. To find the best-input combination, researchers need to perform a trial-and-error approach, which is laborious and cumbersome. To overcome this problem, a novel mathematical tool, the Gamma test (GT), which is a non-parametric test, is adopted for the assessment of the best-input variables, which are proficient enough to build a reliable and smooth model. The GT is derived by Stefánsson et al. [59] to carry out a nonlinear analysis for the estimation of the variance of noise in model output, called best mean square error (MSE) or Gamma statistic (Γ), which is attained by a smooth and noiseless model, by considering all the input data. Further explanations regarding GT can be obtained from Tsui et al. [60], Durrant [61] and Jones et al. [62].
A primary dataset [(x i ,y i ), 1 ≤ I ≤ M] is applied to construct an algorithm that is proficient to understand the link amid the input x and output y. The algorithm is developed based on the logic that y is a function of x. Further, it is disintegrated into smooth and noisy parts. If f is a smooth function and r is the noise part that cannot be considered near any smooth data model, then the relationship is given as [59].
By taking the mean of noise "r" is zero, a constant bias can be employed into the unknown function f, which will be our anticipated gamma statistics.
Further, outcomes within GT can be normalized by taking a scaled variant noise approximation, generally ranges from 0 to 1, named Vratio, and is described as where σ 2 (y) denotes the variance of output (y). The lesser the value of V ratio , the higher the foreseeability of the given output. The Gamma test could be applied to all combinations of inputs to find the best combination that produces the minimum absolute Gamma Value. If there are m scalar inputs, then there are 2 m − 1 possible combinations, but this can produce numerous unrealistic input combinations. Thus, the best input combination was identified by employing various model identification approaches in the WinGamma software. In this study, the Hill-Climbing model identification technique is applied to find the best input combinations for the daily SSL prediction based on a minimum value of gamma (G) and Vratio.

Data Normalization
The input and output data were transformed into a dimensionless state before passing through the GT and ANN models. Equation (10), used for data normalization, is given below where x n , x min and x max are the normalized, lowest and extreme values of the dataset. In this study, a and b were perceived as 0.6 and 0.2, ensuring the recommendation of Cigizoglu [63].

Training and Testing of ANN and WANN Models
Several ANN structures are employed for meteorological and hydrological prediction, among which the Feed Forward Neural Networks are the most popular [64].
In this study, a feed-forward multilayer perceptron (MLP)-based ANN model with a three-layer network (input, hidden and output) is employed for the estimation and prediction of daily SSL. For the network-training method, the back-propagation (BP)based Levenberg-Marquardt (LM) learning algorithm with the hyperbolic tangent sigmoid (tansig) activation function in the hidden layer and a linear function (pureln) in the output layer was applied to configure the MLP model. Further, the model performance can be enhanced by an appropriate selection of the number of neurons within a hidden layer. By selecting too few neurons in a hidden layer, the learning process may be affected, whereas more neurons in a hidden layer might decrease efficacy in terms of computational time or may cause a network-overfitting problem. To date, no clear rule has been suggested in the literature for the selection of the optimum number of neurons in the hidden layer. However, Olyaie, et al. [41] proposed that the hidden layer neurons should range from √ 2n + m to the value 2n + 1, where n and m represent the number of input and output nodes, respectively. In this study, the number of neurons in the hidden layer is increased from 3 to 2n + 1 for both the ANN and the WANN models to avoid under-and over-fitting.
Furthermore, the coupled WANN model is very close to the ANN model. Before applying the ANN model, the input time series data are pre-processed through the WT analysis tool. The WT analysis serves to decompose the original time series data into subseries data of different time scales using the DWT approach. The DWT has different types of mother wavelets, such as the Haar wavelet, Daubechies wavelet, Coiflet wavelet and biorthogonal wavelet to decompose the original time series data. In this study, the Daubechies level four (db4) mother wavelet is applied due to its better performance in the sediment transport processes [46]. After the decomposition of time series by WT, some resulted subseries need to be excluded if there is a poor correlation between the subseries and the observed data. Only the subseries data that have a significant correlation with the observed data are fed into an ANN model.

Limitation and Uncertainties in SSL Prediction
The uncertainties in the description of any physical process arise either from limitations in measurement techniques, imperfect knowledge of the process being modelled, i.e., epistemic uncertainty, or both. In our case of SSL prediction through the soft computing technique, both of these sources are present. The measurement uncertainty comes into play through the accuracy of the input time-series data regarding SSL being fed to the model. The SSL comprises the bed sediment load component in suspension and the so-called wash load. The latter has no nexus, with the sediment forming the channel bed, and consists of fine-sized particles originating from the catchment through sheet erosion. In rivers, sizes finer than 0.0625 mm are approximated as wash load. The SSL measurement is a delicate affair due to the variation in flow velocity and suspended sediment concentration, along with the depth. In principle, the SSL can be found by measuring the flow velocity and sediment concentration at a sufficient number of points over the cross-section and summing the product. In the field, the SSL is usually measured by an instrument known as a depth-integrating sampler. It is lowered at a uniform rate into the water from the surface to the channel bed, and then pulled back at the same rate. The sample concentration, thus collected, represents the mean concentration of the mixture, and when multiplied by the unit discharge, yields the sediment discharge per unit width. It can be appreciated that if there is a significant deviation from the mean sediment concentration over the vertical, there would be a corresponding error induced in the SSL. Usually, this deviation is maximal in streams where sand proportion (size > 0.0625 mm) forms the predominant part of the suspended load, while it is minimal in streams where silt and clay form the bulk of the suspended load. Thus, the raw input time series of SSL has to be viewed against the backdrop of the type of sediment load carried by the river.
As regards uncertainty and limitations in coupled WANN models (epistemic uncertainty), some drawbacks can be associated with this type of modelling. The WT analysis tool decomposes the primary time series data into different sub-series data with detailed hidden information. Therefore, a large volume of data is fed to the ANN model as input, which can cause model complexity, errors, non-convergence, and overtraining. Owing to the large number of time series input samples and the aforementioned WANN drawbacks, the selection of an appropriate input sample and data length for the training of the ANN models are important to avoid errors and complexity in the dataset.
A further question to ponder is the efficacy of the present WANN model is its ability to predict SSL in highly turbid rivers. Generally, rivers in a temperate region carry less sediment as compared to arid and semi-arid regions. Under such conditions, it is natural to wonder about the efficacy of the WANN technique, given the difference in sediment loading. However, it is suggested that this issue is more pertinent in physical models of flow and sediments than in data-driven models. A great majority of the former models only account for the water phase and assume that sediment proportion is very small compared to the solvent. Therefore, in case of a highly turbid flow, their predictions are less accurate, as turbulent eddy viscosity differs markedly for the sediment and water phases. However, in our case, the only physical input is the time series of flow and sediment load being fed to the model for training. Hence, the accuracy and length of the measured time series data assume a special importance and if the same are available, then there is no reason that the present model would not perform equally well under varying turbidity conditions in different basins. Thus, the drawbacks associated with the WANN technique for SSL prediction are specific to numerical aspects, as identified above, and are not physical in nature.

Model Performance Evaluation
The goodness-of-fit approaches applied to evaluate various simulation performances are percent bias (PBIAS), root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE), and coefficient of determination (R 2 ) (Bennett 2013, Rahman 2020). PBIAS measures the mean tendency of the simulated data to be larger or lower than the observed data. The perfect value of PBIAS is 0, while values that lie within ±15% are considered acceptable. Positive and negative values identify the model overestimation and underestimation, respectively. RMSE is a normal method to compute the error of a model in predicting quantitative data. The NSE determines how well the observed plot fits the computed plot and R 2 describes the degree of collinearity amid observed and computed data. Both NSE and R 2 values lie between 0 and 1, with higher values showing less error, and values greater than 0.5 are considered acceptable. All these parameters were calculated by the following equations: where S is the simulated value and O is the observed value, n is the number of observed data entries, and avg is the average of the total values.

Gauging Stations
The Elbe River is the third largest waterway in Central Europe, which originates from the Giant Mountains (Krkonoše) in the Czech Republic, with a catchment area of 148,268 km 2 and a running length of 1094 km. Overall one-third of the total length lies in the Czech Republic, while the rest of the course runs in Germany. The major parts of its catchment area lie in the low-mountain regions, which are primarily under agricultural use. About 600 km reach of the river downstream of the Czech Republic is free of barrages but has guide banks, groynes and flood protection structures. A detailed description of the catchment characteristics and incoming sediment challenges and their management for the Elbe River has been presented in several studies [65][66][67][68].
The available daily mean flow series and suspended sediment concentration data of the Elbe River at the Pirna gauging Station The SSL at the two sites of interest on the river Elbe was measured by the Federal Institute of Hydrology in cooperation with the local boards of the German Waterways and Shipping Administration. Bulk water samples of 5 L were taken on every working day in the middle of the river channel from a boat. The samples were then filtered on paper filters, dried, and the filter residues were gravimetrically determined to the yield concentra- The SSL at the two sites of interest on the river Elbe was measured by the Federal Institute of Hydrology in cooperation with the local boards of the German Waterways and Shipping Administration. Bulk water samples of 5 L were taken on every working day in the middle of the river channel from a boat. The samples were then filtered on paper filters, dried, and the filter residues were gravimetrically determined to the yield concentrations of total suspended matter in PPM. Further, there are small gaps in the measured data that mostly occurred on weekends and also a few large gaps in the available data from the years 1992-2019.
In this study, available daily mean flow and SSL data of around 27 years from November 1992-November 2019 of both gauging stations were employed for training and testing all the developed models. The daily flow and SSL time series data for the Pirna and Magdeburg-Strombrücke gauging stations are shown in Figures 2 and 3, respectively.  In the Pirna gauge station, the available observed Q and SSL data from 3 November 1992, to 21 June 2012 (around 73% of total data) were selected for training and the data from 22 June 2012, to 30 November 2019 (around 27% of the total data) were selected for testing the model. In the Magdeburg-Strombrücke gauging Station, the available observed and SSL data from 3 November 1992, to 9 June 2013 (around 75% of total data) were selected for training and the data from 10 June 2013, to 30 November 2019 (around 25% of the total data) were selected for testing the model. The data length for training and testing datasets was determined by employing the M-test. The training dataset has two benefits; first, it includes the maximum observed peaks of Q and SSL in the available data and secondly, it also covers the significant possible variations in the dataset, which is helpful in model training. Furthermore, some peaks in Q and SSL are missing in the available data. Therefore, those peaks are not included in the training and testing period.

Statistical Analysis of Data
Several statistical parameters of the training, testing and whole datasets for Pirna gauging Station and Magdeburg-Strombrücke gauging Station are presented in Tables 1  and 2, respectively. Tables 1 and 2 includes different statistical parameter such as the min-  In the Pirna gauge station, the available observed Q and SSL data from 3 November 1992, to 21 June 2012 (around 73% of total data) were selected for training and the data from 22 June 2012, to 30 November 2019 (around 27% of the total data) were selected for testing the model. In the Magdeburg-Strombrücke gauging Station, the available observed and SSL data from 3 November 1992, to 9 June 2013 (around 75% of total data) were selected for training and the data from 10 June 2013, to 30 November 2019 (around 25% of the total data) were selected for testing the model. The data length for training and testing datasets was determined by employing the M-test. The training dataset has two benefits; first, it includes the maximum observed peaks of Q and SSL in the available data and secondly, it also covers the significant possible variations in the dataset, which is helpful in model training. Furthermore, some peaks in Q and SSL are missing in the available data. Therefore, those peaks are not included in the training and testing period.

Statistical Analysis of Data
Several statistical parameters of the training, testing and whole datasets for Pirna gauging Station and Magdeburg-Strombrücke gauging Station are presented in Tables 1  and 2, respectively. Tables 1 and 2 includes different statistical parameter such as the min- In the Pirna gauge station, the available observed Q and SSL data from 3 November 1992, to 21 June 2012 (around 73% of total data) were selected for training and the data from 22 June 2012, to 30 November 2019 (around 27% of the total data) were selected for testing the model. In the Magdeburg-Strombrücke gauging Station, the available observed and SSL data from 3 November 1992, to 9 June 2013 (around 75% of total data) were selected for training and the data from 10 June 2013, to 30 November 2019 (around 25% of the total data) were selected for testing the model. The data length for training and testing datasets was determined by employing the M-test. The training dataset has two benefits; first, it includes the maximum observed peaks of Q and SSL in the available data and secondly, it also covers the significant possible variations in the dataset, which is helpful in model training. Furthermore, some peaks in Q and SSL are missing in the available data. Therefore, those peaks are not included in the training and testing period.

Statistical Analysis of Data
Several statistical parameters of the training, testing and whole datasets for Pirna gauging Station and Magdeburg-Strombrücke gauging Station are presented in Tables 1 and 2,  respectively. Tables 1 and 2 includes different statistical parameter such as the minimum, maximum, mean, standard deviation (Sd), coefficient of variation (CV) and skewness coefficient (Csx) of the data. Further, it should be noted that, like other empirical models, ANN models perform well if they do not extrapolate the given range of the data employed for model development [69]. From Tables 1 and 2, the peak values of Q and SSL are observed in the training dataset and all the statistical parameters for training and testing have comparatively similar characteristics, which enhance the model performance. The statistics provided in Tables 1 and 2 elucidate the variations in the Q and SSL data. The skewness coefficients of both gauging stations were low for training testing datasets, as presented in Tables 1 and 2. The low value of the skewed distribution is considered appropriate for the model development because a greater skewness of the time series causes a significant and adverse impact on ANN performance [70].

Selection of Optimal Input Combinations Based on Gamma Test (GT)
As the hydrological processes are inherently dynamic, the current response relies on the present and former responses in the hydrological system's record. Therefore, it is considered that the present-day SSL (S t ) response depends on the present day response of river water discharge (Q t ), and the previous one and the response of river water discharge that occurred the day before (Q t−1 , Q t−2 ) and SSL (S t−1 , S t−2 ). Thus, the present-day SSL (S t ) value would be a function of current and antecedent river water discharges and SSL values, as follows: In this study, a total of five input variables (Q t , Q t−1 , Q t−2 , S t−1 , and S t−2 ) are considered for the modelling of SSL by analyzing the effects of various input combinations.
Based on these five input variables, a total of 2 m − 1 (i.e., 31) input combinations can be used to model SSL. Thus, to find the best possible input combinations for both stations, the Gamma test (GT) was performed to determine the optimal combinations. For both gauging stations, the best 5 out of 31 possible input combinations were selected based on minimum Gamma statistic (Γ), and V-Ratio value criterion, as presented in Tables 3 and 4. Table 3. Best input and output combinations based on GT for the Pirna gauging Station.

M-Test for the Selection of Training Data Length
After the selection of optimal input combinations through GT, we proceeded to determine the suitable data length for the model development, which was neither underfitted nor over-fitted. In general, the literature suggested that a data length ranging from 70% to 90% to train a model is appropriate for the construction of a smooth model. However, in order to find the optimal training data length, we used the M-test for model development.

M-Test for the Selection of Training Data Length
After the selection of optimal input combinations through GT, we proceeded to determine the suitable data length for the model development, which was neither underfitted nor over-fitted. In general, the literature suggested that a data length ranging from 70% to 90% to train a model is appropriate for the construction of a smooth model. However, in order to find the optimal training data length, we used the M-test for model development. The M-test shows the relationship between datapoints and Gamma statistic (Γ) values, which provide direction for the selection of data length with a nearly uniform function convergence. The results of the M-test for the Pirna and Magdeburg-Strombrücke gauging stations are displayed in Figures 4 and 5. The M test identified a uniform convergence of the gamma statistic to a value of 0.000105 and 0.00104 at around 7000 (approximately 73% of total data and 7330 (approximately 75% of total data) datapoints to train a smooth model development at the Pina and Magdeburg-Strombrücke gauging stations, respectively.

The Standard Rating Curve (SRC) Approach
Normally, the conventional SRC is the preferred approach to estimate the SSL in rivers. For the development of the SRC model, the total observed data are divided into two phases, a training and testing phase, as explained previously. The SRC approach was employed to develop a relationship between river water discharge, and SSL and Q-SSL equations were derived from the USBR method (Equation (2)). By employing these equations, missing SSL was estimated for those days when only water discharge data existed. The effectiveness of this approach is based on the available paired datapoints, which were employed to form SRC. Further, two correction factors, i.e., Ferguson correction factor (FCF) and Smearing correction factor (SCF), were calculated and applied to the derived equations for the estimation of SSL. The sediment rating curves derived for the Pirna and Magdeburg-Strombrücke gauging stations are displayed in Figure 6

The Standard Rating Curve (SRC) Approach
Normally, the conventional SRC is the preferred approach to estimate the SSL in rivers. For the development of the SRC model, the total observed data are divided into two phases, a training and testing phase, as explained previously. The SRC approach was employed to develop a relationship between river water discharge, and SSL and Q-SSL equations were derived from the USBR method (Equation (2)). By employing these equations, missing SSL was estimated for those days when only water discharge data existed. The effectiveness of this approach is based on the available paired datapoints, which were employed to form SRC. Further, two correction factors, i.e., Ferguson correction factor (FCF) and Smearing correction factor (SCF), were calculated and applied to the derived equations for the estimation of SSL. The sediment rating curves derived for the Pirna and Magdeburg-Strombrücke gauging stations are displayed in Figure 6 The performance of different SRC approaches to estimate the SSL in the testing phase for the Pirna and Magdeburg-Strombrücke gauging stations are given in Table 5. Among all the sediment rating curves approaches, the SRC approach (without any correction fac- The performance of different SRC approaches to estimate the SSL in the testing phase for the Pirna and Magdeburg-Strombrücke gauging stations are given in Table 5. Among all the sediment rating curves approaches, the SRC approach (without any correction factor) performed better than the SRC approaches, with a bias correction factor for both gauging stations. However, the overall performance efficiency of all these approaches remains low, because PBIAS results showed that all SRC approaches overestimated the SSL, with a high RMSE.

The Suspended Sediment Load (SSL) Estimation through Local Linear Regression Models
The LLR models were trained to predict the SSL for both Pirna and Magdeburg-Strombrücke gauging stations by optimizing the number of nearest neighbours (NN) and a threshold value. The number of nearest neighbours is an important parameter to train

The Suspended Sediment Load (SSL) Estimation through Local Linear Regression Models
The LLR models were trained to predict the SSL for both Pirna and Magdeburg-Strombrücke gauging stations by optimizing the number of nearest neighbours (NN) and a threshold value. The number of nearest neighbours is an important parameter to train

The Suspended Sediment Load (SSL) Estimation through Local Linear Regression Models
The LLR models were trained to predict the SSL for both Pirna and Magdeburg-Strombrücke gauging stations by optimizing the number of nearest neighbours (NN) and a threshold value. The number of nearest neighbours is an important parameter to train the LLR model and it depends on the data length. Jones [62] recommended that if the data length is relatively shorter, then the appropriate number of NN is in the range of 10-20, but, for the larger input data length, the number of NN should be increased to obtain a more precise solution. The threshold value in LLR is applied to filter the local eigenvectors and its default value is around 10 −6 . By setting it to a low value or zero, all the eigenvectors in the local model are included while, by increasing the value, more eigenvectors are filtered out. In this study, the optimal value of NN for all five best-input combinations of each gauging station was selected by applying the increasing near neighbors test, which gave the least value and a nearly smooth line of Gamma statistic and standard error. For instance, the value of the NN for combination 1 of the Pirna station LLR model was taken as equal to 22, where Gamma statistics and the standard error value is minimal and the line is smooth, as presented in Figure 9. Furthermore, the optimal threshold value was found to be 0.01 using the hit and trial method. Subsequently, different LLR models were developed to compute the SSL for all the best combinations of each station in the river, and the results for both stations are presented in Table 6.  To estimate the LLR model efficiency, several statistical parameters were calculated as presented in Table 6 and overall results at the Pirna station revealed that an input of combination 1 gave a relatively high accuracy and, similarly, for the Magdeburg-Strombrücke station LLR model with input, combination 4 provided the best performance criteria.
It could be seen in Table 6 that the values of R, R 2 , RMSE and PBIAS for the testing of the LLR model (Combination 1 and NN 20) of the Pirna and Magdeburg-Strombrücke  To estimate the LLR model efficiency, several statistical parameters were calculated as presented in Table 6 and overall results at the Pirna station revealed that an input of combination 1 gave a relatively high accuracy and, similarly, for the Magdeburg-Strombrücke station LLR model with input, combination 4 provided the best performance criteria.
It could be seen in Table 6 that the values of R, R 2 , RMSE and PBIAS for the testing of the LLR model (Combination 1 and NN 20) of the Pirna and Magdeburg-Strombrücke stations (Combination 4 and NN 22) were 0.94, 0.89, 235.6 (t/day), 8.78%, and 0.90, 0.82, 339.23 (t/day), 3.17%, respectively. Similarly, it can be observed in Table 5 that the abovementioned parameters were, respectively, 0.77, 0.59, 464.27 (t/day) and 15.9%, and 0.82, 0.67, 475.85 (t/day) and 17.30% for the SRC approach for the Pirna and Magdeburg-Strombrücke stations. The outcomes (Tables 5 and 6) revealed that the best LLR models are much better than the SRC model. The SRC approaches were not suitable for handling the complex sediment phenomena involved, although some adjustments could be useful to enhance the precision of the SRC approach (e.g., applying various fitted curves).
The observed and predicted SSL, computed using LLR models during the test phase for the Pirna and Magdeburg-Strombrücke stations, are presented in Figures 10 and 11

The Suspended Sediment Load (SSL) Estimation through ANN Models
The best structure of the ANN models and the adjustment of its distinct parameters were measured by the least value of the root mean square error (RMSE) of the training and testing sets. The optimization of the ANN network depends on two factors, namely, ANN structure, and the number of training iterations (epoch). Model efficiency can be enhanced by an appropriate selection of these two parameters for the training and testing of the ANN model. In this research, the best ANN model was developed using the tansig activation function in the hidden layer, purelin activation function in the output layer, Levenberg-Marquardt learning algorithm, 10 −5 as goal performance and 1000 epochs in the single-hidden-layer network. Additionally, another critical parameter is the appropri-

The Suspended Sediment Load (SSL) Estimation through ANN Models
The best structure of the ANN models and the adjustment of its distinct parameters were measured by the least value of the root mean square error (RMSE) of the training and testing sets. The optimization of the ANN network depends on two factors, namely, ANN structure, and the number of training iterations (epoch). Model efficiency can be enhanced by an appropriate selection of these two parameters for the training and testing of the ANN model. In this research, the best ANN model was developed using the tansig activation function in the hidden layer, purelin activation function in the output layer, Levenberg-Marquardt learning algorithm, 10 −5 as goal performance and 1000 epochs in the single-hidden-layer network. Additionally, another critical parameter is the appropri-

The Suspended Sediment Load (SSL) Estimation through ANN Models
The best structure of the ANN models and the adjustment of its distinct parameters were measured by the least value of the root mean square error (RMSE) of the training and testing sets. The optimization of the ANN network depends on two factors, namely, ANN structure, and the number of training iterations (epoch). Model efficiency can be enhanced by an appropriate selection of these two parameters for the training and testing of the ANN model. In this research, the best ANN model was developed using the tansig activation function in the hidden layer, purelin activation function in the output layer, Levenberg-Marquardt learning algorithm, 10 −5 as goal performance and 1000 epochs in the single-hidden-layer network. Additionally, another critical parameter is the appropriate selection of the number of nodes in the hidden layer. Olyaie, et al. [41] proposed that the hidden layer neurons should range from √ 2n + m to the value 2n + 1. According to that study, the number of neurons in the hidden layer is increased from 3 to 11 for input combination 1 and 2 to construct several ANN models to estimate the SSL. It was observed that the performance of the ANN model was not considerably improved with the increase in the number of hidden nodes above the recommended threshold, as presented in Figure 12, which tallies with the research findings of Abrahart and See [71], Rajaee [17], and Olyaie et al. [41]. To estimate model efficiency, the ANN model was applied to all the best combinations, 1 to 5, as given in Tables 3 and 4, for each gauging station. This was followed by the computation of different statistical indices, as presented in Table 7, for the evaluation of the best ANN model at each gauging station. The results revealed that combinations 1 and 2 provided the best performance accuracy to predict the SSL for the Pirna and Magdeburg-Strombrücke stations, respectively. In combination 1 and 2, given in Table 7, the ANN structure was (5, 11, 1), describing 5, 11, and 1 input, hidden, and output neurons, for both stations, respectively. It can be observed from Table 7 that the values of R, R 2 , RMSE and PBIAS for testing of the ANN model (Combination 1 and Nodes 11) for the Pirna and Magdeburg-Strombrücke stations (Combination 2 and Nodes 11) were 0.95, 0.90, 233.48 (t/day) and 3.43%, and 0.91, 0.82, 335.71 (t/day) and 2.53%, respectively. Similarly, in Table 6, the above-mentioned parameters are 0.94, 0.89, 235.6 (t/day) and 8.78%, and 0.90, 0.82, 339.23 (t/day) and 3.17% for the LLR approach for the Pirna and Magdeburg-Strombrücke stations, respectively. The outcomes from Tables 6 and 7 reveal that the best ANN models are slightly superior to the best LLR models, but differences between the outputs of both techniques To estimate model efficiency, the ANN model was applied to all the best combinations, 1 to 5, as given in Tables 3 and 4, for each gauging station. This was followed by the computation of different statistical indices, as presented in Table 7, for the evaluation of the best ANN model at each gauging station. The results revealed that combinations 1 and 2 provided the best performance accuracy to predict the SSL for the Pirna and Magdeburg-Strombrücke stations, respectively. In combination 1 and 2, given in Table 7, the ANN structure was (5, 11, 1), describing 5, 11, and 1 input, hidden, and output neurons, for both stations, respectively. It can be observed from Table 7 that the values of R, R 2 , RMSE and PBIAS for testing of the ANN model (Combination 1 and Nodes 11) for the Pirna and Magdeburg-Strombrücke stations (Combination 2 and Nodes 11) were 0.95, 0.90, 233.48 (t/day) and 3.43%, and 0.91, 0.82, 335.71 (t/day) and 2.53%, respectively. Similarly, in Table 6, the above-mentioned parameters are 0.94, 0.89, 235.6 (t/day) and 8.78%, and 0.90, 0.82, 339.23 (t/day) and 3.17% for the LLR approach for the Pirna and Magdeburg-Strombrücke stations, respectively. The outcomes from Tables 6 and 7 reveal that the best ANN models are slightly superior to the best LLR models, but differences between the outputs of both techniques were not significant, and they could be taken as an alternate approach for modelling SSL.
The observed and predicted SSL of the best ANN models during the test phase for the Pirna and Magdeburg-Strombrücke stations are presented in Figures 13 and 14, respectively.

The Suspended Sediment Load (SSL) Estimation through WANN Models
In the hybrid WANN models, the original time series data of different input combinations for each station, selected through GT, were decomposed using the DWT approach and fed into the ANN models as input. The db4 mother wavelet was employed to decompose the particular input variables into various multi-frequency, sub-signals at the appropriate decomposition level. The latter was determined by employing the empirical relation proposed by Kisi [72] and Alizadeh et al. [73] = [ ] where i is the appropriate decomposition level and N is the number of datapoints. Here, N was taken as 7000 and 7330 datapoints for the Pirna and Magdeburg-Strombrücke gauging Stations, respectively, and int [.] is the integer part function. Therefore, the five selected input variables in combination 1 for the Pirna station and combination 2 for the Magdeburg-Strombrücke station were decomposed at level (i) 3 by applying the db4 mother wavelet. Furthermore, each of the selected input variables was decomposed at level 3, which generated 1 approximation (A3) and 3 details (D1, D2, D3), with a total of four sub-signals. Thus, the selected five input variables (i.e., Qt, Qt−1, Qt−2,

The Suspended Sediment Load (SSL) Estimation through WANN Models
In the hybrid WANN models, the original time series data of different input combinations for each station, selected through GT, were decomposed using the DWT approach and fed into the ANN models as input. The db4 mother wavelet was employed to decompose the particular input variables into various multi-frequency, sub-signals at the appropriate decomposition level. The latter was determined by employing the empirical relation proposed by Kisi [72] and Alizadeh et al. [73] = [ ] where i is the appropriate decomposition level and N is the number of datapoints. Here, N was taken as 7000 and 7330 datapoints for the Pirna and Magdeburg-Strombrücke gauging Stations, respectively, and int [.] is the integer part function. Therefore, the five selected input variables in combination 1 for the Pirna station and combination 2 for the Magdeburg-Strombrücke station were decomposed at level (i) 3 by applying the db4 mother wavelet. Furthermore, each of the selected input variables was decomposed at level 3, which generated 1 approximation (A3) and 3 details (D1, D2, D3), with a total of four sub-signals. Thus, the selected five input variables (i.e., Qt, Qt−1, Qt−2,

The Suspended Sediment Load (SSL) Estimation through WANN Models
In the hybrid WANN models, the original time series data of different input combinations for each station, selected through GT, were decomposed using the DWT approach and fed into the ANN models as input. The db4 mother wavelet was employed to decompose the particular input variables into various multi-frequency, sub-signals at the appropriate decomposition level. The latter was determined by employing the empirical relation proposed by Kisi [72] and Alizadeh et al. [73] where i is the appropriate decomposition level and N is the number of datapoints. Here, N was taken as 7000 and 7330 datapoints for the Pirna and Magdeburg-Strombrücke gauging Stations, respectively, and int [.] is the integer part function.
Therefore, the five selected input variables in combination 1 for the Pirna station and combination 2 for the Magdeburg-Strombrücke station were decomposed at level (i) 3 by applying the db4 mother wavelet. Furthermore, each of the selected input variables was decomposed at level 3, which generated 1 approximation (A3) and 3 details (D1, D2, D3), with a total of four sub-signals. Thus, the selected five input variables (i.e., Q t , Q t−1 , Q t−2 , S t−1 and S t−2 ) generated 20 (5 × 4) sub-signals. Then, from all these decomposed subseries, only the subseries with a significant correlation with the observed data were fed into an ANN model to develop a hybrid WANN model for the prediction of SSL. Consequently, different WANN models were developed to compute the SSL for all the best combinations of each station.
The predictive capacity of different developed WANN models for both stations was assessed through different statistical parameters such as R, R 2 , RMSE and PBIAS. The performance evaluation of different WANN models for the Pirna gauging station and Magdeburg-Strombrücke gauging station in the training and testing phase are presented in Table 8. It was revealed from Table 8 Table 8 shows that, among all the WANN models for both stations, the model that has input combination Q t , Q t−2 , S t−1 and S t−2 had the least error and relatively higher accuracy. Further, the outcomes from Tables 7 and 8 revealed that the WANN models delivered a superior performance to the LLR and ANN models in suspended sediment load prediction.
The best WANN models for the Pirna gauging station and Magdeburg-Strombrücke gauging station had a testing R, R 2 , RMSE and PBIAS of 0.97 and 0.95, 0.95 and 0.90, 160.26 and 248.26 t/day, and −0.73% and −0.46%, respectively (Table 8) and were more accurate than the best ANN model, which had a testing R, R 2 , RMSE and PBIAS of 0.95, 0.90, 233.48 t/day and 3.43% (Table 7) for the Pirna gauge station and 0.91, 0.82, 335.71 t/day and 2.53% (Table 7) for the Magdeburg-Strombrücke gauge station, respectively.
We know that the combination of R 2 values closer to 1 (1 is the perfect fit value) coupled with the lower RMSE and PBIAS values (0 is the perfect fit value) pointed to the WANN model as the most precise model when reproducing the observed SSL pattern at both the stations, notwithstanding minor discrepancies. The best WANN model predictions in the testing phase was compared using scatter plots and time series (sediment) graphs of observed versus predicted SSL, as shown in Figures 15 and 16, for the two stations, respectively.

Comparison of Different Models Performance
The best models were selected after applying various techniques and then compared among themselves in order to determine the single most accurate model based on different performance evaluation indices i.e., R, R 2 , RMSE and PBIAS, for the prediction of daily SSL in rivers. It was observed that the WANN models outperformed the other best LLR and ANN models and the SRC approaches in SSL prediction SSL for both gauging stations, as presented in Figure 17.

Comparison of Different Models Performance
The best models were selected after applying various techniques and then compared among themselves in order to determine the single most accurate model based on different performance evaluation indices i.e., R, R 2 , RMSE and PBIAS, for the prediction of daily SSL in rivers. It was observed that the WANN models outperformed the other best LLR and ANN models and the SRC approaches in SSL prediction SSL for both gauging stations, as presented in Figure 17.

Comparison of Different Models Performance
The best models were selected after applying various techniques and then compared among themselves in order to determine the single most accurate model based on different performance evaluation indices i.e., R, R 2 , RMSE and PBIAS, for the prediction of daily SSL in rivers. It was observed that the WANN models outperformed the other best LLR and ANN models and the SRC approaches in SSL prediction SSL for both gauging stations, as presented in Figure 17.
Furthermore, the results of different performance evaluation parameters of various models for both gauging stations are shown in Table 9. It is revealed that the best WANN models for the Pirna and Magdeburg-Strombrücke gauging stations had a testing value of R, R 2 , RMSE and PBIAS as 0.97 and 0.95, 0.95 and 0.90, 160.26 and 248.26 t/day and −0.73 and −0.46%, respectively, and were better than the the best ANN and LLR models, as well as those from the SRC, FCF and SCF approach (Table 9). Overall, the results indicated that the WANN models for both stations had higher values of the R, R 2 and lower values of RMSE and PBIAS. This established that they were more accurate among the other models in the prediction of the daily SSL in rivers. Further, the list of models, in decreasing order of accuracy in the prediction of daily SSL, is as follows: WANN, ANN, LLR model and SRC approaches. Figure 17. Comparison of different best models for the prediction of daily SSL by Taylor diagram. Furthermore, the results of different performance evaluation parameters of various models for both gauging stations are shown in Table 9. It is revealed that the best WANN models for the Pirna and Magdeburg-Strombrücke gauging stations had a testing value of R, R 2 , RMSE and PBIAS as 0.97 and 0.95, 0.95 and 0.90, 160.26 and 248.26 t/day and −0.73 and −0.46%, respectively, and were better than the the best ANN and LLR models, as well as those from the SRC, FCF and SCF approach (Table 9). Overall, the results indicated that the WANN models for both stations had higher values of the R, R 2 and lower values of RMSE and PBIAS. This established that they were more accurate among the other models in the prediction of the daily SSL in rivers. Further, the list of models, in decreasing order of accuracy in the prediction of daily SSL, is as follows: WANN, ANN, LLR model and SRC approaches. The model efficiency was also assessed with regard to its ability to estimate the peak SSL values. Usually, the peak SSL values are considered the key aspect in any river structure management as the design is based on those values. In this regard, the peak values were taken by considering the threshold of the top 5% of the data from the original SSL time series. The performance of different models was assessed through different statistical indices, i.e., R 2 , RMSE and PBIAS, and the results figure in Table 10. As per the latter, the WANN models, for both stations, were more competent than the LLR, ANN and SRC models in capturing the peak SSL values.  The model efficiency was also assessed with regard to its ability to estimate the peak SSL values. Usually, the peak SSL values are considered the key aspect in any river structure management as the design is based on those values. In this regard, the peak values were taken by considering the threshold of the top 5% of the data from the original SSL time series. The performance of different models was assessed through different statistical indices, i.e., R 2 , RMSE and PBIAS, and the results figure in Table 10. As per the latter, the WANN models, for both stations, were more competent than the LLR, ANN and SRC models in capturing the peak SSL values.
This research has served to establish the excellent performance of the ANN and the coupled ANN-WT (WANN) technique over the traditional approach of SRC. It is pertinent to investigate the reasons behind this improved performance. We begin by recalling that the SRC is based on linear interpolation (no memory effect) while the hydrological processes are inherently non-linear. The latter quality implies the dependence of a given state on a set of states coming before and after that particular time, for which a predicted value of SSL is being sought. The ANN's superior performance is explained by its employment of non-linear interpolation capability. The latter comes from the training of the model when it is fed the entire time series, thus endowing it with a 'memory' that helps explain its success in better predicting SSL peaks. The WANN model is basically ANN, to which the decomposed time series data are fed using the wavelet transform method. It goes a step further than the simple ANN, as the WT method is used to break down the input time series into component sub-series, which better explain the information contained in short-term, transient events such as SSL peaks, which appear suddenly. Thus, the dynamic features of the studied phenomenon are better translated to the ANN model, which, in turn, enhances and amplifies its predictive capacity.

Conclusions
Sediments carried by water are a nuisance, as they shortens the life of a reservoir, reduce the channel discharge-carrying capacity, especially to tail-end users, etc. Therefore, sediment management is the golden rule in river engineering, to which much effort and energy are directed. An important aspect of sediment management is sediment estimation, which is mostly found in a suspended form in rivers and other water bodies.
This research focused on a comparison of the different means of suspended sediment estimation in rivers. This includes the traditional method, i.e., sediment rating curve (SRC) and soft computing techniques, i.e., local linear regression (LLR), artificial neural networks (ANN) and the wavelet-cum-ANN (WANN) method. All the methods require extensive data from the field, which were obtained from the two river gauging stations situated on the Elbe River in Germany. The data pertain to the daily mean flow and suspended sediment concentration.
The SRC represents a functional relationship between daily SSL load in tons and the river volume flow rate in power-law form. The LLR model is the most effective and reliable tool in regions of high data density in the input space, but it is not significantly effective if the datapoints are scarce and distant from the locality of the query point. The ANN is inspired by the biological (brain) neuron system, which is well-suited to the modelling of non-linear and complicated tasks such as the estimation and forecasting of rainfall, runoff and river sediment. In the WANN models, the original time series data of different input combinations for each station, selected through the Gamma test (GT), were decomposed using the discrete wavelet transform (DWT) approach and fed into the ANN models as input.
The model was built based on the postulation that the present-day SSL response depends upon today's and the antecedent two days volume flow rates and the sediment concentrations of the previous day and the day before. Further, the Gamma test and M-test were performed for the identification of optimal input combinations and selection of a suitable data length for model development, respectively. Thus, by applying the Hill-climbing model-identification technique, a total of five best-input combinations were selected based on the minimum Gamma statistic and V-ratio for each station. Similarly, M-test identified an optimal training data length of around 75% of the total data for accurate model development for both stations.
The goodness of fit for any model was assessed via R 2 , RMSE and PBIAS. The SRC performed well, with an R 2 of 0.64, but it still overestimated the sediment load and was unable to capture the peak sediment rates, which are of great importance for design purposes. The best LLR model scored a mean R 2 of about 0.85, overtaking the best SRC results. Although, it captured the peaks better than SRC, it still was unable to match the observed SSL peak concentration rates. The ANN model depicted a mean R 2 of about 0.86 and a PBIAS of about 3%. A comparison between LLR and ANN models showed that ANN was slightly ahead of LLR, but the difference was not significant and both may be considered as equally good alternatives for the prediction of SSL. Performancewise, the WANN models in the testing phase showed a mean R 2 of 0.92 and a PBIAS of −0.59%. Overall, the results indicated that the WANN models for both stations had higher R 2 and lower RMSE and PBIAS values, thus, establishing that they were more accurate than the other models in the prediction of daily SSL in rivers. Further, the list of models, in decreasing order of accuracy in the prediction of daily SSL, was as follows: WANN, ANN, LLR model and SRC approaches. Similarly, the WANN model was most accurate in reproducing the SSL peak flow rates compared to the rest of the models.
The soft computing methods (ANN, LLR, and WANN) performed better than the traditional technique (SRC), as they made use of non-linear techniques for data reconstruction. It can be concluded that, among all the models, the WANN models, which used decomposed data that captured the dynamic features of the non-linear and non-stationary SSL time series data, performed better than other models, which used simple raw data.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.