Improving Flow Discharge-Suspended Sediment Relations: Intelligent Algorithms versus Data Separation

Information on the transport of fluvial suspended sediment loads (SSL) is crucial due to its effects on water quality, pollutant transport and transformation, dam operations, and reservoir capacity. As such, adopting a reliable method to accurately estimate SSL is a key topic for watershed managers, hydrologists, river engineers, and hydraulic engineers. One of the most common methods for estimating SSL or suspended sediment concentrations (SSC) is sediment rating curve (SRC), which has several weaknesses. Here, we optimize the SRC equation using two main approaches. Firstly, three well recognized metaheuristic algorithms (genetic algorithm (GA), particle swarm optimization (PSO), and imperialist competitive algorithm (ICA)) were used together with two classical approaches (food and agriculture organization (FAO) and non-parametric smearing estimator (CF2)) to optimize the coefficients of the SRC regression model. The second approach uses separation of data based on season and flow discharge (Qw) characteristics. A support vector regression (SVR) model using only Qw as an input was employed for SSC estimation and the results were compared with the SRC and its optimized versions. Metaheuristic algorithms improved the performance of the SRC model and the PSO model outperformed the other algorithms. These results also indicate that the model performance was directly related to the temporal separation of data. Based on these findings, if data are more homogenous and related to the limited climatic conditions used in the estimation of SSC, the estimations are improved. Moreover, it was observed that optimizing SRC through metaheuristic models was much more effective than separating data in the SCR model. The results also indicated that with the same input data, SVR was superior to the SRC model and its optimized version.


Introduction
Having adequate up-to-date information about sediment loads in rivers is important for hydraulic, river engineering, and water resources projects [1,2]. Sediment transport changes channel dynamics and ecologic and hydraulic conditions of the river [3]. Following the definitions of Einstein [4], a river can carry sediment in two distinct modes: bed load and suspended load that depend on particle size, weight and shape, and on the ambient hydraulic conditions. The fluvial sediment load is commonly measured directly or calculated indirectly [3]. Typically, due to lack of facilities, technical constraints, difficulty in accessing remote areas, and high costs, direct and continuous collection of sediment data is not possible, especially in developing countries like Iran [5]. A review of previous studies shows that, many researchers used indirect methods or alternative approaches for estimating of bed load [6][7][8][9] and suspended load [6,10]. Most of these sediment transport functions require comprehensive information on the channel, flow conditions, and sediment characteristics [11]. Therefore, use of simple methods with accessible parameters to estimate sediment load is a more practical alternative; such relationships can be derived using regression analysis [3]. Sediment rating curve (SRC) is the most common regression-based model which has been applied throughout the world in different environmental conditions to estimate suspended sediment in rivers [12][13][14][15][16][17][18][19][20]. This method expresses the empirical relationship between suspended sediment loads (SSL) or suspended sediment concentrations (SSC) and flow discharge (Q w ) by power, linear, or polynomial functions [13,[19][20][21][22][23].
However, there are several weaknesses with the SRC that generate inaccuracies in modeling and estimation of SSL. The first limitation arises from the bias caused by the logarithmic transformation of data during the computation of model coefficients, which leads to underestimation [13,24]. The second limitation is associated with the extrapolation of sediment values at high discharge [25]. Because most of the values used in construction of the SRC are at low flows, the majority of the suspended load in rivers and streams occurs during flood periods. This generally leads to higher error in suspended load estimation at high discharge [12,26]. The third limitation is the typical wide scatter of values around the regression line with long-term suspended sediment against discharge data [13]. This can arise because, in the SRC model, suspended sediment is estimated based on discharge, while the amount of suspended sediment at equal discharge values may be variable in different events (e.g., rainfall, snowmelt, base flow [27]. This can also be seen in the event of flood hydrograph, where the amount of suspended sediment at a constant discharge, in the beginning of the rising limb of the hydrograph, can be quite different from that in the falling limb (i.e., hysteresis effects [28]). This is because, while a large amount of sediment is usually available for transport on the rising limb and peak flow, on the falling limb, less sediment is available for transport; however, in some cases, this hysteresis can be reversed [26]. In addition, the occurrence of successive floods reduces the sediment load in a river, which may result in different amounts of sediment load at equal discharges [29,30].
To overcome these limitations with SRC's, various methods have been suggested. Among these are classical correction coefficient models such as the food and agriculture organization (FAO), quasi-maximum likelihood estimator (QMLE), and non-parametric smearing estimator (CF 2 ) [24,31,32]. Despite their differences, the common aim of these models is to increase the accuracy of calculated values by SRC. Another way of tackling this problem is to prepare different SRC models based on time separation of data (e.g., seasons, months) and based on flow characteristics (e.g., high water and low water periods, similar hydrological periods, discharge classes). Various studies have confirmed that classification of data into hydrological groups and increasing time resolution can lead to better model performance [16,17,19,26,[33][34][35][36][37][38][39][40]. In fact, the goal of data separation is to reduce data scatter around the regression line and increase the estimation power of the model.
Alternatively, to increase the accuracy of SSL estimation, the use of intelligent algorithms as a type of nonlinear network has recently revolutionized forecasts and estimates of suspended sediment in rivers worldwide [41][42][43][44][45][46][47][48]. Metaheuristic algorithms are considered as one of the most important types of intelligent algorithms for optimization problems in a within wide spectrum of applications. These algorithms randomly search for optimal solutions in the problem-solving space, but purposefully [49], and perform well in solving various optimization problems, such as nonlinear, non-convex, or noisy functions [50]. The application of different types of metaheuristic optimization algorithms to water resources projects has been increasing in recent years [51][52][53][54][55][56]. To optimize SRC coefficients using metaheuristic algorithms, Tabatabaei et al. [57] compared genetic algorithm (GA) and nondominated sorting genetic algorithm II (NSGA-II) with traditional methods, namely FAO, QMLE, and Smearing. Their results indicated the superiority of NSGA-II over the other methods. Tabatabaei and Salehpour Jam [58] used GA and particle swarm optimization (PSO) algorithms; Pour et al. [59] used GA and ant colony optimization (ACO) algorithms; Ebrahimi et al. [60] used GA and honey-bees mating optimization (HBMO) algorithm; Altunkaynak [61] used GA to optimize SRC; all of these studies observed high efficiency using these algorithms. Furthermore, the algorithms of teaching-learning based optimiza-tion (TLBO) and artificial bee colony (ABC) were employed to optimize the coefficients in regression equations for estimating SSL. Results indicated that ABC and TLBO algorithms were more efficient than traditional methods [62].
Considering the global use of the SRC method among experts and researchers, as well as its limitations, our study focuses on finding the most effective methods for improving such applications. As such, our study aims to improve the flow discharge-suspended sediment relation by: (1) optimizing the coefficients of SRC using metaheuristic algorithms such as imperialist competitive algorithm (ICA), PSO, and GA and classic corrective methods (FAO and CF 2 ); (2) improving the SRC-based on time separation of data (seasonal) and flow characteristics; and (3) developing a support vector regression (SVR) model using only Q w as an input to provide similar conditions to those in the SRC method. Besides, when developing intelligent models, the main emphasis is often on the optimum design of intelligent networks and determination of the type of input data, whereas the separation of input data has rarely been considered. Therefore, in our study, a comprehensive comparison of different SSC estimation models is conducted to accurately determine the most effective methods.

Study Area
Boostan dam watershed with an area of 1533.3 km 2 located between 37 • 24 05 and 37 • 47 33 North latitude and 54 • 29 30 and 56 • 05 35 East longitude is a sub-basin of GorganRoud basin. Golestan province, Iran ( Figure 1). This watershed was selected as a case study due to the availability of data and the absence of major abstractions or dams in the upstream reaches. Boostan dam watershed has an elevation range from 108 to 2174 m a.s.l. (mean = 753 m) and an average slope gradient of 23%. According to the Amberje climate classification, the regional climate types include moderate semi-humid, cold humid, cold arid, and moderate semi-arid in different parts of the watershed. Average annual precipitation, average annual temperature, and relative humidity of the region are 483 mm, 17.8 • C, and 68.5%, respectively.

Data and Data Preprocessing
To conduct this research, Q w and SSC data for a period of 44 years (from 1969 to 2013) from Tamar hydrometric station (located at the outlet) were collected and employed in modeling. The data were not completely continuous, containing numerous missing values; a total of only 687 records were used after deleting the outliers using box plots. In Iran, due to high costs and labor for conducting direct measurements, only one or two samples are collected each month. The statistical parameters of the field data used in this study are presented in Table 1. Figure 2 shows the time series graph for Q w and SSC during the statistical period. The dataset was classified in two groups at a 70:30 ratio for model building (as the training dataset) and model evaluation (as the testing dataset) in such a way that both datasets were relatively consistent in terms of statistical parameters. We tried to include the maximum and minimum data values in the training dataset. The statistical parameters for the training and testing phases are presented in Tables 2 and 3, respectively. Because of very small and very large values and the high skewness, it can be inferred that the SSC modeling is a complex process.

Data and Data Preprocessing
To conduct this research, Qw and SSC data for a period of 44 years (from 1969 to 2013) from Tamar hydrometric station (located at the outlet) were collected and employed in modeling. The data were not completely continuous, containing numerous missing values; a total of only 687 records were used after deleting the outliers using box plots. In Iran, due to high costs and labor for conducting direct measurements, only one or two samples are collected each month. The statistical parameters of the field data used in this study are presented in Table 1. Figure 2 shows the time series graph for Qw and SSC during the statistical period. The dataset was classified in two groups at a 70:30 ratio for model building (as the training dataset) and model evaluation (as the testing dataset) in such a  is the minimum value of the data, is the maximum value, ̅ is the mean, is the standard deviation, is the skewness, is the coefficient of variation, Qw is flow discharge, and SSC is suspended sediment concentration.     Considering the high variation of data (from very small to very large values), a normalization of data could improve the modeling process [62]. Hence, all the data were normalized within the range of 0 to 1 for intelligent algorithms as follows [63,64]: where, X min and X max represent the minimum and maximum values among the original data and X norm and X ori represent the normalized and original data, respectively.

Sediment Rating Curve (SRC)
The most frequently used form for SRC is a power function using log-transformation and a least squares regression technique, which is defined in general terms as [13]: where, Q w is the flow discharge (m 3 /s), Q s is suspended sediment concentration (mg/L) or suspended sediment discharge (metric tons/day), and the regression coefficients (a and b) can be related to characteristics of soil erodibility and fluvial erosion, respectively [65].

Conventional Correction Factors Food and Agriculture Organization (FAO)
The FAO correction factor is expressed in Equation (3) and used instead of coefficient "a" in Equation (2) [31]: where, Q s is the mean of Q s , Q w is the mean of Q w , and b is the coefficient used in Equation (2).
Non-Parametric Smearing Estimator (CF 2 ) CF 2 is a non-parametric correction factor used to improve estimated values in the SRC model by determining optimal coefficient values. CF 2 is calculated using Equations (4) and (5) as follows [32]: where, n is the sample size, e i is the error term or residual for each sample, Q so is observed suspended sediment concentration or discharge, and Q se is estimated suspended sediment concentration or discharge. CF 2 is used in SRC model as follows: GA as an intelligent model is a nonlinear search and optimization technique based on the concepts of natural genetics and Darwin's evolutionary theory, which was first offered by Holland [66] in the early 1970s. The steps for conducting GA are summarized as follows [66,67]:

1.
Developing a set of initial random answers; these answers, which are the primary solutions to the problem, are called chromosomes and each one is made up of sets of genes. In the present study, the coefficients a and b in the SRC model are considered as genes and form a chromosome.

2.
Comparing, ranking, and selecting the best chromosomes; after developing the initial population of chromosomes, to determine its suitability, the efficiency of each chromosome in estimating the suspended sediment must be determined. At this point, using Equation (2) and the values of genes in each chromosome (a and b coefficients), the amount of suspended sediment for the training data is estimated. Then, the suitability of that chromosome using the objective function (root mean squared error (RMSE)) is determined as [57]: where n is the number of training data and O i and S i are the ith observed and estimated SSC in mg/L, respectively. After determining the suitability of the initial population of chromosomes in the natural selection stage, 50% of the most inefficient chromosomes are removed from the initial population.

3.
Selecting pairs (parents) for reproduction; at this stage, using selection operators, a pair of chromosomes from the set of primary chromosomes in the previous stage are determined as the parents of the next generation. To accomplish this, the widely used roulette wheel selection method was applied [61]. In fact, in this method, chromosomes with more favorable answers are more likely to be selected.

4.
Crossover; the production of new and better chromosomes is accomplished to further investigate the solution space (space containing possible coefficients for the SRC model). In this study, the blending method was employed to combine genes in the parent chromosomes and perform the reproduction. In each generation, the number of reproductions was determined by a parameter called the crossover rate.

5.
Mutation; mutation is a mechanism that leads to a completely random change in the genes of chromosomes (answers to the problem). This prevents the early convergence and getting stuck in local minima, enabling a better search within the answers space. 6.
Convergence; convergence implies that the GA, by repeating successive generations, is no longer able to find better answers to the problem. There are various ways to stop the genetic algorithm, e.g., the number of repetitions of generations reaching a certain level of error and lack of significant progress in error reduction.
Particle Swarm Optimization (PSO) The PSO algorithm, proposed by Kennedy and Eberhart [68], is a social search algorithm inspired by the swarm behavior of bird flocks or fish searching for food. The steps for performing the PSO are summarized as follows: 1.
Generation of the initial random population with random positions and velocities, each called a particle (a and b coefficients in the SRC model are assumed to be equivalent to one particle).

2.
Evaluation of the cost or fitness of each particle; at this stage, the amount of suspended sediment for the training dataset is estimated using Equation (2) and the values for each particle (a and b coefficients). Then, their suitability is evaluated using the objective function (Equation (7)).

3.
Recording the best position for each particle (pbest) and the best position among all particles (gbest); at this step, each particle moves at a speed that can be adjusted to the search space and retains the best previous position in its memory. In addition, in the total search space, the best gained position by the group is shared with all particles. Each particle in an assumed space is shown as a position and velocity vector. The position of each particle is obtained by comparison between the current position and the best value it has achieved (pbest). Moreover, the best response that each particle has attained so far from the pbest is identified as gbest; 4.
Updating the position and velocity vector of all particles; in this step, the transition of the particles to new positions is evaluated. In addition, the velocity and position of each particle are corrected by Equations (8) and (9), respectively.
where pbest and gbest represent the best personal position and the best position among the entire particles, respectively, t represents the number of iterations, R 1 and R 2 are learning parameters which determine the movement slope of the local search, and ω is the inertia coefficient.

5.
Convergence test; this algorithm is repeated for a predetermined number of generations or it is executed until the problem converges to an optimal solution.
Imperialist Competitive Algorithm (ICA) The ICA proposed by Atashpaz-Gargari and Lucas [69] solves complex optimization problems by imitating the process of social, economic, and political evolution of countries. The steps for performing the ICA are summarized as follows: 1.
Generating the random initial countries (a and b coefficients in the SRC model are assumed equivalent to one country).

2.
Dividing the countries into two categories based on the objective function of the problem (Equation (7)). Countries with the lowest amounts of objective function are assumed as imperialist and the rest are colonies.

3.
Determining the number of colonies of each imperialist; to this aim, the power of each imperialist must be evaluated. It is obvious that the stronger the imperialist, the greater the number of its colonies.

4.
Applying the assimilation policy after the formation of the initial empires; in this algorithm, the assimilation policy is modeled as the movement of colonies towards imperialists.

5.
Revolution in countries can be considered as a sudden and accidental change in the situation of the colonized countries. 6.
Comparing the colonies and imperialists (intra-group competition); sometimes a colony, by moving towards an imperialist, reaches a new situation in which it has a lower cost function than the imperialist. In this case, the colony and the imperialist change positions. 7.
Evaluation of empires (intergroup competition); at this stage, a colony is removed from a weaker empire and transferred to another empire. If the empire has no colony, its imperialist is transferred as a colony to another empire. As a result, during colonial competition, the power of larger empires gradually increases, and weaker empires will be eliminated. 8.
Finally, continuing the algorithm until the termination condition is observed. The end limit of colonial competition is when we have a single empire in the world with colonies that are very close to the imperialist country in terms of situation.
In optimization algorithms, there are parameters whose changes will affect the performance of the algorithm, the convergence speed, and the quality of solutions [70]. In our study, parameters in the GA, PSO, and ICA were adjusted through trial and error as follows: For GA: the number of initial chromosomes or size of population = 100, crossover rate = 0.75, mutation rate = 0.1, and maximum number of iterations = 500. For PSO: the number of initial particles = 100, learning parameters (R 1 and R 2 ) = 2, the inertia coefficient (ω) = 0.7, and maximum number of iterations = 500. For ICA: the number of initial countries = 100, the number of initial imperialist countries = 20, colony assimilation coefficient = 2, revolution probability = 0.1, and the maximum number of iterations = 500.

Data Separation Techniques
Considering the role of seasonal changes and river flow dynamics on sediment yield and transport at a watershed scale, data were subdivided and separated into four groups to increase the efficiency of SRC and SVR models as follows:
Discharge classes: Data were divided based on annual average discharge such that in the first category discharge was less than average discharge; in the second category, discharge was ≥the average, but less than twice the average; in the third category, discharge was ≥twice the average [72]; 3.
High water and low water periods: Mean monthly discharge was compared to the mean annual discharge. The months in which mean discharge was ≥ mean annual discharge were considered as the high water period and the months in which the mean discharge was less than the mean annual were considered as the low water period [73]; 4.
Hydrograph state: The daily hydrograph of each water year was plotted and data were classified into three series based on rising and falling limbs or base flow of the hydrograph [23]. Moreover, to assess the effect of these groups on the efficiency of models in estimating suspended sediment, results were compared with a group without data separation (group 5).

Machine Learning (ML) Model
The SVR model has been successfully applied as an ML model in geoscience. Vapnik et al. [74] proposed a version of the support vector machine (SVM) that performed regression instead of classification, known as the SVR model. In fact, the same separator hyperplane in SVM becomes the fitting function of data, which has the same properties. In general, the use of the structural risk minimization (SRM) principle in the SVR modeling process equips this model with a powerful tool for generalization [75]. In this model, when a line is fitted to data, the model error can be partially ignored and the error in the calculated data must be within a range of ε that is marginal on both sides. That is, if the calculated data are within this range, the model is acceptable. However, if data are outside this range and their error is greater than ε, it must be adjusted. The SVR function can be calculated using kernel functions as follows: where y is output, b is the bias term, α is Lagrange multiplier, and k(x,x i ) is the kernel function. In our study, the SVR model was tested using the radial basis function (RBF) kernel, which has proven better in performance than other kernel functions and is currently the most widely used function [44,76], as follows [77]: Optimal values of the kernel parameters, namely width of the Gaussian kernel function (σ), cost of constraint violation (C), and error insensitive zone (ε), were determined by trial and error.

Model Evaluation and Comparison
Finally, to compare the results and evaluate efficiency of the models, the graphical method of scatter-plot was used and four different types of quantitative statistics, including RMSE, mean absolute error (MAE), Nash-Sutcliffe (NS) model performance coefficient, and coefficient of determination (R 2 ) were calculated as follows [57]: where, O and S are the average of observed and estimated SSC, respectively. Higher and lower R 2 values represent more and less correlation between estimated and observed values. The RMSE ranges from 0 to +∞, with 0 indicating a perfect match of estimated and observed values. The range of NS values is from −∞ to 1 with NS > 0.6 being considered as acceptable model performance [45].

Results of the SRC Model Based on Data Separation and Non-Separation
To estimate suspended sediment using SRC, Q w and SSC were log-transformed prior to the analysis. Then, a regression relationship for each of the groups in Section 3.3 between log Q w and log SSC was established based on the training dataset. Finally, the anti-log was calculated and the ''a" and ''b" coefficients in the SRC model (i.e., Equation (2)) were determined (Table 4). After extracting the SRC equations, the efficiency and accuracy of each equation was evaluated based on testing data in each group (Table 5). Based on these results, SRC without data separation had the lowest estimation power (NS = 0.19), while data separation significantly improved model performance. Based on these findings, SRC with NS = 0.44 had the highest estimation power in winter and the lowest estimation power in summer. This may be related to the dry conditions and short-term heavy rains in summer complicating the SSC patterns. Furthermore, results revealed that SRC was much more effective when Q w was lower than the mean Q w (NS = 0.29) and in low water period (NS = 0.33). Overall, estimated values were in closer agreement with observed values for lower flow discharge than for higher discharge. According to the hydrograph, the SCR had the lowest performance in the falling limb (NS = 0.13). Notably, for the same Q w , SSC varies between rising and falling limbs, lower in the falling limb, which complicates the SSC process, hence weakening the modeling performance.

Results of Optimization of the SRC Using Classical Methods and Metaheuristic Algorithms
SSC was estimated using SRC models modified by two classic methods (namely, FAO and CF 2 ) and three metaheuristic algorithms (namely, GA, PSO, and ICA) without data separation with the aim of only modifying the SRC coefficient to improve its efficiency. To this goal, after establishing a regression relationship between log Q w and log SSC during the training phase and determining ''a" and ''b" coefficients; these coefficients were modified using the previously discussed classical techniques. Optimization of the coefficients using algorithms of GA, PSO, and ICA was conducted without log-transforming data using RMSE as the objective function in the Matlab environment. After modifying the SRC equations at this stage, the efficiency and accuracy of each equation was evaluated based on the test data (Table 6) . Overall, the modified SRC models using the metaheuristic algorithms achieved better results than the modified SRC models using FAO and CF2 correction factors. However, although NS was low for both methods, they improved the estimation power of the original SRC. The low NS values could be due to high sensitivity to peak values of the hydrologic models, as it used squared errors [78]. The same condition was observed with RMSE, which minimized the square of residuals, while MAE was less sensitive to large values [79]. Moreover, low NS values could originate from sparse data at high discharge as well as the large amount of missing SSC-Q w data. As previously noted, field data are available only once or twice monthly in Iran; hence, finding a robust and reliable model to estimate SSC accurately is a challenging task.  [60], and Altunkaynak [61], who also state that the metaheuristic algorithms are more efficient than classic optimization methods.
The fitness of different SRC models to observed SSC data during the training phase show that SRC-PSO, SRC-GA, and SRC-ICA all more accurately estimate SSC than other optimization models (Figure 3). Overall, the SRC-PSO model had the best fit to field observations during training and testing phases. In fact, the PSO algorithm has a better chance to move toward areas containing better solutions because of features such as constructive cooperation and shared memory between particles [80].
The fitness of different SRC models to observed SSC data during the trainin show that SRC-PSO, SRC-GA, and SRC-ICA all more accurately estimate SSC tha optimization models (Figure 3). Overall, the SRC-PSO model had the best fit to f servations during training and testing phases. In fact, the PSO algorithm has chance to move toward areas containing better solutions because of features such structive cooperation and shared memory between particles [80].

Results of SVR Models with Data Separation and Non-Separation
In the SVR model we developed, the input and output layers contained a ne which Qw was considered as an input and SSC as an output. The proper archi structure of the SVR model, like other ML models, improves the suspended sedim timate [63]. Therefore, we used trial and error to achieve the optimal network des improve model performance. Considering the lowest RMSE values, the optimal va model parameters (namely, , C and ε) were assessed for all groups ( Table 7).
The SVR model was developed using the entire training dataset without a separation; then, it was evaluated using the testing dataset. Next, it was develope on the training dataset with separated data for various groups. Finally, estima values were evaluated using the testing dataset ( Table 8)

Results of SVR Models with Data Separation and Non-Separation
In the SVR model we developed, the input and output layers contained a neuron in which Q w was considered as an input and SSC as an output. The proper architectural structure of the SVR model, like other ML models, improves the suspended sediment estimate [63]. Therefore, we used trial and error to achieve the optimal network design and improve model performance. Considering the lowest RMSE values, the optimal values for model parameters (namely, σ, C and ε) were assessed for all groups (Table 7). The SVR model was developed using the entire training dataset without any data separation; then, it was evaluated using the testing dataset. Next, it was developed based on the training dataset with separated data for various groups. Finally, estimated SSC values were evaluated using the testing dataset ( Table 8)

Determination of the Best Method of Data Separation
To conduct a general comparison of different groups, average values of statistical indices for seasons, discharge classes, high water and low water periods, and hydrograph state were calculated. The results revealed that in both SRC ( Figure 4) and SVR ( Figure 5) models, data separated by season led to the best estimation of SSC. This was followed in order by high water and low water periods, discharge classes, hydrograph state, and the original SRC. Thus, all data separation models exhibited effective performance. Research by Sichingabula [33], Horowitz [19], Collins et al. [34], Hassan [81], Zeng et al. [16], and Jung et al. [17] also confirms the greater efficiency of timely separation of data in increasing the accuracy of estimations compared with other methods of data separation.

The Most Effective Model for Estimating SSC
Based on our findings and comparing performance indices (Figures 4 and 5 and Tables 5,  6  Overall, our findings show that the SVR model presented better results in all cases with lower error values and higher NS and R 2 than the SRC, including its optimized versions. In general, according to our results, the SVR model can be used to estimate SSC in similar seasons instead of applying a model for the entire dataset. Moreover, findings show that optimizing SRC through metaheuristic algorithms leads to higher performance than data separation for SRC. In fact, the results indicate that the original SRC model could not adequately represent the complex process of suspended load in the watershed [40] because, in addition to discharge, other factors control supply and transport of suspended sediment in the watershed, such as rainfall (intensity and volume), sediment sources from previous flooding, land use/land cover changes, and anthropogenic changes in the watershed, which are not considered as model inputs. Similarly, Rodriguez et al. [29] noted that discharge only explained 19% of the variance in suspended sediment even though it is the only input to the SRC method. Another reason that the original SRC model underperforms is due to its simple structure.
The comparison observed versus estimated SSC using the best performing models (i.e., SVR) indicates a relatively good agreement using the SVR model ( Figure 6). However, as is often the case in modelling [33], peak values were not accurately estimated. This poor performance for peak value prediction is partly due to uncertainties of suspended sediment data and the low number of data samples at high discharge. Additionally, missing values can significantly affect the results. Lastly, many studies have shown that SSC cannot be estimated using only Q w [82,83].
In fact, in addition to the modeling tools [63], selection of acceptable input data can influence model results [64]. Our results showed that for accurate estimation of SSC, relevant variables other than Q w are necessary for building an effective input scenario. Several studies have shown that the utilization of antecedent values of Q w and Qs [63], meteorological parameters such as rainfall, temperature, and potential evapotranspiration [84,85], hydro-geomorphic variables such as the index of sediment connectivity [48], and biophysical data such as the normalized difference vegetation index (NDVI) [84,86], along with the hydrological parameters utilized in this study, can most likely improve estimates of suspended sediment production.
The superior performance of the SVR as a modeling tool compared to the SRC model for similar conditions (with Q w as an input and without data separation) indicates that ML models can better capture nonlinear relationships between system inputs and outputs due to their: (1) non-linear structure of the ML models, (2) robustness to missing data and (3) high flexibility [87]. Previous studies by Chiang et al. [43], Zounemat-Kermani et al. [45], Rajaee et al. [88], Muhammadi et al. [89], Kisi et al. [90], Alp and Cigizoglu [91], and Cobaner et al. [92] have also noted the superiority of ML models over traditional regression models (e.g., SRC model) for estimating suspended sediment. vant variables other than Qw are necessary for building an effective input scenario. Several studies have shown that the utilization of antecedent values of Qw and Qs [63], meteorological parameters such as rainfall, temperature, and potential evapotranspiration [84,85], hydro-geomorphic variables such as the index of sediment connectivity [48], and biophysical data such as the normalized difference vegetation index (NDVI) [84,86], along with the hydrological parameters utilized in this study, can most likely improve estimates of suspended sediment production. Figure 6. Observed SSC versus estimated SSC using the best models (i.e., SVR) for the testing dataset. . Observed SSC versus estimated SSC using the best models (i.e., SVR) for the testing dataset.

Conclusions
Because direct measurements of fluvial suspended sediment loads are costly and time consuming, using reliable and accurate models to estimate this parameter is a challenging task for hydrologists and river engineers, especially in regions with few hydrometric stations. Our research examined and provided a reliable method that can accurately estimate SSC by: (1) optimizing SRC using metaheuristic algorithms (GA, PSO, and ICA) and classical approaches (FAO and CF2), (2) improving estimation power of SRC using separation of data based on season and flow discharge characteristics and (3) using the SVR model with only flow discharge values as inputs, similar to the SRC approach. Our findings indicate that the metaheuristic optimization algorithms are more efficient than the classic correction factors. Overall, among the metaheuristic algorithms, PSO had a higher optimization efficiency for coefficients of the SRC model. These results showed that metaheuristic algorithms could be employed instead of the classic correction coefficients and log-transformation of data in the SRC model. Moreover, to increase the accuracy of the SRC, using seasonal separation of data led to better improvements than the other methods of data separation. Overall, optimization of the SRC model using metaheuristic algorithms was more effective than data separation. The results also indicate that the SVR model had higher efficiency for estimating SSC compared to the SRC model optimized with metaheuristic algorithms.
In general, our results emphasize the benefits of using soft computing methods in enhancing the accuracy of estimating SSC. However, the SVR model predictions of SSC were not accurate when only flow discharge was used. This revealed the main weakness of the SRC method. Our improvement in suspended sediment estimation is valuable for water resources planning and management. While our results are specific to the study watershed, the possibility of extending these findings to other watersheds needs to be explored. Given that the algorithms we used were single-objective, optimization was accomplished only by minimizing the error function. Thus, we suggest surveying the multi-objective optimization algorithms (e.g., multi-objective particle swarm optimization (MOPSO)) by employing different objective functions. Data Availability Statement: This study does not report any data.