Application of Machine Learning Model for the Prediction of Settling Velocity of Fine Sediments

Sedimentation management is one of the primary factors in achieving sustainable development of water resources. However, due to difficulties in conducting in-situ tests, and the complex nature of fine sediments, it remains a challenging task when dealing with issues related to settling velocity. Hence, the machine learning model appears as a suitable tool to predict the settling velocity of fine sediments in water bodies. In this study, three different machine learning-based models, namely, the radial basis function neural network (RBFNN), back propagation neural network (BPNN), and self-organizing feature map (SOFM), were developed with four hydraulic parameters, including the inlet depth, particle size, and the relative x and y particle positions. The five distinct statistical measures, consisting of the root mean square error (RMSE), Nash–Sutcliffe efficiency (NSE), mean absolute error (MAE), mean value accounted for (MVAF), and total variance explained (TVE), were used to assess the performance of the models. The SOFM with the 25 × 25 Kohonen map had shown superior results with RMSE of 0.001307, NSE of 0.7170, MAE of 0.000647, MVAF of 101.25%, and TVE of 71.71%.


Background and Problem Statement
The movement characteristics of sediments (i.e., coarse and fine sediments) are considered as one of the most complex study areas in the field of hydrology. Shallow riverbeds were one of the main causes of flood-related disasters reported worldwide, with an estimated financial loss of almost USD1800/s between 1990 and 2020 [1]. Sedimentation issues have always been a deep concern for various parties due to the severe impacts that are invited by sediments. Siltation, for instance, is a critical problem that is caused by the resuspension of fine sediments deposited within the sediment bed in water [2,3]. Siltation could also induce a large number of negative impacts on humans and the environment. For example, changes in water flow patterns and bed structure, a reduced lifetime of reservoirs, aggravation of floods, deterioration of water quality leading to a shortage in water supply, and destruction of natural habitats which threatens the aquatic biodiversity [4,5]. Economically, stakeholders would need to bear exorbitant costs on their projects, such as carrying out landfilling and water treatment activities. Fine sediments are the root causes of the siltation possess [6,7]. The transport mechanism of fine sediments is known to take place in locations ranging from rivers, reservoirs, dams, lakes, to coastal areas, for which the flow structures, bed morphology, as well as water quality are highly affected [8,9].
Understanding the seriousness and series of negative impacts from sedimentation problems, many studies related to sedimentation have been conducted in the past few decades. The transport mechanism of sediments could be fundamentally described by Mathematics 2021, 9, 3141 2 of 18 two main processes: the horizontal advection, and the vertical sedimentation that is due to gravitational acceleration [10,11]. In the past, there were several commonly used sediment-related parameters, such as water turbidity [12], salinity [13], suspended sediment concentrations [14], and total suspended solids [15]. Although some of the basic indications of the sediment loadings in the water could be presented, such parameters were inadequate in describing the hydrodynamics of fine sediment since the vertical flux, which is the essential component of sedimentation, was not considered [16,17].
Furthermore, the increase in complexity of fine sediment transport in water bodies was also driven by various factors which influence the relationship between water flow and fine sediment transport, including concentration, size, and surface electrostatic charge distributions of particles, bed structures, and flow rates. However, it is practically not feasible to determine precise measurements of these parameters on both temporal and spatial scales, where in-situ experiments and tests are required to be conducted [14,18].
The settling velocity, also known as terminal velocity, plays an important role in terms of interpreting hydrodynamic behaviors exhibited by fine sediments in water [9,19]. Specifically, the settling velocity is the velocity attained by fine sediments resulting from the balanced force between drag forces, gravitational pull, and other hydrodynamic forces [20]. For turbulent flows in particular, drag coefficients vary in accordance with the hydrodynamic forces, such as shear lift, buoyancy, Magnus force, and torque [21]. In this regard, the settling velocity accounts for the vertical flux in the process of sedimentation for different conditions. Thus, it is capable of providing a significant characterization of the hydrodynamics of fine sediments [22].

Machine Learning Model
There are numerous machine learning algorithms that perform classification and prediction tasks with different purposes, such as decision trees, random forests, long short-term memory networks, etc. [23,24] Being one of the pioneers of the machine learning models, an artificial neural network (ANN), which also operates machine learning algorithms, has been broadly employed in different contexts. This includes the field of green energy [25], business/economics [26], and health care [27]. In hydrology, countless studies have also taken advantage of ANN models, such as water quality predictions [28], wastewater treatment [29], flow duration curves modelling [30], turbulent flow velocity field prediction [31], adsorption water desalination system modelling [32], suspended sediment and sediment yield [14,24,33], ocean bubbles [34], and wall slip [35,36].
Essentially, ANN is technologically advanced as a generalization of mathematical models inspired by human cognition which depends on the biological neural system [13]. ANN, also often being referred to as black-box models, possesses flexibility in terms of mathematical computations and topography, making them capable of performing predictions, classifications, and modelling different types of complex and non-linear relationships between data inputs and outputs without prior interpretation of the data behavior [37]. Subsequently, diverse network architectures allowed ANN models which are conceptually semi-parametric regression estimators to be favored over other typical models. The basic development of ANN is generally based on, but not limited to, the given rules, i.e., (i) input information processed individually at multiple elements called neurons, also referred to as units, cells, or nodes; (ii) processed signals transferred between the neurons through the connective links; (iii) each of the connective links carries an associated weight, representing its connection strength; (iv) each neuron which typically applies a non-linear transformation to its net input to determine its output.
A typical ANN (see Figure 1) usually comprises three components, namely the input layer, hidden layer, and output layer. The input layer commences the algorithm, where it inputs one instance of the data into the network. The dimension of the instances determines the number of inputs in the input layer. The hidden layer contains one or several layers where it outputs intermediate data to the output layer, generating the final output of the network. The output number is determined by the encoding of the classified or estimated  Apart from the network architecture, there are two kinds of network parameters, weight, W, and bias, B, playing an important part in the learning and predicting process. Essentially, the concept of computational procedure involving both network parameters mimics the biological afferent and efferent neurons in transmitting signals [13]. At first, the afferent response is initiated when all the input signals are channelled into a particular neuron. As described in Equation (1), the net input value will be computed based on the total weights contributed by each input observation, x, added with a constant bias. The efferent response takes place once the net input is obtained. Based on the pre-defined activation function within each neuron, a single value output will be computed. There are different types of activation functions, but the most commonly implemented ones are the linear activation function, sigmoid/logistic activation function, and hyperbolic tangent activation function [38].
It should be noted that in the training phase, a basic ANN employs feed-forward to generate an output, and then calculates the error between the output and the target output. In the prediction phase, a basic ANN will only execute a feed-forward mechanism to achieve the ultimate result. The approach of trial and error was implemented since there was no generally proven theory or fixed rule on the decision of the network geometry (i.e., the number of neurons and hidden layers) that ensures optimum prediction and classification results [39,40].

Research Objective
Regarding the ability in handling tedious computational tasks, machine learningbased models present an innovative alternative for the prediction of settling velocity of fine sediments. The main objective of this study is to propose machine learning-based computational models for the prediction of settling velocity of fine sediments. Three machine learning-based models were employed, along with four hydraulic as input variables to estimate a single output of the settling velocity of fine sediments. The relationship between the settling velocity and the input variables was then investigated. In addition, a comparative statistical analysis was carried out on the developed models based on their performance. Apart from the network architecture, there are two kinds of network parameters, weight, W, and bias, B, playing an important part in the learning and predicting process. Essentially, the concept of computational procedure involving both network parameters mimics the biological afferent and efferent neurons in transmitting signals [13]. At first, the afferent response is initiated when all the input signals are channelled into a particular neuron. As described in Equation (1), the net input value will be computed based on the total weights contributed by each input observation, x, added with a constant bias. The efferent response takes place once the net input is obtained. Based on the pre-defined activation function within each neuron, a single value output will be computed. There are different types of activation functions, but the most commonly implemented ones are the linear activation function, sigmoid/logistic activation function, and hyperbolic tangent activation function [38].
It should be noted that in the training phase, a basic ANN employs feed-forward to generate an output, and then calculates the error between the output and the target output. In the prediction phase, a basic ANN will only execute a feed-forward mechanism to achieve the ultimate result. The approach of trial and error was implemented since there was no generally proven theory or fixed rule on the decision of the network geometry (i.e., the number of neurons and hidden layers) that ensures optimum prediction and classification results [39,40].

Research Objective
Regarding the ability in handling tedious computational tasks, machine learningbased models present an innovative alternative for the prediction of settling velocity of fine sediments. The main objective of this study is to propose machine learning-based computational models for the prediction of settling velocity of fine sediments. Three machine learning-based models were employed, along with four hydraulic as input variables to estimate a single output of the settling velocity of fine sediments. The relationship between the settling velocity and the input variables was then investigated. In addition, a comparative statistical analysis was carried out on the developed models based on their performance.

Methodology
The design of experimental procedures and data collection via advanced laboratory equipment, known as particle image velocimetry (PIV), was based on the main source of reference in the work of Kashani et al. [18]. In general, the whole-flow-field technique of the PIV records the position-over-time of the injected small tracer particles (seeds) in the flow [41], providing instantaneous velocity fields and real-time movements of fine sediments in the water.
This study emphasized the prediction of settling velocity of fine sediments by applying three distinct machine learning-based models, namely the radial basis function neural network (RBFNN), the backpropagation neural network (BPNN), and the self-organizing feature map (SOFM). The prediction models were developed based on the four hydraulic parameters as input variables, which consist of particle size (5 µm, 10 µm, 20 µm, 50 µm), inlet depth (6 cm, 7 cm, 8 cm, 9 cm, 10 cm, 10.5 cm), the relative x-position ranged between 0 cm and 50 cm, and the relative y-position ranged between 0 cm and 20 cm in the water column. The overall research methodology is shown in Figure 2.

Methodology
The design of experimental procedures and data collection via advanced laboratory equipment, known as particle image velocimetry (PIV), was based on the main source of reference in the work of Kashani et al. [18]. In general, the whole-flow-field technique of the PIV records the position-over-time of the injected small tracer particles (seeds) in the flow [41], providing instantaneous velocity fields and real-time movements of fine sediments in the water.
This study emphasized the prediction of settling velocity of fine sediments by applying three distinct machine learning-based models, namely the radial basis function neural network (RBFNN), the backpropagation neural network (BPNN), and the self-organizing feature map (SOFM). The prediction models were developed based on the four hydraulic parameters as input variables, which consist of particle size (5 µm, 10 µm, 20 µm, 50 µm), inlet depth (6 cm, 7 cm, 8 cm, 9 cm, 10 cm, 10.5 cm), the relative x-position ranged between 0 cm and 50 cm, and the relative y-position ranged between 0 cm and 20 cm in the water column. The overall research methodology is shown in Figure 2. To ensure machine learning-based models are well-trained, and, at the same time, achieve high performance with good generalization, the data set must be at least divided into two sets, namely the training set and testing set. However, different proportion settings on the division of training and testing data set were discovered in different studies, as no fixed proportion could guarantee the best results. Nonetheless, a typical range of 60% to 80% on the training set associated with 40% to 20% of the testing set were implemented [42][43][44]. As an initial attempt to estimate the settling velocity of fine sediments in To ensure machine learning-based models are well-trained, and, at the same time, achieve high performance with good generalization, the data set must be at least divided into two sets, namely the training set and testing set. However, different proportion settings on the division of training and testing data set were discovered in different studies, as no fixed proportion could guarantee the best results. Nonetheless, a typical range of 60% to 80% on the training set associated with 40% to 20% of the testing set were implemented [42][43][44]. As an initial attempt to estimate the settling velocity of fine sediments in this study, a stipulated proportion of 80%:20% (training set:testing set) was defined such that sufficient learning instances were allocated for the model training while reserving reasonable data size for model testing.
For the data pre-processing step, the min-max normalization technique was applied to the data before the training process was executed. The primary reason for the data normalization was to geometrically nullify the weight effects imposed by the input variables of incompatible scale units, range and interval values, and distributions into a desired compact and specific range [45]. In other words, normalization simplifies interpretation tasks in terms of the total variation contributed by the input variables, for which imbalanced weights over-influencing the learning of instances were minimized [46].

Radial Basis Function Neural Network (RBFNN)
The RBFNN takes a simple and straightforward form, and has been notably employed in different works, such as predicting suspended sediment load, and water turbidity in rivers [38,47]. RBFNN often requires a shorter time to be trained because of its simplistic structure of only three layers, namely the input layer, hidden layer, and output layer. The incoming input variables from the input layer of the RBFNN were fed forwarded to its hidden layer. The corresponding weights received by neurons in the hidden layer process these weights by computing them using the activation function. The specialty of RBFNN lies in the application of radial basis activation function (i.e., Gaussian) [48], as defined in Equation (2).
The parameters µ and σ 2 represent the mean and variance of the received weights specifying the central tendency and spread of the Gaussian curve [15,23]. The monotonic property held by the radial basis activation function with a good local convergence rate is the merit of the RBFNN. The network architecture of the proposed RBFNN was illustrated in Figure 3.
this study, a stipulated proportion of 80%:20% (training set:testing set) was defined such that sufficient learning instances were allocated for the model training while reserving reasonable data size for model testing.
For the data pre-processing step, the min-max normalization technique was applied to the data before the training process was executed. The primary reason for the data normalization was to geometrically nullify the weight effects imposed by the input variables of incompatible scale units, range and interval values, and distributions into a desired compact and specific range [45]. In other words, normalization simplifies interpretation tasks in terms of the total variation contributed by the input variables, for which imbalanced weights over-influencing the learning of instances were minimized [46].

Radial Basis Function Neural Network (RBFNN)
The RBFNN takes a simple and straightforward form, and has been notably employed in different works, such as predicting suspended sediment load, and water turbidity in rivers [38,47]. RBFNN often requires a shorter time to be trained because of its simplistic structure of only three layers, namely the input layer, hidden layer, and output layer. The incoming input variables from the input layer of the RBFNN were fed forwarded to its hidden layer. The corresponding weights received by neurons in the hidden layer process these weights by computing them using the activation function. The specialty of RBFNN lies in the application of radial basis activation function (i.e., Gaussian) [48], as defined in Equation (2).
The parameters and 2 represent the mean and variance of the received weights specifying the central tendency and spread of the Gaussian curve [15,23]. The monotonic property held by the radial basis activation function with a good local convergence rate is the merit of the RBFNN. The network architecture of the proposed RBFNN was illustrated in Figure 3.

Back Propagation Neural Network (BPNN)
The BPNN is a well-known network that has been extensively applied in hydrology to predict and solve various problems, especially in water resources and management

Back Propagation Neural Network (BPNN)
The BPNN is a well-known network that has been extensively applied in hydrology to predict and solve various problems, especially in water resources and management [49]. It shares the same network geometry as a basic ANN by having an input layer, an output layer, and one or more hidden layers. Fundamentals of the backpropagation algorithm in the case of control theory were discovered in the work of Kelley [50]. After training the BPNN with the input instances, the backpropagation learning algorithm allows the weights and biases parameters within the neurons to be tuned according to the Levenberg-Marquardt approach. In the tuning process, the network parameters are updated based on the minimization of the error functions [51,52]. In this study, the logistic activation function, as defined in Equation (3), was adapted, and the sum of squared error between the estimated and observed settling velocity is minimized. The network architecture of the proposed BPNN was illustrated in Figure 4.
[49]. It shares the same network geometry as a basic ANN by having an input layer, an output layer, and one or more hidden layers. Fundamentals of the backpropagation algorithm in the case of control theory were discovered in the work of Kelley [50]. After training the BPNN with the input instances, the backpropagation learning algorithm allows the weights and biases parameters within the neurons to be tuned according to the Levenberg-Marquardt approach. In the tuning process, the network parameters are updated based on the minimization of the error functions [51,52]. In this study, the logistic activation function, as defined in Equation (3), was adapted, and the sum of squared error between the estimated and observed settling velocity is minimized. The network architecture of the proposed BPNN was illustrated in Figure 4.

Self-Organizing Feature Map (SOFM)
The SOFM, with its unique network architecture, was initially proposed by Kohonen [53,54]. With the growing interest in the application of unsupervised learning algorithms, SOFM has been widely used, mostly for data classification and estimation. Studies that implement SOFM, relevant to the field of hydrology, include the characterization and survey of groundwater chemistry and groundwater levels [55,56], sediment quality assessment [4], and soil hydraulic properties [57]. However, there were hardly any recent applications of the SOFM related to fine sediment studies, specifically in estimating hydrodynamic characteristics of fine sediments. The SOFM consists of only an input and a Kohonen map. The Kohonen map is a discrete lattice structure of usually two dimensions, formed by the projection of multidimensional inputs through a non-linear vector quantization-based learning attribute. Neurons in the Kohonen map are physically arranged in a hexagonal fashion while the topological properties of the input space are preserved. The network architecture of the proposed SOFM is illustrated in Figure 5.

Self-Organizing Feature Map (SOFM)
The SOFM, with its unique network architecture, was initially proposed by Kohonen [53,54]. With the growing interest in the application of unsupervised learning algorithms, SOFM has been widely used, mostly for data classification and estimation. Studies that implement SOFM, relevant to the field of hydrology, include the characterization and survey of groundwater chemistry and groundwater levels [55,56], sediment quality assessment [4], and soil hydraulic properties [57]. However, there were hardly any recent applications of the SOFM related to fine sediment studies, specifically in estimating hydrodynamic characteristics of fine sediments. The SOFM consists of only an input and a Kohonen map. The Kohonen map is a discrete lattice structure of usually two dimensions, formed by the projection of multidimensional inputs through a non-linear vector quantization-based learning attribute. Neurons in the Kohonen map are physically arranged in a hexagonal fashion while the topological properties of the input space are preserved. The network architecture of the proposed SOFM is illustrated in Figure 5. For each iteration step, , the Euclidian distance was computed between the input vector, , and the neurons in the Kohonen map. Out of the neurons in total, the best matching unit (BMU), , served as the winning neuron after competing with other neurons. Thus, the BMU possesses the minimum Euclidian distance towards the input vector, as shown in Equation (4).
Once the BMU is selected, the weight vectors of the neurons located within the pre-defined neighborhood function (Gaussian) shown in Equation (5) were then updated cooperatively. The Euclidean distance between a neuron and the BMU acts as the central mean of the Gaussian neighborhood, whereas the learning rate, , as defined in Equation (6), determines the corresponding spreadness. Both the learning rate, and neighborhood radius, , as defined in Equation (7), decreased accordingly when iterations were increased. Consequently, Equation (8) shows the updating rule for the weights in each neuron within the neighborhood of BMU.
Eventually, clusters of similar neurons can be identified as the training process matures. Individual heat maps could be produced for visualization purposes to enhance the study of the relationship between the input parameters with the settling velocity. As the neighborhood function provides close density estimation in the Kohonen map, the prediction of settling velocities was based on the asymptotic convergence to the mean value computed from the values within the BMU, � ( ) , as defined in Equation (9). For each iteration step, s, the Euclidian distance was computed between the input vector, z, and the neurons in the Kohonen map. Out of the M neurons in total, the best matching unit (BMU), θ, served as the winning neuron after competing with other neurons. Thus, the BMU possesses the minimum Euclidian distance towards the input vector, as shown in Equation (4).
Once the BMU is selected, the weight vectors of the neurons located within the predefined neighborhood function (Gaussian) shown in Equation (5) were then updated cooperatively. The Euclidean distance between a neuron and the BMU acts as the central mean of the Gaussian neighborhood, whereas the learning rate, δ, as defined in Equation (6), determines the corresponding spreadness. Both the learning rate, and neighborhood radius, η, as defined in Equation (7), decreased accordingly when iterations were increased. Consequently, Equation (8) shows the updating rule for the weights in each neuron within the neighborhood of BMU.
Eventually, clusters of similar neurons can be identified as the training process matures. Individual heat maps could be produced for visualization purposes to enhance the study of the relationship between the input parameters with the settling velocity. As the neighborhood function provides close density estimation in the Kohonen map, the prediction of settling velocities was based on the asymptotic convergence to the mean value computed from the values within the BMU, V (θ) , as defined in Equation (9).

Performance Measures
For N of the settling velocity observations, V k , and their corresponding predicted values,V k , the average value is represented by V. The following Equations (10)- (14) show the five types of statistical measures applied to access the prediction performance of the developed machine learning models.

Results and Discussion
Under the method of trial and error, the optimum network parameters and geometry were decided based on the performance measures computer for each of the models. To compare and contrast the three developed machine learning-based models, the most appropriate model with the highest accuracy was determined once the best parameter settings were finalized. Ideally, the RMSE and MAE should be close to zero, indicating the minimum error obtained; the NSE should be close to 1, indicating the model is better than applying the mean estimator; the MVAF should be close to 100%, indicating the accuracy of the average estimating performance of the model; the TVE should be close to 100%, indicating the overall dynamics and dispersion accounted by the model.
Due to the flexible settings of network architectures, numerous combinations of network parameters and geometry could be generated. Consequently, only selected models that provided appropriate results relevant to the performance measures were reported. Table 1 shows the results of the RBFNN model. The lowest RMSE of 0.002495 and MAE of 0.001409 were seen at the 4-17-1 setting, but the optimum values of NSE, MVAF, and TVE were 0.1435%, 188.18%, and 20.43%, respectively, which was exhibited by the 4-16-1 setting. Thus, the 4-16-1 RBFNN model with 16 neurons in the hidden layer is the best architecture obtained. In addition, the corresponding comparison plot of the predicted and observed settling velocity output was illustrated in Figure 6. The majority of points deviated far from the red line (i.e., predicted = observed) associated with the difference in scales by a multiple of 10 between the axes, which indicated a poor prediction of the RBFNN model. The illustration corresponds to the relatively low TVE, NSE, and MVAF that significantly exceeded 100%.     Figure 7. As compared to the RBFNN model, most points were located closer along the red line, with minor points being scattered far away, reflecting smaller RMSE and MAE, MVAF closer to 100%, and higher NSE and TVE.   Figure 7. As compared to the RBFNN model, most points were located closer along the red line, with minor points being scattered far away, reflecting smaller RMSE and MAE, MVAF closer to 100%, and higher NSE and TVE.     Figure 8. On a close scale unit for both axes, almost all points were located closely along the red line, although there exists a single prediction point located right-most in the figure that appeared to be relatively lower than the red line. The highest TVE was also indicated by the well-captured variation by the SOFM model.   Figure 8. On a close scale unit for both axes, almost all points were located closely along the red line, although there exists a single prediction point located right-most in the figure that appeared to be relatively lower than the red line. The highest TVE was also indicated by the well-captured variation by the SOFM model.   For comparison purposes, the results among related studies were generally reviewed. Similar error measures were adapted by Rushd et al., [52] and Cao et al., [22], where the RMSE of 0.066 and 0.0428, and the MAE of 0.044 and 0.0242 were reported, respectively. The SOFM in the current study outperformed other ANN models, and produced lower RMSE and MAE. However, results from other studies cannot be directly compared due to the difference in the study scope (e.g., coarse sediment, sphericity, fluid type, etc.).
Apart from performance measures, the SOFM model enabled visualizations of the prediction space. The codes plot shown in Figure 9 presented a total of 19 clusters assigned with distinct background colors, which were clearly separated by the thick black boundary lines on the 25 × 25 Kohonen map. In each neuron, there were five different colored sectors, corresponding to the weights contributed by the particle size, inlet depth, x-position, y-position, and settling velocity (i.e., input and output variables). As a result, the size of a particular sector in the neurons was determined by the magnitude of the individual variable that is projected on the Kohonen map. In other words, the larger the magnitude of the variable, the larger the size of the sector. For comparison purposes, the results among related studies were generally reviewed. Similar error measures were adapted by Rushd et al., [52] and Cao et al., [22], where the RMSE of 0.066 and 0.0428, and the MAE of 0.044 and 0.0242 were reported, respectively. The SOFM in the current study outperformed other ANN models, and produced lower RMSE and MAE. However, results from other studies cannot be directly compared due to the difference in the study scope (e.g., coarse sediment, sphericity, fluid type, etc.).
Apart from performance measures, the SOFM model enabled visualizations of the prediction space. The codes plot shown in Figure 9 presented a total of 19 clusters assigned with distinct background colors, which were clearly separated by the thick black boundary lines on the 25 × 25 Kohonen map. In each neuron, there were five different colored sectors, corresponding to the weights contributed by the particle size, inlet depth, x-position, yposition, and settling velocity (i.e., input and output variables). As a result, the size of a particular sector in the neurons was determined by the magnitude of the individual variable that is projected on the Kohonen map. In other words, the larger the magnitude of the variable, the larger the size of the sector. To provide an example, neurons from the top left cluster have very large red and olive-green sectors, followed by moderately large lime-green sectors, small blue sectors, and very small purple sectors. This indicates that the top left neurons cluster was characterized by the higher magnitude of particle size and depth, with moderate x-position value and low y-position value, together resulting in low settling velocity. The bottom left group of neurons with a dark brownish-red background is another evident example of a different cluster. The very large olive-green sectors were associated with large lime-green sectors, followed by moderately large blue sectors, and extremely small and unclear red and purple sectors. It is a clear characterization of very low settling velocity because of the very large depth value, high x-position value, moderately high y-position value, and very small particle size. In general, neurons under the same clusters were arranged physically close together as they shared neighborhoods with similar characteristics.
In addition to the codes plot, the heat maps, as shown in Figure 10, enabled a complementary visualization for the trained 25 × 25 Kohonen map by providing the breakdown of distribution for each variable. The color intensity filled in each of the neurons ranged from dark blue to dark red, reflecting the color temperature from cool to hot. The heat map for the settling velocity suggested that most of its projected patterns were extremely low, for which only a single red-colored neuron was located (9th row, 25th column). A small amount of green and light blue colored neurons were found at the borders from the top left corner to the right bottom corner. To provide an example, neurons from the top left cluster have very large red and olivegreen sectors, followed by moderately large lime-green sectors, small blue sectors, and very small purple sectors. This indicates that the top left neurons cluster was characterized by the higher magnitude of particle size and depth, with moderate x-position value and low y-position value, together resulting in low settling velocity. The bottom left group of neurons with a dark brownish-red background is another evident example of a different cluster. The very large olive-green sectors were associated with large lime-green sectors, followed by moderately large blue sectors, and extremely small and unclear red and purple sectors. It is a clear characterization of very low settling velocity because of the very large depth value, high x-position value, moderately high y-position value, and very small particle size. In general, neurons under the same clusters were arranged physically close together as they shared neighborhoods with similar characteristics.
In addition to the codes plot, the heat maps, as shown in Figure 10, enabled a complementary visualization for the trained 25 × 25 Kohonen map by providing the breakdown of distribution for each variable. The color intensity filled in each of the neurons ranged from dark blue to dark red, reflecting the color temperature from cool to hot. The heat map for the settling velocity suggested that most of its projected patterns were extremely low, for which only a single red-colored neuron was located (9th row, 25th column). A small amount of green and light blue colored neurons were found at the borders from the top left corner to the right bottom corner. By carefully inspecting the heat maps for the settling velocity, and the x and y-positions, a correlated region with a triangular pattern was discovered at the right bottom corner. The cool colors in this triangular region from the settling velocity heat map corresponded with the warmer colors and the coolest dark blue colors in the x-and y-position heat maps, respectively. In other words, this cluster of neurons was characterized by a moderately low settling velocity value, resulting from a high x-position value, but a very By carefully inspecting the heat maps for the settling velocity, and the x and y-positions, a correlated region with a triangular pattern was discovered at the right bottom corner. The cool colors in this triangular region from the settling velocity heat map corresponded with the warmer colors and the coolest dark blue colors in the xand y-position heat maps, respectively. In other words, this cluster of neurons was characterized by a moderately low settling velocity value, resulting from a high x-position value, but a very low y-position value. Based on this region, the effects of particle sizes and depth were not obvious, since the color intensities were not consistent. However, lower particle sizes and greater depth were usually found in this region.
After each of the heat maps were examined and compared, it could be concluded that the projected pattern distributions for each variable were rather distinct. Nevertheless, a strong association between the heat maps of the y-position and the settling velocity was discovered. Although both heat maps were not identical in patterns and colors, they were observed to be highly correlated in terms of colors and the shaped patterns formed. The outmost border from the top left extended to the right bottom triangular region, suggesting that the projected patterns for this region have moderately low settling velocity when the y-position was extremely low. Besides that, the warm-colored neurons in the center region of the y-position heat map reflects cool-colored neurons in the same region from the settling velocity heat map. This could be interpreted as the higher y-position resulting in low settling velocity. For the other remaining dark blue neurons in the settling velocity heat map, neurons either yellow or green could be found in the same region from the y-position heat map.
To further summarize the prediction performance of the three machine learningbased models, the residual plot for each model that had the best results were constructed (see Figure 11). The residual plot is one of the most effective strategies to compare the residuals, also known as the noise (i.e., error terms), and the predicted settling velocity in a comprehensive manner. The prediction results have reported less than three extreme residual values for each model. Thus, residuals exceeding 0.02 in value were not shown in the residual plot to obtain a more meaningful analysis.
The residuals of the best RBFNN models (in red) formed a hook-shaped trend. The residuals exhibited a decreasing trend as the predicted settling velocity increased, and then deflected upwards. This is evidence of heteroscedasticity because the spread of the residuals exhibited a non-random pattern. It could be observed that the overall prediction performance of the RBFNN model was poor, since most points have highly deviated from the black horizontal line (i.e., zero residual value).
From the residual plot of the best BPNN model, the prediction results highly improved compared to the best RBFNN model. The residuals were now much closer to the zeroresidual line without any obvious patterns, suggesting a more consistent variation in the scattered residuals (i.e., homoscedasticity). Despite the improved performance, there remain several highly deviated residuals, particularly those residual points which exceeded the value of 0.005.
Finally, the superior prediction performance could be discovered from the residual plot of the best SOFM model. A stronger assumption of homoscedasticity can be made based on the randomly scattered residuals. More specifically, the residuals that influenced the relationship between the modelled input hydraulic parameters and the output settling velocity were robust. Moreover, the residuals were highly concentrated around the zeroresidual line, with minimum deviation.

Conclusions
As this research was directed on the application of machine learning models to predict the settling velocity of fine sediments, three machine learning-based models with distinct network designs and abilities were developed. Based on the performance of the pre-

Conclusions
As this research was directed on the application of machine learning models to predict the settling velocity of fine sediments, three machine learning-based models with distinct network designs and abilities were developed. Based on the performance of the prediction of settling velocity of fine sediments, the SOFM provided superior results, followed by the BPNN, and the RBFNN. In terms of the appropriateness of the model, the RBFNN has not achieved satisfactory results, as reflected in the poor performance measures that were produced. The BPNN could still be considered to estimate the settling velocity, although it achieved limited performance. In particular, the BPNN was capable of capturing at least 50% of the total dynamics of the settling velocities with reasonable accuracy as compared to the RBFNN model. Ultimately, the SOFM appeared to be the ideal model, as it had successfully accounted for at least 71.7% of the total variation of settling velocities (NSE of 0.7170, and TVE of 71.71%), associated with the lowest overall prediction error (RMSE of 0.001307, MAE of 0.000647, and MVAF of 101.25%).
Aside from being the best estimation model in this study, SOFM is privileged in examining the hydrodynamics behavior of fine particles. The codes plot and heat maps were excellent tools for visualization of the projected patterns based on the studied hydraulic parameters, even though the relationship between the hydraulic parameters in influencing the settling velocity of fine sediments in water bodies was highly complex. From the various patterns and clusters discovered from the codes plot and heat maps of the SOFM, it can be seen that each hydraulic parameter plays an important role in affecting the settling velocity. After careful investigation, it can be concluded that the y-position of fine particles had the most apparent and dominant influence on the settling velocity.
In conclusion, the main objective was achieved by having the SOFM and BPNN as appropriate machine learning computational models for the prediction of settling velocity of fine sediments. A comparative analysis was performed among the developed models, and it can be said that, as the first attempt for fine sediment settling velocity predictions based on the incorporated hydraulic variables, the SOFM model showed superior performance. The SOFM model also succeeded in enhancing the understanding of the relationship among the hydraulic parameters and fine sediment settling velocity with the help of the codes plot and heat maps as visualization tools. Although there were several ANN models applied in related sediment studies, the models typically do not provide direct insight on how the inputs and output were related due to their black-box properties. Additional experiments and other efforts were required to investigate the influence of the input variables on the predicted output. Hence, the outcomes of this study offer a useful approach in understanding the importance and relationship between the input and output variables.
Although the SOFM model has achieved reliable prediction results, the gaps in terms of accuracy are still required to be filled. It is recommended that a wider range of data of relevant input variables is considered to improve the prediction accuracy. Also, further enhancement of results could be made by extending the current work. The existing ANN models can be improved by implementing metaheuristic algorithms for the decision of the optimal network parameters and architectures. Furthermore, advanced and new machine learning models can be applied to replace the existing ANN models, such as the RBFNN and BPNN models in this study. Nevertheless, the successful results have also opened doors for other types of applications not limited to sedimentation studies, such as the transport mechanism of solids in biogeochemical cycles, pollutants in urban wastewater and sewage sludge, etc.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.