Prediction of Scour Hole Geometry Downstream of Ski-Jump Spillways Using Novel Intelligent Computational Machine Learning Models

Samadi, Mehrshad; Shishegaran, Aydin; Torabi, Mina; Sheikh Khozani, Zohreh

doi:10.3390/forecast8030049

Open AccessArticle

Prediction of Scour Hole Geometry Downstream of Ski-Jump Spillways Using Novel Intelligent Computational Machine Learning Models

by

Mehrshad Samadi

¹

,

Aydin Shishegaran

^1,2,

Mina Torabi

^3,4

and

Zohreh Sheikh Khozani

^5,*

¹

Department of Civil Engineering, Iran University of Science and Technology, Tehran 1684613114, Iran

²

Department of Civil Engineering, Bauhaus Universität Weimar, 99423 Weimar, Germany

³

Department of Civil Engineering, East Tehran Branch, Islamic Azad University, Tehran 4513766731, Iran

⁴

Department of Civil Engineering, Shahed University, Tehran 3319118651, Iran

⁵

Paleoclimate Dynamics Group, Alfred Wegener Institute, Helmholtz Center for Polar and Marine Research, 27570 Bremerhaven, Germany

^*

Author to whom correspondence should be addressed.

Forecasting 2026, 8(3), 49; https://doi.org/10.3390/forecast8030049 (registering DOI)

Submission received: 9 March 2026 / Revised: 1 June 2026 / Accepted: 3 June 2026 / Published: 12 June 2026

(This article belongs to the Section Environmental Forecasting)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Hybrid machine learning models were developed to predict dimensionless parameters of scour dimensions below the ski- jump bucket spillway.
SVCM, HCVCM, and GEP models were applied for hydraulic prediction.

What are the implications of the main findings?

Hybrid SVCM+GEP and HCVCM+GEP models improved prediction accuracy.
The proposed models outperformed traditional regression approaches.

Abstract

The ski-jump spillway is an energy-dissipating structure that discharges extra water beyond the dam’s capacity. The scour process occurs below spillways due to the collision of the water jet with high energy. It is critical to acquire information on scour holes to improve the dam’s safety and related components. Machine learning (ML) techniques have successfully demonstrated their effectiveness for modeling scour in hydraulic engineering. The present research considers novel approaches of ML models for estimating the scour hole geometries below ski-jump bucket spillways. This study investigates the capability of two novel feature-engineering approaches, namely Stronger Variable Creator Machine (SVCM) and High Correlated Variables Creator Machine (HCVCM), along with Gene Expression Programming (GEP) and their hybrid forms (SVCM+GEP and HCVCM+GEP), which were employed to predict normalized scour depth, scour length, and scour width below ski-jump spillways. Statistical metrics, graphical analyses, the Rank Mean (RM) method, the cross-validation approach, and

U_{95}

index were used for the evaluation and reliability assessment of the proposed ML models. The results showed that hybrid ML models consistently outperformed individual algorithms. The results indicated that the SVCM+GEP method with

R M = 1.83

and

1.50

had the highest performance compared to other methods for the prediction of

\frac{D s}{D w}

and

\frac{L_{s}}{D_{w}}

, respectively. In addition, the HCVCM+GEP method with

R M = 1.33

was the best model for the prediction of

\frac{W s}{D w}

. In comparison with the conventional regression-based equations and previously reported ML methods, the proposed hybrid approaches improved the prediction results. In addition, the cross-validation method confirmed the robustness and generalization capability of the suggested hybrid ML models. The superior performance of the hybrid models is attributed to their ability to capture complex nonlinear interactions among hydraulic and geometric variables. The developed SVCM/HCVCM+GEP models provide accurate approaches for predicting scour parameters in hydraulic structures.

Keywords:

scour hole geometry; ski-jump spillway; machine learning models; Stronger Variable Creator Machine (SVCM); High Correlated Variables Creator Machine (HCVCM)

1. Introduction

Scour below spillways is one of the most critical issues that may occur, resulting in bed erosion and threatening spillway stability. Areas downstream of spillways are prone to erosion and damage because they are typically exposed to high-speed and high-energy flows. Hydraulic engineers are always concerned with erosion and scour below spillways. Therefore, the accurate estimation of the dimensions of the scour hole is an important task in hydraulic engineering to ensure the safe and cost-effective operation of dams and spillways. The physical mechanisms of scour in turbulent three-dimensional flow, accompanied by sediment transport and erosion of the riverbed, are highly complicated. Therefore, experimental methods are commonly used to study and measure scour parameters in hydraulic structures. The scour process is highly nonlinear due to the interaction of hydraulic, geometric, and sediment-related parameters. Consequently, traditional conventional regression-based equations may fail to capture the existing nonlinear interactions between the involved variables. For this reason, the application of machine learning (ML) methods has increasingly been explored in scour modeling, receiving considerable attention as an alternative to conventional regression-based equations. Successful applications of different standalone and hybrid ML techniques have been widely used for modeling complex problems in civil engineering [1,2].

In recent years, a broad range of standalone ML techniques have been applied to scour-related problems, including gene expression programming (GEP), model tree (MT), classification and regression trees (CART), adaptive neuro-fuzzy inference systems (ANFIS), multivariate adaptive regression splines (MARS), artificial neural networks (ANN), support vector machines (SVM), evolutionary polynomial regression (EPR), and Gaussian process regression (GPR) [3,4,5,6,7,8,9]. Various ML approaches have been increasingly employed to overcome the limitations of traditional empirical formulas for predicting scour depth downstream of ski-jump spillways. Early applications of ML methods frequently utilized laboratory and field datasets using dimensional and non-dimensional parameters for the estimation of scour depth. In the specific case of ski-jump spillways, Azamathulla et al. [10] used the height of fall and discharge intensity over the spillways to develop an ANN and an ANFIS to determine the depth of scour below ski-jump spillways in the prototype. It was discovered that the ANFIS technique produced superior results compared to traditional formulas. Agarwal et al. [11] utilized dimensional experimental data to develop a Locally Weighted Projection Regression (LWPR) model and demonstrated that LWPR produced satisfactory predictions for scour depth estimation compared to ANN. Goyal and Ojha [12] conducted a comparative evaluation of ANN, SVM, and the MT using experimental dimensional parameters. Their results showed that SVM and MT were superior to ANN for estimating scour dimensions below ski-jump bucket spillways. In addition, they highlighted the practical advantage of the MT due to its ability to produce explicit mathematical expressions that are directly usable by design engineers.

Haghiabi [13] utilized experimental data alongside non-dimensional parameters to model scour depth using MARS and ANN. The study found that MARS outperformed both ANN and empirical equations. Azamathulla et al. [14] concluded that GEP is more accurate than ANFIS, ANN, and empirical equations in estimating downstream non-dimensional scour hole dimensions of ski-jump bucket spillways. Naini [15] used different ANN types to estimate non-dimensional parameters of scour holes below ski-jump bucket spillways. Ayoubloo et al. [16] determined the scour depth with CART, MT, and SVM approaches below ski-jump spillways using experimental datasets, discovering that the CART algorithm offered a more reliable approximation of the scour depth than the MT and SVM methods. Najafzadeh et al. [17] employed various GMDH-based methods to estimate the geometry of scour holes downstream of ski-jump buckets using non-dimensional experimental datasets. They introduced the GMDH approach, which achieved good performance for modeling scour holes downstream of ski-jump buckets. Some researchers, utilizing field measurements of scour below ski-jump bucket spillways, developed a wide spectrum of ML methods for scour modeling. Guven and Azamathulla [18] modeled scour depth using field data samples by incorporating both dimensional and non-dimensional forms of discharge and head. They applied GEP and Genetic Programming (GP) for scour depth modeling, concluding that GEP was more accurate than traditional regression-based formulae and slightly superior to the GP method. Shafagh Loron et al. [19] and Fuladipanah et al. [20] employed non-dimensional forms of field data samples and demonstrated that MARS consistently outperformed Decision Tree (DT), GEP, and SVM algorithms for scour depth modeling.

Most recent studies focus on the applications of hybridizing standalone ML algorithms with metaheuristic optimization techniques to improve the predictive accuracy of scour depth. Sammen et al. [21] employed field data samples and non-dimensional parameters to evaluate a standalone ANN against an ANN optimized with Particle Swarm Optimization (PSO), Genetic Algorithm (GA), and Harris Hawks Optimization (HHO). Their comparative study concluded that the hybrid ANN-HHO model achieved the highest accuracy compared to the other ANN methods and empirical formulas. Wang et al. [22] proposed a hybrid SVR model combined with the Innovative Gunner Optimization algorithm (SVR-AIG) using non-dimensional field measurements for scour depth prediction below ski-jump spillways. Their results proved that the SVR-AIG hybrid approach is an effective strategy for the enhancement of the SVR model for scour prediction. Sun et al. [23] optimized SVR with Fruitfly Optimization Algorithms (FOAs) to accurately estimate scour hole parameters below ski-jump spillways based on experimental data. They compared the SVR-FOA results to empirical formulae, ANN, and ANFIS, finding that the proposed SVR-FOAs outperformed the others. These studies confirm that the hybridization of ML approaches can improve the performance and efficiency of standalone ML models for scour modeling.

Despite significant progress in applying ML models for scour prediction below ski-jump spillways, critical research gaps remain. First, most previous studies on scour below ski-jump spillways have focused on conventional standalone algorithms such as ANN, ANFIS, SVM, CART, and GEP, while the role of supervised feature-engineering techniques has not been investigated in this field. Second, most existing studies have focused on scour depth, while the simultaneous modeling of all three scour hole dimensions (depth, length, and width) using unified modeling approaches has received limited attention. Third, while original hydraulic and geometric parameters are commonly used for scour modeling, there is a need to examine the potential of supervised feature engineering for the automatic construction of new and more predictive input variables.

Therefore, the main objective of the present research addresses these gaps by investigating the capability of the Stronger Variable Creator Machine (SVCM) and High Correlated Variables Creator Machine (HCVCM), introduced in 2021 and 2023 [24,25,26], alongside their hybrid combinations with GEP (SVCM+GEP and HCVCM+GEP) for the simultaneous estimation of normalized scour depth (

\frac{D_{s}}{D_{w}}

), scour length (

\frac{L_{s}}{D_{w}}

), and scour width (

\frac{W_{s}}{D_{w}}

) below ski-jump bucket spillways. It is worth mentioning that the SVCM and HCVCM novel feature-engineering algorithms have been proposed to strengthen the relationship between input variables and target outputs before the final predictive model is developed. The main feature of the HCVCM and SVCM algorithms is that they introduce newer input variables that can enhance the accuracy of ML methods. These methods are designed to generate transformed variables that exhibit stronger predictive relevance and reduced redundancy. SVCM expands the feature space by retaining both original and newly generated predictors, whereas HCVCM refines the feature space by replacing the original variables with decorrelated transformed variables. A comprehensive assessment of the proposed hybrid methods was conducted through statistical metrics, graphical plots, and the Rank Mean method. Finally, the cross-validation method and overall model performance (

U_{95}

) were used to assess the generalizability and reliability of the results, and the proposed hybrid ML models were compared with conventional regression-based equations and previously reported ML methods for estimating scour hole dimensions.

2. Methodology

The main goal of this study is to predict scour hole dimensions below ski-jump bucket spillways using experimental datasets. First, the dataset and dimensional parameters for scour hole modeling below ski-jump bucket spillways are presented, followed by explanations of the ML methods, including SVCM, HCVCM, GEP, and their hybrid approaches (the SVCM+GEP and HCVCM+GEP methods).

2.1. Laboratory Models, Experimental Data and Dimensional Analysis

Due to the difficulty of field study of scour below spillways, researchers have conducted experimental work for modeling scour depth. This work used a reliable and available experimental dataset collected by Azamathulla et al. [27]. They collected 95 experimental datasets of scour depth, length, and width measured at equilibrium conditions conducted on hydraulic models at the Central Water and Power Research Station (CWPRS) in Pune, India. These experiments included comprehensive scale models of ski-jump bucket designs simulating dams across the Subarnarekha, Ranganadi, and Parbati rivers. In addition, the database incorporates experimental scour data contributed by Dr. Masoud Ghodsian from Tarbiat Modarres University, Tehran, Iran.

The experiments were performed using sectional and comprehensive ski-jump spillway hydraulic models designed according to Froude similarity laws, with geometric scales ranging approximately from 1:40 to 1:100. The downstream channel consisted of erodible cohesionless sand beds, and scour dimensions were measured under equilibrium scour conditions after different discharge and reservoir-level scenarios. The experiments considered different hydraulic and geometric parameters, including discharge intensity, total head, bucket radius, lip angle, tailwater depth, and sediment size. The downstream river channel was simulated using a free-formed plunge pool filled with erodible, cohesionless sand having a median particle diameter (d₅₀

D_{50}

) of 2 mm, while the riverbanks in the tested portions were assumed to be rigid and non-erodible. A primary assumption in their dimensional analysis was that the standard deviation of the bed material size had a negligible effect on the scour dimensions and could therefore be omitted. The flow conditions were varied extensively throughout the experimental phase. Specifically, four distinct discharge passes were tested, simulating 25%, 50%, 75%, and 100% of the maximum design discharge. These passes were conducted under both fully open and partially open spillway gate configurations, with a reported discharge measurement accuracy of ±2%. Each experimental run was allowed to proceed for three hours to ensure the scour hole reached a state of equilibrium.

Based on the governing hydraulic and sediment parameters, the equilibrium scour depth

D_{s}

, scour width

W_{s}

, and the distance of maximum scour depth from the spillway bucket lip

L_{s}

downstream of a ski-jump bucket can be expressed as functions of unit discharge

q

, total head

H_{1}

, bucket radius

R

, lip angle

ϕ

, tailwater depth

D_{w}

, median sediment size

d_{50}

, gravity

g

, and densities of water and sediment [1]:

D_{s}, L_{s}, W_{s} = f (q, H_{1}, R, ϕ, D_{w}, D_{50}, g, ρ_{w}, ρ_{s})

(1)

The authors applied the Buckingham

π

-theorem to derive non-dimensional functional relationships. The scour depth (

D_{s}

), scour width (

W_{s}

), and the distance of maximum scour depth from the spillway bucket lip (

L_{s}

) were normalized by the tail water depth (

D_{w}

). The general functional relationship is expressed as [1]:

\frac{D_{s}}{D_{w}}, \frac{L_{s}}{D_{w}}, \frac{W_{s}}{D_{w}} = f (F r, \frac{H_{1}}{D_{w}}, \frac{R}{D_{w}}, \frac{D_{50}}{D_{w}}, ϕ)

(2)

where

q / \sqrt{g D_{w}^{3}}

represents the Froude number, denoted as

F r

. The density ratio

ρ_{s} / ρ_{w}

was treated as constant and omitted. Figure 1 illustrates the cross-sectional scour downstream of a ski-jump bucket spillway.

Table 1 lists the main statistical parameters of the involved variables presented in Equation (2).

2.2. Traditional Regression Relationships for Estimating Scour Hole Parameters

Using nonlinear regression analysis, Azmathullah et al. [27] proposed the following nonlinear equations for estimating the normalized scour hole dimensions for normalized scour depth (

D_{s} / D_{w}

), normalized scour length (

L_{s} / D_{w}

), and normalized scour width (

W_{s} / D_{w}

) as follows:

\frac{D_{s}}{D_{w}} = 6.914 {(F r)}^{0.694} {(\frac{H_{1}}{D_{w}})}^{0.0815} {(\frac{R}{D_{w}})}^{- 0.233} {(\frac{D_{50}}{D_{w}})}^{0.196} (ϕ)^{0.196}

(3)

\frac{L_{s}}{D_{w}} = 9.85 {(F r)}^{0.42} {(\frac{H_{1}}{D_{w}})}^{0.28} {(\frac{R}{D_{w}})}^{0.043} {(\frac{D_{50}}{D_{w}})}^{0.037} (ϕ)^{0.34661}

(4)

\frac{W_{s}}{D_{w}} = 5.42 {(F r)}^{- 0.015} {(\frac{H_{1}}{D_{w}})}^{0.55107} {(\frac{R}{D_{w}})}^{0.1396} {(\frac{D_{50}}{D_{w}})}^{0.242} (ϕ)^{- 0.16}

(5)

2.3. SVCM and HCVCM Algorithm

Stronger Variable Creator Machine (SVCM) and High Correlated Variables Creator Machine (HCVCM) are supervised feature-engineering algorithms developed to increase the estimation accuracy of machine learning models. Their main concept is to construct new input variables from the original variables so that the resulting predictors have a stronger statistical dependence on the target variable while maintaining a small amount of redundancy between them. Both SVCM and HCVCM methods generate new variables by applying nonlinear mathematical transformations to the initial input variables, such as algebraic combinations, trigonometric functions, exponential functions, and user-defined relationships. For each new variable generated, the coefficient of determination (R²) is calculated with respect to the target variable and compared to the coefficient of determination of its parent variable. Only variables that have a higher correlation with the output than the original variables are retained, ensuring that the constructed variables provide better predictive relevance [24,25,26].

2.3.1. SVCM Algorithm

The main goal of SVCM is to increase the number of problem variables by creating additional variables that have a stronger correlation with the target parameter while remaining sufficiently independent of their parent variables. In this approach, after generating candidate variables and evaluating their R² with the output parameter, SVCM examines the correlation between a new variable and its parent variable by considering a certain threshold, which usually should be less than 0.5 [25].

Among the newly generated variables, those variables that simultaneously satisfy the two conditions of higher correlation with the output parameter and low correlation with the parent variable are selected to estimate the value of the target parameter. Therefore, the final input of the model can include the original variables as well as the newly selected variables. Therefore, SVCM expands the space of variables predicting the target parameter to improve the flexibility of the model and the prediction accuracy, especially in complex problems [25].

2.3.2. HCVM Algorithm

HCVM follows a similar variable-generation strategy but differs fundamentally in its selection and replacement mechanism. After generating candidate variables and retaining those with stronger correlation to the target than the original variables, HCVCM applies a second decorrelation filter among the selected variables. Only variables whose mutual correlation is lower than the correlation among the original variables are preserved. The resulting variables are considered improved representations of the original predictors and are used in place of them in the prediction model. Thus, HCVCM performs supervised feature transformation rather than augmentation: it replaces the original variables with a refined subset of generated variables that are both highly correlated with the output and minimally redundant. This transformation produces a more compact and informative input space, which can improve generalization and interpretability while reducing multicollinearity effects [24].

2.3.3. Relationship and Distinction Between SVCM and HCVCM

SVCM and HCVCM share the same conceptual foundation, including nonlinear supervised construction of variables with higher predictive relevance, but differ in how the generated variables are incorporated into models. SVCM retains both original and generated variables, thereby enlarging the feature space and increasing nonlinear descriptive capability. In contrast, HCVCM replaces the original variables with decorrelated generated variables, yielding a reduced and more informative predictor set. Consequently, SVCM emphasizes feature enrichment, whereas HCVCM emphasizes feature refinement and redundancy reduction. In summary, both algorithms transform weakly correlated original inputs into stronger predictors; however, SVCM achieves this by augmenting the input space, while HCVCM achieves it by restructuring and replacing the input representation. These complementary strategies enable improved predictive performance of downstream models such as regression, gene expression programming, and neuro-fuzzy systems in modeling complex prediction tasks [24,25,26].

2.4. GEP Algorithm

GEP is an evolutionary algorithm used to automatically find mathematical equations that describe the relationship between input variables and an output. It combines features of Genetic Algorithms and Genetic Programming by encoding solutions as fixed-length chromosomes that are later converted into mathematical expression trees. The algorithm starts with a random population of chromosomes, evaluates each equation using an error measure, and repeatedly improves the population through genetic operations such as mutation and crossover. Using basic mathematical functions, including +, −, ×, ÷, trigonometric and exponential functions, GEP can model complex nonlinear relationships. A key advantage of GEP is that it produces explicit equations while efficiently searching for accurate predictive models [28].

2.5. Hybrid Algorithms (HCVCM+GEP and SVCM+GEP)

The hybrid models operate in a two-stage process in which the SVCM and HCVCM construct an enhanced feature space that is then used by GEP to evolve optimal predictive expressions. GEP is employed as a symbolic regression method to derive explicit mathematical equations relating inputs to the output parameter. The resulting hybrid models (SVCM+GEP and HCVCM+GEP) integrate supervised feature engineering with evolutionary symbolic regression, resulting in accurate predictive equations for complex problems.

3. Performance Criteria and Evaluation Methods

3.1. Statistical Metrics

This part presents the evaluation criteria for the assessment of the developed models. Generally, statistical indices are employed to evaluate the performance and accuracy of the data-driven model. Therefore, some statistical indices such as determination of coefficient (R²), mean absolute error (MAE), mean average percent error (MAPE), root mean square error (RMSE), BIAS, and scatter index (SI) are frequently used for assessing the scour depth models. According to the values of statistical indices, the most accurate model is determined. The best model should be deemed the model with the lowest BIAS, SI, RMSE, MAPE, and MAE values and the highest R². The statistical indices are defined as follows:

R^{2} = (\frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}})^{2}

(6)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{i = n} {(x_{i} - y_{i})}^{2}}

(7)

S I = \frac{R M S E}{\bar{x}} \times 100

(8)

B I A S = \bar{y} - \bar{x}

(9)

M A E = \frac{1}{N} \sum_{i = 1}^{N} |x_{i} - y_{i}|

(10)

M A P E = \frac{100}{n} \sum_{i = 1}^{N} \frac{|x_{i} - y_{i}|}{x_{i}}

(11)

where

x_{i}

and

y_{i}

represent observed and estimated values,

\bar{x}

and

\bar{y}

represent the average of observed and estimated values, and

N

represents the number of the dataset.

3.2. Rank Mean Method

In addition, for the comprehensive evaluation of the proposed models, statistical indicators for all data-driven models were considered, and the best model was selected based on the rank mean method [29,30]. The RM method includes the aggregation of ranks obtained from the values of statistical indices, presenting a comprehensive assessment metric to evaluate the efficacy of various models. The RM expression is defined as:

R M = \frac{1}{n} \sum_{i = 1}^{n} {R a n k}_{i}

(12)

The values of the RM method were used to determine the best model for the prediction of scour hole dimensions. The lowest RM value indicates that the best predictive model can be considered overall to have the best performance for prediction purposes.

3.3. Overall Model Performance Using $U_{95}$ Index

In addition to the conventional statistical indices, an overall model performance of the proposed models was employed using

U_{95}

index. The U₉₅ index is defined as follows [31]:

U_{95} = (\frac{1.96}{n}) \sqrt{\sum_{i = 1}^{n} {({(y)}_{i (Observed)} - \bar{{(y)}_{(Observed)}})}^{2} + \sum_{i = 1}^{n} {({(y)}_{i (Observed)} - {(y)}_{i (Predicted)})}^{2}}

(13)

In general, lower values of U₉₅ are associated with better model performance.

3.4. Cross-Validation Approach

In addition to the conventional train–test split evaluation, k-fold cross-validation was employed to further assess the generalization capability and robustness of the developed machine learning models. In k-fold cross-validation, the entire dataset is randomly divided into approximately equal subsets, called folds. At each iteration, one fold is used as the validation set, and the remaining folds are used for model training; this procedure is repeated until each fold has been used once for validation. The final performance of each model is then obtained by averaging the evaluation metrics over all folds, which provides a more reliable estimate than a single train–test split because all samples are used for both training and validation in different iterations [32].

4. Result and Discussion

In this section, the results of all previously explained models for the prediction of

\frac{D_{s}}{D_{w}}

,

\frac{L_{s}}{D_{w}}

, and

\frac{W_{s}}{D_{w}}

are presented and compared. The explicit equations for the estimation of scour parameters are also presented. The dataset was randomly divided into 80% training data (76 samples) and 20% testing data (19 samples). The statistical characteristics and parameter ranges of both subsets were carefully examined to ensure consistency and similarity between the training and testing datasets. In addition, to further evaluate the robustness and stability of the developed models against possible variability caused by random data partitioning, the k-fold cross-validation approach was also conducted.

4.1. Result of Modeling $\frac{D_{s}}{D_{w}}$

The application of the proposed ML models yielded the mathematical expressions for the prediction of

\frac{D_{s}}{D_{w}}

, as listed in Table 2.

The derived explicit formulas for scour depth consist of highly complex nonlinear combinations of the governing hydraulic variables. As observed in the resulting equations, the algorithms selected a diverse combination of elementary algebraic functions (such as natural logarithm, exponential, and polynomial terms) alongside nested trigonometric functions (such as cos, sin, and tan). The trigonometric and exponential terms indicate that the physical interactions governing scour geometry are highly nonlinear in nature.

The values of statistical indices of the developed models are calculated to assess the accuracy of the proposed ML models. The values of statistical indices for the prediction of

\frac{D_{s}}{D_{w}}

are given in Table 3. The evaluation was conducted separately on training and testing datasets, and on the total dataset, to examine both models’ fitting capability and generalization performance. In general, higher R² and lower error metrics indicate better predictive accuracy.

The SVCM model demonstrated an R² = 0.967 for the total dataset, with MAE = 0.381, RMSE = 0.503, and SI = 14.84%. In addition, the SVCM model maintained consistent accuracy between training (R² = 0.967) and testing (R² = 0.974) phases, confirming robust generalization capability. The bias values (−4.37 × 10⁻⁶ for all data) indicate minimal prediction error. The SVCM+GEP model exhibited the highest overall performance, with R² = 0.972 and the lowest error metrics (MAE = 0.331, RMSE = 0.469, and SI = 13.61%) for the entire dataset among all ML models. The HCVCM+GEP model also showed high predictive capability with R² = 0.960 for all datasets. The standalone GEP model yielded competitive results with R² = 0.946 and the lowest MAPE value (0.178) compared to other ML methods. The HCVCM model provides acceptable accuracy (R² = 0.920); however, it exhibited the lowest performance among all tested methods. The model showed higher error metrics (MAE = 0.600, RMSE = 0.786, SI = 23.19%) and a slight performance degradation from training to testing phases.

To identify the best overall ML model, the rank mean (RM) method was employed. This approach aggregates the rankings obtained from individual statistical indices to provide a comprehensive assessment metric. Table 4 presents the RM values of each ML method for the prediction of

\frac{D_{s}}{D_{w}}

.

As seen in Table 4, the RM analysis confirmed that SVCM+GEP is the most reliable model (RM = 1.83), followed by SVCM (RM = 2.17), HCVCM+GEP (RM = 3.17), GEP (RM = 3.33), and HCVCM (RM = 4.50). Figure 2 presents diagrams comparing observed and estimated normalized scour depth (

\frac{D_{s}}{D_{w}}

) values for all developed ML models on the training and testing datasets. These diagrams provide a graphical assessment of each ML model’s ability to track the variation in

\frac{D_{s}}{D_{w}}

. These diagrams show the general pattern of scour depth variation that is predicted by ML models. In these plots, if the prediction points are closer to the observed points, it indicates better agreement between the measured and estimated scour depths. Overall, all ML models demonstrated a reasonable ability to reproduce the observed

\frac{D_{s}}{D_{w}}

values. As seen, most data points predicted by each ML model are concentrated around measured values. However, differences in performance among the models are observed.

The HCVCM model (Figure 2a) shows the greatest deviation at higher

\frac{D_{s}}{D_{w}}

values. The SVCM model (Figure 2b) exhibits a more concentrated distribution of points near the observed values, indicating lower prediction error than HCVCM. Similarly, the GEP model (Figure 2c) shows good agreement between predicted and observed values, although slight deviations remain at higher values of

\frac{D_{s}}{D_{w}}

. The HCVCM+GEP model (Figure 2d) provides better graphical performance compared to standalone HCVCM. This confirms the benefit of hybrid models. The SVCM+GEP model (Figure 2e) exhibits the closest agreement with observed

\frac{D_{s}}{D_{w}}

and the smallest deviation among all models. These results of the graphical analysis are consistent with the quantitative evaluations of ML models that identified SVCM+GEP as the best overall model for estimation of

\frac{D_{s}}{D_{w}}

.

Furthermore, it should be mentioned that all experimental scour measurements correspond to equilibrium conditions; therefore, they are strictly time-independent and do not constitute time-series data. The trend prediction for (

\frac{D_{s}}{D_{w}}

), as illustrated in Figure 2, was used solely to visually compare the observed and predicted values for both the training and testing datasets. This plot was presented to demonstrate the predictive agreement and generalization capability of the developed models rather than to represent time-dependent trends. Therefore, the ordering of the sample numbers in the figures does not indicate temporal evolution or sequential dependency between experiments. The sample numbering simply represents the indexing of independent experimental cases within the compiled database.

In addition, for a better graphical evaluation of the proposed methods, the DDR plots of their results are shown in Figure 3. The DDR is a graphical performance index used to evaluate the predictive accuracy of models by quantifying the relative deviation between predicted and observed values. It is defined as:

DDR = (\frac{Predicted value}{Observed value}) - 1

(19)

Figure 3 presents the DDR values for the prediction of normalized scour depth (

\frac{D_{s}}{D_{w}}

) obtained from all the proposed ML models.

In each subplot, a DDR value of zero indicates perfect agreement between predicted and measured

\frac{D_{s}}{D_{w}}

. However, the positive DDR values denote overprediction of

\frac{D_{s}}{D_{w}}

. In contrast, the negative values of DDR denote underprediction of

\frac{D_{s}}{D_{w}}

. These diagrams allow a point-wise examination of model behavior across the dataset. When interpreted together with the statistical indices and rank mean analysis for

\frac{D_{s}}{D_{w}}

, the DDR plots provide complementary evidence on model reliability, highlighting that models with better aggregate performance such as SVCM and the hybrid SVCM+GEP are characterized by DDR values that are more tightly concentrated around zero, whereas weaker models (e.g., HCVCM) tend to exhibit a broader spread of DDR values and more pronounced deviations from the zero-discrepancy line.

As shown in Figure 3, overall, all models exhibit DDR distributions centered around zero, which indicates the absence of significant bias in the prediction of

\frac{D_{s}}{D_{w}}

. However, the different dispersion is observed among the models. The HCVCM model (Figure 3a) exhibited the widest DDR range and, especially at several points, deviated considerably from zero. The SVCM (Figure 3b) and GEP (Figure 3c) models displayed narrower DDR distributions and were more concentrated around zero compared to HCVCM. The hybrid HCVCM+GEP model (Figure 3d) was more concentrated around the zero line compared with the standalone HCVCM model. It confirms that combining HCVCM with GEP improves the prediction accuracy of

\frac{D_{s}}{D_{w}}

. The SVCM+GEP model (Figure 3e) indicated the tightest concentration of DDR values around zero and the smallest spread among all ML models. Therefore, SVCM+GEP has the most reliable prediction of

\frac{D_{s}}{D_{w}}

across the dataset. These graphical findings are consistent with the values of statistical metrics.

4.2. Result of Modeling $\frac{L_{s}}{D_{w}}$

The applications of proposed ML models suggested the following expressions for the estimation of the

\frac{L_{s}}{D_{w}}

parameter, as listed in Table 5.

The values of statistical indices obtained from proposed data-driven models for the estimation of

\frac{L_{s}}{D_{w}}

are tabulated in Table 6.

As seen in Table 6, regarding the error values, including MAE, RMSE, SI, and MAPE values, the SVCM+GEP model has minimal values compared to other methods (MAE = 0.842, RMSE = 1.303, SI = 10.43, and MAPE = 0.084). However, the HCVCM method has the lowest BIAS value. In addition, the highest value of R² is related to the SVCM+GEP method (R² = 0.967).

The values of statistical metrics for developed ML models for predicting the normalized scour length

\frac{L_{s}}{D_{w}}

demonstrated that all approaches have relatively high accuracy. In addition, the statistical metrics indicated the clear superiority of the SVCM-based formulations, particularly the hybrid SVCM+GEP model. For the total dataset, SVCM+GEP attains the highest coefficient of determination (R² = 0.967) and the lowest error metrics (MAE = 0.842, RMSE = 1.303, SI = 10.45%, MAPE = 0.084), indicating the most accurate ML model for estimation of

\frac{L_{s}}{D_{w}}

. On the other hand, the standalone SVCM model also outperforms HCVCM, GEP, and HCVCM+GEP in most statistical indices. In contrast, HCVCM exhibited the weakest overall performance, with the largest MAE, RMSE, and SI values. The GEP and HCVCM+GEP models provide intermediate performance, with R² values of 0.962 and 0.933 and higher RMSE and MAPE than SVCM-based models. For a comprehensive ranking of ML models for estimating

\frac{L_{s}}{D_{w}}

, the RM method was used to aggregate the ranks of all statistical metrics. The RM values are listed in Table 7 for Ls/DW.

Table 7 indicates that the SVCM+GEP model with RM = 1.50 has the best performance compared to other ML methods, followed by SVCM (RM = 2.17), while HCVCM, GEP, and HCVCM+GEP showed comparatively lower performance. For graphical evaluation of the proposed method, the result obtained by the developed models is compared with the observed data in Figure 4.

The graphical performance of the proposed ML models for predicting the normalized scour length (

\frac{L_{s}}{D_{w}}

) is shown in Figure 4. These plots present trend prediction diagrams comparing observed and estimated values by ML models. As observed in Figure 4, all models have acceptable performance for modeling the observed variation in

\frac{L_{s}}{D_{w}}

, and predicted trends generally follow the pattern of the measured data. The HCVCM model (Figure 4a) shows the largest deviation from the observed trend, particularly at higher values of

\frac{L_{s}}{D_{w}}

. The SVCM (Figure 4b) and GEP (Figure 4c) models have better agreement with the observed trend, and the estimation values more closely track the measured values of

\frac{L_{s}}{D_{w}}

. The hybrid HCVCM+GEP model (Figure 4d) shows a closer match to the observed trend compared with the standalone HCVCM model. Finally, the SVCM+GEP model (Figure 4e) demonstrated the closest correspondence and the smallest deviations between predicted and observed

\frac{L_{s}}{D_{w}}

values and highlights the best capability of this model for the estimation of Ls/Dw. These graphical observations are consistent with the statistical evaluation results, which identified SVCM+GEP as the most accurate model for predicting

\frac{L_{s}}{D_{w}}

.

In addition, for a better illustration of the performance of the proposed ML models for estimation of

\frac{L_{s}}{D_{w}}

, Figure 5 shows the difference between the measured values and the values predicted by each ML model using DDR values.

The HCVCM plot shows a relatively wide scatter of DDR values around zero, consistent with its higher error metrics. In contrast, the SVCM and GEP plots show tighter clustering of DDR values, consistent with their higher R² values and lower error statistics, although GEP still shows some positive deviation, which is consistent with its comparatively large positive BIAS. The hybrid models further improve DDR values. For HCVCM+GEP, the spread of DDR values is narrow relative to HCVCM. In addition, the SVCM+GEP plot exhibited the most compact DDR values around zero, indicating the superior overall performance of this hybrid model, as evidenced by its lowest error metrics and best rank mean score for the prediction of

\frac{L_{s}}{D_{w}}

.

4.3. Results of Modeling $\frac{W_{s}}{D_{w}}$

The mathematical expressions for the estimation of

\frac{W_{s}}{D_{w}}

using ML methods are listed in Table 8.

The values of statistical indices for the estimation of

\frac{W_{s}}{D_{w}}

are presented in Table 9.

The quantitative evaluation of the proposed ML models for predicting the normalized scour width

\frac{W_{s}}{D_{w}}

downstream of ski-jump spillways reveals that all ML models provide satisfactory results, as summarized in Table 9. The hybrid HCVCM+GEP model demonstrated superior performance across all datasets, achieving the highest coefficient of determination (R² = 0.981) and the lowest error metrics (MAE = 0.764, RMSE = 1.140, SI = 9.14%, MAPE = 0.084). The standalone SVCM model follows closely with accurate results (R²=0.962, MAE = 1.022, RMSE = 1.608) and stable generalization between training and testing phases. In contrast, the standalone GEP model shows the weakest performance, characterized by statistical metrics (R² = 0.829, RMSE = 3.532, MAPE = 0.277). The hybrid SVCM+GEP and standalone HCVCM have intermediate positions for estimation of

\frac{W_{s}}{D_{w}}

. The SVCM+GEP achieves moderate errors (MAE = 0.792, SI = 15.37%) but higher bias (0.387), while HCVCM provides acceptable accuracy (R² = 0.892) with perfect bias (0.000) for all data despite elevated RMSE (2.724). The values of RM for ML models for the estimation of

\frac{W_{s}}{D_{w}}

are shown in Table 10.

With regard to Table 10, the HCVCM+GEP method has a high value of RM (1.33), which determined the best model for the prediction of Ws/Dw, followed by SVCM (RM = 2.33) and SVCM+GEP (RM = 2.83), whereas HCVCM and GEP show comparatively lower performance.

For graphical evaluation, the proposed method for predicting

\frac{W_{s}}{D_{w}}

results from the developed models compared with observed data is shown in Figure 6. Figure 6 presents the trend prediction diagrams comparing the observed and estimated normalized scour width (

\frac{W_{s}}{D_{w}}

) for developed ML models.

The HCVCM plot (Figure 6a) shows moderate tracking of the observed trend, particularly for extreme values of

\frac{W_{s}}{D_{w}}

, which are associated with higher RMSE values. The SVCM plot (Figure 6b) demonstrated significantly better alignment, closely following the observed fluctuations, which confirms its strong standalone performance. The GEP plot (Figure 6c) shows underprediction of the estimated values of

\frac{W_{s}}{D_{w}}

across much of the data range. It is consistent with its large negative BIAS and poor statistical ranking. The HCVCM+GEP (Figure 6d) shows an excellent agreement between estimated and observed values, with minimal deviation. It is aligned with its superior statistical metrics (highest R², lowest RMSE, and MAPE), while SVCM+GEP (Figure 6e) also tracks well but shows slightly more scatter than the HCVCM+GEP model. These graphical results reinforce the quantitative finding that the HCVCM+GEP model provides the most accurate model for the prediction of the normalized width of scour depth. Furthermore, to better illustrate the differences between the predicted and observed

\frac{W_{s}}{D_{w}}

values, the DDR plots for the proposed models are shown in Figure 7.

Figure 7 displays the DDR plots for the prediction of normalized scour width

\frac{W_{s}}{D_{w}}

, for all ML models. The HCVCM plot (Figure 7a) shows a broad scatter of DDR values around the zero line, reflecting the model’s lower accuracy and higher RMSE. The SVCM plot (Figure 7b) shows a concentration of estimates around the zero line and indicates lower error than HCVCM. The GEP plot (Figure 7c) shows the greatest dispersion around the zero line, indicating the weakest performance in the statistical metrics. The hybrid HCVCM+GEP plot (Figure 7d) shows a highly compact, symmetric distribution of DDR values around zero with minimal outliers, indicating it is the most accurate model for estimation of

\frac{W_{s}}{D_{w}}

. On the other hand, the SVCM+GEP plot (Figure 7e) also shows good accuracy but with slightly more scatter than the HCVCM+GEP model. These graphical results align perfectly with the rank mean analysis, which identified HCVCM+GEP as the optimal model for estimating scour width.

The further and overall assessment of the performance of the proposed ML models for estimation of

\frac{D_{s}}{D_{w}}

,

\frac{L_{s}}{D_{w}}

, and

\frac{W_{s}}{D_{w}}

was evaluated using Taylor diagrams constructed using all datasets. These diagrams provide a comprehensive assessment of model performance by simultaneously representing the correlation coefficient, standard deviation, and centered root mean square difference relative to the observed data and providing a direct comparison of the proposed ML models. The Taylor diagrams for the scour hole geometry of ski-jump spillways are illustrated in Figure 8.

For

\frac{D_{s}}{D_{w}}

and

\frac{L_{s}}{D_{w}}

, the SVCM+GEP model has the closest distance to the observed reference point and demonstrates the best ML model with high accuracy. For

\frac{W_{s}}{D_{w}}

, the HCVCM+GEP model was positioned nearest to the observed reference point. However, the SVCM and SVCM+GEP models also showed good performance for estimation of

\frac{W_{s}}{D_{w}}

. Overall, the Taylor diagrams for scour dimensions confirmed that the hybrid models, including SVCM+GEP and HCVCM+GEP, consistently outperform the standalone algorithms. Specifically, SVCM+GEP provided the most accurate results for estimation of values of

\frac{D_{s}}{D_{w}}

and

\frac{L_{s}}{D_{w}}

, whereas HCVCM+GEP has the best outcome for estimation of values of

\frac{W_{s}}{D_{w}}

. These findings are consistent with the statistical and graphical evaluations presented earlier.

The primary algorithmic innovation of this study is the novel integration of these supervised feature-engineering algorithms (SVCM and HCVCM) as a pre-processing engine for GEP. Scour downstream of a ski-jump spillway is a highly nonlinear physical process governed by the simultaneous interaction of jet momentum (represented by the Froude number, Fr), geometric boundary constraints (

\frac{H_{1}}{D_{w}}

,

\frac{R}{D_{w}}

), and sediment resistance (

\frac{D_{50}}{D_{w}}

). Traditional empirical regression formulas inherently force these variables into rigid, pre-determined power-law structures, which often fail to accurately capture the nonlinear development of scour depth. The physical rationality of applying SVCM and HCVCM is that they systematically explore nonlinear mathematical transformations of these raw dimensionless parameters. By automatically discovering and filtering new composite variables, SVCM and HCVCM uncover hidden physical interactions, such as the coupled effect of high-velocity jet impacts and specific sediment sizes before passing them to the GEP algorithm. This adaptability improvement ensures that the subsequent GEP model operates on a physically enriched, highly correlated feature space.

5. Comparison of the Developed ML Models with Previous Studies

In this section, the performance of the best proposed hybrid ML models was compared with regression-based and ML methods previously reported for predicting scour hole dimensions below ski-jump spillways.

5.1. Comparison with the Traditional Regression Approach

In order to evaluate the effectiveness of suggested data-driven methods with the traditional nonlinear regression approach, the results of statistical indicators for the prediction of

\frac{D_{s}}{D_{w}}

,

\frac{L_{s}}{D_{w}}

, and

\frac{W_{s}}{D_{w}}

are given in Table 11, Table 12 and Table 13, respectively. To assess the efficacy of the proposed ML models against the traditional nonlinear regression method developed by Azmatullah et al. [27] for estimating scour hole dimensions downstream of ski-jump spillways, Table 11, Table 12 and Table 13 present a comparative statistical analysis of the normalized depth (

\frac{D_{s}}{D_{w}}

), length (

\frac{L_{s}}{D_{w}}

), and width (

\frac{W_{s}}{D_{w}}

) of the scour hole, using the optimal ML model for each parameter.

As seen in Table 11, for the prediction of normalized scour depth (

\frac{D_{s}}{D_{w}}

), the SVCM+GEP model demonstrates a substantial performance advantage over the traditional method. This approach achieved an R² of 0.972 compared to 0.878 for the regression method, representing a 10.7% improvement in values of R². In addition, the SVCM+GEP model reduced the RMSE values from 1.275 to 0.469, a 63.2% reduction, and decreased the MAE by 50.5% (from 0.669 to 0.331). The values of SI also improved markedly, dropping from 31.37% to 13.61%, a 56.6% decrease, indicating much tighter clustering of predictions around observed values.

Regarding the normalized scour length (

\frac{L_{s}}{D_{w}}

), the SVCM+GEP model again outperformed the Azmatullah formulation. This model increased

R^{2}

from 0.928 to 0.967, a 4.2% enhancement, while reducing RMSE from 2.046 to 1.303, corresponding to a 36.3% improvement in predictive accuracy. The MAE decreased by 24.3% (from 1.113 to 0.842), and the SI fell from 16.41% to 10.45%, a 36.3% reduction in relative error variability.

As seen in Table 13, the HCVCM+GEP model is compared to the traditional approach for the normalized scour width (

\frac{W_{s}}{D_{w}}

). The ML model improved the values of R² from 0.836 to 0.981, a 17.3% increase. The error metrics showed considerable reductions of RMSE values, decreased by 68.0% (from 3.559 to 1.140), and MAE dropped by 54.6% (from 1.682 to 0.764). Furthermore, the SI was reduced from 28.84% to 9.14%, indicating about 68.3% improvement in prediction accuracy. These results unequivocally demonstrate that the proposed hybrid ML methods offer significantly higher accuracy and reliability than traditional nonlinear regression.

5.2. Comparison with Previously Developed ML Methods

Table 14 present a comparative evaluation between the most accurate hybrid models developed in the present study, including SVCM+GEP for

\frac{D_{s}}{D_{w}}

and

\frac{L_{s}}{D_{w}}

, and HCVCM+GEP for

\frac{W_{s}}{D_{w}}

, and several previously reported ML algorithms used for estimating scour parameters below ski-jump spillways. The comparison includes GMDH-based models (GMDH-BP, GMDH-PSO, and GMDH-GP), ANFIS, Feed-Forward Backpropagation Neural Network (FFBP-NN), Radial Basis Function Neural Network (RBF-NN), and GP. The comparison was conducted by considering three reported statistical criteria in previous studies, including

R^{2}

, RMSE, and MAPE values. Table 14 presents the comparative results for the estimation of

\frac{D_{s}}{D_{w}}

.

As observed in Table 14, the SVCM+GEP model achieved the highest coefficient of determination (R² = 0.97) and the lowest root mean square error (RMSE = 0.42) and mean absolute percentage error (MAPE = 0.16) among all compared ML models. Relative to the GMDH-BP model that had obtained the best-performing ML in a previous study, the proposed SVCM+GEP improved R² by approximately 3.2%, reduced RMSE by about 33.3%, and achieved a considerably lower MAPE. Table 12 presents the comparison for the prediction of normalized scour length

(\frac{L_{s}}{D_{w}})

.

As clearly observed from Table 15, the SVCM+GEP model again achieved the highest R² value and the lowest MAPE among all compared ML methods. Although the GMDH-BP and GMDH-GP models achieved a slightly lower RMSE value compared to SVCM+GEP, the proposed SVCM+GEP model still outperformed them in terms of overall fit (R²) and relative error (MAPE). This confirms that SVCM+GEP is more accurate for the estimation of

\frac{L_{s}}{D_{w}}

, even if the GMDH-BP and GMDH-GP models showed marginally lower RMSE values. Therefore, SVCM+GEP remains the most accurate ML model for predicting

\frac{L_{s}}{D_{w}}

compared to previously developed ML approaches. Finally, Table 16 compares the accuracy of ML methods for the approximation of

\frac{W_{s}}{D_{w}}

.

As seen in Table 16, the proposed HCVCM+GEP model achieved the highest R² (0.96) and the lowest RMSE (1.32) and MAPE (0.07) compared to all ML algorithms. Compared to the GMDH-BP model as the best prior ML method for the estimation of

\frac{W_{s}}{D_{w}}

, the HCVCM+GEP model improved R² by approximately 2.1%, reduced RMSE by about 12.6%, and reduced MAPE by approximately 79.4%. Overall, the comparative analysis indicated that supervised feature-engineering algorithms (SVCM and HCVCM) with white-box symbolic regression (GEP) fundamentally enhance predictive reliability. The proposed hybrid frameworks successfully overcome the limitations of conventional standalone algorithms by transforming the input space to extract optimal, highly correlated variables. These results demonstrate that combining supervised feature engineering with white-box symbolic regression is an effective and novel strategy for improving the estimation of scour parameters in hydraulic engineering problems.

It is worth mentioning that the obtained results are generally consistent with previous studies reporting the superiority of nonlinear and hybrid machine learning techniques over traditional empirical equations for scour prediction downstream of ski-jump spillways. Earlier investigations employing ANN, ANFIS, SVM, MARS, CART, GEP, and GMDH models have consistently demonstrated that ML methods can effectively capture the complex nonlinear interactions governing scour processes. The present study further confirms that data-driven approaches significantly improve scour prediction accuracy compared with conventional regression-based formulas. In addition, while previous studies have primarily focused on standalone ML algorithms or optimization-based hybridization techniques, including the use of evolutionary algorithms for hyperparameter tuning, the present work introduces supervised feature-engineering techniques (SVCM and HCVCM) prior to symbolic regression modeling. The improved performance of the proposed hybrid models indicates that explicitly constructing transformed composite variables with stronger predictive relevance can substantially enhance both the predictive accuracy and the physical interpretability of ML methods for hydraulic scour modeling.

6. The Overall Model Performance

In addition, for overall model performance and error evaluation metric, the

U_{95}

index was used. Based on the definition of

U_{95}

, the values of

U_{95}

of the standalone and hybrid ML models for the prediction of each normalized scour parameter, including

\frac{D_{s}}{D_{w}}

,

\frac{L_{s}}{D_{w}}

, and

\frac{W_{s}}{D_{w}}

, were calculated by Equations (30)–(32) as follows:

U_{95} = (\frac{1.96}{n}) \sqrt{\sum_{i = 1}^{n} {({(\frac{D_{s}}{D_{w}})}_{i (Observed)} - {\bar{(\frac{D_{s}}{D_{w}})}}_{(Observed)})}^{2} + \sum_{i = 1}^{n} {({(\frac{D_{s}}{D_{w}})}_{i (Observed)} - {(\frac{D_{s}}{D_{w}})}_{i (Predicted)})}^{2}}

(30)

U_{95} = (\frac{1.96}{n}) \sqrt{\sum_{i = 1}^{n} {({(\frac{L_{s}}{D_{w}})}_{i (Observed)} - {\bar{(\frac{L_{s}}{D_{w}})}}_{(Observed)})}^{2} + \sum_{i = 1}^{n} {({(\frac{L_{s}}{D_{w}})}_{i (Observed)} - {(\frac{L_{s}}{D_{w}})}_{i (Predicted)})}^{2}}

(31)

U_{95} = (\frac{1.96}{n}) \sqrt{\sum_{i = 1}^{n} {({(\frac{W_{s}}{D_{w}})}_{i (Observed)} - {\bar{(\frac{W_{s}}{D_{w}})}}_{(Observed)})}^{2} + \sum_{i = 1}^{n} {({(\frac{W_{s}}{D_{w}})}_{i (Observed)} - {(\frac{W_{s}}{D_{w}})}_{i (Predicted)})}^{2}}

(32)

Table 17 presents the calculated

U_{95}

values for all proposed ML models in predicting

(\frac{D_{s}}{D_{w}})

,

(\frac{L_{s}}{D_{w}})

, and

(\frac{W_{s}}{D_{w}})

.

As shown in Table 17, for the estimation of normalized scour depth

(\frac{D_{s}}{D_{w}})

, the

U_{95}

demonstrated that the hybrid SVCM+GEP model yielded the lowest

U_{95}

value of 0.5662 among all proposed models. The standalone SVCM model ranked second with

U_{95}

= 0.5673, followed closely by HCVCM+GEP (

U_{95}

= 0.5693) and GEP (

U_{95}

= 0.5734). The standalone HCVCM model produced the highest

U_{95}

value of 0.5802, consistent with its comparatively larger prediction errors reported in the statistical evaluation. The differences in

U_{95}

values confirm that the integration of SVCM with GEP produces a more stable and dependable model for scour depth prediction, as it minimizes both data variance and prediction residuals simultaneously. Regarding the normalized scour length

(\frac{L_{s}}{D_{w}})

, the SVCM+GEP model again achieved the lowest

U_{95}

value of 1.4442, confirming its superior predictive reliability for this parameter. The standalone SVCM model followed with

U_{95}

= 1.4498, while GEP and HCVCM+GEP produced slightly higher values of 1.4656 and 1.4673, respectively. The HCVCM model exhibited the highest

U_{95}

of 1.5009, suggesting greater prediction instability for scour length. These results fully corroborate the rank mean (RM) analysis, in which SVCM+GEP was also identified as the best-performing model for

\frac{L_{s}}{D_{w}}

prediction. For the normalized scour width

(\frac{W_{s}}{D_{w}})

, the HCVCM+GEP model demonstrated the best performance by achieving the lowest

U_{95}

value of 1.6839. The standalone SVCM model ranked second (

U_{95}

= 1.6992), followed by SVCM+GEP (

U_{95}

= 1.7122) and HCVCM (

U_{95}

= 1.7558). The GEP model produced the highest

U_{95}

of 1.8131, indicating the lowest accuracy for scour width estimation. This result is consistent with the statistical evaluation, which also identified HCVCM+GEP as the best-performing model for

\frac{W_{s}}{D_{w}}

, with the highest

R^{2}

and the lowest RMSE and SI values.

Overall, the

U_{95}

values confirmed that the hybrid proposed ML models provide more reliable predictions than the standalone algorithms for all three scour parameters. In particular, SVCM+GEP delivered the lowest

U_{95}

values for

\frac{D_{s}}{D_{w}}

and

\frac{L_{s}}{D_{w}}

, while HCVCM+GEP presented the lowest

U_{95}

values for

\frac{W_{s}}{D_{w}}

. These results demonstrate that the proposed hybridization of supervised feature-engineering methods with GEP-based symbolic regression improves prediction accuracy and is suitable for practical engineering applications in hydraulic structure design.

7. Five-Fold Cross-Validation Results

The 5-fold CV results were used as a complementary validation tool alongside the statistical indices, RM ranking, and

U_{95}

analysis to provide a comprehensive assessment of each model’s reliability and generalization capability for scour hole geometry estimation. In addition to the conventional train–test split evaluation, cross-validation analysis was employed to further examine the generalization capability and robustness of the best-performing hybrid machine learning models for predicting scour hole dimensions below ski-jump spillways. Cross-validation is considered a more reliable validation strategy than a single train–test split because all available data samples are used for both training and validation in different iterations, thereby reducing the dependence of the results on one particular data partition. In the present study, the total dataset of 95 experimental observations was randomly divided into five equal folds of 19 samples each. This approach was adopted for the prediction of normalized scour depth

(\frac{D_{s}}{D_{w}})

, normalized scour length

(\frac{L_{s}}{D_{w}})

, and normalized scour width

(\frac{W_{s}}{D_{w}})

. The statistical indices, including R² and RMSE, were calculated for each fold. The results of 5-fold cross-validation are indicated in Figure 9.

As observed in Figure 9, for the prediction of

(\frac{D_{s}}{D_{w}})

, the SVCM+GEP model yielded R² values ranging from 0.926 to 0.966 and RMSE values ranging from 0.487 to 0.546 over the five folds. A similar pattern was observed for the prediction of

(\frac{L_{s}}{D_{w}})

, where the SVCM+GEP model achieved R² values between 0.923 and 0.961 and RMSE values between 1.311 and 1.569 over all folds. For the prediction of

(\frac{W_{s}}{D_{w}})

, the HCVCM+GEP model generated R² values ranging from 0.930 to 0.978 and RMSE values ranging from 1.154 to 1.360. In summary, the 5-fold cross-validation results confirmed the generalization ability of the proposed hybrid SVCM+GEP and HCVCM+GEP models for estimating scour hole geometry below ski-jump spillways and indicated that their superior performance was not dependent on a single train–test split.

8. Summary and Conclusions

Accurate scour hole estimation below spillways has always been an important issue in hydraulic engineering. This study introduced novel hybrid ML approaches for the prediction of dimensionless parameters of scour dimensions below the ski-jump bucket spillway using SVCM, HCVCM, GEP, and two hybrid models (SVCM+GEP and HCVCM+GEP). The novel algorithms, namely SVCM and HCVCM, were used to generate new variables to improve accuracy in the estimation of scour hole dimensions. The performance of the ML methods was quantified using multiple statistical indices, the RM method, U95 index, and the cross-validation approach, supported by trend prediction diagrams, DDR plots, and Taylor diagrams. For the estimation of the values of

\frac{D_{s}}{D_{w}}

, the SVCM+GEP has the best accuracy with RMSE = 0.469 and the lowest RM value (1.83), followed by the standalone SVCM and HCVCM+GEP models. In addition, the SVCM+GEP approach again has the highest accuracy for the estimation of

\frac{L_{s}}{D_{w}}

with RMSE = 1.303 and RM = 1.50. The HCVCM+GEP is marginally better than the SVCM+GEP model for the estimation of

\frac{W_{s}}{D_{w}}

, with RMSE = 1.140 and the smallest RM (1.33). For further evaluation of the proposed ML methods, the best ML model for the estimation of each scour hole parameter was compared against nonlinear regression equations and previously reported ML methods. These comparisons highlighted the efficiency of the proposed hybrid ML methods for the estimation of scour geometry. The comprehensive evaluation confirms that hybrid ML approaches combining SVCM/HCVCM with GEP could provide better prediction results than standalone ML models, traditional regression-based equations, and prior ML methods. In the present study, in addition to the standard train/test procedures, cross-validation was employed to ensure the reliability and generalizability of the developed models. The consistency between the statistical performance obtained from the train/test procedure and the cross-validation results indicates that the developed models are not overly sensitive to specific data distributions and possess acceptable generalizability.

It is worth noting that the superior performance of the SVCM+GEP model is intrinsically linked to its automated feature importance capability. By systematically generating and filtering composite variables based on their correlation with the target, SVCM inherently identifies the most dominant physical interactions. The presence of specific coupled variables in the final explicit equations reflects the underlying nonlinear physics of the scour process, effectively serving as an embedded sensitivity and feature selection analysis. The 5-fold cross-validation further confirmed the robustness and generalization ability of the proposed hybrid models. These findings confirm that the hybridization of supervised feature-engineering methods, including SVCM and HCVCM, with GEP can improve the accuracy of hybrid models compared with earlier ML algorithms in scour parameter estimation. Further research is recommended to investigate and examine the applicability of the SVCM and HCVCM methods for modeling complex problems, such as scour depth, in hydraulic engineering.

Author Contributions

Conceptualization, M.S., A.S., M.T., and Z.S.K.; methodology, M.S. and A.S.; software, M.S., A.S., and M.T.; validation, M.S., A.S., M.T., and Z.S.K.; formal analysis, M.S. and Z.S.K.; investigation, M.S., A.S., M.T., and Z.S.K.; resources, M.S., A.S., M.T., and Z.S.K.; data curation, M.S.; writing—original draft preparation, M.S. and A.S.; writing—review and editing, M.S., A.S., M.T., and Z.S.K.; visualization, M.S., A.S., and M.T.; supervision, M.S. and Z.S.K.; project administration, M.S. and Z.S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The complete dataset used in this study is publicly available and has been fully reported in the previously published work of Azamathulla et al. [27].

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Nomenclature

Physical and Geometric Parameters
$D_{s}$	Equilibrium Scour Depth
$L_{s}$	Distance of Maximum Scour Depth from the Spillway Bucket Lip (Scour Length)
$W_{s}$	Scour Width
$D_{w}$	Tailwater Depth
$q$	Unit Discharge
$H_{1}$	Total Head
$R$	Bucket Radius
$D_{50}$	Median Sediment Size
$g$	Acceleration Due to Gravity
$\emptyset$	Bucket Lip Angle
$ρ_{w}$	Density of Water
$ρ_{s}$	Density of Sediment
$F r$	Froude Number
Dimensionless Variables
$\frac{D_{s}}{D_{w}}$	Normalized Scour Depth
$\frac{L_{s}}{D_{w}}$	Normalized Scour Length
$\frac{W_{s}}{D_{w}}$	Normalized Scour Width
$\frac{H_{1}}{D_{w}}$	Normalized Total Head
$\frac{R}{D_{w}}$	Normalized Bucket Radius
$\frac{D_{50}}{D_{w}}$	Normalized Sediment Size
Machine Learning Algorithms
SVCM	Stronger Variable Creator Machine
HCVCM	High Correlation Variable Creator Machine
GEP	Gene Expression Programming
CART	Classification and Regression Trees
MARS	Multivariate Adaptive Regression Spline
GMDH	Group Method of Data Handling
ANN	Artificial Neural Network
ANFIS	Adaptive Neuro-Fuzzy Inference System
SVR-FOA	Support Vector Regression optimized by Fruit-fly Optimization Algorithm
Statistical Evaluation Metrics
R²	Coefficient of Determination
RMSE	Root Mean Square Error
MAE	Mean Absolute Error
MAPE	Mean Average Percentage Error
SI	Scatter Index
BIAS	Bias
DDR	Data Discrepancy Ratio
RM	Rank Mean

References

Boudhaouia, A.; Wira, P. A real-time data analysis platform for short-term water consumption forecasting with machine learning. Forecasting 2021, 3, 682–694. [Google Scholar] [CrossRef]
She, X.; Jia, Y.; Li, R.; Xu, J.; Yang, Y.; Cao, W.; Xiao, L.; Zhao, W. Machine Learning-Based Prediction of External Pressure in High-Speed Rail Tunnels: Model Optimization and Comparison. Forecasting 2025, 7, 33. [Google Scholar] [CrossRef]
Samadi, M.; Afshar, M.H.; Jabbari, E.; Sarkardeh, H. Application of multivariate adaptive regression splines and classification and regression trees to estimate wave-induced scour depth around pile groups. Iran. J. Sci. Technol. Trans. Civ. Eng. 2020, 44, 447–459. [Google Scholar] [CrossRef]
Ebtehaj, I.; Bonakdari, H.; Moradi, F.; Gharabaghi, B.; Sheikh Khozani, Z. An integrated framework of Extreme Learning Machines for predicting scour at pile groups in clear water condition. Coast. Eng. 2018, 135, 1–15. [Google Scholar] [CrossRef]
Azma, A.; Liu, Y.; Eftekhari, M.; Zhang, D. Comparison of hybrid deep learning models for estimation of the time-dependent scour depth downstream of river training structures. Phys. Fluids 2024, 36, 101911. [Google Scholar] [CrossRef]
Azma, A.; Borthwick, A.G.; Liu, Y.; Zhang, D.; Afaridegan, E. Comparative assessment of white-box and black-box machine learning models for predicting free-falling jet scour depth. Eng. Appl. Comput. Fluid Mech. 2026, 20, 2638090. [Google Scholar] [CrossRef]
Najafzadeh, M.; Oliveto, G. More reliable predictions of clear-water scour depth at pile groups by robust artificial intelligence techniques while preserving physical consistency. Soft Comput. 2021, 25, 5723–5746. [Google Scholar] [CrossRef]
Najafzadeh, M.; Oliveto, G. Scour Propagation Rates around Offshore Pipelines Exposed to Currents by Applying Data-Driven Models. Water 2022, 14, 493. [Google Scholar] [CrossRef]
Samadi, M.; Jabbari, E.; Azamathulla, H.M.; Mojallal, M. Estimation of scour depth below free overfall spillways using multivariate adaptive regression splines and artificial neural networks. Eng. Appl. Comput. Fluid Mech. 2015, 9, 291–300. [Google Scholar] [CrossRef]
Azamathulla, H.M.; Deo, M.C.; Deolalikar, P.B. Alternative neural networks to estimate the scour below spillways. Adv. Eng. Softw. 2008, 39, 689–698. [Google Scholar] [CrossRef]
Agarwal, M.; Goyal, M.; Deo, M.C. Locally weighted projection regression for predicting hydraulic parameters. Civ. Eng. Environ. Syst. 2010, 27, 71–80. [Google Scholar] [CrossRef]
Goyal, M.K.; Ojha, C.S.P. Estimation of scour downstream of a ski-jump bucket using support vector and M5 model tree. Water Resour. Manag. 2011, 25, 2177–2195. [Google Scholar] [CrossRef]
Haghiabi, A.H. Estimation of scour downstream of a ski-jump bucket using the multivariate adaptive regression splines. Sci. Iran. 2017, 24, 1789–1801. [Google Scholar] [CrossRef]
Azamathulla, H.M.; Ab Ghani, A.; Azazi Zakaria, N. Prediction of scour below flip bucket using soft computing techniques. In AIP Conference Proceedings; American Institute of Physics: College Park, MD, USA, 2010; Volume 1233, pp. 1588–1593. [Google Scholar] [CrossRef]
Naini, S. Evaluation of RBF, GR and FFBP neural networks for prediction of geometrical dimensions of scour hole below ski-jump spillway. In International Conference on Environmental and Computer Science, Singapore; IACSIT Press: Singapore, 2011; Volume 19, pp. 89–93. [Google Scholar]
Ayoubloo, M.K.; Azamathulla, H.M.; Ahmad, Z.; Ghani, A.A.; Mahjoobi, J.; Rasekh, A. Prediction of scour depth in downstream of ski-jump spillways using soft computing techniques. Int. J. Comput. Appl. 2011, 33, 92–97. [Google Scholar] [CrossRef]
Najafzadeh, M.; Barani, G.A.; Hessami-Kermani, M.R. Group method of data handling to predict scour at down-stream of a ski-jump bucket spillway. Earth Sci. Inform. 2014, 7, 231–248. [Google Scholar] [CrossRef]
Guven, A.; Azamathulla, H.M. Gene-expression programming for flip-bucket spillway scour. Water Sci. Technol. 2012, 65, 1982–1987. [Google Scholar] [CrossRef]
Shafagh Loron, R.; Samadi, M.; Shamsai, A. Predictive explicit expressions from data-driven models for estimation of scour depth below ski-jump bucket spillways. Water Supply 2023, 23, 304–316. [Google Scholar] [CrossRef]
Fuladipanah, M.; Azamathulla, H.M.; Tota-Maharaj, K.; Mandala, V.; Chadee, A. Precise forecasting of scour depth downstream of flip bucket spillway through data-driven models. Results Eng. 2023, 20, 101604. [Google Scholar] [CrossRef]
Sammen, S.S.; Ghorbani, M.A.; Malik, A.; Tikhamarine, Y.; AmirRahmani, M.; Al-Ansari, N.; Chau, K.W. Enhanced artificial neural network with Harris hawks optimization for predicting scour depth downstream of ski-jump spillway. Appl. Sci. 2020, 10, 5160. [Google Scholar] [CrossRef]
Wang, L.; Zhang, G.; Yin, X.; Zhang, H.; Kashani, M.H.; Roshni, T.; Meshram, S.G. Hybrid model of support vector regression and innovative gunner optimization algorithm for estimating ski-jump spillway scour depth. Appl. Water Sci. 2023, 13, 11. [Google Scholar] [CrossRef]
Sun, X.; Bi, Y.; Karami, H.; Naini, S.; Band, S.S.; Mosavi, A. Hybrid model of support vector regression and fruitfly optimization algorithm for predicting ski-jump spillway scour geometry. Eng. Appl. Comput. Fluid Mech. 2021, 15, 272–291. [Google Scholar] [CrossRef]
Shishegaran, A.; Varaee, H.; Rabczuk, T.; Shishegaran, G. High correlated variables creator machine: Prediction of the compressive strength of concrete. Comput. Struct. 2021, 247, 106479. [Google Scholar] [CrossRef]
Shishegaran, A.; Saeedi, M.; Mirvalad, S.; Korayem, A.H. Computational predictions for estimating the performance of flexural and compressive strength of epoxy resin-based artificial stones. Eng. Comput. 2023, 39, 347–372. [Google Scholar] [CrossRef]
Shishegaran, A.; Samadi, M.; Torabi, M. Novel computational methods to predict the compressive strength of hydrothermally solidified clay. Front. Struct. Civ. Eng. 2025, 19, 2054–2072. [Google Scholar] [CrossRef]
Azmathullah, H.M.; Deo, M.C.; Deolalikar, P.B. Neural networks for estimation of scour downstream of a ski-jump bucket. J. Hydraul. Eng. 2005, 131, 898–908. [Google Scholar] [CrossRef]
Samadi, M.; Sarkardeh, H.; Jabbari, E. Prediction of the dynamic pressure distribution in hydraulic structures using soft computing methods: M. Samadi et al. Soft Comput. 2021, 25, 3873–3888. [Google Scholar] [CrossRef]
Ahmed, A.N.; Van Lam, T.; Hung, N.D.; Van Thieu, N.; Kisi, O.; El-Shafie, A. A comprehensive comparison of recent developed meta-heuristic algorithms for streamflow time series forecasting problem. Appl. Soft Comput. 2021, 105, 107282. [Google Scholar] [CrossRef]
Essam, Y.; Huang, Y.F.; Birima, A.H.; Ahmed, A.N.; El-Shafie, A. Predicting suspended sediment load in Peninsu-lar Malaysia using support vector machine and deep learning algorithms. Sci. Rep. 2022, 12, 302. [Google Scholar] [CrossRef]
Saberi-Movahed, F.; Najafzadeh, M.; Mehrpooya, A. Receiving more accurate predictions for longitudinal dispersion coefficients in water pipelines: Training group method of data handling using extreme learning machine conceptions. Water Resour. Manag. 2020, 34, 529–561. [Google Scholar] [CrossRef]
Samadi, M.; Torabi, M.; Ghorbani, M.K.; Hamidifar, H.; Nikoo, M.R. The application of data-driven models for estimating the discharge coefficients of labyrinth gates. Eng. Appl. Artif. Intell. 2026, 171, 114277. [Google Scholar] [CrossRef]

Figure 1. The schematic view of the cross-section of the scour process downstream of the ski-jump spillway.

Figure 2. The comparison of the observed

\frac{D_{s}}{D_{w}}

with the results of (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 2. The comparison of the observed

\frac{D_{s}}{D_{w}}

with the results of (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 3. The DDR values for prediction of

\frac{D_{s}}{D_{w}}

resulting from (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 3. The DDR values for prediction of

\frac{D_{s}}{D_{w}}

resulting from (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 4. The comparison of the observed

\frac{L_{s}}{D_{w}}

with the results of (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP and (e) SVCM+GEP.

Figure 4. The comparison of the observed

\frac{L_{s}}{D_{w}}

with the results of (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP and (e) SVCM+GEP.

Figure 5. The DDR values for prediction of

\frac{L_{s}}{D_{w}}

resulting from (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 5. The DDR values for prediction of

\frac{L_{s}}{D_{w}}

resulting from (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 6. The comparison of the observed

\frac{W_{s}}{D_{w}}

with the results of (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 6. The comparison of the observed

\frac{W_{s}}{D_{w}}

with the results of (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 7. The DDR values for prediction of

\frac{W_{s}}{D_{w}}

resulting from (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 7. The DDR values for prediction of

\frac{W_{s}}{D_{w}}

resulting from (a) HCVM, (b) SVCM, (c) GEP, (d) HCVM+GEP, and (e) SVCM+GEP.

Figure 8. Taylor diagram comparing the overall performance of HCVCM, SVCM, GEP, HCVCM+GEP, and SVCM+GEP models using all datasets: (a)

\frac{D_{s}}{D_{w}}

, (b)

\frac{L_{s}}{D_{w}}

, and (c)

\frac{W_{s}}{D_{w}}

.

Figure 8. Taylor diagram comparing the overall performance of HCVCM, SVCM, GEP, HCVCM+GEP, and SVCM+GEP models using all datasets: (a)

\frac{D_{s}}{D_{w}}

, (b)

\frac{L_{s}}{D_{w}}

, and (c)

\frac{W_{s}}{D_{w}}

.

Figure 9. The results of 5-fold cross-validation for prediction of (a)

\frac{D_{s}}{D_{w}}

; (b)

\frac{L_{s}}{D_{w}}

; and (c)

\frac{W_{s}}{D_{w}}

.

Figure 9. The results of 5-fold cross-validation for prediction of (a)

\frac{D_{s}}{D_{w}}

; (b)

\frac{L_{s}}{D_{w}}

; and (c)

\frac{W_{s}}{D_{w}}

.

Table 1. The summary of statistical parameters of involved variables for scour modeling below ski-jump spillways [27].

Parameter	$F r$	$\frac{H_{1}}{D_{w}}$	$\frac{R}{D_{w}}$	$ϕ$	$\frac{D_{50}}{D_{w}}$	$\frac{D_{s}}{D_{w}}$	$\frac{L_{s}}{D_{w}}$	$\frac{W_{s}}{D_{w}}$
Minimum	0.026	2.791	1.000	0.126	0.008	0.288	2.454	3.208
Maximum	4.634	37.760	13.533	0.780	0.280	12.542	32.690	54.333
Average	0.945	7.827	3.341	0.544	0.112	3.358	12.366	12.382
Standard deviation	0.896	5.373	2.163	0.094	0.096	2.794	7.136	8.349

Table 2. Mathematical expressions derived from the proposed ML models for estimating

\frac{D_{s}}{D_{w}}

.

Table 2. Mathematical expressions derived from the proposed ML models for estimating

\frac{D_{s}}{D_{w}}

.

Formula	Approach	No.
$\begin{matrix} \frac{D_{s}}{D_{w}} = & 7.424 + 0.363 c o s (l n (\frac{R}{D_{w}})) + 0.500 l n (\frac{H_{1}}{D_{w}}) l n (\frac{R}{D_{w}}) c o s (l n (\frac{R}{D_{w}})) \\ - 0.051 \frac{e x p (3.945 {c o s}^{2} (F r))}{c o s (F r)} - 3.863 c o s (F r) - 0.885 l n (\frac{H_{1}}{D_{w}}) c o s (\frac{D_{50}}{D_{w}}) \end{matrix}$	HCVCM+GEP	(14)
$\begin{matrix} \frac{D_{s}}{D_{w}} = & ϕ^{5} + 6.277 F r + 2.875 \frac{D_{50}}{D_{w}} - 0.036 t a n (F r) \\ - 0.175 F r \frac{R}{D_{w}} - 0.596 F r c o s (F r) - 0.946 F r l n (\frac{H_{1}}{D_{w}}) \end{matrix}$	SVCM+GEP	(15)
$\begin{matrix} \frac{D_{s}}{D_{w}} = & 0.157 + 14.520 \frac{D_{50}}{D_{w}} + 2.803 F r + 25.364 F r \frac{D_{50}}{D_{w}} + 0.275 F r \frac{R}{D_{w}} \\ - 3.548 \frac{R}{D_{w}} \frac{D_{50}}{D_{w}} - 1.604 F r - 95.497 \frac{D_{50}^{2}}{D_{w}} \end{matrix}$	GEP	(16)
$\begin{matrix} \frac{D_{s}}{D_{w}} = & - 5.009 c o s (F r) - 3.136 \times 10^{- 2} c o s (\frac{H_{1}}{D_{w}}) - 0.618 l n (\frac{H_{1}}{D_{w}}) - 8.423 l n (\frac{R}{D_{w}}) \\ - 3.376 ϕ^{3} + 2.318 ϕ^{5} - 1.549 \times 10^{4} c o s (\frac{D_{50}}{D_{w}}) - 7.683 \times 10^{3} {(\frac{D_{50}}{D_{w}})}^{2} + 1.551 \times 10^{4} \end{matrix}$	HCVCM	(17)
$\begin{matrix} \frac{D_{s}}{D_{w}} = & 2.049 F r - 3.077 c o s (F r) - 0.284 (\frac{H_{1}}{D_{w}}) - 0.049 c o s (\frac{H_{1}}{D_{w}}) + 1.488 l n (\frac{H_{1}}{D_{w}}) \\ - 0.568 (\frac{R}{D_{w}}) + 0.913 l n (\frac{R}{D_{w}}) + 39.067 (ϕ) + 252.309 c o s (ϕ) + 134.645 ϕ^{3} \\ - 60.402 θ^{5} + 17.349 (\frac{D_{50}}{D_{w}}) + 9722.012 c o s (\frac{D_{50}}{D_{w}}) + 4776.357 {(\frac{D_{50}}{D_{w}})}^{2} - 9975.190 \end{matrix}$	SVCM	(18)

Table 3. The values of statistical indices of the proposed methods for the prediction of

\frac{D_{s}}{D_{w}}

.

Table 3. The values of statistical indices of the proposed methods for the prediction of

\frac{D_{s}}{D_{w}}

.

Method	Category	R²	MAE	RMSE	SI (%)	BIAS	MAPE
HCVCM	Training	0.923	0.584	0.792	23.47	0.003	0.335
	Testing	0.899	0.664	0.761	22.06	−0.014	0.442
	All data	0.920	0.600	0.786	23.19	$- 1.68 \times 10^{- 6}$	0.356
SVCM	Training	0.967	0.397	0.522	15.46	0.017	0.215
	Testing	0.974	0.316	0.419	12.13	−0.067	0.171
	All data	0.967	0.381	0.503	14.84	$- 4.37 \times 10^{- 6}$	0.207
GEP	Training	0.942	0.413	0.696	20.63	0.022	0.192
	Testing	0.969	0.312	0.422	12.22	−0.024	0.124
	All data	0.946	0.393	0.651	19.12	0.013	0.178
HCVM+GEP	Training	0.957	0.389	0.594	17.59	0.033	0.240
	Testing	0.981	0.292	0.367	10.62	−0.044	0.141
	All data	0.960	0.369	0.556	16.13	0.018	0.220
SVCM+GEP	Training	0.974	0.335	0.480	14.22	0.081	0.198
	Testing	0.973	0.316	0.423	12.26	−0.040	0.160
	All data	0.972	0.331	0.469	13.61	0.057	0.190

Table 4. The RM values for

\frac{D_{s}}{D_{w}}

proposed models for all datasets.

Table 4. The RM values for

\frac{D_{s}}{D_{w}}

proposed models for all datasets.

Method	Rank (R²)	Rank (MAE)	Rank (RMSE)	Rank (SI)	Rank (BIAS)	Rank (MAPE)	RM
HCVCM	5	5	5	5	2	5	4.50
SVCM	2	3	2	2	1	3	2.17
GEP	4	4	4	4	3	1	3.33
HCVCM+GEP	3	2	3	3	4	4	3.17
SVCM+GEP	1	1	1	1	5	2	1.83

Table 5. Mathematical expressions derived from the proposed ML models for estimating

\frac{L_{s}}{D_{w}}

.

Table 5. Mathematical expressions derived from the proposed ML models for estimating

\frac{L_{s}}{D_{w}}

.

Formula	Approach	No.
$\begin{matrix} \frac{L_{s}}{D_{w}} = & 71.757 c o s (F r) + 1475.858 c o s (F r) (ϕ)^{3} + 242.478 l n (\frac{R}{D_{w}}) c o s (ϕ) + 0.835 {c o s}^{2} (F r) \\ - 246.843 - 4739.572 ϕ^{5} - 67.277 c o s (F r) c o s (ϕ) - 2530.603 l n (\frac{R}{D_{w}}) ϕ^{3} \end{matrix}$	HCVCM+GEP	(20)
$\begin{matrix} \frac{L_{s}}{D_{w}} = & 1.625 + 1.857 \frac{H_{1}}{D_{w}} + c o s (\frac{H_{1}}{D_{w}}) l n (\frac{R}{D_{w}}) + 7.366 F r (ϕ)^{3} c o s (ϕ) \\ + 1.857 \frac{H_{1}}{D_{w}} c o s (\frac{H_{1}}{D_{w}}) l n (\frac{R}{D_{w}}) + \frac{- 0.034 - l n (\frac{R}{D_{w}})}{F r} - l n (\frac{R}{D_{w}}) - F r^{2} \end{matrix}$	SVCM+GEP	(21)
$\begin{matrix} \frac{L_{s}}{D_{w}} = & 5.253 + 2.767 F r + 0.378 \frac{H_{1}}{D_{w}} + 2.008 l o g (F r) \\ + 8.667 F r \frac{D_{50}}{D_{w}} + 1.349 \frac{R}{D_{w}} ϕ - 0.263 F r \frac{R}{D_{w}} \end{matrix}$	GEP	(22)
$\begin{matrix} \frac{L_{s}}{D_{w}} = & 9.648 l n (\frac{H_{1}}{D_{w}}) + 27.737 (ϕ)^{3} - 55.348 (ϕ)^{5} + 231.995 c o s (\frac{D_{50}}{D_{w}}) + 152.532 e x p (\frac{D_{50}}{D_{w}}) \\ - 1143.051 {(\frac{D_{50}}{D_{w}})}^{2} + 2656.497 {(\frac{D_{50}}{D_{w}})}^{3} + 488.875 {(\frac{D_{50}}{D_{w}})}^{5} - 397.027 \end{matrix}$	HCVCM	(23)
$\begin{matrix} \frac{L_{s}}{D_{w}} = & 5.013 F r - 0.411 (\frac{H_{1}}{D_{w}}) + 5.840 l n (\frac{H_{1}}{D_{w}}) + 0.565 (\frac{R}{D_{w}}) + 18.560 (ϕ) \\ - 45.837 (ϕ)^{3} + 38.006 (ϕ)^{5} + 21.237 (\frac{D_{50}}{D_{w}}) + 29.454 c o s (\frac{D_{50}}{D_{w}}) - 3.797 e x p (\frac{1}{D_{w}}) \\ - 108.732 {(\frac{D_{50}}{D_{w}})}^{2} + 334.796 {(\frac{D_{50}}{D_{w}})}^{3} + 70.653 {(\frac{D_{50}}{D_{w}})}^{5} - 33.095 \end{matrix}$	SVCM	(24)

Table 6. Comparison between the proposed methods for the prediction of

\frac{L_{s}}{D_{w}}

.

Table 6. Comparison between the proposed methods for the prediction of

\frac{L_{s}}{D_{w}}

.

Method	Category	R²	MAE	RMSE	SI (%)	BIAS	MAPE
HCVCM	Training	0.871	2.006	2.526	20.30	−0.034	0.204
	Testing	0.937	1.413	1.905	15.14	0.136	0.206
	All data	0.883	1.888	2.414	19.36	$2.20 \times 10^{- 5}$	0.205
SVCM	Training	0.959	1.0.67	1.425	11.45	−0.036	0.116
	Testing	0.960	1.123	1.542	12.25	0.147	0.162
	All data	0.958	1.078	1.449	11.62	$1.93 \times 10^{- 4}$	0.126
GEP	Training	0.963	1.392	1.766	14.20	0.793	0.184
	Testing	0.966	1.485	1.929	15.33	0.341	0.219
	All data	0.962	1.411	1.800	14.43	0.703	0.191
HCVM+GEP	Training	0.933	1.285	1.831	14.72	0.193	0.158
	Testing	0.954	1.323	1.840	14.63	−0.144	0.192
	All data	0.933	1.292	1.833	14.70	0.126	0.165
SVCM+GEP	Training	0.972	0.779	1.176	9.46	−0.062	0.075
	Testing	0.950	1.092	1.719	13.66	−0.508	0.120
	All data	0.967	0.842	1.303	10.45	−0.151	0.084

Table 7. The RM values for

\frac{L_{s}}{D_{w}}

proposed models for all datasets.

Table 7. The RM values for

\frac{L_{s}}{D_{w}}

proposed models for all datasets.

Method	Rank (R²)	Rank (MAE)	Rank (RMSE)	Rank (SI)	Rank (BIAS)	Rank (MAPE)	RM
HCVCM	5	5	5	5	1	5	4.33
SVCM	3	2	2	2	2	2	2.17
GEP	2	4	3	3	5	4	3.50
HCVCM+GEP	4	3	4	4	3	3	3.50
SVCM+GEP	1	1	1	1	4	1	1.50

Table 8. Mathematical expressions derived from the proposed ML models for estimating

\frac{W_{s}}{D_{w}}

.

Table 8. Mathematical expressions derived from the proposed ML models for estimating

\frac{W_{s}}{D_{w}}

.

Formula	Approach	No.
$\begin{matrix} \frac{W_{s}}{D_{w}} = & 466.873 + 1.390 c o s (F r) - 463.412 θ^{3} - 1.051 c o s (F r) ϕ^{5} \\ - 2.713 c o s (F r) s i n (c o s (0.663 + 685.258 l n (\frac{H_{1}}{D_{w}}) + 8.522 c o s (ϕ))) \end{matrix}$	HCVCM+GEP	(25)
$\begin{matrix} \frac{W_{s}}{D_{w}} = & \| F r + 188.024 c o s (ϕ) + 36.629 (\frac{R}{D_{w}}) + 355.192 l n (\frac{R}{D_{w}}) c o s (ϕ) \\ - 34.090 - 1.515 c o s (F r) - 16.058 θ^{3} \| - 3.008 \end{matrix}$	SVCM+GEP	(26)
$\begin{matrix} \frac{W_{s}}{D_{w}} = & 1.9137 + 25.178 (\frac{D_{50}}{D_{w}}) + 0.942 (\frac{H_{1}}{D_{w}}) \\ + 37.690 (\frac{D_{50}}{D_{w}}) s i n (\tan (\tan (1.984 + \log (34.891 \frac{D_{50}}{D_{w}} - 0.0669)))) - 1.561 (\frac{D_{50}}{D_{w}}) \\ - 0.694 (\frac{H_{1}}{D_{w}}) s i n (t a n (t a n (1.984 + l o g (34.891 \frac{D_{50}}{D_{w}} - 0.067)))) - 1.561 (\frac{D_{50}}{D_{w}}) \end{matrix}$	GEP	(27)
$\begin{matrix} \frac{W_{s}}{D_{w}} = & 0.249 c o s (ϕ) - 481.513 l n (ϕ) - 7.126 (ϕ)^{3} - 391.250 (θ)^{5} + 215.127 c o s (\frac{D_{50}}{D_{w}}) \\ - 4.663 \times 10^{5} e x p (\frac{D_{50}}{D_{w}}) + 32.827 {(\frac{D_{50}}{D_{w}})}^{2} - 2.342 \times 10^{5} {(\frac{D_{50}}{D_{w}})}^{3} \\ - 9122.996 {(\frac{D_{50}}{D_{w}})}^{5} - 4.667 \times 10^{5} \end{matrix}$	HCVCM	(28)
$\begin{matrix} \frac{W_{s}}{D_{w}} = & 0.051 F r + 1.050 (\frac{H_{1}}{D_{w}}) - 0.216 (\frac{R}{D_{w}}) + 0.078 θ - 2661.171 c o s (ϕ) \\ - 8737.933 l n (ϕ) + 182.321 (ϕ)^{3} - 3612.250 (ϕ)^{5} + 1374.113 (\frac{D_{50}}{D_{w}}) \\ - 1.614 \times 10^{7} c o s (\frac{D_{50}}{D_{w}}) - 1.888 \times 10^{7} e x p (\frac{D_{50}}{D_{w}}) + 1.614 \times 10^{7} {(\frac{D_{50}}{D_{w}})}^{2} \\ - 1.751 \times 10^{7} {(\frac{D_{50}}{D_{w}})}^{3} - 2.659 \times 10^{6} {(\frac{D_{50}}{D_{w}})}^{5} + 2.753 \times 10^{6} \end{matrix}$	SVCM	(29)

Table 9. Comparison between the proposed methods for the prediction of

\frac{W_{s}}{D_{w}}

.

Table 9. Comparison between the proposed methods for the prediction of

\frac{W_{s}}{D_{w}}

.

Method	Category	R²	MAE	RMSE	SI (%)	BIAS	MAPE
HCVCM	Training	0.928	2.019	2.481	19.83	−0.351	0.198
	Testing	0.873	2.193	3.534	28.61	1.405	0.173
	All data	0.892	2.053	2.724	21.83	0.000	0.193
SVCM	Training	0.965	1.043	1.659	13.26	−0.066	0.122
	Testing	0.955	0.938	1.386	11.22	0.302	0.094
	All data	0.962	1.022	1.608	12.89	0.007	0.116
GEP	Training	0.821	2.778	3.825	30.57	−0.773	0.308
	Testing	0.934	1.612	1.960	15.87	−0.776	0.153
	All data	0.829	2.545	3.532	28.30	−0.774	0.277
HCVM+GEP	Training	0.985	0.750	1.093	8.73	−0.104	0.087
	Testing	0.955	0.817	1.315	10.64	−0.304	0.072
	All data	0.981	0.764	1.140	9.14	−0.144	0.084
SVCM+GEP	Training	0.951	0.786	1.992	15.92	0.389	0.117
	Testing	0.934	0.813	1.587	12.85	0.380	0.103
	All data	0.949	0.792	1.918	15.37	0.387	0.114

Table 10. The RM values for

\frac{W_{s}}{D_{w}}

proposed models for all datasets.

Table 10. The RM values for

\frac{W_{s}}{D_{w}}

proposed models for all datasets.

Method	Rank (R²)	Rank (MAE)	Rank (RMSE)	Rank (SI)	Rank (BIAS)	Rank (MAPE)	RM
HCVCM	4	4	4	4	1	4	3.50
SVCM	2	3	2	2	2	3	2.33
GEP	5	5	5	5	5	5	5.00
HCVCM+GEP	1	1	1	1	3	1	1.33
SVCM+GEP	3	2	3	3	4	2	2.83

Table 11. The results of the

\frac{D_{s}}{D_{w}}

for all datasets compared to the traditional method.

Table 11. The results of the

\frac{D_{s}}{D_{w}}

for all datasets compared to the traditional method.

Method	R²	MAE	RMSE	SI (%)	BIAS	MAPE
SVCM+GEP	0.972	0.331	0.469	13.61	0.057	0.190
Azmathullah et al. [27]	0.878	0.669	1.275	31.37	−0.250	0.232

Table 12. The results of the

\frac{L_{s}}{D_{w}}

for all datasets compared to the traditional method.

Table 12. The results of the

\frac{L_{s}}{D_{w}}

for all datasets compared to the traditional method.

Method	R²	MAE	RMSE	SI (%)	BIAS	MAPE
SVCM+GEP	0.967	0.842	1.303	10.45	−0.151	0.084
Azmathullah et al. [27]	0.928	1.113	2.046	16.41	−0.049	0.089

Table 13. The results of the Ws/Dw for all datasets compared to the traditional method.

Method	R²	MAE	RMSE	SI (%)	BIAS	MAPE
HCVCM+GEP	0.981	0.764	1.140	9.14	−0.144	0.084
Azmathullah et al. [27]	0.836	1.682	3.559	28.84	−0.723	0.147

Table 14. Comparison of the best hybrid model (SVCM+GEP) in the present study for the estimation of

\frac{D_{s}}{D_{w}}

with previous ML models by Najafzadeh et al. [17].

Table 14. Comparison of the best hybrid model (SVCM+GEP) in the present study for the estimation of

\frac{D_{s}}{D_{w}}

with previous ML models by Najafzadeh et al. [17].

Model	R²	RMSE	MAPE
SVCM+GEP	0.97	0.42	0.16
GMDH-BP	0.94	0.63	0.96
GMDH-PSO	0.94	0.67	1.07
GMDH-GP	0.90	0.84	1.21
ANFIS	0.85	0.93	1.37
FFBP-NN	0.86	0.70	0.95
RBF-NN	0.88	0.75	0.89
GP	0.88	0.81	1.08

Table 15. Comparison of the best hybrid model (SVCM+GEP) in the present study for estimation of

\frac{L_{s}}{D_{w}}

with previous ML models by Najafzadeh et al. [17].

Table 15. Comparison of the best hybrid model (SVCM+GEP) in the present study for estimation of

\frac{L_{s}}{D_{w}}

with previous ML models by Najafzadeh et al. [17].

Model	R²	RMSE	MAPE
SVCM+GEP	0.95	1.72	0.12
GMDH-BP	0.94	1.64	0.38
GMDH-PSO	0.86	3.86	1.91
GMDH-GP	0.94	1.64	0.38
ANFIS	0.90	1.91	0.53
FFBP-NN	0.92	1.83	0.46
RBF-NN	0.86	2.27	0.55
GP	0.90	2.41	0.67

Table 16. Comparison of the best hybrid model (HCVCM+GEP) in the present study for estimation of

\frac{W_{s}}{D_{w}}

with previous ML models by Najafzadeh et al. [17].

Table 16. Comparison of the best hybrid model (HCVCM+GEP) in the present study for estimation of

\frac{W_{s}}{D_{w}}

with previous ML models by Najafzadeh et al. [17].

Model	R²	RMSE	MAPE
HCVCM+GEP	0.96	1.32	0.07
GMDH-BP	0.94	1.51	0.34
GMDH-PSO	0.90	1.82	0.56
GMDH-GP	0.81	1.51	0.34
ANFIS	0.92	1.50	0.31
FFBP-NN	0.85	2.17	0.48
RBF-NN	0.86	1.96	0.37
GP	0.88	2.12	0.64

Table 17. The values of

U_{95}

for the proposed ML methods in the present study.

Table 17. The values of

U_{95}

for the proposed ML methods in the present study.

ML Model	$D_{s} / D_{w}$	$L_{s} / D_{w}$	$W_{s} / D_{w}$
HCVCM	0.5802	1.5009	1.7558
SVCM	0.5673	1.4498	1.6992
GEP	0.5734	1.4656	1.8131
HCVCM+GEP	0.5693	1.4673	1.6839
SVCM+GEP	0.5662	1.4442	1.7122

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Samadi, M.; Shishegaran, A.; Torabi, M.; Sheikh Khozani, Z. Prediction of Scour Hole Geometry Downstream of Ski-Jump Spillways Using Novel Intelligent Computational Machine Learning Models. Forecasting 2026, 8, 49. https://doi.org/10.3390/forecast8030049

AMA Style

Samadi M, Shishegaran A, Torabi M, Sheikh Khozani Z. Prediction of Scour Hole Geometry Downstream of Ski-Jump Spillways Using Novel Intelligent Computational Machine Learning Models. Forecasting. 2026; 8(3):49. https://doi.org/10.3390/forecast8030049

Chicago/Turabian Style

Samadi, Mehrshad, Aydin Shishegaran, Mina Torabi, and Zohreh Sheikh Khozani. 2026. "Prediction of Scour Hole Geometry Downstream of Ski-Jump Spillways Using Novel Intelligent Computational Machine Learning Models" Forecasting 8, no. 3: 49. https://doi.org/10.3390/forecast8030049

APA Style

Samadi, M., Shishegaran, A., Torabi, M., & Sheikh Khozani, Z. (2026). Prediction of Scour Hole Geometry Downstream of Ski-Jump Spillways Using Novel Intelligent Computational Machine Learning Models. Forecasting, 8(3), 49. https://doi.org/10.3390/forecast8030049

Article Menu

Prediction of Scour Hole Geometry Downstream of Ski-Jump Spillways Using Novel Intelligent Computational Machine Learning Models

Highlights

Abstract

1. Introduction

2. Methodology

2.1. Laboratory Models, Experimental Data and Dimensional Analysis

2.2. Traditional Regression Relationships for Estimating Scour Hole Parameters

2.3. SVCM and HCVCM Algorithm

2.3.1. SVCM Algorithm

2.3.2. HCVM Algorithm

2.3.3. Relationship and Distinction Between SVCM and HCVCM

2.4. GEP Algorithm

2.5. Hybrid Algorithms (HCVCM+GEP and SVCM+GEP)

3. Performance Criteria and Evaluation Methods

3.1. Statistical Metrics

3.2. Rank Mean Method

3.3. Overall Model Performance Using U 95 Index

3.4. Cross-Validation Approach

4. Result and Discussion

4.1. Result of Modeling D s D w

4.2. Result of Modeling L s D w

4.3. Results of Modeling W s D w

5. Comparison of the Developed ML Models with Previous Studies

5.1. Comparison with the Traditional Regression Approach

5.2. Comparison with Previously Developed ML Methods

6. The Overall Model Performance

7. Five-Fold Cross-Validation Results

8. Summary and Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3. Overall Model Performance Using $U_{95}$ Index

4.1. Result of Modeling $\frac{D_{s}}{D_{w}}$

4.2. Result of Modeling $\frac{L_{s}}{D_{w}}$

4.3. Results of Modeling $\frac{W_{s}}{D_{w}}$