To meet the objective of the study (i.e., investigating the optimization capability of the abovementioned metaheuristic algorithms), the algorithms should be coupled with the ANN. The aim of this work was to let this algorithm find the most appropriate matrix of the weights and biases for the ANN. To this end, firstly, an ANN with one hidden layer containing six neurons (determined by a trial-and-error process) was proposed as the base model. Thus, regarding the number of input/output parameters, the considered MLP took the form 7 × 6 × 1. Note that, in the present study, the activation functions of the hidden and output neurons were set to be “tangent-sigmoid (i.e., Tansig)” and “purelin”, respectively. Next, it was mathematically synthesized with the WOA, LCA, MFO, and ACO algorithms to create WOA–ANN, LCA–ANN, MFO–ANN, and ACO–ANN neural ensembles.

#### 4.1. Hybridizing the ANN Using Metaheuristic Algorithms

After creating the ensembles, a population-based trial-and-error process was carried out to achieve the best-fitted complexity of the metaheuristic algorithms. To do so, all four networks were tested with nine different population sizes including 10, 25, 50, 75, 100, 200, 300, 400, and 500. Each model performed 1000 repetitions to minimize the error. In this process, root-mean-square error (RMSE) was set as the objective function (OF) to measure the training error in each iteration. This function is expressed in Equation (2).

Figure 3a shows the obtained RMSEs for the tested population sizes. Also, the convergence curve of the most accurate one is illustrated in this figure.

where

N is the number of data, and

Y_{i observed} and

Y_{i predicted} stand for the observed and predicted stability values.

As can be seen, all four models exhibited an acceptable error in analyzing the relationship between the stability condition and its influential parameters. In detail, the smallest error was obtained for the WOA–ANN with a population size of 400 (RMSE = 0.307658318), LCA–ANN with a population size of 200 (RMSE = 0.312263011), MFO–ANN with a population size of 50 (RMSE = 0.298588793), and ACO–ANN with a population size of 10 (RMSE = 0.274504799).

#### 4.3. Accuracy Assessment of the Predictive Models

In this part, the results of the best-fitted models (i.e., with elite population sizes) are evaluated to examine their simulation capability. As is known, the results of the training phase address the learning quality of the model, and the testing results indicate the generalization capability for unseen conditions of the problem.

In the training phase, the calculated values of RMSE and MAE for the typical ANN were 0.3465 and 0.3055, respectively. Both of these indices experienced considerable decreases by applying the WOA (0.3076 and 0.2555), LCA (0.3122 and 0.2592), MFO (0.2985 and 0.2430), and ACO (0.2745 and 0.1783) optimization techniques. Also, in terms of the AUROC, the accuracy of the ANN was increased from 0.956 to 0.969, 0.964, 0.969, and 0.965, respectively. At a glance, it can be deduced that the models can improve the learning capability of the ANN.

Figure 4 displays the predicted and actual stability values for the ensemble models. The output ranges were [−0.196124259, 1.163826771], [−0.285459666, 1.165811194], [−0.280854543, 1.220819059], and [−0.323683705, 1.197618633], respectively.

Similar to the first phase, all the neural-metaheuristic ensembles surpassed the ANN in the testing phase which means the algorithms have performed efficiently in adjusting the computational weights and biases of this tool. In detail, the RMSE was reduced from 0.3465 to 0.3076, 0.3122, 0.2985, and 0.2745. As for the MAE, it fell from 0.3055 to 0.2555, 0.2592, 0.2430, and 0.1783. The differences between the actual and predicted stability values (labeled as error) are illustrated in

Figure 5, along with the histogram of the errors. The products of the WOA-ANN, LCA-ANN, MFO-ANN, and ACO-ANN vary in the extents [−0.19260775, 1.121547514], [−0.221848351, 1.183820947], [−0.176990489, 1.103579629], and [−0.072023941, 1.206028442], respectively.

Moreover, the ROC curves for the prediction of ensemble models are shown in

Figure 6. The calculated areas under the curves indicate more than 90% accuracy for all five models. However, the AUROCs of the hybrid ensembles were significantly higher than the unreinforced ANN (AUROC = 0.930).

Until now, all used criteria confirmed that the metaheuristic algorithms can develop a more powerful ANN compared to the BP learning method. The results of the WOA–ANN, LCA–ANN, MFO–ANN, and ACO–ANN tools are evaluated in this section to compare the efficiency of the algorithms. A score-based system was developed to rank the models and determine the most accurate one. As

Table 2;

Table 3 denote, each model received three scores based on the calculated RMSE, MAE, and AUROC. Then, the summation of these scores determined the model producing the most consistent outputs in each phase. According to

Table 3, the ACO-based model grasped the highest scores in terms of all accuracy criteria except for the training AUROC (0.965) which was second to the WOA and MFO (0.969). Therefore, it grasped the highest overall scores in both training and testing phases, followed by the MFO, which closely surpassed the WOA, while the LCA featured the lowest rank.

Moreover, in comparison with the HHO and DA applied to the same data by Moayedi et al. [

23], it was deduced that the methods of the current study present a more accurate analysis and approximation of bearing capacity. In detail, the RMSE and MAE obtained for the DA–MLP (superior to the HHO–MLP) were 0.3421 and 0.2904, which are larger than the results obtained for our WOA–ANN, MFO–ANN, and ACO–ANN. Also, the best AUROC of this study was higher than that for both models in the mentioned reference (0.944 vs. 0.942).