Development of Two Novel Hybrid Prediction Models Estimating Ultimate Bearing Capacity of the Shallow Circular Footing

In the present work, we employed artificial neural network (ANN) that is optimized with two hybrid models, namely imperialist competition algorithm (ICA) as well as particle swarm optimization (PSO) in the case of the problem of bearing capacity of shallow circular footing systems. Many types of research have shown that ANNs are valuable techniques for estimating the bearing capacity of the soils. However, most ANN training models have some drawbacks. This study aimed to focus on the application of two well-known hybrid ICA–ANN and PSO–ANN models to the estimation of bearing capacity of the circular footing lied in layered soils. In order to provide the training and testing datasets for the predictive network models, extensive finite element (FE) modelling (a database includes 2810 training datasets and 703 testing datasets) are performed on 16 soil layer sets (weaker soil rested on stronger soil and vice versa). Note that all the independent variables of ICA and PSO algorithms are optimized utilizing a trial and error method. The input includes upper layer thickness/foundation width (h/B) ratio, footing width (B), top and bottom soil layer properties (e.g., six of the most critical soil characteristics), vertical settlement of circular footing (s), where the output was taken ultimate bearing capacity of the circular footing (Fult). Based on coefficient of determination (R2) and Root Mean Square Error (RMSE), amounts of (0.979, 0.076) and (0.984, 0.066) predicted for training dataset and amounts of (0.978, 0.075) and (0.983, 0.066) indicated in the case of the testing dataset of proposed PSO–ANN and ICA–ANN models of prediction network, respectively. It demonstrates a higher reliability of the presented PSO–ANN model for predicting ultimate bearing capacity of circular footing located on double sandy layer soils.


Introduction
Recently, artificial neural network (ANN) has been suggested to support the estimation of ultimate bearing capacity of the circular footing in single homogenous soil environments as well as other engineering materials [1][2][3][4]. The study of F ult (maximum applied stresses corresponding to a particular settlement, i.e., 0.10 foundation width) of foundations is essential to geotechnical engineering and soil mechanics. In addition, complex geological structure, e.g., bearing capacity of shallow footings lied on multilayer soil (or non-homogeneous soil), are important to be considered. Traditional approaches are based on complex solutions, i.e., utilizing limit equilibrium consideration [5][6][7][8][9] or extensive experimental approaches [10][11][12][13][14]. In many cases, the suggested solutions illustrated how the top layer thickness (or its ratio to the footing width) affects the F ult of the shallow footing placed on two or more layered soils. In this regard, the settlement and bearing capacity of shallow footings have been proven to depend on several key factors, namely foundation soil parameters, the shape of the footing, and number of soil layer beneath the footing. The complexity of the problem is raised by the addition of soil layers under the footing. Different equations have been proposed to compute the soil bearing capacity (e.g., in a particular settlement) of circular and square footings [15][16][17]. The main problem of predicting the ultimate bearing capacity of the circular footing is to reduce or minimize its likelihood of high settlement after real stresses are used. The most effective parameters in predicting a correction amount for the bearing capacity are (i) soil parameters (or layered soils beneath footing), (ii) footing shape (e.g., strip, circular, and rectangular); and (iii) soil layers arrangement [18]. It is important to note that the soil properties (e.g., dilation angle, unit weight, internal friction angle and cohesion, and Poisson's ratio, as well as elastic modulus) can apply stresses on the footing. In general, the ultimate bearing capacity of the circular footing has been determined to be the ultimate applied stress in the case of the maximum ratio of settlement (footing settlement/footing width or S/B) that is 10% of the footing width [5,19,20]. There are many factors that can affect bearing capacity of the shallow footing like multilayer soil and geological conditions, failure model considered through the calculation, the stronger soil location (for example, the arrangement of soil layer), type of the soil, and footing width [21]. For providing a reliable prediction of the ultimate bearing capacity (F ult ) for shallow footings placed on multi-layered soils, many scholars like Haghbin [22], Lotfizadeh and Kamalian [23], Meyerhof and Hanna [10], and Ahmadi and Kouchaki [24] suggested novel formulas.
In this study, to forecast the F ult of the circular footing, 84 ANN models and 58 hybrid models (for example, helping the ANN for providing an efficient outcome), namely (i) imperialist competition algorithm (ICA) and (ii) particle swarm optimization (PSO) were designed. The hybrid ICA-NN and PSO-ANN models provided here were not utilized in the engineering-based example provided in this study. There does not exist a study conducted on the use of the proposed hybrid model to predict the F ult of circular footing rested on multilayer soil conditions. Thus, this study aimed to optimize ANN with two well-developed hybrid optimization algorithm models with a reliable approximation of circular footing's bearing capacity rested on layered soils.

Artificial Neural Network
In this study, three different hybrid artificial intelligent systems were used to estimate the ultimate bearing capacity of the circular footing, namely (i) conventional feedforward backpropagation ANN, (ii) hybrid PSO-ANN, and also (iii) ICA-ANN. McCulloch and Pitts [25] suggested ANN for the first one. Then, in 1949, the first technique for training ANNs was suggested [26]. There were several rules according to observations and hypothesis of neuro-physiologic nature. Numerous other researchers have investigated the development of nonlinear and simple mathematical models according to biological neuron [27,28]. These studies allow for the production of a big number of structures (e.g., topologies) and network learning algorithms [29][30][31][32][33][34][35]. With using a randomly selected testing database, ANN-based models run the dataset in a training network and can also analyze the predicted result (i.e., less than 30% of the whole datasets) [36][37][38]. More details about the ANN algorithm are presented in Figure 1. can also analyze the predicted result (i.e., less than 30% of the whole datasets) [36][37][38]. More details about the ANN algorithm are presented in Figure 1.

Particle Swarm Optimization (PSO)
Eberhart and Kennedy [39] have introduced particle swarm optimization (PSO) algorithm. After that, many scholars used it, including Huang and Dun [40] and Wan et al. [41], in different optimization problems. This algorithm commonly uses less memory and also has a higher learning speed than other optimization algorithms like the genetic algorithm. Figure 2 shows simplified details about the algorithm of PSO that all considered particles were initialized. The most appropriate particle can be chosen when the fitness assessment for every particle is performed. Then, among all particles, the global best particle can be selected and the outcomes for particle velocity should be tested by terminating criteria. The algorithm ends when the terminating criteria are met. With this new compatibility, the algorithm is predicted again, and new compatibility can be selected per each particle. In many studies (Zhang,et al. [42], Yuan and Moayedi [43], Yuan and Moayedi [44], Xi, et al. [45] and Moayedi, et al. [46]), the learning approach of hybrid PSO combined with the ANN algorithm (called PSO-ANN in this study) was suggested ( Figure 3).

Particle Swarm Optimization (PSO)
Eberhart and Kennedy [39] have introduced particle swarm optimization (PSO) algorithm. After that, many scholars used it, including Huang and Dun [40] and Wan et al. [41], in different optimization problems. This algorithm commonly uses less memory and also has a higher learning speed than other optimization algorithms like the genetic algorithm. Figure 2 shows simplified details about the algorithm of PSO that all considered particles were initialized. The most appropriate particle can be chosen when the fitness assessment for every particle is performed. Then, among all particles, the global best particle can be selected and the outcomes for particle velocity should be tested by terminating criteria. The algorithm ends when the terminating criteria are met. With this new compatibility, the algorithm is predicted again, and new compatibility can be selected per each particle. In many studies (Zhang,et al. [42], Yuan and Moayedi [43], Yuan and Moayedi [44], Xi, et al. [45] and Moayedi, et al. [46]), the learning approach of hybrid PSO combined with the ANN algorithm (called PSO-ANN in this study) was suggested ( Figure 3).

Imperialist Competition Algorithm (ICA)
The method of imperialist competition algorithm (ICA) was firstly developed by [47] and later expanded in different engineering subjects by other researchers (e.g., Mosallanezhad and Moayedi [48]). Its process is very similar to many other evolutionary algorithms such as PSO and genetic algorithm (GA). The first step in ICA is candidate solutions (or initial population). This step in ICA gets started with a number of countries. The number of countries will be separated into two main groups, namely imperialists and colonies. In these groups, the imperialists are some of the strongest countries where the colonies are the remaining countries ( Figure 4). Note that colonies are distributed between the imperialists to make the empires. The distribution of the colonies is based on their own relative strength. As in the ICA, each one of the emeries competes with others to govern more colonies and therefore expand their territory. At the end of this imperialist competition looping, stronger empires will take possession of weaker colonies located in weaker empires. The process will stop after being satisfied by pre-defined termination criteria. Description of the ICA algorithm including its designed steps is well described in other studies. The algorithm of ICA is presented in Figure 5.
In the literature, for enhancing the efficiency of the ANN-based algorithm by the use of novel optimization algorithms, many works have been performed. In the case of solving engineering issues, algorithms like genetic, ICA, and PSO algorithms can be selected as the new optimization approaches. Moreover, in order to improve the defect of the ANNs, many optimization algorithms have been tested. The optimized finding method of ANN algorithm can cause undesirable solution because the backpropagation approach is a local searching method, for example, searching via the training algorithm. For finding an appropriate weight along with bias of the network and for enhancing the efficiency, different optimization approaches can be selected. Because optimization algorithms can discover a global minimum in comparison with backpropagation-based neural networks, one can find a higher probability of finding more appropriate convergence in these methods. Hence, by taking advantage of hybrid methods like PSO-ANN and ICA-ANN, the ANN weakness in the case of finding the global minimum is removed for example enhancing its searching characteristics by fitness functions and also cost functions.

Imperialist Competition Algorithm (ICA)
The method of imperialist competition algorithm (ICA) was firstly developed by [47] and later expanded in different engineering subjects by other researchers (e.g., Mosallanezhad and Moayedi [48]). Its process is very similar to many other evolutionary algorithms such as PSO and genetic algorithm (GA). The first step in ICA is candidate solutions (or initial population). This step in ICA gets started with a number of countries. The number of countries will be separated into two main groups, namely imperialists and colonies. In these groups, the imperialists are some of the strongest countries where the colonies are the remaining countries ( Figure 4). Note that colonies are distributed between the imperialists to make the empires. The distribution of the colonies is based on their own relative strength. As in the ICA, each one of the emeries competes with others to govern more colonies and therefore expand their territory. At the end of this imperialist competition looping, stronger empires will take possession of weaker colonies located in weaker empires. The process will stop after being satisfied by pre-defined termination criteria. Description of the ICA algorithm including its designed steps is well described in other studies. The algorithm of ICA is presented in Figure 5.  In the literature, for enhancing the efficiency of the ANN-based algorithm by the use of novel optimization algorithms, many works have been performed. In the case of solving engineering issues, algorithms like genetic, ICA, and PSO algorithms can be selected as the new optimization approaches. Moreover, in order to improve the defect of the ANNs, many optimization algorithms have been tested. The optimized finding method of ANN algorithm can cause undesirable solution because the backpropagation approach is a local searching method, for example, searching via the training algorithm. For finding an appropriate weight along with bias of the network and for enhancing the efficiency, different optimization approaches can be selected. Because optimization algorithms can discover a global minimum in comparison with backpropagation-based neural networks, one can find a higher probability of finding more appropriate convergence in these methods. Hence, by taking advantage of hybrid methods like PSO-ANN and ICA-ANN, the ANN weakness in the case of finding the global minimum is removed for example enhancing its searching characteristics by fitness functions and also cost functions.

FEM Simulation
In this study, eight different soil types were used along with significant diversity in their main attributes. These attributes can address almost the most usual kinds of sands. In the modelling, internal friction and dilation angles in the range of 32-42 and 3.4-11.5 degrees are selected and also utilized, respectively. Moreover, the elastic modulus, Poisson's ratio, and unit weight varied between the values of 17500-65000 kN/m 2 , 0.333-0.249, and 19-21.1 kN/m 3 , respectively. The soil properties, which are attended for prediction of a network can be shown as a series of the graphical summary that is the range of input data. These datasets include friction angle, Poisson's ratio, elastic modulus, and unit weight is drawn in Figure 6. To determine applied stresses beneath the footing (F ult ), asymmetric FEM of circular foundation, for example, a width around 1.0 m is placed on two-layer soils (Figure 7). Infeasible civil engineering plans, the layers of soil beneath the foundations are not homogeneous, commonly, and there are many cases where (i) a weaker soil layer rested on a stronger soil layer or (ii) stronger soil placed on a soil layer with much weaker physical characteristics. Plaxis 2D (a commercial finite element software) is employed to predict the influences of soil layer's properties on ultimately applied stresses. Based on several recommendations (e.g., Mosallanezhad and Moayedi [50], and Hou, et al. [51]) the most effective factors, which influence the maximum bearing capacity of the soil, are (i) soil properties (for example, soil layer thickness beneath the footing), maximum expected settlement (s), friction angle ϕ, unit weight γ, elastic modulus E, Poisson's ratio υ, type of design analysis, and dilation angle ψ. (Table 1). Note that, in the present work, cohesion has been considered equal to zero for estimating the F ult in distinct layered sandy soils. In addition, zero value for cohesion means soil does not have any cohesive strength and provides sandy soil state. Values of 0, 0.4, 0.4, 0.8, and also 1.0 are selected for the upper layer width of thickness.
that is the range of input data. These datasets include friction angle, Poisson's ratio, elastic modulus, and unit weight is drawn in Figure 6. To determine applied stresses beneath the footing (Fult), asymmetric FEM of circular foundation, for example, a width around 1.0 m is placed on two-layer soils (Figure 7). Infeasible civil engineering plans, the layers of soil beneath the foundations are not homogeneous, commonly, and there are many cases where (i) a weaker soil layer rested on a stronger soil layer or (ii) stronger soil placed on a soil layer with much weaker physical characteristics. Plaxis 2D (a commercial finite element software) is employed to predict the influences of soil layer's properties on ultimately applied stresses. Based on several recommendations (e.g., Mosallanezhad and Moayedi [50], and Hou, et al. [51]) the most effective factors, which influence the maximum bearing capacity of the soil, are (i) soil properties (for example, soil layer thickness beneath the footing), maximum expected settlement (s), friction angle , unit weight , elastic modulus Ε, Poisson's ratio , type of design analysis, and dilation angle ( Table 1). Note that, in the present work, cohesion has been considered equal to zero for estimating the Fult in distinct layered sandy soils. In addition, zero value for cohesion means soil does not have any cohesive strength and provides sandy soil state. Values of 0, 0.4, 0.4, 0.8, and also 1.0 are selected for the upper layer width of thickness.    Table 1. FEM plan used for the input layers' preparation.

Ground Conditions Foundation Diameter (B) (m) (Upper Layer Thickness/Foundation Width) or h/B
Soil layer S 1 on soil layer (S 2 , S 3 , S 4 , S 5 , S 6 , S 7 , S 8 ) 1.0 0, 0.2, 0.4, 0.8, 1.0, Soil layer S 8 on soil layer (S 1 , S 2 , S 3 , S 4 , S 5 , S 6 , S 7 ) As stated earlier, to generate the best structure for the proposed hybrid models, the database employed to train the models is obtained by a total of 3513 full-scale finite element simulations. The dataset is considered for a circular footing with a radius of 1.0 m, rested on two-layered soil condition. It is noted that the vertical stress amounts before arriving at the maximum S/B ratio are labelled F ult . Based on many similar works (e.g., Anvari and Shooshpasha [52] and Noorzad and Manavirad [53]), soil layer thickness beneath the footing, maximum expected settlement (s), friction angle ϕ, unit weight γ, elastic modulus E, Poisson's ratio υ, type of design analysis, and dilation angle ψ were selected as inputs and applied datasets for building the proposed models. An instance of a database received by the FEM simulation and effective factors influencing the F ult , as the model output, selected in the ANN algorithm was shown in Table 2.

Model Assessment
The normalized data were employed in this study. Moreover, two well-known statistical criteria of coefficient of determination (R 2 ) and root mean square error (RMSE) were specified to calculate the error rate of the predicted network results. Equations (1) and (2) formulate these indices: where Y i observed, and Y i predicted are the real and estimated values of a bearing capacity, respectively. Also, the term N indicates the number of instances and Y observed denotes the average of capacities.

ANN Network Optimization
Estimation of the maximum applied stress on circular footings lied in 16 different layered soil sates is the most important objective of this research. As the first step of the optimization method of ANN, different training and testing datasets are taken for the prediction that considering around 80% of the total dataset (in the case of the training dataset) and 20% of the total dataset (for the testing dataset). To drive the most appropriate predictive network, many works (e.g., Moayedi, et al. [46], and Chakraborty and Goswami [12]) proposed to use 20% or 703 datasets and 80% or 2810 datasets of both datasets (i.e., testing and training), respectively. The most appropriate structure of the model of ANN could be obtained after a large number of trial and also error process and using change in the number of the hidden layer as well as the number of neurons [54]. Therefore, around 84 ANN-Tansig models were built. For finding their best network performances, the efficiency of suggested networks has been measured. Figures 8 and 9 show the mean outcome of six ANN iterations in the case of the network efficiency R 2 and RMSE. For ranking the achieved outcomes from ANN iterations, two distinct ranking systems of total ranking and also color intensity are utilized. The proposed model number 14 with the total ranking value of 28 should be selected as the best-constructed model according to the mean network performance from a total of 84 ANN built networks, and the testing and training datasets (i.e., that is after six iterations). However, we have gained better network performance results after analyzing whole network efficiency (as shown in Figures 8 and 9) with eight hidden neurons. This proves that the final structure of ANN that is designated for this approach should have a structure of 14×8×1, however searching for the trivial change in the accuracy and the network efficiency (in the case of the training and testing databases as shown in Figures 8 and 9), for the pre-introduced node number in a single structure of hidden layer the optimized amount was designated to be eight. This is also a simplification that is provided for the proposed model to be more practical.

Hybrid PSO-ANN and ICA-ANN Models in Predicting F ult
Two-hybrid models of ICA-ANN and also PSO-ANN are employed to choose the most suitable predictive model among them. Therefore, to discover optimum factors in both models, many parametric studies are done. The optimal state of ANN model needs to be indicated, prior to running a parametric investigation for hybrid model factors. Parametric investigation of ANN model was performed considering a series of trial and error process (discussed in Section 4.1), and it was found that an ANN model with eight hidden neurons (or architecture of 12 × 8 × 1) received better performances. Hence, the obtained architecture was confirmed and used to both hybrid intelligent systems (e.g., ICA-ANN and PSO-ANN).
As previously mentioned, in order to determine the most appropriate structure of the hybrid model of PSO-ANN, there is a need for a parametric investigation according to the trial and error process. Numerous parameters such as coefficients of velocity relation (C 1 and C 2 ), inertia weight and the number of iterations, and the number of existing particles can considerably affect network efficiency of the PSO approach. Many scholars suggested that, for providing an acceptable network performance, C 1 and C 2 equal to 2 as well as inertia weight (i w ) of 0.25 is proper. Different models of PSO are generated by distinct amounts of swarm size (i.e., 25, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, and 500) to indicate the appropriate particle size/swarm population (with I w = 0.25 as well as C 1 and C 2 equal to 2). When 400 is the number of swarm size, the best efficiency is obtained. For the training and testing datasets, the R 2 and RMSE are (0.9607 and 0.076) and (0.978, and 0.075), respectively. Figure 10a shows the performance result variation; for example, MSE utilized here, of PSO-ANN models that have different various population sizes. As observed, the model that has swarm sizes equal to 400 indicates the least RMSE, which demonstrates its advantage in comparison to other approaches. For selecting appropriate values for C 1 and C 2, a similar trial and error approach is used. In this regard, by distinct C 1 and C 2, around 12 models are designed, and then their efficiencies are according to total ranking and CER approaches. The most appropriate efficiencies with the total value of 69 obtained when C 1 and C 2 were (1.33 and 0.67) or (1.5 and 1.5), respectively. For the testing datasets, the R 2 , and RMSE are 0.9607, and 0.076, respectively. Finally, five approaches are provided by I w (0.2, 0.4, 0.6, 0.8, and 1.0) to choose an appropriate amount for the I w . For the IW, achieved total rank of 0.2, 0.4, 0.6, 0.8, and 1.0 are 28, 12, 18, 6, and 26, respectively. It demonstrates the advantage of the PSO-ANN approach with I w of 0.2.
In order to take the most appropriate predictive result from the ICA-ANN hybrid approach, optimizing its influential factors is an important concern. In order to employ the best approach, two distinct ranking methods (CER and total ranking [49]) are used. Similar to the first parametric study process performed, e.g., in selecting proper ANN architecture and PSO-ANN model, this has also been obtained considering a series of trial and error progress. As stated earlier, in ICA, number of the country (N c ), number of the decade (N d ), and number of imperialists (N i ) are recommended as the most influential parameters on ICA performance. In order to determine the proper N c , several designs are created by distinct amounts of Nc (25, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, and 500). For these models, N i = 5 and N d = 200 were utilized. As a result, with the total rank of 68, N c = 300 can provide a higher efficiency system capacity. Then, for applying the best amount for N i , 12 models with N i values of 5,10,15,20,25,30,35,40,45,50,55, and 60 were constructed and assessed using R 2 and RMSE. The results show that all three statistical indexes are not changed for the N i more than 5. Therefore, N i = 5 is considered the best value for the number of imperialists. Finally, the step of the ICA-ANN modelling, using the N i = 5 and N c = 300, several models of ICA were constructed considering the various number of N d , i.e., 50, 100, 200, 300, 400, and 500 (results shown in Figure 10b). These models are also assessed according to their efficiency indexes (e.g., R 2 and RMSE). It can be clearly seen that increasing the number of nodes causes more convergence between the predicted network and measured outputs. According to the obtained result, the model with N d = 300 was the best value, among others. The results of network performance varied vs. population size for both of the ICA-ANN and PSO-ANN approaches are shown in Figures 11 and 12  The results of network performance varied vs. population size for both of the ICA-ANN and PSO-ANN approaches are shown in Figures 11 and 12, respectively.

Conclusions
In the present paper, outputs of a total of 3513 FEM simulations are found to calculate the applicability of the used method. We found that the learning approach is acceptable in term of whole selected predictive models. According to the outcomes, the hybrid PSO-ANN model, which is a combination of ANN optimal with PSO, showed a proper and also more trustworthy ANN model, however, for prediction tasks, whole suggested models have proper results in the case of estimation bearing capacity of shallow circular footing systems. In comparison to the other two methods, the PSO-ANN algorithm had a higher performance for training and testing sets in terms of all statistical indexes like RMSE and R 2 . It can be observed, clearly, from the high-performance outcomes of the training network. Amounts of (0.979 and 0.076) and (0.978 and 0.0750) are obtained according to R 2 and RMSE, respectively, for training and testing databases of the optimal PSO-ANN predictive models. In the same way, in ICA-ANN, the training and testing datasets are (0.984 and 0.066) and (0.983, and 0.066), respectively, based on the R 2 and RMSE. It proved the superiority of the PSO-ANN method in the case of prediction of Fult.

Conclusions
In the present paper, outputs of a total of 3513 FEM simulations are found to calculate the applicability of the used method. We found that the learning approach is acceptable in term of whole selected predictive models. According to the outcomes, the hybrid PSO-ANN model, which is a combination of ANN optimal with PSO, showed a proper and also more trustworthy ANN model, however, for prediction tasks, whole suggested models have proper results in the case of estimation bearing capacity of shallow circular footing systems. In comparison to the other two methods, the PSO-ANN algorithm had a higher performance for training and testing sets in terms of all statistical indexes like RMSE and R 2 . It can be observed, clearly, from the high-performance outcomes of the training network. Amounts of (0.979 and 0.076) and (0.978 and 0.0750) are obtained according to R 2 and RMSE, respectively, for training and testing databases of the optimal PSO-ANN predictive models. In the same way, in ICA-ANN, the training and testing datasets are (0.984 and 0.066) and (0.983, and 0.066), respectively, based on the R 2 and RMSE. It proved the superiority of the PSO-ANN method in the case of prediction of F ult .