Prediction of Bearing Capacity of the Square Concrete-Filled Steel Tube Columns: An Application of Metaheuristic-Based Neural Network Models

During design and construction of buildings, the employed materials can substantially impact the structures’ performance. In composite columns, the properties and performance of concrete and steel have a significant influence on the behavior of structure under various loading conditions. In this study, two metaheuristic algorithms, particle swarm optimization (PSO) and competitive imperialism algorithm (ICA), were combined with the artificial neural network (ANN) model to predict the bearing capacity of the square concrete-filled steel tube (SCFST) columns. To achieve this objective and investigate the performance of optimization algorithms on the ANN, one of the most extensive datasets of pure SCFST columns (with 149 data samples) was used in the modeling process. In-depth and detailed predictive modeling of metaheuristic-based models was conducted through several parametric investigations, and the optimum factors were designed. Furthermore, the capability of these hybrid models was assessed using robust statistical matrices. The results indicated that PSO is stronger than ICA in finding optimum weights and biases of ANN in predicting the bearing capacity of the SCFST columns. Therefore, each column and its bearing capacity can be well-predicted using the developed metaheuristic-based ANN model.


Introduction
Among the concrete-filled steel tube (CFST) columns, the circular CFST (CCFST) and the square CFST (SCFST) columns have a more comprehensive range of applications and are used more often than the other types in construction as these shapes are more suitable for the concrete confinement. However, the confining action could be less efficient in SCFST column due to its angles [1,2]. Nevertheless, in current global practices, SCFST columns are also applied in the main lateral resistance systems of unbraced and braced building structures, and they also might be used for retrofitting purposes in seismic prone zones. In addition to CFST columns, this type of square infilled tube can be used as beams, caissons, and piers for deep foundations [3,4].
Some research states that concrete confinement is not efficient enough in square concrete-filled steel tube columns (SCFST) because of rigidity loss in these types of columns. In fact, in these SCFST columns, only the concrete around the center and corners of the (SCFST) columns. Tran et al. [34] developed a study to predict the bearing capacity of the square CFST columns using the ANN technique. In their study, 300 experimental data samples were collected to be trained and tested. The trial-error method was applied to determine the optimum model in terms of the highest coefficient of determination (R 2 ) and the lowest mean square error (MSE). Furthermore, many codes were adopted to assess the performance of the study. The results showed that the ANN model was more accurate than the existing formula. After validating the ANN technique, several curves were generated to accurately analyze the SCFST columns' behavior under compressive loading. In another study [35], the short square CFST column was considered, and a comprehensive dataset was obtained by means of axial compression tests. In this research, SVM and PSO were combined to develop a new hybrid model called PSVM (SVM optimized by PSO) to predict the bearing capacity of SCFST columns. For validation purposes, the reliability of the novel model was verified against the experimental results. Le [36] proposed a model based on Gaussian Process Regression (GPR) to predict the axial load which the SCFST columns could withstand under compression, and reported a high level of accuracy for the proposed GPR model.
Furthermore, in another study [37], a GEP-based methodology is proposed to develop some equations to analyze the bearing capacity of the SCFST columns subjected to axial compression. For this purpose, six GEP-based equations were proposed. The results indicated that the proposed formulations excelled the current codes and correlations in terms of efficiency. A radial basis function neural network was applied to predict the bearing capacity of SCFST columns [38]. In this study, FFA and other optimization algorithms were also applied. A database of 300 experimental tests was collected from the literature to train the data. Several comparative criteria were also used to assess the accuracy of the proposed model. The outcomes revealed that the novel predictive model could provide a higher accuracy compared with the other similar techniques.
In structural engineering, it is essential to study the bearing capacity of the CFST columns, specifically, the SCFST column, since the grip between steel tube and concrete core in these sorts of columns is complicated, and the structural performance of these columns is often nonlinear as a result of this interaction. Furthermore, the experimental laboratory tests are generally expensive and time-consuming. In recent years, using artificial intelligence (AI) methods have become more popular as the AI approaches are usually faster and less complex in comparison with FEA. The accuracy of prediction in simulation by FE method is highly influenced by input elements which normally cannot be simulated thoroughly [39]. Therefore, using AI/ML methods for predicting the bearing capacity of the SCFST columns could be a suitable alternative. Due to the lack of research in terms of developing AI/ML techniques to predict the ultimate bearing capacity of the square CFST columns, this study aims to propose a novel technique using a combination of ANN and metaheuristic algorithms (i.e., PSO and ICA) for prediction of the ultimate axial load of these types of columns. The step-by-step modeling procedure is explained, and the obtained results are compared to select the best ANN-based metaheuristic model.

Artificial Neural Network
The artificial neural network (ANN) is a method that takes advantages of a biologybased computational model that resembles the rational reactions of a human brain. ANN is a methodology for recognizing sophisticated relationships between different variables, resulting in computational models for one or a number of outputs [40][41][42]. Basically, an ANN model encompasses three fundamental components named "activating function", "patterns of connections", and learning rules [43]. Depending on the problem that needs to be taken into account, the components are required to be introduced to train the network considering their weights [44]. In this regard, one of the ordinary neural networks is the multilayer perceptron (MLP), which consists of a layer of input variables, one or more hidden layers of neurons' processing, and a layer for output variables. All of these layers are connected in a sequential order, and the latter layers usually include one or more neurons together with numerical operators. Essentially, a feedforward is responsible for making signals between the output and input layers through the hidden layers. In order to specify the features of input variables, the signals, initially, have to be assessed through the hidden neurons. Later on, the specified features will be transferred to the neurons in the output layers to generate a proper model [45,46].
In recent years, various learning techniques have been suggested to improve the capability of MLP. However, backpropagation (BP) was considered a more effective method based on gradient descent [47]. This technique is benefitted from interchanging the input signs between the nodes of sequential layers. In this method, the net weight of each input "net j " is calculated as the following: In which n is the input's quantity, x i is the input's signal, and w i represents the weight of each node. Furthermore, the threshold of each node determines by the θ parameter. The activation functions, such as sigmoid, linear function, and step, are responsible for passing through the input variables, which is called the training step of the variables. In the next step, a comparison between the actual output and the predicted one will be made, and the discrepancy between these two can be determined [48]. Finally, the calculated errors travel back into the network to refresh the individual weights. Figure 1 indicates a numerical model for artificial neurons. During the training stage, the network behavior is assessed through proper statistical functions such as the root mean square error (RMSE) [44]. The refreshing of the weights will be continued until the system observes a decline in RMSE lower than a predefined level. The number of datasets is a significant factor in this technique, as lack of it may lead to overfitting in the training process [49].
to be taken into account, the components are required to be introduced to train the network considering their weights [44]. In this regard, one of the ordinary neural networks is the multilayer perceptron (MLP), which consists of a layer of input variables, one or more hidden layers of neurons' processing, and a layer for output variables. All of these layers are connected in a sequential order, and the latter layers usually include one or more neurons together with numerical operators. Essentially, a feedforward is responsible for making signals between the output and input layers through the hidden layers. In order to specify the features of input variables, the signals, initially, have to be assessed through the hidden neurons. Later on, the specified features will be transferred to the neurons in the output layers to generate a proper model [45,46].
In recent years, various learning techniques have been suggested to improve the capability of MLP. However, backpropagation (BP) was considered a more effective method based on gradient descent [47]. This technique is benefitted from interchanging the input signs between the nodes of sequential layers. In this method, the net weight of each input "netj" is calculated as the following: In which n is the input's quantity, xi is the input's signal, and wi represents the weight of each node. Furthermore, the threshold of each node determines by the θ parameter. The activation functions, such as sigmoid, linear function, and step, are responsible for passing through the input variables, which is called the training step of the variables. In the next step, a comparison between the actual output and the predicted one will be made, and the discrepancy between these two can be determined [48]. Finally, the calculated errors travel back into the network to refresh the individual weights. Figure 1 indicates a numerical model for artificial neurons. During the training stage, the network behavior is assessed through proper statistical functions such as the root mean square error (RMSE) [44]. The refreshing of the weights will be continued until the system observes a decline in RMSE lower than a predefined level. The number of datasets is a significant factor in this technique, as lack of it may lead to overfitting in the training process [49].

Particle Swarm Optimization
In 1997 [50], Kennedy and Eberhart first proposed an intelligent computer-based technique for optimization, which later it is called particle swarm optimization (PSO). This methodology mimics the natural behavior of some creatures, such as birds, fish, bees, and ants [51]. Several algorithms later followed this technique, such as the ant colony algorithm (ACO). While some similarities exist between ACO, PSO, and genetic algorithm (GA); PSO has less complexity. In fact, PSO randomly takes advantage of the actual numbers and the relationships between the swarm particles. In the PSO technique, some entities, named particles, are distributed in an area called the objective function's zone. The main concept in this algorithm is to locate the particles in their optimal conditions. The main elements in the particles' movement are deterministic components and characteristics of stochasticity. In addition, they can move towards the existing global best (p * ) and also it is the best location (x * i ). Later on, the particle will look for a more suitable location compared with the previous one. In any time, "t", of the specific iterations, a current best location for the n particles is available. Particles, finally, will look for the global best to end up with the algorithm. Figure 2 illustrates the movements of particles. As evident from the figure, x * i is the current best for the particle i, and p * ≈ min{ f (x i )}, (i = 1, 2, . . . , n), is the global best.

Particle Swarm Optimization
In 1997 [50], Kennedy and Eberhart first proposed an intelligent computer-based technique for optimization, which later it is called particle swarm optimization (PSO). This methodology mimics the natural behavior of some creatures, such as birds, fish, bees, and ants [51]. Several algorithms later followed this technique, such as the ant colony algorithm (ACO). While some similarities exist between ACO, PSO, and genetic algorithm (GA); PSO has less complexity. In fact, PSO randomly takes advantage of the actual numbers and the relationships between the swarm particles. In the PSO technique, some entities, named particles, are distributed in an area called the objective function's zone. The main concept in this algorithm is to locate the particles in their optimal conditions. The main elements in the particles' movement are deterministic components and characteristics of stochasticity. In addition, they can move towards the existing global best ( * ) and also it is the best location ( * ). Later on, the particle will look for a more suitable location compared with the previous one. In any time, "t", of the specific iterations, a current best location for the n particles is available. Particles, finally, will look for the global best to end up with the algorithm. Figure 2 illustrates the movements of particles. As evident from the figure, * is the current best for the particle i, and * ≈ min { ( )}, ( = 1, 2, … , ), is the global best.  Using Equation (2), given the fitness requirements for the swarms, the velocity of the swarms may be determined by a function that is proportional to the best location of the swarm and the most suitable position of each particle. Furthermore, Equation (3) leads to the successive suitable positions of the particles.
In these equations, − −− → pbest is the best position of the particle itself, and − −− → gbest represents the global best position among all particles. Poli et al. [52] stated that Equation (2) could be adjusted if a new parameter, inertia weight (w) is added to it. Inertia weight specifies the rate of the previous velocity of each particle to its velocity at current Equation (4). The flowchart of PSO algorithm is illustrated in Figure 2.

Imperialism Competitive Algorithm
When it comes to engineering challenges, the ICA, created by Atashpaz-Gargari and Lucas [53], is one of the most effective optimization strategies. Similar to the other techniques, ICA begins its processing by making "so-called countries" as a random initial population. After making N countries (N country ), many of them having the lowest costs or functions, are picked up as the imperialists (N imp ). As a result, colonies (N col ) are specified as the remaining countries. Based on the power of empires, all colonies will be distributed to them [54,55]. Therefore, the more influential the imperialists (lowest RMSE), the more colonies can be absorbed. ICA comprises three leading operators, which are revolution, assimilation, and competition. While assimilation and revolution are in operation, a colony can reach a zone that is superior to that of its imperialist neighbor and seize control of the territory formerly controlled by the preceding imperialist.
However, since it is a competitive scenario, each empire has a chance to dominate at least one colony of the weakest empire, depending on the strength of the empire in question. Suppose the most powerful empire is still undefeated after a certain number of iterations or decades. In that case, the method will be repeated until a specified termination condition is satisfied, such as an acceptable RMSE, a maximum number of iterations or decades, etc. Note that the number of decades (N decade ) in ICA is potentially quite comparable to the number of iterations in several other algorithms, which is worth noting [56,57]. The flowchart of ICA is shown in Figure 3.

Metaheuristic-Based ANN Models
The problem in using ANN in prediction case studies is that it will receive different results with various performance levels. It is because of the basic shortcomings of ANN, which are slow learning rate and getting trapped in local minima [58,59]. In these conditions, the possible solutions may refer to optimizing weights and biases of ANN and therefore getting more similar results by means of ANN. This optimization process can be performed by metaheuristic algorithms such as PSO and ICA. These algorithms and their effective parameters should be designed to obtain the best optimization outcome. For example, the number of countries and swarms should be designed based on a series of available range introduced in the literature. Of course, the results of hybrid models cannot differ significantly in terms of specific influential parameters. The flowcharts of hybrid models used in this study to solve the bearing capacity of the SCFST columns are presented in Figures 4 and 5. As it can be seen from these figures, a number of populations (i.e., particles or countries) together with the other effective parameters of the optimization algorithms are selected and the hybrid system is trained. Then, the error indicators of the hybrid system should be measured based on the optimum weights and biases of ANN itself. Because the goal is to achieve the lowest system error possible, different values of optimization parameters can result in different system errors for the entire system. Therefore, each effective parameter should be designed using a trial-and-error system. It is worth mentioning that the base model should be designed using ANN itself. These hybrid models been applied to get more stable results in different areas of civil engineering [60,61].

Metaheuristic-Based ANN Models
The problem in using ANN in prediction case studies is that it will receive different results with various performance levels. It is because of the basic shortcomings of ANN, which are slow learning rate and getting trapped in local minima [58,59]. In these conditions, the possible solutions may refer to optimizing weights and biases of ANN and therefore getting more similar results by means of ANN. This optimization process can be performed by metaheuristic algorithms such as PSO and ICA. These algorithms and their effective parameters should be designed to obtain the best optimization outcome. For example, the number of countries and swarms should be designed based on a series of available range introduced in the literature. Of course, the results of hybrid models cannot differ significantly in terms of specific influential parameters. The flowcharts of hybrid models used in this study to solve the bearing capacity of the SCFST columns are presented in Figures 4 and 5. As it can be seen from these figures, a number of populations (i.e., particles or countries) together with the other effective parameters of the optimization algorithms are selected and the hybrid system is trained. Then, the error indicators of the hybrid system should be measured based on the optimum weights and biases of ANN itself. Because the goal is to achieve the lowest system error possible, different values of optimization parameters can result in different system errors for the entire system. Therefore, each effective parameter should be designed using a trial-and-error system. It is worth mentioning that the base model should be designed using ANN itself. These hybrid models been applied to get more stable results in different areas of civil engineering [60,61].

Background
For many years, types and shapes of columns have been developed from typical reinforced concrete columns and steel columns into steel-reinforced concrete columns and various types of composite columns such as CFST columns, encased columns, and also

Background
For many years, types and shapes of columns have been developed from typical reinforced concrete columns and steel columns into steel-reinforced concrete columns and various types of composite columns such as CFST columns, encased columns, and also concrete columns reinforced with concrete-filled steel tubes. Improvements and revolutions in CFST have rapidly grown during the past decades until now. Different technical journals on this topic led to the establishment of the Architectural Institute of Japan (AIJ) and a standard for circular steel and concrete, which is known as a composite structure, released in 1967. Japan and China conducted many investigations to lay the foundation for CFST later on. Then, in 1993, a study plan on composite and hybrid structures, the fifth stage of the US-Japan collaboration research program, and another study on CFST column systems were considered in the study, and the findings achieved from this investigation made the current design suggestions for the CFST column system.
CFST columns demonstrate more fire resistance and strength property levels than bare steel columns [62]. In addition, the filled concrete has a significant role in the structural behavior of these types of columns. High-performance concrete has superior characteristics compared to NC, for example, improved ductility, strength, and self-consolidating features. Using high-performance concrete, including lightweight concrete (LWC) and engineering cementitious concrete (ECC), could improve the ductility and strength of the concretefilled tube composite columns. Several studies investigated self-consolidated concrete (SCC) and the NC-filled tube columns [63]. However, the current study attempts to collect comprehensive data from the literature consisting of square CFST columns with various concrete compressive strengths purely under axial compression. This limitation of datasets is due to achieving better results through prediction using ML/AI techniques. Figure 6 indicates a schematic of using ML/AI techniques for the square CFST columns. engineering cementitious concrete (ECC), could improve the ductility and strength of the concrete-filled tube composite columns. Several studies investigated self-consolidated concrete (SCC) and the NC-filled tube columns [63]. However, the current study attempts to collect comprehensive data from the literature consisting of square CFST columns with various concrete compressive strengths purely under axial compression. This limitation of datasets is due to achieving better results through prediction using ML/AI techniques. Figure 6 indicates a schematic of using ML/AI techniques for the square CFST columns.

Data Source
This study tries to collect extensive datasets from the literature. More than a hundred articles were studied, and among them, twenty-three articles were chosen to include their results in the dataset of this study because some of the studied articles were comprised of SCFST columns under different loading conditions or some of them with additional reinforcements inside the column, which could highly impact our result of the ultimate bearing capacity. At first, 217 data points were achieved from the articles that were taken into consideration [13,14,61,[63][64][65][66][67][68][69][70][71][72][73][74][75][76]. Several essential parameters can affect analysis and results based on the previous experimental tests and experience [69,70,76,77] and some of the former data analysis for the CFST columns [26,31,35,[78][79][80]. They are concrete compressive strength (fc), the width or diameter (B/D) of the columns, the length of column (L), the thickness of the columns (t), the yield strength (fy) of the steel tube, the slenderness ratio (L/D, B), and the diameter/width to thickness ratio (D, B/t). However, the most critical

Data Source
This study tries to collect extensive datasets from the literature. More than a hundred articles were studied, and among them, twenty-three articles were chosen to include their results in the dataset of this study because some of the studied articles were comprised of SCFST columns under different loading conditions or some of them with additional reinforcements inside the column, which could highly impact our result of the ultimate bearing capacity. At first, 217 data points were achieved from the articles that were taken into consideration [13,14,61,[63][64][65][66][67][68][69][70][71][72][73][74][75][76]. Several essential parameters can affect analysis and results based on the previous experimental tests and experience [69,70,76,77] and some of the former data analysis for the CFST columns [26,31,35,[78][79][80]. They are concrete compressive strength (f c ), the width or diameter (B/D) of the columns, the length of column (L), the thickness of the columns (t), the yield strength (f y ) of the steel tube, the slenderness ratio (L/D, B), and the diameter/width to thickness ratio (D, B/t). However, the most critical factors for the SCFST columns which impact the ultimate bearing capacity (P exp ) were considered as five factors: concrete compressive strength (f c ), the column's width (B), the column's length (L), the column's thickness (t), and the yield strength (f y ) of the steel tube. The minimum, maximum, average, and standard deviation values of the first patch of the collected data are shown in Table 1. The initial raw data were first analyzed considering empirical analysis and try and error. For this purpose, by taking all input parameters into consideration, the distribution of each was plotted, and later, it was considered which parameter had the most impact on the output, which was the bearing capacity, in this study. Then, considering the parameters with the dispersed data and eliminating those, it was found that which parameters had a higher impact on the R 2 . Therefore, after the preliminary analysis, it was revealed that the first patch of the collected data has a wide range of tolerance, which can cause less accuracy from the machine learning techniques. It was more critical for the wide range of the width of the columns. As shown in Figure 7, when all data were considered, the initial R 2 was 0.68, while after the column's width filtration to less than 260 mm and 150 mm, the R 2 value was increased to 0.92 and 0.96, respectively, which is highly improved. It is worthy of mention that the R 2 value is one of the significant factors in ML/AI to be considered for training and testing stages. The closer the value of R 2 to 1, the better results can be achieved from training and testing sessions. sidered as five factors: concrete compressive strength (fc), the column's width (B), the col-umn's length (L), the column's thickness (t), and the yield strength (fy) of the steel tube. The minimum, maximum, average, and standard deviation values of the first patch of the collected data are shown in Table 1. The initial raw data were first analyzed considering empirical analysis and try and error. For this purpose, by taking all input parameters into consideration, the distribution of each was plotted, and later, it was considered which parameter had the most impact on the output, which was the bearing capacity, in this study. Then, considering the parameters with the dispersed data and eliminating those, it was found that which parameters had a higher impact on the R 2 . Therefore, after the preliminary analysis, it was revealed that the first patch of the collected data has a wide range of tolerance, which can cause less accuracy from the machine learning techniques. It was more critical for the wide range of the width of the columns. As shown in Figure 7, when all data were considered, the initial R 2 was 0.68, while after the column's width filtration to less than 260 mm and 150 mm, the R 2 value was increased to 0.92 and 0.96, respectively, which is highly improved. It is worthy of mention that the R 2 value is one of the significant factors in ML/AI to be considered for training and testing stages. The closer the value of R 2 to 1, the better results can be achieved from training and testing sessions.  Therefore, it was decided to sort the data based on the preliminary results. The updated dataset led to 149 pure data points, where the range of the data was closer to each other to achieve more accurate results. Finally, the following ranges were considered in this study:   Table 2. In addition, in order to show the correlation between independent variables and dependent variables, the correlation matrix was used. The correlation matrix is a matrix that can be used when several inputs can generate R 2 (or else) with their pairs. However, it is worth mentioning that only the linear correlations between two variables can be evaluated with this approach. Therefore, it may not be capable of being used for nonlinear relationships. Figure 8 illustrates the correlations between the variables with their adjusted R 2 values. In addition, the distributions for each parameter are presented in Figure 8. In general, the relations between variables are not that high and significant. The highest correlation between input parameters is related to the relationship of T (mm) and B (mm) with Adj R 2 = 0.592 followed by the relationship of L (mm) and B (mm) with Adj R 2 = 0.531. In terms of input-output relationships, f y received the highest Adj R 2 (0.518) for predicting P exp followed by f c with Adj R 2 = 0.195. It seems that proposing a multi-inputs model with a high level of accuracy is of importance based on these simple regression analyses.

CFST Prediction
The most important parameters of the ANN system should be determined before beginning the modeling steps. However, the data samples must be normalized before this can be carried out. It was recommended by Liou et al. [81] that a specific equation can be used to normalize datasets at the modeling outset in order to simplify the process: Xnormalize = (X − Xminimum)/(Xmaximum − Xminimum)

CFST Prediction
The most important parameters of the ANN system should be determined before beginning the modeling steps. However, the data samples must be normalized before this can be carried out. It was recommended by Liou et al. [81] that a specific equation can be used to normalize datasets at the modeling outset in order to simplify the process: where X normalize, X minimum , and X maximum are the normalized data sample, the minimum of each data sample, and the maximum of each data sample, respectively.
ANN models with just one hidden layer [30,34,44] or multiple hidden layers [82,83] were presented by a number of researchers to solve various problems [45,46]. To anticipate the bearing capacity of the SCFST columns, we analyzed data from the first three layers of our data. The findings of this parametric investigation (PI) revealed that when they are compared to other implemented numbers, one hidden layer provided more accurate prediction performance than the others. ANN performance is also affected by the number of neurons in the network, which should be calculated using a different PI analysis. Preliminary research revealed that the hidden neuron values in the range of 1-11 should be considered and employed in the modeling of this section, where their root mean square error (RMSE) values were examined. As a result of these PIs, it was discovered that the hidden neuron number 9 produces bearing capacity values that are more similar to the measured values. As a result, this value was chosen as the optimal ANN model. Based on the findings from these PIs, a model with five input variables, nine hidden neurons, and one output neuron is introduced as the best ANN model, and all further hybrid modellings in this study are carried out using this model as a reference. The training and testing datasets for this investigation were selected at random from a pool of data samples representing 80% and 20% of the total number of data samples, respectively. It means that the numbers of 30 and 119 were considered for training and testing purposes of this study.
A major step in the PSO-ANN modeling process is to choose an appropriate particle size and number of iterations simultaneously during the initial stage. Through another PI, the swarm size was specified to be in the range of 50 to 500 (50, 100, 150, 200, 250, 300, 350, 400, 450, and 500). On the other hand, it was decided to set the maximum number of iterations as 500. Thus, 10 PSO-ANN prediction models were developed to forecast the bearing capacity of the SCFST columns, using the RMSE results as shown in Figure 9. As evident from the figure, the RMSE values for all of the models were significantly lowered at the beginning of the iterations. After that, the modification of the values was minimized progressively until attaining a constant value. In this manner, the situation in which the swarm size was set at 150 was the one in which the lowest error was attained. Furthermore, it can be noted that the RMSE reached a constant value after 350 rounds. As a result, to anticipate the bearing capacity of the SCFST columns for the purposes of the modeling presented in this work, the swarm size and the number of iterations used in the current article were set at 150 and 350, respectively. It is worth noting that these models were built using C 1 = C 2 = 2 and w = 0.25.
On to the second step, the C 1 and C 2 parameters were calculated. A PI was built similarly to the previous phases, using a variety of C 1 and C 2 values to examine which ones were the most appropriate for our model. In order to do this, the PSO-ANN models were built using the following parameters: (C 1 = C 2 = 2.5), (C 1 = C 2 = 2), (C 1 = C 2 = 1.75), (C 1 = C 2 = 1.5), (C 1 = 2 and C 2 = 1.5), and (C 1 = 1.5 and C 2 = 2). As a factor for assessing the models' prediction performance, R 2 was considered. Figure 10 indicates the results in this regard. The best model was obtained with C 1 = C 2 = 2 as its training and testing R 2 values are the highest among all six models shown in Figure 10. Consequently, both C 1 and C 2 were set to 2 and applied to the last modeling step, which was responsible for computing the "w" value.
progressively until attaining a constant value. In this manner, the situation in which the swarm size was set at 150 was the one in which the lowest error was attained. Furthermore, it can be noted that the RMSE reached a constant value after 350 rounds. As a result, to anticipate the bearing capacity of the SCFST columns for the purposes of the modeling presented in this work, the swarm size and the number of iterations used in the current article were set at 150 and 350, respectively. It is worth noting that these models were built using C1 = C2 =2 and w = 0.25. On to the second step, the C1 and C2 parameters were calculated. A PI was built similarly to the previous phases, using a variety of C1 and C2 values to examine which ones were the most appropriate for our model. In order to do this, the PSO-ANN models were built using the following parameters: (C1 = C2 = 2.5), (C1 = C2 = 2), (C1 = C2 = 1.75), (C1 = C2 = 1.5), (C1 = 2 and C2 = 1.5), and (C1 = 1.5 and C2 = 2). As a factor for assessing the models' prediction performance, R 2 was considered. Figure 10 indicates the results in this regard. The best model was obtained with C1 = C2 = 2 as its training and testing R 2 values are the highest among all six models shown in Figure 10. Consequently, both C1 and C2 were set to 2 and applied to the last modeling step, which was responsible for computing the "w" value. The accuracy level of the PSO-ANN models is also affected by the "w" value, which can significantly influence these models [84]. As a result, in the four PSO-ANN models shown in Figure 11, the "w" value was adjusted to 0.25, 0.5, 0.75, and 1. Once again, R 2 was selected as the performance criterion in this PI. As evident from the figure, the PSO-ANN model with w = 0.25, presented the best ability to fit and predict the bearing capacity of the SCFST columns. Therefore, as a summary for the best PSO-ANN model, the values of 150, 350, 2, 2, and 0.25 were obtained for swarm size, iteration number, C1, C2, and w, respectively. This model will be further discussed in the next section. The accuracy level of the PSO-ANN models is also affected by the "w" value, which can significantly influence these models [84]. As a result, in the four PSO-ANN models shown in Figure 11, the "w" value was adjusted to 0.25, 0.5, 0.75, and 1. Once again, R 2 was selected as the performance criterion in this PI. As evident from the figure, the PSO-ANN model with w = 0.25, presented the best ability to fit and predict the bearing capacity of the SCFST columns. Therefore, as a summary for the best PSO-ANN model, the values of 150, 350, 2, 2, and 0.25 were obtained for swarm size, iteration number, C 1, C 2 , and w, respectively. This model will be further discussed in the next section.
The procedures used by the ICA-ANN approach to model the bearing capacity of the SCFST columns are discussed herein. As previously stated, three factors, namely, N decade , N imp, and N country have a substantial impact on the performance capability of the ICA-ANN. As a result, designing these parameters and acquiring the optimal parameters values using various PIs is necessary to accomplish the desired results. The first PI was carried out to pick N decade , and N country at the same time. Towards to the end, the first PI was structured similarly to the preceding section and based on a variety of previously conducted research [14,50]. In order to have a fair comparison with the PSO-ANN model, the authors decided to select and use the same values of swarm for N country . As shown in Figure 12, the findings obtained from different N country are dependent on N decade that have passed and they were compared in terms of predicting the bearing capacity of the SCFST columns. As the figure simply illustrates, the majority of the countries had final RMSE values in the range of 0.11 to 0.14. When N country was limited to 450, the RMSE was reduced to its bare minimum. A further finding is that when N decade was set around 250 (almost for all N country ), no further drop in RMSE was found. Therefore, the mentioned numbers were selected for these significant parameters of ICA (N country and N decade ). The accuracy level of the PSO-ANN models is also affected by the "w" value, which can significantly influence these models [84]. As a result, in the four PSO-ANN models shown in Figure 11, the "w" value was adjusted to 0.25, 0.5, 0.75, and 1. Once again, R 2 was selected as the performance criterion in this PI. As evident from the figure, the PSO-ANN model with w = 0.25, presented the best ability to fit and predict the bearing capacity of the SCFST columns. Therefore, as a summary for the best PSO-ANN model, the values of 150, 350, 2, 2, and 0.25 were obtained for swarm size, iteration number, C1, C2, and w, respectively. This model will be further discussed in the next section. Figure 11. Results of w in PSO-ANN models.
The procedures used by the ICA-ANN approach to model the bearing capacity of the SCFST columns are discussed herein. As previously stated, three factors, namely, Ndecade, Nimp, and Ncountry have a substantial impact on the performance capability of the ICA-ANN. As a result, designing these parameters and acquiring the optimal parameters values using various PIs is necessary to accomplish the desired results. The first PI was carried out to pick Ndecade, and Ncountry at the same time. Towards to the end, the first PI was structured similarly to the preceding section and based on a variety of previously conducted research [14,50]. In order to have a fair comparison with the PSO-ANN model, the authors decided to select and use the same values of swarm for Ncountry. As shown in Figure 12, the findings obtained from different Ncountry are dependent on Ndecade that have passed and they were compared in terms of predicting the bearing capacity of the SCFST columns. As the figure simply illustrates, the majority of the countries had final RMSE values in the range of 0.11 to 0.14. When Ncountry was limited to 450, the RMSE was reduced to its bare minimum. A further finding is that when Ndecade was set around 250 (almost for all Ncountry), no further drop in RMSE was found. Therefore, the mentioned numbers were selected for these significant parameters of ICA (Ncountry and Ndecade). It is now necessary to run another PI to identify the appropriate value for the amount Nimp. According to previous studies, this was accomplished by varying Nimp. from 5 to 10 [19,85]. The results achieved by this PI based on R 2 , are shown in Figure 13. Although the obtained results are close to each other and they are in a certain range, Nimp. = 5 received more accurate prediction values for the bearing capacity of the SCFST columns. The R 2 values are 0.855 and 0.873 for training and testing phases, respectively. Therefore, they were considered as the best ICA-ANN model since there are no more parameters to design. This model and its findings are discussed with further detail in the next section. It is now necessary to run another PI to identify the appropriate value for the amount N imp. According to previous studies, this was accomplished by varying N imp. from 5 to 10 [19,85]. The results achieved by this PI based on R 2 , are shown in Figure 13. Although the obtained results are close to each other and they are in a certain range, N imp. = 5 received more accurate prediction values for the bearing capacity of the SCFST columns. The R 2 values are 0.855 and 0.873 for training and testing phases, respectively. Therefore, they were considered as the best ICA-ANN model since there are no more parameters to design. This model and its findings are discussed with further detail in the next section.

Results and Discussion
The findings of the metaheuristic-based ANN techniques in predicting the bearing capacity of the SCFST columns are addressed in this section. As previously stated, R 2 and RMSE were used to evaluate models during their building. Another statistical index, namely variance account for (VAF), was considered and calculated for the best metaheuristic-based models. The statistical indices used in this study were widely applied in other predictive studies [21,[86][87][88][89][90][91][92][93]. It is important to note that this study aims to compare the ability and power of two metaheuristic-based ANN models in predicting the bearing capacity of the SCFST columns. According to previous studies [5,7,16,94], these metaheuristic-based ANN models can achieve higher performance capacities and closer predictive values than the ANN model itself. Table 3 presents the results of statistical indices for training, testing, and all datasets of PSO-ANN and ICA-ANN models in estimating the bearing capacity of the SCFST columns. The obtained results clearly showed that PSO is the most successful model in finding the optimum values for weights and biases of ANN. This model has better performance in terms of all R 2 , VAF, and RMSE statistical indices.

Results and Discussion
The findings of the metaheuristic-based ANN techniques in predicting the bearing capacity of the SCFST columns are addressed in this section. As previously stated, R 2 and RMSE were used to evaluate models during their building. Another statistical index, namely variance account for (VAF), was considered and calculated for the best metaheuristic-based models. The statistical indices used in this study were widely applied in other predictive studies [21,[86][87][88][89][90][91][92][93]. It is important to note that this study aims to compare the ability and power of two metaheuristic-based ANN models in predicting the bearing capacity of the SCFST columns. According to previous studies [5,7,16,94], these metaheuristic-based ANN models can achieve higher performance capacities and closer predictive values than the ANN model itself. Table 3 presents the results of statistical indices for training, testing, and all datasets of PSO-ANN and ICA-ANN models in estimating the bearing capacity of the SCFST columns. The obtained results clearly showed that PSO is the most successful model in finding the optimum values for weights and biases of ANN. This model has better performance in terms of all R 2 , VAF, and RMSE statistical indices. To better understand these metaheuristic-based ANN models and their capacity for forecasting the bearing capacity of the SCFST columns, the measured and predicted values (i.e., normalized) for PSO-ANN and ICA-ANN models are displayed in Figures 14 and 15, respectively. The PSO-ANN model could provide a stronger correlation between the measured and estimated bearing capacities of the SCFST columns. This model has a strong capability during the training and testing stages and, of course, for all data samples. With the anticipated and observed bearing capacities of the SCFST columns shown in Figures 14 and 15, it is evident that PSO has significant potential for optimizing the weights and biases of ANN. If the weights and biases of the ANN were adequately optimized in the first place, the PSO-ANN model's performance capabilities could be far greater than those of the ICA-ANN model. When a system error is considered, it is clear that the PSO method outperforms the ICA approach by a significant margin. Based on the above description, it is reasonable to develop the PSO-ANN model, which can get RMSE values of (0.077 and 0.059) and VAF values of (90.549% and 93.497%) for the training and testing datasets, respectively. The generated PSO-ANN technique was found to be more powerful and adaptable than the developed ICA-ANN technique in terms of solving the described issue linked to the bearing capacity of the SCFST columns. To better understand these metaheuristic-based ANN models and their capacity for forecasting the bearing capacity of the SCFST columns, the measured and predicted values (i.e., normalized) for PSO-ANN and ICA-ANN models are displayed in Figures 14  and 15, respectively. The PSO-ANN model could provide a stronger correlation between the measured and estimated bearing capacities of the SCFST columns. This model has a strong capability during the training and testing stages and, of course, for all data samples. With the anticipated and observed bearing capacities of the SCFST columns shown in Figures 14 and 15, it is evident that PSO has significant potential for optimizing the weights and biases of ANN. If the weights and biases of the ANN were adequately optimized in the first place, the PSO-ANN model's performance capabilities could be far greater than those of the ICA-ANN model. When a system error is considered, it is clear that the PSO method outperforms the ICA approach by a significant margin. Based on the above description, it is reasonable to develop the PSO-ANN model, which can get RMSE values of (0.077 and 0.059) and VAF values of (90.549% and 93.497%) for the training and testing datasets, respectively. The generated PSO-ANN technique was found to be more powerful and adaptable than the developed ICA-ANN technique in terms of solving the described issue linked to the bearing capacity of the SCFST columns.  In addition, in order to show the capability of the proposed models better, the results were compared with those obtained from the standards Euro Code 4 (EC4) and ACI Code [7,95]. Table 4 shows 30 data samples (which were randomly taken from the whole data)  To better understand these metaheuristic-based ANN models and their capacity for forecasting the bearing capacity of the SCFST columns, the measured and predicted values (i.e., normalized) for PSO-ANN and ICA-ANN models are displayed in Figures 14  and 15, respectively. The PSO-ANN model could provide a stronger correlation between the measured and estimated bearing capacities of the SCFST columns. This model has a strong capability during the training and testing stages and, of course, for all data samples. With the anticipated and observed bearing capacities of the SCFST columns shown in Figures 14 and 15, it is evident that PSO has significant potential for optimizing the weights and biases of ANN. If the weights and biases of the ANN were adequately optimized in the first place, the PSO-ANN model's performance capabilities could be far greater than those of the ICA-ANN model. When a system error is considered, it is clear that the PSO method outperforms the ICA approach by a significant margin. Based on the above description, it is reasonable to develop the PSO-ANN model, which can get RMSE values of (0.077 and 0.059) and VAF values of (90.549% and 93.497%) for the training and testing datasets, respectively. The generated PSO-ANN technique was found to be more powerful and adaptable than the developed ICA-ANN technique in terms of solving the described issue linked to the bearing capacity of the SCFST columns.  In addition, in order to show the capability of the proposed models better, the results were compared with those obtained from the standards Euro Code 4 (EC4) and ACI Code [7,95]. Table 4 shows 30 data samples (which were randomly taken from the whole data) In addition, in order to show the capability of the proposed models better, the results were compared with those obtained from the standards Euro Code 4 (EC4) and ACI Code [7,95]. Table 4 shows 30 data samples (which were randomly taken from the whole data) and their experiment bearing capacity results. In addition, predicted bearing capacity results by the PSO-ANN, ICA-ANN, EC4, and ACI are shown in Table 4. As clearly indicated, the measured results by the two metaheuristic-based ANN models are much closer than the obtained results by the standard equations. It is worth noting that for the predicted results by PSO-ANN from 30 samples, 23 data have a difference of equal or less than 10%, while for the rest of the data, the discrepancy is still less than 15%. For the ICA-ANN, the difference of the predicted bearing capacities is less than 20%. For some of the datasets, for example, datasets No. 6, 8, 9, 11, 12, 23, 26, and 31, the difference between the value obtained by the PSO-ANN method and ICA-ANN method is very small. However, the bearing capacities obtained by standards formula have a difference of 11% to 56%, which in most cases, this discrepancy is more than 20% for the models. Therefore, this indicates that the proposed metaheuristic-based models are well organized to predict the bearing capacity of the SCFST columns of the same type.

Sensitivity Analysis
In order to figure out the impact of the input variables (i.e., f c , B, L, T, and f y ) on P exp , the mutual information (MI) method was used to analyze the importance of each variable. The MI method is a filtering method used to capture arbitrary relationships (both linear and nonlinear) between each independent variable and the target object, and thus an estimated amount of mutual information between each independent variable and the target object can be obtained [96]. Furthermore, the estimated amount lies between [0, 1]; when it is 0 then the two variables are independent and when it is 1 then the two variables are perfectly correlated. In other words, when the estimated amount is closer to 1, it means the correlation between the two variables is stronger. Based on this, the results of the relevance between these five inputs and P exp were shown in Figure 16. Intuitively, f y and T showed a significant correlation with P exp , with the respective correlation indices of 0.457 and 0.432, followed by B and f c , whose values of correlation indices with P exp are 0.255 and 0.233, respectively. As for L, it showed an insignificant correlation with P exp because of the low correlation index of 0.061. object can be obtained [96]. Furthermore, the estimated amount lies between [0, 1] it is 0 then the two variables are independent and when it is 1 then the two variab perfectly correlated. In other words, when the estimated amount is closer to 1, it the correlation between the two variables is stronger. Based on this, the results of evance between these five inputs and Pexp were shown in Figure 16. Intuitively, fy showed a significant correlation with Pexp, with the respective correlation indices o and 0.432, followed by B and fc, whose values of correlation indices with Pexp are 0.2 0.233, respectively. As for L, it showed an insignificant correlation with Pexp becaus low correlation index of 0.061. Figure 16. Importance of input variables.

Limitations and Future Investigations
Model generalization is one of the common limitations in concrete technolo civil engineering studies. In this study, we considered only square cases of the SCF umns, and therefore, we used 149 data samples for modeling purposes. The pro models are able to predict the bearing capacity of the SCFST columns if the input eters on new data are within the range of our input parameters. Previous research oneered the concept of combining theories and empirical formulas with ML/AI me which has been refined.
The civil engineering communities would benefit from more investigation in topic since a pure ML/AI model is not appealing enough to be employed. The ab include theories and empirical formulae into the data preparation stage for a specif would be of the utmost importance to civil engineers at all levels. It is vital to emp that civil engineers do not often have enough knowledge of computer science or M models. The well-known theories and formula in this field of study may be used to a new database, which will result in improved model performance and predictio racy.

Limitations and Future Investigations
Model generalization is one of the common limitations in concrete technology and civil engineering studies. In this study, we considered only square cases of the SCFST columns, and therefore, we used 149 data samples for modeling purposes. The proposed models are able to predict the bearing capacity of the SCFST columns if the input parameters on new data are within the range of our input parameters. Previous researchers pioneered the concept of combining theories and empirical formulas with ML/AI methods, which has been refined.
The civil engineering communities would benefit from more investigation into this topic since a pure ML/AI model is not appealing enough to be employed. The ability to include theories and empirical formulae into the data preparation stage for a specific issue would be of the utmost importance to civil engineers at all levels. It is vital to emphasize that civil engineers do not often have enough knowledge of computer science or ML/AL models. The well-known theories and formula in this field of study may be used to create a new database, which will result in improved model performance and prediction accuracy.

Conclusions
This research considered one of the most comprehensive databases of square SCFST columns and the bearing capacity of these columns were estimated using a series of analysis and computations. To accomplish this, the most critical parameters of the ICA-ANN and PSO-ANN models were created using a comprehensive modelling procedure. Then, their predictive ability for the bearing capacities of SCFST columns was evaluated using a variety of statistical evaluation indicators. In addition, the bearing capacities of SCFST columns