Chloride Permeability Coefficient Prediction of Rubber Concrete Based on the Improved Machine Learning Technical: Modelling and Performance Evaluation

The addition of rubber to concrete improves resistance to chloride ion attacks. Therefore, rapidly determining the chloride permeability coefficient (DCI) of rubber concrete (RC) can contribute to promotion in coastal areas. Most current methods for determining DCI of RC are traditional, which cannot account for multi-factorial effects and suffer from low prediction accuracy. Machine learning (ML) techniques have good non-linear learning capabilities and can consider the effects of multiple factors compared with traditional methods. However, ML models easily fall into the local optimum due to their parameters’ influence. Therefore, a mixed whale optimization algorithm (MWOA) was developed in this paper to optimize ML models. The main strategies are to introduce Tent mapping to expand the search range of the algorithm, to use an adaptive t-distribution dimension-by-dimensional variation strategy to perturb the optimal fitness individual to thereby improve the algorithm’s ability to jump out of the local optimum, and to introduce adaptive weights and adaptive probability threshold values to enhance the adaptive capacity of the algorithm. For this purpose, data were collected from the published literature. Three machine learning models, Extreme Learning Machine (ELM), Random Forest (RF), and Elman Neural Network (ELMAN), were built to predict the DCI of RC, and the three models were optimized using MWOA. The calculations show that the MWOA is effective with the optimized ELM, RF, and ELMAN models improving the prediction accuracy by 54.4%, 62.9%, and 36.4% compared with the initial model. The MWOA-ELM model was found to be the optimal model after a comparative analysis. The accuracy of the multiple linear regression model (MRL) and the traditional mathematical model is calculated to be 87.15% and 85.03%, which is lower than that of the MWOA-ELM model. This indicates that the ML model that is optimized using the improved whale optimization algorithm has better predictive ability than traditional models, providing a new option for predicting the DCI of RC.


Introduction
Concrete is one of the most widely used building materials today [1,2]. With the economic boom, the disposal of used tire rubber is becoming a significant issue for urban development [3]. Developing concrete from tire rubber is considered to be a viable technical solution contributing to resource and environmental protection [4][5][6][7][8][9][10]. With the rapid development of marine technology, concrete structures have been widely used in coastal projects which has led to widespread interest in the structural durability of concrete [11][12][13]. The structural durability of concrete plays a vital role in sustainable development. Chloride ion attack is one of the main factors affecting the durability of concrete structures [14][15][16][17]. Chloride ion attack can cause structural failure of concrete, which can lead to many problems. Rubber is a hydrophobic material and has a high resistance to permeation. Chloride ions are transported in concrete using water as a medium, so adding rubber can improve RC, while the WOA was improved to form a new mixed whale optimization alg (MWOA). The MWOA algorithm can improve the accuracy of machine learning The optimized model also has advantages over the traditional model. Thus, da first collected from the published literature and created a database for analysis. Se using the three models to predict the DCI of RC, the three models were optimize MWOA. Thirdly, we evaluated the model performance and found the optimal Fourth, we conducted sensitivity analysis for models. Fifth, the representative pre result of the optimal model was compared with the actual value. Finally, using the model compared with the traditional model.

Database Description and Analysis of Variables
Since the experimental data were collected from the published literature, pro is required to allow the model to learn better. This study collected 88 sets of RC ratio data [22,[54][55][56][57][58][59]. Three ML algorithms using nine input variables were used: (1 urement method, (2) cement content, (3) water reducing agent content, (4) water c (5) water to ash ratio, (6) fine aggregate content, (7) coarse aggregate content, (8) size, (9) rubber content. The DCI is the only output variable. Due to the different m used to measure the DCI of RC, this study distinguishes between the measuremen ods. Two measurement methods are included in the collected literature, the rapid c permeability test (RCPT) method and the rapid chloride migration Test (RCM) m Therefore, the RCPT method is defined as 1, and the RCM method is defined as sidering the different types and sizes of rubber, this study distinguishes between by size under different measurement methods. The RCPT literature method inclu rubber sizes, 0-1 mm and 1-3 mm. Therefore, the two rubbers were recorded as by size. The RCM method literature includes five types of rubber with dimens 0.063-0.6 mm, 0.25 mm, 0.6-0.7 mm, 1-2 mm, and 4-10 mm. Therefore, the five t rubber are noted as 1, 2, 3, 4, and 5 by size. The RC samples selected for this study 28-day curing period. Cement substitution materials were not used as an input v in this study, as they are rarely added in the published literature. Figure 1 repres hotspot plot of the correlation coefficient between the variables. The correlation cients between the variables are all less than 0.8, as seen from the graph. Some suggest that the correlation between variables should be less than 0.8 to reduce th of multiple collinearities [60,61]. Figure 2 represents the input and output paramet quency distribution histogram. The statistical analysis of each variable is shown i 1. Stdd denotes overall sample bias, and Stde indicates sample bias.

Whale Optimization Algorithm
Humpback whales are herd animals because they can only hunt small fish a prawns. They have developed a unique way of hunting known as bubble net huntin where the term WOA comes from [46]. The algorithm is divided into three main par encirclement predation, prey predation, and prey search. The specific process of the WO

Whale Optimization Algorithm
Humpback whales are herd animals because they can only hunt small fish and prawns. They have developed a unique way of hunting known as bubble net hunting, where the term WOA comes from [46]. The algorithm is divided into three main parts: encirclement predation, prey predation, and prey search. The specific process of the WOA is as follows: In this process, humpback whales randomly search for prey based on the positions of each other in the population. Since the location of the best target has yet to be discovered in the search space, the WOA assumes that the location of the best target is within the search range. Once the position of the best target is determined, other populations will approach the best target and update their positions. The mathematical expression for this process is as follows [62]: where v indicates the number of iterations; A and C are vector coefficients; X * (v) is the location of the best target; X(v) is the current location; D is the process quantity; the expressions for the calculation of A, C are as follows [62]: where v indicates the number of iterations; V max indicates the maximum number of iterations; r 1 and r 2 are both random numbers belonging to the range [0, 1]; a decrease gradually from 2 to 0.

Prey Predation
The bubble net foraging method is a unique hunting method for humpback whales. The WOA is a simulation of the spiral bubble net foraging strategy for optimization. A total of two methods were designed to simulate this behavior: (1) Shrinkage envelope mechanism: This is achieved by reducing the value of a in Equation (3). Other targets will move closer to the best target when the best target is identified. The current position (X, Y) is gradually contracted to the optimal target position (X * , Y * ).
(2) Spiral update mechanism: The distance between any whale (X, Y) and the optimal target position (X * , Y * ) is first calculated, then spiral update equations are created to simulate the whale's hunting motion. The main expressions are as follows [62]: where b is a constant and indicates the parameter for the shape of the spiral; l denotes a random number between [-1, 1]; D suggests the distance between the best target and any whale. These two mechanisms co-occur, with a probability of 50% each. The expression of the equation is as follows [62]:

Prey Search
Whale populations randomly search for prey, and the mathematical expression for this process is as follows [62]: where X rand (v) indicates a randomly selected location in the current population of whales. The WOA, like other intelligent algorithms, suffers from the problem of falling into local extremum. Therefore, improvements to the WOA are needed.

Tent Chaotic Mapping Initializes Populations
Chaos is a complex, non-linear state that exhibits irregularity and randomness [63]. Therefore, chaotic mapping can be used to improve the algorithm's performance. The two commonly used chaotic mapping sequence models are Logistic and Tent. Compared to logistic mappings, Tent mappings have a more uniform distribution, allowing the algorithm to have a wider search range [50]. Therefore, Tent chaotic mapping is used to initialize the population in this study. The expressions are as follows [51]: The expression after the Bernoulli displacement transformation is as follows [51]:

Adaptive Adjustment of Weight
The inertia weight is a crucial parameter in the WOA. Appropriate weight values can improve the algorithm's performance, since the original WOA did not consider that the prey would guide the whale for position updates during the iterative process. Therefore, an adaptive weight formula is established in this paper. The specific expression is as follows [52]: where t indicates the current number of iterations; x upper i and x lower i denote the upper and lower bounds of x i respectively; d 1 and d 2 represent constants; P worst and P best denote the worst and best positions of the current population respectively. Thus Equations (1) and (7) can be improved as: X(v + 1) = D · e bl · cos(2πl) + w * X * (v) (15) With the introduction of an adaptive adjustment weights strategy, the algorithm can adaptively change the weights' size according to the whale population's current distribution. At the beginning of the algorithm iteration, if the whale population falls into the local optimum and the difference between the optimum and the worst solution is not significant, the value of d 2 * x upper i − x lower i /t is not affected by the population distribution. At this point, obtaining a large value of weights is still possible and avoids the algorithm falling into a smaller search range at the beginning of the iteration. As the whale population iterations increase, the value of d 2 * x upper i − x lower i /t decreases, and the effect on the weights decreases. If the algorithm does not obtain an optimal solution, d 1 * (P iworst − P ibest ) can play a dominant role in the weight and can make the algorithm find the optimal solution in larger steps. The adjustment of these two components makes the inertia weights highly adaptive and strengthens the algorithm's optimization search capability.

Adaptive Adjustment of the Search Strategy
To prevent the algorithm from falling into the local optimum, a Probability threshold value Q is introduced to update the expression of the random search. The expression for Q is as follows [52]: where f indicates the average fitness of the current population; f min indicates the current best fit value; f max indicates the current worst fit value; for each whale, a q ∈ [0, 1] is compared with Q value. If q < Q, the randomly selected individual whale updates its position according to Equation (17), and the other individual whales remain unchanged [52]. Otherwise, other individuals update their position according to Equation (10). This allows the algorithm to generate a set of random solutions globally with a greater probability in the early iterations, reducing the likelihood of population diversity decline and enhancing the global search capability of the algorithm.
where r is a random number between [0, 1]; X min and X rand are the maximum and minimum values of X rand respectively.

Adaptive t-Distribution Dimension-by-Dimensional Variation Strategy
Population diversity declines in later iterations of the WOA. This leads to the algorithm being prone to fall into the local optimum. Therefore, this study introduces an adaptive t-distribution dimension-by-dimensional variation strategy to perturb the individuals with optimal fitness and improve the ability of the algorithm to jump out of the local optimum. Depending on the size of the degree of freedom n, the t-distribution curves show different patterns. When t → (n → ∞) → N(0, 1) , t(n = 1) = C(0, 1), where N (0, 1) is a Gaussian distribution and C (0, 1) is a Cauchy distribution. This shows that the two boundary special cases of the t-distribution are the Gaussian and the Cauchy distributions [64]. The dimension-by-dimension variation is calculated as follows [53]: where iter indicates the current number of iterations; t(iter) denotes t-distribution with the degree of freedom parameter t. The flow chart for the MWOA is shown in Figure 3.

Adaptive Adjustment of the Search Strategy
To prevent the algorithm from falling into the local optimum, a Probability threshold value Q is introduced to update the expression of the random search. The expression for Q is as follows [52]: where ̅ indicates the average fitness of the current population; indicates the current best fit value; indicates the current worst fit value; for each whale, a ∈ [0,1] is compared with Q value. If q < Q, the randomly selected individual whale updates its position according to Equation (17), and the other individual whales remain unchanged [52]. Otherwise, other individuals update their position according to Equation (10). This allows the algorithm to generate a set of random solutions globally with a greater probability in the early iterations, reducing the likelihood of population diversity decline and enhancing the global search capability of the algorithm.
where is a random number between [0, 1]; and are the maximum and minimum values of respectively.

Adaptive t-Distribution Dimension-by-Dimensional Variation Strategy
Population diversity declines in later iterations of the WOA. This leads to the algorithm being prone to fall into the local optimum. Therefore, this study introduces an adaptive t-distribution dimension-by-dimensional variation strategy to perturb the individuals with optimal fitness and improve the ability of the algorithm to jump out of the local optimum. Depending on the size of the degree of freedom n, the t-distribution curves show different patterns. When → ( → ∞) → (0, 1), ( = 1) = (0, 1), where N (0, 1) is a Gaussian distribution and C (0, 1) is a Cauchy distribution. This shows that the two boundary special cases of the t-distribution are the Gaussian and the Cauchy distributions [64]. The dimension-by-dimension variation is calculated as follows [53]: where indicates the current number of iterations; ( ) denotes t-distribution with the degree of freedom parameter t. The flow chart for the MWOA is shown in Figure 3.

Extreme Learning Machine
The Extreme Learning Machine is a new neural network learning algorithm proposed by Professor Guangbin Huang in 2004 [65]. The Extreme learning machine also evolved from the feedforward neural network, which can randomize the input weights, from the feedforward neural network, which can randomize the input weights, bias, and the number of hidden layer neurons and then obtain the output weights by least squares without the need for the entire iteration of the network [34,35]. ELM is widely used in various fields such as pattern recognition, image processing, signal processing, combinatorial optimization, and prediction [66][67][68][69]. The structure of the ELM is shown in Figure 4. The primary calculation process for ELM is as follows: Polymers 2023, 15,308 bias, and the number of hidden layer neurons and then obtain the output weights b squares without the need for the entire iteration of the network [34,35]. ELM is w used in various fields such as pattern recognition, image processing, signal proce combinatorial optimization, and prediction [66][67][68][69]. The structure of the ELM is sho Figure 4. The primary calculation process for ELM is as follows: H is called the hidden layer output matrix of the neural network [71,72]. The umn of H is the output vector of the i th hidden neuron concerning the Assume that the number of hidden layer neurons is N, and the activation function is h(x), the standard single hidden layer feedforward neural network (SLFN) expression is as follows [70]: where k i = [k i1 , k i2 , k i3 , . . . , k in ] T denote the vector of weights connecting the i th hidden neuron to the input neuron; b i denotes the bias of the i th neuron; β i = [β i1 , β i2 , β i3 , . . . , β in ] T denotes the weights connecting the i th hidden layer neuron to the output neuron. k i · x j denote the inner product of k i and x j ; The activation function is usually Sigmoid, RBF or Sine, and in this study the activation function is Sigmoid. A standard SLFN with N hidden layer neurons and activation function h(x) can approach this N samples with zero error, where ∑ N j=1 y j − t j = 0 . Thus, the following expression exists [70]: The above N equations can be written as [70]: H is called the hidden layer output matrix of the neural network [71,72]. The i th column of H is the output vector of the i th hidden neuron concerning the input x 1 , x 2 , x 3 , . . . , x n . When the input layer weights and the hidden layer bias are determined, the hidden layer output matrix H can be obtained by following the input samples. So, the final conversion is to find the least squares solution for Hβ = T [70]: the least squares solution of Equation (14) is as follows [70]: where: H † is the Moore-Penrose generalized inverse matrix of the matrix H [73]. Random input weights and hidden layer bias can lead to problems, such as blind iterations and accuracy degradation [36]. Therefore, this study introduces the MWOA into the ELM model to optimize the input weights and hidden layer bias to improve the model's accuracy.

Random Forest Model
The RF model is one of the most commonly used regressions and classification models proposed by Leo Breiman in 2001 [74]. The main idea is to train decision trees by taking n samples from the original data set N to form a new training set, and m random forests are created by these n decision trees at random [75,76]. Meanwhile, the predicted value is decided by the voting of these m random forests [77]. A mathematical model can explain the RF regression model, the leading theory being that X is the independent variable (input data) and Y is the dependent variable (output data). Assuming that the distributions of (X, Y) are independent, the randomly generated training set is Q, and the predicted outcome is G(X), the mean squared generalization error is [78]: Assuming that there are h decision trees, the average of the predicted values {G(Q, X h )} of the h decision trees is the prediction of the RF regression. If h → ∞ , then the following equation holds [78]: where E X,Y Y − E Q (X, Q h ) 2 denotes the generalization error, noted as M. When h is infinite, the average generalization error of a single tree is noted as M * . The expression for M * is as follows [78]: where Q satisfies the following expression [78]: where ρ denotes the residual weighted correlation coefficient. The final RF regression function is as follows [78]: Since the number of forests and leaves in the RF model significantly impacts the model's performance, at the same time, they have randomness and limitations. Therefore, this study introduces MWOA to optimize these two parameters. The structure of the RF model is shown in Figure 5.

ELMAN Neural Network
Elman neural network was proposed by ELMAN in 1990 [79]. ELMA layer dynamic recurrent neural network that can approximate nonlinear fu and is therefore used in many industries [39,80,81]. Like artificial neur ELMAN has an input, hidden, and output layer. The difference is that EL unique storage layer. This particular storage layer, which acts as a delay o store the output values of the neurons in the previously hidden layer, giving a memory function and improving the net work's ability to process dynamic The structure of ELMAN is shown in Figure 6. The expression for ELMAN at t is as follows [82]: where 1 , denotes the weight of node i in the connected input layer and hidden layer; 2 , denotes the weight of node i and node j in the connection s 3 , denotes the weight connecting node i in the hidden layer and node j i layer; ( ), ( ) and ( ) denote the output vectors of the hidden layer layer and the output layer respectively. f and g denote the transfer functions o layer and the output layer, respectively. The transfer function for this study ELMAN calculates the number of hidden layer neurons in the same w

ELMAN Neural Network
Elman neural network was proposed by ELMAN in 1990 [79]. ELMAN is a multi-layer dynamic recurrent neural network that can approximate nonlinear functions well and is therefore used in many industries [39,80,81]. Like artificial neural networks, ELMAN has an input, hidden, and output layer. The difference is that ELMAN has a unique storage layer. This particular storage layer, which acts as a delay operator, can store the output values of the neurons in the previously hidden layer, giving the network a memory function and improving the net work's ability to process dynamic information. The structure of ELMAN is shown in Figure 6. The expression for ELMAN at the moment t is as follows [82]: where ω1 i,j denotes the weight of node i in the connected input layer and node j in the hidden layer; ω2 i,j denotes the weight of node i and node j in the connection storage layer; ω3 i,j denotes the weight connecting node i in the hidden layer and node j in the output layer; x j (k), c i (k) and y j (k) denote the output vectors of the hidden layer, the storage layer and the output layer respectively. f and g denote the transfer functions of the hidden layer and the output layer, respectively. The transfer function for this study is tanh.

Evaluation Indicators for the Three Models
This study uses root mean square error (RMSE), mean absolute er absolute percentage error (MAPE), and coefficient of determination (R 2 ) formance of the model. R 2 is the metric used to evaluate the accuracy o dictions [83,84]. The closer the R 2 value is to 1, the closer the MAE is t accurate the model will be. These four evaluation indicators are exp [85,86]: Figure 6. The structure of ELMAN.
ELMAN calculates the number of hidden layer neurons in the same way as ANN. The main expressions are as follows [31]: where m is the number of nodes in the input layer; n is the number of nodes in the output layer; a ∈ (1, 10). ELMAN's predictive performance is influenced by weights and biases. Therefore, the optimal weights and biases are found by optimizing the ELMAN neural network using MWOA.
The flow chart of the research process is shown in Figure 7.

Evaluation Indicators for the Three Models
This study uses root mean square error (RMSE), mean absolute error (MAE), absolute percentage error (MAPE), and coefficient of determination (R 2 ) to assess th formance of the model. R 2 is the metric used to evaluate the accuracy of the model dictions [83,84]. The closer the R 2 value is to 1, the closer the MAE is to 0, and the accurate the model will be. These four evaluation indicators are expressed as fo [85,86]:

Evaluation Indicators for the Three Models
This study uses root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and coefficient of determination (R 2 ) to assess the performance of the model. R 2 is the metric used to evaluate the accuracy of the model's predictions [83,84]. The closer the R 2 value is to 1, the closer the MAE is to 0, and the more accurate the model will be. These four evaluation indicators are expressed as follows [85,86]: where N indicates the number of samples; q 0 indicates the actual value; q 0 indicates the actual average value; q t indicates the output value; q t indicates the output average value, k = 1:N.

Results of the Three Models
The objective of the computational analysis was to predict the D CI of the RC using three ML models (MWOA-ELM, MWOA-RF, and MWOA-ELMAN). Therefore, the model optimized by WOA and the conventional model was also built for comparison. A crossvalidation operation is also used in the calculation process, and the result is the average of a 5-fold cross-validation. This was done to make the results more realistic and to avoid chance. Therefore, the data set is divided into five groups by a 5-fold cross-validation operation. For each training session, one set was used as the testing set and the remaining four sets were used as the training set. This resulted in three 70 training sets, 18 test sets, two 71 training sets, and 17 test sets of data sets. Consistent data for each model during training and testing by programming. The models constructed and the computational results are described in detail in the following subsections.

MWOA-ELM Model Result
Like neural network models, ELM models need to determine the number of hidden layer neurons. In this study, the number of neurons in the hidden layer of the ELM model was calculated by the corresponding program and determined by the trial-and-error method to be 28. The parameter settings for the MWOA-ELM model in this study are shown in Table 2. The parameter settings for the WOA-ELM model are the same as the MWOA-ELM model. The average results of the three ELM models under 5-fold cross-validation are presented in Table 3. It was clear that the MWOA-ELM model performs the best. Its test set R 2 improved from 0.6458 to 0.9971, while the other error metrics RMSE, MAE, and MAPE were all the lowest among the three ELM models. Figure 8a,b represent the Taylor diagrams for the training and testing sets of the three ELM models [87]. As can be seen from the graph, the MWOA was effective, as reflected by the fact that the MWOA-ELM model was closest to the optimal reference point for each indicator.

MWOA-RF Model Result
Introducing the MWOA is to find the optimal number of forests and leaves f RF model. The parameter settings for the MWOA-RF model for this study are sho Table 4. The parameter settings for the WOA-RF model are the same as the MWO model.

MWOA-RF Model Result
Introducing the MWOA is to find the optimal number of forests and leaves for the RF model. The parameter settings for the MWOA-RF model for this study are shown in Table 4. The parameter settings for the WOA-RF model are the same as the MWOA-RF model. The results of the 5-fold cross-validation of the three random forest models are presented in Table 5. On the test set, the MWOA-RF model had the highest R 2 of 0.9341 and the lowest values of 1.0164, 0.6533, and 0.0962 for RMSE, MAE, and MAPE, respectively. Figure 9a,b show the Taylor diagrams for the three RF models on the training and testing sets. It was clear that the MSSA-RF model was closest to the optimal reference point among the three evaluation indicators of the Taylor diagram. Therefore, MWOA is effectively increased the probability of the RF model finding the optimal number of forests and leaves.

MWOA-ELMAN Model Result
A three-layer feed-forward MWOA-ELMAN model was established, and the number of neurons was obtained by Equation (32) and trial-and-error method as MWOA-ELMAN model parameter settings for this study are shown in Table 6. rameter settings for the WOA-ELMAN model are the same as the MWOA-E model.

MWOA-ELMAN Model Result
A three-layer feed-forward MWOA-ELMAN model was established, and the optimal number of neurons was obtained by Equation (32) and trial-and-error method as 13. The MWOA-ELMAN model parameter settings for this study are shown in Table 6. The parameter settings for the WOA-ELMAN model are the same as the MWOA-ELMAN model. The 5-fold cross-validation results for the three ELMAN models are presented in Table 7. Similar to the pattern of the first two models, the MWOA-ELMAN model has the best results. Figure 10 represents the Taylor diagrams for the training and testing sets. From Figure 10, the results of the MWOA-ELMAN model were closest to the optimal reference point, indicating that the algorithm is effective in optimizing the weights and improving the model's prediction accuracy.

Comparative Analysis of the Three Models
In Section 5, MWOA is proven to improve the generalization of the three ELM, RF, and ELMAN. It is also possible to prove that the three optimization mod the most exact (it has been shown in the literature that a model is highly accurat

Comparative Analysis of the Three Models
In Section 5, MWOA is proven to improve the generalization of the three models ELM, RF, and ELMAN. It is also possible to prove that the three optimization models are the most exact (it has been shown in the literature that a model is highly accurate when its R 2 is more significant than 0.9 [88]). This indicates that ML techniques can meet the prediction accuracy. However, it is necessary to perform a comparative analysis to obtain the optimal model. Figure 11 represents the metric radar plots for the training and testing sets of the MWOA-ELM, MWOA-RF, and MWOA-ELMAN models. The figure shows that the MWOA-ELM model outperforms the other two models on the training and testing sets, with the highest R 2 and lowest RMSE, MAE and MAPE. Figure 12 represents the Taylor diagrams for the training and testing sets of the three models. The MWOA-ELM model performs the best, with the lowest error metric on the Taylor diagram, while being closest to the best reference point. Table 8 shows the average of the 5-fold cross-validation results for each of the three models. MWOA-ELM model outperformed the training process during testing. In contrast, the prediction accuracy of both the MWOA-RF and MWOA-ELMAN models decreased, with the MWOAA-RF model decreasing the most (by about 5.6%), indicating that MWOA-ELM is very stable. Figure 13 represents box plots of the training and testing sets for the three models. The bars indicate the mean value of each model. The training and prediction results of the models can be seen more visually in the box plots, where the MWOA-ELM model is not necessarily the best on the training set but has the lowest mean value. In the testing set, it is the best performer and relatively stable, with all R 2 around 0.99. This suggests that the MWOA-ELM is the best. It can also demonstrate that with the introduction of cross-validation, the model is very realistic in its calculations and avoids chance.

Sensitivity Analysis
Sensitivity factor analysis (SA) is an effective method for measuring the influence model input parameters on output parameters. Sensitivity factor analysis provides fee back on the importance of the model input parameters. Therefore, this study uses the sine amplitude method (CAM) [89] to perform sensitivity factor analysis on three mod and an experimental model. The expressions are as follows: where xi denotes input parameters; xj denotes output parameters; n denotes the numb of data; Rij denotes the strength of the relationship. Figure 14 represents the strength factor of the relationship between each variable a

Sensitivity Analysis
Sensitivity factor analysis (SA) is an effective method for measuring the influence of model input parameters on output parameters. Sensitivity factor analysis provides feedback on the importance of the model input parameters. Therefore, this study uses the cosine amplitude method (CAM) [89] to perform sensitivity factor analysis on three models and an experimental model. The expressions are as follows: where x i denotes input parameters; x j denotes output parameters; n denotes the number of data; R ij denotes the strength of the relationship. Figure 14 represents the strength factor of the relationship between each variable and D CI . It can be seen that the three models show similar sensitivity to the experimental model, justifying the developed model. As seen from the graph, the measurement method has the most significant effect on the D CI of RC, followed by FA, while the impact of WR is the least.

Prediction of Typical Machine Learning Model
In Section 6.1, the MWOA-ELM model was proven to be the b the typical predictions from the MWOA-ELM model are shown in depicts the regression results for the training and testing set. It is im that the MWOA-ELM model has strong predictive power. Its ind training and testing set were 2

Prediction of Typical Machine Learning Model
In Section 6.1, the MWOA-ELM model was proven to be the best model. Therefore, the typical predictions from the MWOA-ELM model are shown in this section. Figure 15 depicts the regression results for the training and testing set. It is important to emphasize that the MWOA-ELM model has strong predictive power. Its indicator values for the training and testing set were R 2 = 0.9928, RMSE = 0.3243, MAE = 0.2219, MAPE = 0.0287 and R 2 = 0.9987, RMSE = 0.1336, MAE = 0.0979, MAPE = 0.0187. Figure 16 shows the predicted values of the MWOA-ELM model compared to the actual values, with the error values included. The comparison results show that the predicted values of the D CI of RC are consistent with the experimental values. It is worth noting that the error between the training set and the testing set is small, which indicates that the MWOA-ELM model can predict the D CI of RC well. The above results suggest that predicting the D CI of RC using the MWOA-ELM model is feasible. This may contribute to developing a numerical tool for determining durability indicators for RC. The application of intelligent algorithms is equally effective. In the future, consider increasing the amount of data and input variables, which would improve the ability of the MWOA-ELM model to predict the D CI of RC. The weight matrix for the MWOA-ELM model's typical prediction result is shown in Appendix A.
for determining durability indicators for RC. The application of intelligent algorithms equally effective. In the future, consider increasing the amount of data and input var bles, which would improve the ability of the MWOA-ELM model to predict the DCI of R The weight matrix for the MWOA-ELM model's typical prediction result is shown in A pendix A.

Forecast Comparison
The overall results of the representative predictions of the MWOA-ELM model shown in Section 6.3. However, it is necessary to show the forecasting results within model. This provides a more intuitive view of the model's predictions. Therefore, this s tion shows the prediction results of the MWOA-ELM model under different methods s arately. Figure 17 represents the prediction result of the MWOA-ELM model for the li ature [54,55]. The model's predictions can be judged from Figure 17, in general agreem with the experimental results. Figure 18 represents the prediction results of the model the literature [57][58][59]. It can be seen from Figure 18 that the model predicts better resu for the literature [58,59]. The predicted curves for the literature [57], showed some dev tions, but the overall trend was consistent. However, the errors are acceptable in terms the overall results of Section 6.3. Figure 19 represents the prediction results of the mo for the literature [56]. From Figure 19, it can be observed that the prediction curves b show some deviations. This may be due to the algorithm falling into the local optim when optimizing this part of the data, making the model learn insufficiently. However general, the errors are still acceptable. Figure 20 reflects the predictions of the literat [22]. From Figure 20, the prediction trends are consistent and the errors are relativ

Forecast Comparison
The overall results of the representative predictions of the MWOA-ELM model are shown in Section 6.3. However, it is necessary to show the forecasting results within the model. This provides a more intuitive view of the model's predictions. Therefore, this section shows the prediction results of the MWOA-ELM model under different methods separately. Figure 17 represents the prediction result of the MWOA-ELM model for the literature [54,55]. The model's predictions can be judged from Figure 17, in general agreement with the experimental results. Figure 18 represents the prediction results of the model for the literature [57][58][59]. It can be seen from Figure 18 that the model predicts better results for the literature [58,59]. The predicted curves for the literature [57], showed some deviations, but the overall trend was consistent. However, the errors are acceptable in terms of the overall results of Section 6.3. Figure 19 represents the prediction results of the model for the literature [56]. From Figure 19, it can be observed that the prediction curves both show some deviations. This may be due to the algorithm falling into the local optimum when optimizing this part of the data, making the model learn insufficiently. However, in general, the errors are still acceptable. Figure 20 reflects the predictions of the literature [22]. From Figure 20, the prediction trends are consistent and the errors are relatively small. The above conclusions indicate that using the MWOA-ELM model predicting the D CI of RC is feasible. The model can still be further optimized, which includes developing more powerful algorithms and increasing the amount of data.
the overall results of Section 6.3. Figure 19 represents the prediction results of the model for the literature [56]. From Figure 19, it can be observed that the prediction curves both show some deviations. This may be due to the algorithm falling into the local optimum when optimizing this part of the data, making the model learn insufficiently. However, in general, the errors are still acceptable. Figure 20 reflects the predictions of the literature [22]. From Figure 20, the prediction trends are consistent and the errors are relatively small. The above conclusions indicate that using the MWOA-ELM model predicting the of RC is feasible. The model can still be further optimized, which includes developing more powerful algorithms and increasing the amount of data.

Comparative Analysis with Other Models
To further verify the MWOA-ELM model's reliability, this study int ple linear regression model for comparison [90]. Since the MRL model is s model in that it also studies the effects of multi-factor interactions, it is co MWOA-ELM model. Figure 21 represents the regression analysis res model. The results of the evaluation indicators for the MWOA-ELM ( model results under five-fold cross-validation) model and the MRL mod Table 9. Obviously, MWOA-ELM is superior to the MRL model.

Comparative Analysis with Other Models
To further verify the MWOA-ELM model's reliability, this study introduces a multiple linear regression model for comparison [90]. Since the MRL model is similar to the ML model in that it also studies the effects of multi-factor interactions, it is compared with the MWOA-ELM model. Figure Table 9. Obviously, MWOA-ELM is superior to the MRL model.   Comparison with other models is equally necessary. Ye [91] develop ical model to predict the DCI of RC, because the input variables for the mat  Comparison with other models is equally necessary. Ye [91] developed a mathematical model to predict the D CI of RC, because the input variables for the mathematical model used are the water-cement ratio, the rubber admixture, and the rubber size. Therefore, the input variables of the MWOA-ELM model are replaced in the same way. Keeping the same input variables is better for comparison. The data were obtained from three randomly selected papers to avoid complex calculation [22,54,55]. Figure 22 represents the results of the regression analysis for the two models. The results of the evaluation indicators for the two models are shown in Table 10. From Figure 22, the regression analysis result of the MWOA-ELM model is better, with the R 2 of 0.991 higher than the mathematical model of 0.8053. Similar results are seen in the other error evaluation indicators in Table 9. This indicates that the MWOA-ELM model has better prediction and generalization ability. Comparison with other models is equally necessary. Ye [91] developed a mathem ical model to predict the DCI of RC, because the input variables for the mathematical mo used are the water-cement ratio, the rubber admixture, and the rubber size. Therefore, input variables of the MWOA-ELM model are replaced in the same way. Keeping same input variables is better for comparison. The data were obtained from three r domly selected papers to avoid complex calculation [22,54,55]. Figure 22 represents results of the regression analysis for the two models. The results of the evaluation indi tors for the two models are shown in Table 10. From Figure 22, the regression analy result of the MWOA-ELM model is better, with the R 2 of 0.991 higher than the mathem ical model of 0.8053. Similar results are seen in the other error evaluation indicators Table 9. This indicates that the MWOA-ELM model has better prediction and generali tion ability.

Conclusions and Future Prospect
This study used ML techniques to predict the D CI of RC. Three models, ELM, RF, and ELMAN were developed to investigate. The established MWOA was also used to optimize the three models. Four metrics, RMSE, MAE, and MAPE, were used to evaluate the performance of the models. According to the prediction results, the MWOA-optimized ELM, RF, and ELMAN models successfully predicted the D CI of RC. At the same time, they had the highest R 2 and lowest errors compared to the unoptimized models. This indicates that the algorithm established is valid. Comparing the three optimal models, the MWOA-ELM model performs the best. The three models were shown to have similar sensitivity to the experimental model by the CAM method. This justifies the developed models. Comparing the typical prediction results of the MWOA-ELM model with the actual values shows that the prediction results generally agree with the experiment, while the error is within a reasonable range. Comparison with the MRL and published mathematical model show that the MWOA-ELM model performs the best. This suggests that the MWOA-ELM model can accurately predict the D CI of RC.
In summary, this study successfully used ML techniques to predict the D CI of RC while demonstrating the proposed MWOA is valid. This provides a new option for determining the D CI of RC. However, observation of the results revealed that the proposed method could be further optimized to better understand RC's chloride permeation process. This includes