Predicting the Geopolymerization Process of Fly-Ash-Based Geopolymer Using Machine Learning

: The process of geopolymerization affects the freshness and hardening properties of ﬂy ash base polymer. The prediction of geological polymerization parameters, such as DPT, DPH, GPT, and GPH, is very important for the mixing optimization of FA base polymer. In this study, machine learning models such as backpropagation neural network, support vector regression, random forest, K-nearest neighbor, logistic regression, and multiple linear regression were used to predict the above geological polymerization parameters and explain the inﬂuence of composition on the geological polymerization of FA base polymer. Results show that RF was the most stable ML model and had the best predictive performance on the test sets of GPT, GPH, DPT, and DPH, with correlation coefﬁcients of 0.88, 0.95, 0.92, and 0.95, respectively. The variable importance and sensitivity were analyzed by SHapley Additive exPlanations. Results indicate that temperature is the most signiﬁcant input variable affecting the DPT, DPH, and GPH with SHAP values of 0.09, 4.83, and 1.03, respectively. For GPT, the SHAP value of temperature is 6.89, slightly lower than that of LFR (6.95); yet it is a still signiﬁcantly important input variable. The mole ratio and alkaline solution concentration were also important and negatively contributed to DPT and DPH, respectively. Besides, both GPT and GPH were sensitive to the mass ratio of liquid-to-ﬂy ash which can promote the geopolymerization extent and shorten the geopolymerization time at a small content. The results of this study pave the way for automatic mixture optimization of FA-based geopolymers.


Introduction
Cement is a vital component of concrete and the most widely used manmade material on the planet [1]. However, cement production accounts for as much as 7% of global CO 2 emissions [2] and 12-15% of the world's industrial energy use [3]. To protect the environment, eco-friendly binder materials that can replace cement in concrete are urgently needed. Geopolymer is one of the most important alternatives to cement. Geopolymers are developed from industrial wastes containing pozzolanic minerals. Therefore, geopolymers have a large number of advantages such as lower cost, higher mechanical strength, excellent fire, and corrosion resistance, and lower energy consumption [4]. Besides, it has a reduced carbon footprint compared to cement [5]. The CO 2 emission may be reduced by 60-80% if the cement is replaced by geopolymer in construction materials [6].
Geopolymers are synthesized using a variety of feedstocks with different chemical compositions. According to the different content of calcium oxide in raw materials, geopolymers are generally divided into low calcium type and high calcium type. Metakaolin and fly ash are representative materials of low-calcium geopolymer, which usually need to be synthesized with high alkalinity hydroxide or alamu metalosilicate solution [7]. In the prediction models are proposed in the literature. In recent years, the use of machine learning (ML) has been considered a powerful tool for predicting the performance of cementitious materials [20]. Due to the excellent fitting capability of ML, the output can be accurately predicted when the input variables are highly uncorrelated, reducing the need for time-consuming and expensive experiments [21]. Therefore, this study adopted ML models to evaluate geopolymerization parameters (DPT, DPH, GPT, GPH). However, there is no single best ML model for every dataset as per the no-free-lunch theorem [22]. Therefore, evaluation of the performance of different ML models is of vital importance to select a model with the best prediction performance. It should also be noted that for the ML models to have robust performance, we need to optimize their hyperparameters. Metaheuristics have the advantages of simple coding and high computational efficiency, which can be used as optimization algorithms for hyperparameter tuning [23]. In recent years, an individual intelligence-based beetle antenna search algorithm (BAS) has been proposed [24]. The BAS algorithm has simple code and is easy to implement, and also, its calculation time is very short. At the same time, due to its specific step size strategy, it can overcome the shortcoming of falling into the local optima [25]. Therefore, this paper uses the BAS algorithm to optimize the hyperparameters.
We aim to accurately predict the geopolymerization parameters and interpretate the influence of geopolymer composition on its geopolymerization process. This paper adopts machine learning methods to model the geopolymerization parameters of the FA-based geopolymer. In the following parts, Section 2 describes the dataset, the construction of the ML model, and the principle of the BAS algorithm. Then, in Section 3, the results of hyperparameter tuning were summarized and the performance of different ML algorithms was compared. Finally, the ML model was used to study the importance and sensitivity of the variables. The RF model can be used to evaluate the effects of mixing parameters (molar ratio, alkaline solution concentration, liquid-cement mass ratio) and temperature on the FA-based polymerization process more effectively. This model can further reduce the trial and error in the process of mix ratio design and can be used to study the sensitivity of mix ratio design parameters. The information obtained from this model can be used as a guide for the mixing optimization of FA-based polymers.

Dataset
The geopolymerization parameters of FA-based geopolymers were characterized by the isothermal calorimetry method [16]. In a typical calorimetric curve, the first exothermal peak (the dissolution peak) with high intensity appeared very fast after the fly ash initially contacted the alkali activator, indicating a very high dissolution rate of the fly ash. After this, the dissolution peak suddenly dropped, while the mixture continued to release heat, and a second flat and wide exothermic peak appeared, which was called the geopolymerization peak [26]. This peak provided information on the strength development of the geopolymer. The intensity and appearance time of these peaks on the calorimetric curve was very important to the fresh and hardened properties of geopolymers. Therefore, it is important to predict the dissolution peak time (DPT), dissolution peak heat (DPH), geopolymerization peak time (GPT), and geopolymerization peak heat (GPH) during geopolymerization for mixture optimization [26].
In the present research, the dataset was derived from previous literature [26,27]. The dataset includes 72 experimental results of DPT, DPH, GPT, and GPH, respectively. In this dataset, fly ash with a ratio of 2.52 and fineness of 419.6 m 2 /kg was used as the precursor material. The chemical analysis of fly ash is given in Table 1. NaOH and Na 2 SiO 3 were used as activators. By adjusting the content of NaOH in the NaOH-Na 2 SiO 3 solution, the SiO 2 /Na 2 O molar ratio of the designed alkali solution is 1.0, 1.5, and 2.0. The alkaline solution was diluted with deionized water and adjusted the concentration of the alkaline solution to 15%, 20%, and 25% (by mass). Fly ash and nine alkali solutions were mixed according to the liquid-cement ratio (L/F = 0.33, 0.40, 0.50, 0.60). The mixture design is shown in Table 2. This component includes molar ratio (MR), alkaline solution concentration (ASC), and liquid-cement mass ratio (LFR). Curing temperature (T) has a significant effect on the process of geodetic polymerization. Therefore, the component parameters MR, ASC, and LFR and the environmental parameter T are used as input variables. The higher the molar ratio, the higher the concentration of free silicate ions in the alkali solution, which may form geopolymerization products on the surface of FA particles in a short time and inhibit the further dissolution of FA [28]. The liquid FA ratio has an important effect on the degree and time of geopolymerization. The concentration of alkaline solution has a great influence on the dissolution of FA in alkaline solution. Temperature affects the kinetic energy of the FA base polymer, which is conducive to the breaking and formation of solute molecular bonds [29].
The correlation coefficients between each input variable are shown in the matrix of correlation coefficients ( Figure 1). The Pearson correlation coefficient between two different variables was less than 0.5, which indicates that there was no multi-collinearity problem between these input variables [30]. The statistical description of the variables was shown in Table 3.   The backpropagation neural network (BPNN) has been widely employed to solve engineering issues [31]. The architecture of BPNN is composed of the input, output, and several hidden layers. This algorithm compares the differences between the predicted and actual outputs during the training process to calculate the errors which then propagates backward to minimize the weights and thresholds. The prediction accuracy of BPNN can be significantly improved by the backpropagation process. Figure 2 presents a representative structure of the BPNN model. In this figure, X is the input neuron, H is the hidden neuron, and Y is the output neuron.The equation shown as follows can be employed to determine the correlation between the output variable and inputs of a neuron in the BPNN.  The backpropagation neural network (BPNN) has been widely employed to solve engineering issues [31]. The architecture of BPNN is composed of the input, output, and several hidden layers. This algorithm compares the differences between the predicted and actual outputs during the training process to calculate the errors which then propagates backward to minimize the weights and thresholds. The prediction accuracy of BPNN can be significantly improved by the backpropagation process. Figure 2 presents a representative structure of the BPNN model. In this figure, X is the input neuron, H is the hidden neuron, and Y is the output neuron.The equation shown as follows can be employed to determine the correlation between the output variable and inputs of a neuron in the BPNN.
In which O denotes the neuron output; f is an activation function for restricting the amplitude of the output; w j denotes the weight of the input x j ; b can represent the value of bias in the neuron. The sigmoid function can be usually used as the function of activation as follows: The mean square error (MSE) can be employed as the threshold to terminate the training process: where y i is the actual value of the output;ŷ i is the predicted value of the output.
In which denotes the neuron output; is an activation function for restricting the amplitude of the output; denotes the weight of the input ; b can represent the value of bias in the neuron. The sigmoid function can be usually used as the function of activation as follows: The mean square error (MSE) can be employed as the threshold to terminate the training process: where is the actual value of the output; ̂ is the predicted value of the output.

Support Vector Regression (SVR)
Support vector regression (SVR) is a popular supervised algorithm based on a machine learning approach [32]. This algorithm can model the correlation between the input parameters and output parameters by mapping the data into a characteristic space with higher dimensions from the sample space. The separation of training data can be maximized and the upper bound of errors can be minimized by SVR. It is broadly employed because it has a fast-learning speed, an outstanding generalization capability, and is not sensitive to noise. Figure 3 plots a schematic diagram of a SVR model. In this figure, the bule stars represent the data points in the dataset.

Support Vector Regression (SVR)
Support vector regression (SVR) is a popular supervised algorithm based on a machine learning approach [32]. This algorithm can model the correlation between the input parameters and output parameters by mapping the data into a characteristic space with higher dimensions from the sample space. The separation of training data can be maximized and the upper bound of errors can be minimized by SVR. It is broadly employed because it has a fast-learning speed, an outstanding generalization capability, and is not sensitive to noise. Figure 3 plots a schematic diagram of a SVR model. In this figure, the bule stars represent the data points in the dataset. Generally, the following equation is applied to describe the SVR model: where x is an l-dimensional input variable; represents the weight vector; ( ) deno a nonlinear mapping function; b is the deviation value. The loss function ℒ for the SV Generally, the following equation is applied to describe the SVR model: where x is an l-dimensional input variable; w represents the weight vector; ϕ(x) denotes a nonlinear mapping function; b is the deviation value. The loss function L for the SVR algorithm is defined as follows: where y i is the actual value of the output of x i ; f (x i ) is the predicted value of the output of x; ε is the maximum tolerance error. It should be noted that only the points situated on or outside the ε-tube can be employed as support vectors to build f (x). The problem can be re-expressed to minimize the structural risk: The above equation is transferred as the convex optimization problem by describing the slack variables ξ i and ξ * i : in which C is a penalty argument for determining the trade-off between the degree of punishment of the data outside the tube and the flatness of f (x). The constraint problem can be solved by introducing the Lagrange multipliers: The Karush-Kuhn-Tucker (KKT) conditions must be satisfied when the objective function is differentiable and the constraint functions have strong duality: Besides, at the optimal solution, the following equations should be zero based on KKT conditions: The Lagrange dual problem is then addressed by substituting the above equations The SVR model can be reiterated as follows by substituting

Random Forest (RF)
Random Forest (RF) is a supervised machine learning algorithm made up of decision trees using both bagging and feature randomness [33]. The bagging method repeatedly samples the original training set and produces many new training sets. The weak tree models are then trained on these newly generated training sets. The feature randomness is the selection of a random characteristic subset at each node of the tree.
A typical flowchart of the RF structure is plotted in Figure 4. The construction procedure of the RF model can be iterated as follows: . , x n } is the original training set with n samples and its label is Y = {y 1 , y 2 , . . . , y n }. The data in the original training set is then randomly sampled with replacement to generate N new training sets {X 1 , X 2 , . . . , X n } with labels {Y 1 , Y 2 , . . . , Y n }.
The N decision trees {t 1 , t 2 , . . . , t N } are grown on newly generated training sets by randomly selecting characteristics at every node of the tree without pruning.
The plurality of the voting rule is used to aggregate the outcomes of all decision trees.  The k-nearest neighbor (kNN) algorithm is also a widely used algorithm for solving regression and classification algorithms. This algorithm compares determines the label of new data by comparing the labels of the k closest data [34]. The Minkowski metric is generally employed to define the distance between input vectors and :

The k-Nearest Neighbor
The k-nearest neighbor (kNN) algorithm is also a widely used algorithm for solving regression and classification algorithms. This algorithm compares determines the label of new data by comparing the labels of the k closest data [34]. The Minkowski metric is generally employed to define the distance between input vectors x i and x j : The predicted output can be represented by the following equation: where y i is the distance d i between x and the ith sample. Finally, the predicted valueŷ is the mean of the predicted value of its k nearest neighbors.

Logistic Regression
Logistic regression (LR) is a popular ML model that utilizes logistic functions to model dependent variables. An LR model with multiple predictors can be represented by the following equation: where p is a dependent variable as a probability; b 0 ,..., b n denote the constant coefficients of the LR model; x 1 ,..., x n represent the independent variables.

Multiple Linear Regression
Multiple linear regression (MLR) uses the following linear equations to model the input-output correlation: where X represents an n-dimensional feature vector; β i (i = 1, 2, . . . n) represents the regression coefficients, and Y is the output variable.

Optimization Algorithm
In this study, the beetle antennae search (BAS) algorithm is employed to tune the hyperparameters of the ML models. This algorithm is inspired by the beetle's foraging behavior. A beetle search for food using its two antennae. The beetle's movement direction depends on the concentration of the odor. Assume that b is a random vector representing the direction, which is defined as: where rand is a uniform random function and k represents the dimension of the searching space. The position vector can be written as: where x i is the beetle position at the i th time instant (i = 1, 2 . . . ); d i is the distance between the antennas at the i th time instant; x i r and x i l represent a position on the right and left antennae side at the i th time instant, respectively.
The beetle's position vector is updated according to the following equation: where δ i is the step size at time i; f (x) is the objective function. The following updating strategy for the antennae length and step size can be applied to avoid being trapped into local optima: The flow chart of the tuning ML model using BAS is shown in Figure 5.
where is the beetle position at the ℎ time instant (i = 1, 2…); is the distance between the antennas at the ℎ time instant; and represent a position on the right and left antennae side at the ℎ time instant, respectively. The beetle's position vector is updated according to the following equation: where is the step size at time ; ( ) is the objective function. The following updating strategy for the antennae length and step size can be applied to avoid being trapped into local optima: = 0.95 −1 + 0.01 The flow chart of the tuning ML model using BAS is shown in Figure 5.

K-Fold Cross Validation
When the dataset is small, the overfitting problem easily occur. To prevent this problem, K-fold cross-validation (CV) was introduced [35]. As suggested above, the value of k is set to 10. Specifically, the dataset is randomly divided into a training set (external training set) containing 80% of the instances and a test set containing 20% of the instances. The external training set is further divided into 10 subsets. In each fold, the BAS algorithm searches for the optimal hyperparameter on nine subsets and then calculates the root mean square error (RMSE) on another validation set, as shown in Figure 6. This process is repeated 10 times, each time using a different set as the validation set. After 10 times of cross-validation, the 10 groups of hyperparameters were averaged to obtain the final result.

Evaluation of the Predicted Results
The predicted results of ML models to estimate geopolymerization parameters were assessed using the root-mean-square error (RMSE) and correlation coefficient (R).
RMSE refers to the deviation between the evaluated and target values. It is given by: where y i denote the actual value; y * i denote the predicted value; n is the number of instances. R expresses the strength of correlation between predicted and observed values. It is expressed as: where y * is the average value of the predicted results; y is the average value of the actual results; n presents the instance number.
lem, K-fold cross-validation (CV) was introduced [35]. As suggested above, the value of k is set to 10. Specifically, the dataset is randomly divided into a training set (external training set) containing 80% of the instances and a test set containing 20% of the instances. The external training set is further divided into 10 subsets. In each fold, the BAS algorithm searches for the optimal hyperparameter on nine subsets and then calculates the root mean square error (RMSE) on another validation set, as shown in Figure 6. This process is repeated 10 times, each time using a different set as the validation set. After 10 times of cross-validation, the 10 groups of hyperparameters were averaged to obtain the final result. Figure 6. Process of adjusting the hyperparameters.

Evaluation of the Predicted Results
The predicted results of ML models to estimate geopolymerization parameters were assessed using the root-mean-square error (RMSE) and correlation coefficient (R).
RMSE refers to the deviation between the evaluated and target values. It is given by: where denote the actual value; * denote the predicted value; is the number of instances. R expresses the strength of correlation between predicted and observed values. It is expressed as:

Evaluation of the Hyperparameter Tuning
As mentioned previously, BPNN, SVR, RF, kNN, LR, and MLR were used to predict the DPT, DPH, GPT, and GPH. These algorithms were trained on the training set including 80% of the dataset samples and tested on the testing dataset containing 20% of the data. First of all, this study used BAS combined with 10-fold cross-validation to obtain the optimized hyperparameters. Then, the obtained 10 optimal hyperparameters were averaged to obtain the final result. Figure 7 shows the RMSE change with iteration for the ML models during hyperparameter tuning on the fold with the lowest RMSE. It can be seen that all the RMSE curves converge in the first 50 iterations, which indicates that BAS has a good performance in finding the optimal hyperparameters of the ML model. Different ML algorithms presented different RMSE-change patterns in terms of the convergence velocity and final RMSE values. For all predicted target variables, kNN took the shortest time to reach convergence due to its simple structures and hyperparameters compared with other ML algorithms. It can be also seen that SVM reached convergence with the lowest RMSE on the DPH, GPT, and GPH sets, while on the DPT dataset, although BPNN took the longest time to achieve convergence, it had the lowest RMSE after convergence. It is interesting to note that the RMSE of SVR and KNN decreased marginally when tuning hyperparameters on the GPT dataset, which means that the initial values of the hyperparameters were approximate to the optimal values. The obtained optimal hyperparameters of each model can be found in Table 4.
with other ML algorithms. It can be also seen that SVM reached convergence with lowest RMSE on the DPH, GPT, and GPH sets, while on the DPT dataset, although BP took the longest time to achieve convergence, it had the lowest RMSE after converge It is interesting to note that the RMSE of SVR and KNN decreased marginally when tun hyperparameters on the GPT dataset, which means that the initial values of the hype rameters were approximate to the optimal values. The obtained optimal hyperparame of each model can be found in Table 4.

Prediction Performance of ML Models
In this section, the performance of each computational model is evaluated by calculating the RMSE and R values on the test set, as shown in Figure 8. It can be clearly observed that RF has better prediction performance than the other models in terms of minimum RMSE (0.06, 2.3, 7.5, and 0.65 for DPT, DPH, GPT and GPH, respectively) and highest R (0.92, 0.95, 0.88, 0.95 for DPT, DPH, GPT and GPH, respectively). The possible reason is that the RF model creates an uncorrelated forest of decision trees using both bagging and feature randomness, which helps to decrease the model's variance than other single models like BPNN, SVM. KNN, etc. It can be also noted that for prediction of DPT, DPH, GPT and GPH, the methods with the worst predicting effect regarding R and RMSE are LR (0.67 and 0.16, respectively), BPNN (0.78 and 4.4, respectively), MLR (0.35 and 22.2, respectively) and LR (0.62 and 0.67, respectively). This indicates that simple regression models such as LR and MLR are unable to simulate the highly nonlinear relationship between geopolymer composition and geopolymer parameters. Figure 9 shows the relationship between the predicted and actual values on both RF sets. It can be seen that most of the values are near the ideal fitting line (R = 1) except for a few outliers, which may be caused by insufficient training data in these areas. Therefore, RF is recommended as the best model to predict the parameters of FA-based polymers. Ling et al. [36] also used artificial neural network (ANN) models to predict GPH. The R value of this model is 0.959, which is similar to the R value of this study (0.95). However, other geological aggregation parameters, including GPT, DPT, and DPH, were not considered in the previous study. Compared with the previous studies, the prediction of parameters and the important variables affecting the process of geological polymerization were comprehensively explained in this study.

Importance of the Input Variables
ML models such as RF have a non-linear and highly complex architecture, and therefore they tend to behave as a black-box model. Tree-like models are interpretable due to their hierarchical structure, but the visualization of these models may not be easy to interpret. To address the interpretability problem of RF models, a useful agnostic tool, i.e,: Shapley additive explanations (SHAP) was introduced to explain the highly complex RF algorithms with a large number of parameters [37]. For RF, a SHAP approximation method called TreeExplainer was used in this study.
TreeExplainer is a package for explaining and interpreting predictions of tree-based ML models. The basic idea is to decompose each prediction into feature contribution components. For a dataset with n features, each prediction on the dataset is calculated as where is the prediction value of each sample; is the baseline representing the mean of all output variables; ( ) is the contribution of the kth feature of Sample i. Figure 10 shows the mean SHAP values of different input features for the four geopolymerization parameters. Different features were sorted in descending order of importance. It is observed that T is the most important input parameter affecting the DPT, Ling et al. [36] also used artificial neural network (ANN) models to predict GPH. The R value of this model is 0.959, which is similar to the R value of this study (0.95). However, other geological aggregation parameters, including GPT, DPT, and DPH, were not considered in the previous study. Compared with the previous studies, the prediction of parameters and the important variables affecting the process of geological polymerization were comprehensively explained in this study.

Importance of the Input Variables
ML models such as RF have a non-linear and highly complex architecture, and therefore they tend to behave as a black-box model. Tree-like models are interpretable due to their hierarchical structure, but the visualization of these models may not be easy to interpret. To address the interpretability problem of RF models, a useful agnostic tool, i.e,: Shapley additive explanations (SHAP) was introduced to explain the highly complex RF algorithms with a large number of parameters [37]. For RF, a SHAP approximation method called TreeExplainer was used in this study.
TreeExplainer is a package for explaining and interpreting predictions of tree-based ML models. The basic idea is to decompose each prediction into feature contribution components. For a dataset with n features, each prediction on the dataset is calculated as where y i is the prediction value of each sample; y base is the baseline representing the mean of all output variables; f (x ik ) is the contribution of the kth feature of Sample i. Figure 10 shows the mean SHAP values of different input features for the four geopolymerization parameters. Different features were sorted in descending order of importance.
It is observed that T is the most important input parameter affecting the DPT, DPH, and GPH with SHAP values of 0.09, 4.83, 1.03, respectively. For GPT, the SHAP value of T is 6.89, slightly lower than that of LFR (6.95); yet it is a still significantly important input variable. This may be because the higher temperature increases the kinetic energy of the system, which promotes the bond breaking and formation of solute molecules [38]. DPH, and GPH with SHAP values of 0.09, 4.83, 1.03, respectively. For GPT, the SHAP value of T is 6.89, slightly lower than that of LFR (6.95); yet it is a still significantly important input variable. This may be because the higher temperature increases the kinetic energy of the system, which promotes the bond breaking and formation of solute molecules [38].  Figure 11 gives the influences of the top features on the output of the model. A single dot represents an instance on each feature row. The SHAP value determines the horizontal position of the points, which are "stacked" along each feature row to show density. The importance of the features was arranged in descending order. The sensitivity of the two most important input variables to the outputs was analyzed by the trained ML model (see the blue lines in Figure 12). In detail, we changed the variable to be analyzed and fixed the other variable to their mean values and then predicted the output using the trained ML model. It can be seen that low temperature corresponded to higher SHAP values for DPT ( Figure 12a) and GPT (Figure 12b), and lower SHAP values for DPH ( Figure 12c) and GPH (Figure 12d). Furthermore, it can be seen that DPT (Figure 12(a1)) and GPT ( Figure  12(c1)) decreased, while DPH (Figure 12(b1)) and GPH (Figure 12(c2)) increased with increasing temperature from 23 °C to 50 °C. This indicates that the dissolution rate and degree of fly ash and geopolymerization increase with the increase of temperature, which may be because the increase in temperature will increase the average kinetic energy of reactant molecules. ASC is the second most important variable for DPT. Lower ASC values caused higher SHAP values for DPT (Figure 12a Figure 11 gives the influences of the top features on the output of the model. A single dot represents an instance on each feature row. The SHAP value determines the horizontal position of the points, which are "stacked" along each feature row to show density. The importance of the features was arranged in descending order. The sensitivity of the two most important input variables to the outputs was analyzed by the trained ML model (see the blue lines in Figure 11). In detail, we changed the variable to be analyzed and fixed the other variable to their mean values and then predicted the output using the trained ML model. It can be seen that low temperature corresponded to higher SHAP values for DPT ( Figure 11a) and GPT (Figure 11b), and lower SHAP values for DPH ( Figure 11c) and GPH (Figure 11d). Furthermore, it can be seen that DPT (Figure 11(a1)) and GPT (Figure 11(c1)) decreased, while DPH (Figure 11(b1)) and GPH (Figure 11(c2)) increased with increasing temperature from 23 • C to 50 • C. This indicates that the dissolution rate and degree of fly ash and geopolymerization increase with the increase of temperature, which may be because the increase in temperature will increase the average kinetic energy of reactant molecules. ASC is the second most important variable for DPT. Lower ASC values caused higher SHAP values for DPT (Figure 11a), and DPT decreased from 0.4 to 0.15 h with increasing ASC from 15-25%. This suggests that lower ASC might delay the dissolution of fly ash in an alkaline solution. The possible reason was that during the dissolution and hydrolysis of aluminosilicate raw mineral materials, the concentration of resultant [Al(OH) 4 ], [SiO(OH) 3 ] − and [SiO 2 (OH) 2 ] 2− increased with increasing ASC [28]. MR ranks second in the variable importance for DPH (Figure 11b). Higher MR values resulted in lower SHAP values corresponding to lower DPH. It is observed from Figure 11(b2) that DPH reduced from 25 mw to nearly 0 with the increase in MR from 1.0 to 1.8. This may be due to the high concentration of free silicate ions in the high magnetorheological base solution, which forms geopolymerization products on the surface of fly ash and inhibits the early dissolution of fly ash [28]. It can be also noted that LFR was a very important variable for GPT (the most important) and GPH (the second most important), as shown in Figure 11c,d, respectively. Higher LFR corresponded to higher GPT and lower GPH. As LFR increased from 0.34 to 0.6, GPT increased from 10 to 24 h (Figure 11(c1)) and GPH decreased from 4 to 0.5 mw (Figure 11(d2)). A possible explanation for this might be that in high LFR systems, the reactivity of the outer layer of the FA spherical particles is lower than that of the inner particles trapped in larger particles, inhibiting the geopolymerization of FA [28]. In addition to the whole analysis of the proposed approach, the local analysis of varying individual samples (the first and last sample in each dataset) was shown in Figures  12 and 13, respectively. The base value is the average of all objective parameters, and f(x) represents the estimated one of the objective variables in the sample. The difference between the f(x) and the base value represents the feature contribution [39]. From Figure 12, it is observed that for the first sample, the highest positive SHAP values for DPT, DPH, Figure 11. SHAP violin plot for the geopolymerization process: (a) DPT, (b) DPH, (c) GPT and (d) GPH using RF. (a1,a2) represent DPT change with T and ASC, respectively; (b1,b2) represent DPH change with T and MR, respectively; (c1,c2) represent GPT change with LFR and T, respectively; (d1,d2) represent GPH change with T and LFR, respectively.

Sensitivity Analysis
In addition to the whole analysis of the proposed approach, the local analysis of varying individual samples (the first and last sample in each dataset) was shown in Figures 12 and 13, respectively. The base value is the average of all objective parameters, and f (x) represents the estimated one of the objective variables in the sample. The difference between the f (x) and the base value represents the feature contribution [39]. From Figure 12, it is observed that for the first sample, the highest positive SHAP values for DPT, DPH, GPT, and GPH were achieved by T (0.097), MR (0.395), MR (5.870), and LFR (0.593), respectively, while the highest negative SHAP values were achieved by MR (−0.051), T (−5.846), LFR (−12.588) and T (−1.041), respectively. However, the contribution of these input variables to the outputs changed with the variation of the values of the input variables. For example, in Figure 13 when the temperature became 50 • C, it had the most significant negative contribution to DPT (SHAP = −0.0735), and the most significant positive contribution to DPH (SHAP = 3.843), as given in Figure 13a,b, respectively. It is worth noting that the results of the SHAP analysis may be more accurate with increasing the dataset, and hence future work should focus on incorporating more data and different input variables.  Figure 13 when the temperature became 50 °C, it had the most significant negative contribution to DPT (SHAP = −0.0735), and the most significant positive contribution to DPH (SHAP = 3.843), as given in Figure 13a,b, respectively. It is worth noting that the results of the SHAP analysis may be more accurate with increasing the dataset, and hence future work should focus on incorporating more data and different input variables.

Implications for Research
Previous studies have also reported the dissolution and geopolymerization of FAbased polymers. However, it should be noted that the proposed RF model can more effectively evaluate the effects of mixing parameters (molar ratio, alkaline solution concentration, and liquid-cement mass ratio) and temperature on the FA-based polymerization process. Further use of the RF model can reduce the trial and error in the process of mix ratio design, and the sensitivity of mix ratio design parameters can be studied. The information obtained from this model can be used as a guide for the mixing optimization of FA base polymer. Also, the RF model is recommended to predict the hardening and new concrete properties of geopolymer mortars or concrete because it has achieved good results in predicting the geological polymerization process of FA base polymers. Buildings 2022, 12, x FOR PEER REVIEW 20 Figure 13. Local interpretation of the 72nd sample for DPT (a), DPH (b), GPT (c), and GPH (d)

Implications for Research
Previous studies have also reported the dissolution and geopolymerization of based polymers. However, it should be noted that the proposed RF model can more e tively evaluate the effects of mixing parameters (molar ratio, alkaline solution concen tion, and liquid-cement mass ratio) and temperature on the FA-based polymerization cess. Further use of the RF model can reduce the trial and error in the process of mix r design, and the sensitivity of mix ratio design parameters can be studied. The informa obtained from this model can be used as a guide for the mixing optimization of FA polymer. Also, the RF model is recommended to predict the hardening and new conc properties of geopolymer mortars or concrete because it has achieved good results in dicting the geological polymerization process of FA base polymers.

Conclusions
In the present research, the ML approaches were employed to predict the geop merization process of FA-based geopolymer, and the importance and sensitivity of in variables to the outputs were analyzed. The main findings were summarized as follo

•
The RF model had the best performance for the prediction of GPT, GPH, DPT, DPH, compared with other ML models due to its use of bagging and feature rand ness. Therefore, ensemble learning models are recommended to predict geopolym ization parameters of FA-based geopolymer; • SHAP analysis shows that temperature had the greatest influence on the geop merization of FA-based geopolymer. Control of temperature may not only sig cantly affect the geopolymerization process but might affect the hardened charac izations of FA-based geopolymer; and • The elevation of temperature accelerated the geopolymerization rates and prom the geopolymerization extent. ASC and MR were also important and negatively tributed to DPT and DPH, respectively. LFR was important to both GPT and G Lower LFR can promote the geopolymerization extent and shorten the geopolym ization time.

Conclusions
In the present research, the ML approaches were employed to predict the geopolymerization process of FA-based geopolymer, and the importance and sensitivity of input variables to the outputs were analyzed. The main findings were summarized as follows:

•
The RF model had the best performance for the prediction of GPT, GPH, DPT, and DPH, compared with other ML models due to its use of bagging and feature randomness. Therefore, ensemble learning models are recommended to predict geopolymerization parameters of FA-based geopolymer; • SHAP analysis shows that temperature had the greatest influence on the geopolymerization of FA-based geopolymer. Control of temperature may not only significantly affect the geopolymerization process but might affect the hardened characterizations of FA-based geopolymer; and • The elevation of temperature accelerated the geopolymerization rates and promoted the geopolymerization extent. ASC and MR were also important and negatively contributed to DPT and DPH, respectively. LFR was important to both GPT and GPH. Lower LFR can promote the geopolymerization extent and shorten the geopolymerization time.
The above results can effectively guide the optimization design of the FA-based geopolymer. There are some limitations in the prediction of geopolymer polymerization parameters in this study. The prediction model is based on a dataset containing only the polymerization parameters of FA base polymers. For other types of geopolymers, it is necessary to retrain the model and test the generalization of the model. Furthermore, in future work, it is necessary to collect larger datasets and collect more data to further improve the generalization ability of the model.