Prediction of Compressive Strength of Fly Ash Based Concrete Using Individual and Ensemble Algorithm

Machine learning techniques are widely used algorithms for predicting the mechanical properties of concrete. This study is based on the comparison of algorithms between individuals and ensemble approaches, such as bagging. Optimization for bagging is done by making 20 sub-models to depict the accurate one. Variables like cement content, fine and coarse aggregate, water, binder-to-water ratio, fly-ash, and superplasticizer are used for modeling. Model performance is evaluated by various statistical indicators like mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE). Individual algorithms show a moderate bias result. However, the ensemble model gives a better result with R2 = 0.911 compared to the decision tree (DT) and gene expression programming (GEP). K-fold cross-validation confirms the model’s accuracy and is done by R2, MAE, MSE, and RMSE. Statistical checks reveal that the decision tree with ensemble provides 25%, 121%, and 49% enhancement for errors like MAE, MSE, and RMSE between the target and outcome response.


Introduction
Carbon dioxide produced from the cement industry has a malignant adamant effect on environmental conditions [1]. Its utilization and excessive use in modern construction around the world produces greenhouse gases (GHG) [2]. Moreover, countless amounts of gases are emitted during the production of cement due to the burning of natural resources and fossil fuels [3]. Annually, 4 billion tons of Portland cement (PC) is being produced and approximately one ton of cement generates one ton of CO 2 gas [4]. This huge amount of carbon dioxide is a serious threat to the environment. The report shows that a 1.6% increment (3.4% to 5%) of global CO 2 gas discharge was observed from the year 2000 to 2006. The cement industry contributes 18% of industrial greenhouse gases (GHG to the environment. This is due to the direct process-related activity, energy-related combustion, and remaining use of electricity, which is termed as indirect energy [5]. To overcome the above-mentioned issue, a process of replacing the cement material with an alternative binder is of great research interest [6]. The supplementary cementation materials (SCMs) can be used for many purposes, especially in the concrete industry. Their utilization in concrete gives a benignant effect than empirical relations. Similarly, Onyari et al. [30] reveal robust performance by utilizing ANN to predict the flexural and compressive strength of modified mortar. Previously mentioned examples show the overwhelming response of individual algorithms.
Recently, application of ensemble modeling is perceived as a chance for enhancement of the model's overall efficiency. It can be achieved due to taking a weak leaner to build strong, predictive learners than individual learners [31]. Feng et al. [32] use ensemble algorithm techniques for the prediction of failure mode classification and bearing capacity of reinforced concrete (RC) structural element (column). Both models give robust performance. However, bearing capacity is characterized by better correlation than failure mode classification. Bui et al. [33] employed a modified firefly algorithm with ANN on high performance concrete (HPC) and reported better performance of the model. Moreover, Salamai et al. [34] report good accuracy of R 2 = 0.9867 by using the RF algorithm. In turn, Cai et al. [35] use various supervised machine ensemble algorithms for the prediction of chloride penetration in the RC structure situated in a marine environment. Ensemble models outclass individual algorithms to predict chloride penetration in RC. Hacer et al. [36] present the comparative assessment of bagging as the ensemble approach for high-performance concrete mix slump flow. Ensemble models with bagging were found to be superior with regard to standalone approaches. Halil et al. [37] predict the strength of HPC by employing three ensemble modeling approaches. The author used the decision tree as a base learner for other models and found that the hybrid model outperforms with the output result of R 2 = 0.9368 among the several proposed models. Kermani et al. [38] represents the performance of five soft, computing base learners for predicting concrete corrosion in sewers. The author used both tree-based and network-based learners and reported that RF ensemble learners give a better result with R 2 = 0.872. These ensemble approaches give an enhanced effect with robust performance of the overall models.
Taking the above into consideration, it may seem that ensemble learning models have more favorable features and give better results than individual learning models. The difference between individual and ensemble model is illustrated in Figure 1.

Research Significance
The aim of this study is to use the machine-learning algorithm with ensemble modeling using Anaconda Python to predict the compressive strength of fly-ash-based concrete using different algorithms. A decision tree with a bagging algorithm is used and optimization is done by making 20 sub-models to give a strong outcome. A comparison is made with the individual, ensemble algorithms, and with gene expression programming to give the best model. Moreover, K-fold cross-validation and a statistical check are ap-

Research Significance
The aim of this study is to use the machine-learning algorithm with ensemble modeling using Anaconda Python to predict the compressive strength of fly-ash-based concrete using different algorithms. A decision tree with a bagging algorithm is used and optimization is done by making 20 sub-models to give a strong outcome. A comparison is made with the individual, ensemble algorithms, and with gene expression programming to give the best model. Moreover, K-fold cross-validation and a statistical check are applied to evaluate the model performance.

Data Description
The efficiency of the model is completely dependent upon the variables and the number of data samples used. The parameters used in models preparation in order to predict the strength of concrete were taken from published literature [39] and are summarized in Appendix A. Eight variables concerning composition of the concrete mixture and including cement, fine and coarse aggregate, superplasticizer, water, waste material, age, and a water-to-binder ratio were taken into analysis. The overall distribution in terms of the relative frequency distribution is illustrated in Figure 2. The range of variables of each parameter used in the study, with a minimum and maximum value, is illustrated in Figure 3. Statistical descriptive analysis for the variables in terms of strength is listed in Table 1.

Methodology
Individual and ensemble model techniques used to predict the properties in a limited time that are of great interest. The accuracy level between the actual and prediction level is typically obtained from the R 2 value (ranges from 0-0.99). A high R 2 value indicates the satisfactory results of the selected technique. This study uses three approaches to predict the compressive strength of concrete with waste material. A decision tree with ensemble algorithms such as bagging with a learning rate of 0.9 and gene expression programming is used. These techniques are selected due to their popularity among other algorithms. The overall machine learning model methodology in the form of a diagram is illustrated in Figure 4.

Decision Tree
The decision tree is one of the supervised learning techniques used for categorizing regression problems but is also commonly used for classification problems [40]. There are classes inside the tree. However, if there is no class, then the regression technique can predict the outcome by independent variables [37]. A decision tree is a tree-structured classifier in which the inner nodes reflect the attribute of a database. Branches indicate the conclusion rules, and every leaf node constitutes the outcome. The decision tree consists of two nodes known as a decision node and a leaf node. Decision nodes have multiple branches with the capability to make any decision, while leaf nodes do not have branches and are considered as the output of the decisions. It is known as a decision tree because it has a similar nature to a tree that starts with the root node and distributes in the number of branches, and reflects a tree-like structure [41]. The decision tree splits the data samples at various points. The executed algorithm finds the error between the target and predicted

Decision Tree
The decision tree is one of the supervised learning techniques used for categorizing regression problems but is also commonly used for classification problems [40]. There are classes inside the tree. However, if there is no class, then the regression technique can predict the outcome by independent variables [37]. A decision tree is a tree-structured classifier in which the inner nodes reflect the attribute of a database. Branches indicate the conclusion rules, and every leaf node constitutes the outcome. The decision tree consists of two nodes known as a decision node and a leaf node. Decision nodes have multiple branches with the capability to make any decision, while leaf nodes do not have branches and are considered as the output of the decisions. It is known as a decision tree because it has a similar nature to a tree that starts with the root node and distributes in the number of branches, and reflects a tree-like structure [41]. The decision tree splits the data samples at various points. The executed algorithm finds the error between the target and predicted value at every divided point. The errors are calculated at every divided point, and the variable with the least value for the fitness function is selected as a split point, and the same procedure is repeated again.

Ensemble Bagging Approach
The ensemble technique is the concept of machine learning used to train numerous models by applying a similar learning algorithm [42]. The ensemble involves a substantial group of methods known as multi-classifiers. The group of hundreds or thousands of learners with a common intent are joined together to fix the problem. Bagging is a parallel type ensemble method that explains the variance of the prediction model by producing supplementary data in the stage of training. This production is from irregular sampling including substituting from the real set of data. Some of the observations can be repeated by sampling with replacement in every new training data set. In bagging, every component has an equal chance to appear in the new dataset. The force of prediction cannot be enhanced by increasing the size of the training set. The variance can also be reduced narrowly by tuning the forecast to an anticipated outcome. All these numbers of sets of the given data are normally used to train other numbers of models. This ensemble of different models uses the average of all the predictions from the other various models. In regression, the prediction may be the mean or average of the predictions taken from the different models [43]. The decision tree with bagging is tuned with 20 sub-models to obtain the optimized value that gives an adamant output result.

Gene Expression Programming
Gene expression programming (GEP) is a computer programming-based algorithm used to develop different models [44]. GEP, which is initially introduced by Ferreira [45], is considered to be a natural development of genetic programming (GP). Multiple numbers of genetic operators that are being used in genetic algorithms (GAs) can also be used in GEP with the help of a few recommended changes. There are five main components of GEP, namely, function set, terminal set, fitness function, control variables, and termination condition. GEP works as a fixed length of character twine to explain the problems, which are next defined as tree-like structures with different dimensions. This type of tree is known as the GEP expression tree (ETs). Selection of individual chromosomes takes place and then they are copied into the next generation, as per the fitness by roulette wheel sampling with elitism [23]. This ensures the durability and replication of the best individual to the next generation. Fluctuation in the population is shown by applying one or more genetic operators (mutation, crossover, or rotation) on the given chromosomes. Among the number of advantages of GEP, the formation of genetic diversity is remarkably simplified because of the working of genetic operators at the chromosome level. This multi-genic approach of GEP permits the natural selection of other complicated and complex programs composed of numerous subprograms. GEP genes along with a function set and terminal set play a vital role during the process [46].

K-Fold Cross-Validation and Statistical Measures
The model performance in terms of bias and variance is checked by employing K-fold cross-validation. The data is divided into 10 stratified groups, which randomly distribute the data into a training set and test set. This process takes one part of the overall data into the test sample and the remaining into the training set, as illustrated in Figure 5. The model's overall efficiency by cross-validation is then tested by taking an average of 10 rounds by various errors. Similarly, the model evaluation is also done by using statistical indicators [23]. Three types of the indicator are used in our current study, which is listed below (Equations (1)-(3)). where: • n = Total number of data samples, • x, y re f = reference values in the data sample, • x i , y pred = predicted values from models.
GEP, namely, function set, terminal set, fitness function, control variables, and termination condition. GEP works as a fixed length of character twine to explain the problems, which are next defined as tree-like structures with different dimensions. This type of tree is known as the GEP expression tree (ETs). Selection of individual chromosomes takes place and then they are copied into the next generation, as per the fitness by roulette wheel sampling with elitism [23]. This ensures the durability and replication of the best individual to the next generation. Fluctuation in the population is shown by applying one or more genetic operators (mutation, crossover, or rotation) on the given chromosomes. Among the number of advantages of GEP, the formation of genetic diversity is remarkably simplified because of the working of genetic operators at the chromosome level. This multigenic approach of GEP permits the natural selection of other complicated and complex programs composed of numerous subprograms. GEP genes along with a function set and terminal set play a vital role during the process [46].

K-Fold Cross-Validation and Statistical Measures
The model performance in terms of bias and variance is checked by employing Kfold cross-validation. The data is divided into 10 stratified groups, which randomly distribute the data into a training set and test set. This process takes one part of the overall data into the test sample and the remaining into the training set, as illustrated in Figure 5. The model's overall efficiency by cross-validation is then tested by taking an average of 10 rounds by various errors. Similarly, the model evaluation is also done by using statistical indicators [23]. Three types of the indicator are used in our current study, which is listed below (Equations (1)- (3)). where: • n = Total number of data samples, • , = reference values in the data sample, • , = predicted values from models.

Decision Tree/Ensemble Model
The prediction of concrete strength by employing a decision tree yields an adamantly strong relationship between targets to output strength, as depicted in Figure 6. It can be seen that the individual model gives a better response with less variance, as illustrated in Figure 6a. However, the decision tree with bagging gives precise performance than an individual one, as illustrated in Figure 6d. This is due to an increase in model efficiency as it takes several data to train the best model by using weak base learners [47]. The ensemble model is optimized by making 20 sub-models, as depicted in Figure 6c. The zero number shows the individual model, which is made by using the decision approach and shows R 2 = 0.812. After the ensemble approach, there is a significant enhancement in the overall response of the model. Every model shows a surpass effect by giving an average score of about R 2 = 0.904 within 20 models. However, the 12th sub-model gives a prime result with R 2 = 0.911, as depicted in Figure 6c. Moreover, the model comparison in terms of errors is depicted in Figure 6b

Gene Expression Programming
The performance of the model by GEP yielded a robust relationship between targets and predicted, as illustrated in Figure 7. It can be seen that R 2 by employing GEP is close to 1. Moreover, Figure 7b represents the error distribution of the testing set with fewer errors. Similarly, the predicted value shows a lower error to target values with a minimum, maximum, and average value of 0.00 MPa, 26.20 MPa, and 3.48 MPa, respectively. Table 2 presents detailed results from the models.

Evaluation of the Model by K-Fold and Statistical Checks
Cross-validation is a statistical practice used to evaluate or estimate the actual performance of the machine learning models. It is necessary to know the performance of the selected models. For this purpose, a validation technique is required to find the accuracy level of the model's data. Shuffling of the data set randomly and splitting a dataset into kgroups is required for the k-fold validation test. In the described study, data of experimental samples are equally divided into 10 subsets. It uses nine out of ten subsets, while the only subset is utilized for the validation of the model. The same approach of this process is then repeated 10 times for obtaining the average accuracy of these 10 repetitions. It is clarified widely that the 10-fold cross-validation method well represents the conclusion and accuracy of the model performance [48].
Bias and a variance decrease for the test set can be checked by employing K-fold cross-validation. The results of cross-validation are evaluated by a correlation coefficient (R 2 ), a mean absolute error (MAE), a mean square error (MSE), and a root mean square error (RMSE), as illustrated in Figure 8. The ensemble model shows fewer errors and better R 2 as compared to GEP. The average R 2 for ensemble modeling is 0.905 with a maximum and minimum values of 0.84 and 0.96, as depicted in Figure 8a. Whereas the GEP model shows an average R 2 = 0.873 of ten folds with 0.76 and 0.95 for a minimum and maximum correlation, respectively, as shown in Figure 8b. Each model shows fewer errors for validation. The validation indicator result shows that ensemble means values of MAE, MSE, and RMSE come to be 6.43 MPa, 6.66 MPa, and 2.55 MPa, respectively. Similarly, the GEP model shows the same trend by showing fewer errors. The GEP model shows mean values of 7.30 MPa, 9.60 MPa, and 3.06 MPa for MAE, MSE, and RMSE, respectively (see Figure 8b). Table 3 represents the validation results of both models.

Evaluation of the Model by K-Fold and Statistical Checks
Cross-validation is a statistical practice used to evaluate or estimate the actual performance of the machine learning models. It is necessary to know the performance of the selected models. For this purpose, a validation technique is required to find the accuracy level of the model's data. Shuffling of the data set randomly and splitting a dataset into k-groups is required for the k-fold validation test. In the described study, data of experimental samples are equally divided into 10 subsets. It uses nine out of ten subsets, while the only subset is utilized for the validation of the model. The same approach of this process is then repeated 10 times for obtaining the average accuracy of these 10 repetitions. It is clarified widely that the 10-fold cross-validation method well represents the conclusion and accuracy of the model performance [48].
Bias and a variance decrease for the test set can be checked by employing K-fold cross-validation. The results of cross-validation are evaluated by a correlation coefficient (R 2 ), a mean absolute error (MAE), a mean square error (MSE), and a root mean square error (RMSE), as illustrated in Figure 8. The ensemble model shows fewer errors and better R 2 as compared to GEP. The average R 2 for ensemble modeling is 0.905 with a maximum and minimum values of 0.84 and 0.96, as depicted in Figure 8a. Whereas the GEP model shows an average R 2 = 0.873 of ten folds with 0.76 and 0.95 for a minimum and maximum correlation, respectively, as shown in Figure 8b  Statistical check is also applied to evaluate the model with regard to the testing results. The statistical check is an indicator that shows the model response towards prediction, as shown in Table 4. It can be seen that models depict bottom-most errors. However, the ensemble model shows a 25% error reduction for MAE as compared to the individual and GEP. Similarly, the bagging approach indicates the robust performance of the model. Moreover, MSE and RMSE for strong learners show 121% and 49% enhancement in the predictions by showing reduced errors between the target and predicted outcomes, as shown in Table 4. Moreover, permutation feature importance via python is conducted to check the influence of variables on strength, as depicted in Figure 9. These variables have a vital influence on the prediction of compressive strength of concrete. The concrete age, cement,  Statistical check is also applied to evaluate the model with regard to the testing results. The statistical check is an indicator that shows the model response towards prediction, as shown in Table 4. It can be seen that models depict bottom-most errors. However, the ensemble model shows a 25% error reduction for MAE as compared to the individual and GEP. Similarly, the bagging approach indicates the robust performance of the model. Moreover, MSE and RMSE for strong learners show 121% and 49% enhancement in the predictions by showing reduced errors between the target and predicted outcomes, as shown in Table 4. Moreover, permutation feature importance via python is conducted to check the influence of variables on strength, as depicted in Figure 9. These variables have a vital influence on the prediction of compressive strength of concrete. The concrete age, cement, and water-to-cement ratio have a significant influence on model analysis. Whereas water, filler material (fly ash), superplasticizer, fine aggregate, and coarse aggregate have moderate influences in making the model. Thus, it can be concluded that every parameter is crucial in the forecasting of the strength properties. However, cement, age, and the water-to-cement ratio should be given more importance while casting of specimens. and water-to-cement ratio have a significant influence on model analysis. Whereas water, filler material (fly ash), superplasticizer, fine aggregate, and coarse aggregate have moderate influences in making the model. Thus, it can be concluded that every parameter is crucial in the forecasting of the strength properties. However, cement, age, and the waterto-cement ratio should be given more importance while casting of specimens.

Limitation and Future Work
Despite the fact that, in the work, a thorough analysis based on a large number of data points was conducted and an extensive machine learning algorithm with evaluation was implemented, the limitations of work should be mentioned. Described in the paper selection, an approach can be enhanced by using other appropriate methods. A clear limitation of work is the number of data points equal to 270. The study is also limited to predict only one result from various mechanical properties of concrete. Tensile strength, durability, corrosion, toughness, and abrasion behavior of concrete is not considered in this work. Other algorithm-based techniques, like artificial neural network (ANN), support vector machine (SVM), gradient boosting, and AdaBoost may also be applied to the same dataset for a better understanding. However, this research work does not only focus on algorithm-based techniques but also involves the programming-based GEP, which indicated the wide scope of this work.
Since concrete is the most widely used material after water on this earth, it is further recommended that other properties of this material should be incorporated except for its compressive strength. Machine learning techniques should also be used to predict the environmental effects on concrete properties. To achieve high accuracy in the actual and predicted results, the multi-stage genetic programming approach may also be used. It is also recommended that models can be run for the concrete modified with different fibers as: jute fibers, glass fibers, polypropylene fibers, nylon fibers, and steel fibers.

Limitation and Future Work
Despite the fact that, in the work, a thorough analysis based on a large number of data points was conducted and an extensive machine learning algorithm with evaluation was implemented, the limitations of work should be mentioned. Described in the paper selection, an approach can be enhanced by using other appropriate methods. A clear limitation of work is the number of data points equal to 270. The study is also limited to predict only one result from various mechanical properties of concrete. Tensile strength, durability, corrosion, toughness, and abrasion behavior of concrete is not considered in this work. Other algorithm-based techniques, like artificial neural network (ANN), support vector machine (SVM), gradient boosting, and AdaBoost may also be applied to the same dataset for a better understanding. However, this research work does not only focus on algorithm-based techniques but also involves the programming-based GEP, which indicated the wide scope of this work.
Since concrete is the most widely used material after water on this earth, it is further recommended that other properties of this material should be incorporated except for its compressive strength. Machine learning techniques should also be used to predict the environmental effects on concrete properties. To achieve high accuracy in the actual and predicted results, the multi-stage genetic programming approach may also be used. It is also recommended that models can be run for the concrete modified with different fibers as: jute fibers, glass fibers, polypropylene fibers, nylon fibers, and steel fibers.

Conclusions
This study describes the supervised machine learning approaches with ensemble modeling and gene expression programming to predict concrete strength. The following points are drawn from the analysis:

1.
A decision tree with ensemble modeling gives a robust performance compared to a decision tree individually and with gene expression programming. The correlation coefficient of R 2 = 0.911 is reported for DT with bagging.

2.
Optimization of the model for the decision tree with bagging is done by making twenty sub-models. Magnificent enhancement is observed from the twelve, which shows R 2 = 0.911 as compared to the individual model with R 2 = 0.812.

3.
Validation score is conducted by different indicators. Both models (DT with bagging and GEP) show better anticipation for testing results.

4.
Statistical analysis checks reveal that the decision tree with bagging shows enhancement in model accuracy by minimizing the error difference between targeted and predicted values.
To summarize, all applied algorithms show a significant effect on the model's quality by predicting the target response more accurately. As described in the paper, machine learning approaches can save experimental time and predict the outcome by gathering extensive data from laboratory and published papers. It can help the scientific society to predict the properties and responses in the coming month or year.

Data Availability Statement:
The data presented in this article is available within the article.

Conflicts of Interest:
The authors declared no conflict of interest.