Environmentally Friendly Concrete Compressive Strength Prediction Using Hybrid Machine Learning

: In order to reduce the adverse effects of concrete on the environment, options for eco-friendly and green concretes are required. For example, geopolymers can be an economically and environmentally sustainable alternative to portland cement. This is accomplished through the utilization of alumina-silicate waste materials as a cementitious binder. These geopolymers are synthesized by activating alumina-silicate minerals with alkali. This paper employs a three-step machine learning (ML) approach in order to estimate the compressive strength of geopolymer concrete. The ML methods include CatBoost regressors, extra trees regressors, and gradient boosting regressors. In addition to the 84 experiments in the literature, 63 geopolymer concretes were constructed and tested. Using Python language programming, machine learning models were built from 147 green concrete samples and four variables. Three of these models were combined using a blending technique. Model performance was evaluated using several metric indices. Both the individual and the hybrid models can predict the compressive strength of geopolymer concrete with high accuracy. However, the hybrid model is claimed to be able to improve the prediction accuracy by 13%. These were in Python language and included a CatBoost regressor, extra trees regressor, and gradient boosting regressor. In this paper, the grid search method was used to perform hyperparameter tuning for ML models, which presents a list of values for each hyperparameter and evaluates the model for every combination of the values in this list.

An alkali-activated alumina-silicate mineral produces geopolymers, which are inorganic polymers [13]. Using alumina-silicate waste materials as a cementitious binder, geopolymer is an environmentally friendly and economic alternative to traditional ordinary portland cement (OPC). Fly ash-slag geopolymer mortar develops strength based on the chemical composition of the raw materials. The evaluation of molar ratios represents a good method for studying chemical components in geopolymers [14].
Thermal coal plants produce fly ash (FA), which is the unburned residual residue that is carried by gases released by the boiler's burning zone [15]. Electrostatic separators or mechanical separators collect FA [16]. Each year, more than 375 million tons of FA are produced throughout the world, whose disposal costs range from $20 to $40 per ton [17]. There are several landfills in suburban areas where this waste is disposed of [18]. The environment is adversely affected by dumping tons of FA without any treatment [19]. Water, soil, and air pollution are caused by hazardous substances contained in FA. These include silica, alumina, and oxides such as ferric oxide (Fe 2 O 3 ). Consequently, human health and the environment are also adversely affected [20]. A safe and sustainable environment requires good waste management employment [21]. The whole ecological cycle will be affected if FA is not properly disposed of.
The most commonly consumed material after water is concrete, which is used as a construction material worldwide [22,23]. Approximately three tons of concrete are produced for every human being [24]. The global production of concrete is estimated to be around 25 billion tons per year [25]. Cement is produced in excess of 2 billion tons annually around the world according to current statistics. In the next decade, this is expected to rise by 25 percent [26]. Cement manufacturing, however, has adverse environmental effects. Gene expression programming (GEP) has been used by a number of researchers in recent years to estimate various mechanical properties of concrete. Experimental and literature-based data are used to predict the compressive strength of sugar cane bagasse ash concrete (SCBA) [27]. In addition, these authors suggested a formula based on GEP to estimate concrete-filled steel tube (CFST) axial capacity based on just 277 examples. The GEP algorithms have also been used by Nour et al. [28] To determine the compressive strength of CFST which contains recycled aggregate.
Construction materials such as portland cement (PC) are commonly used throughout the world [29]. Despite its many benefits, PC production emits approximately 7% of the overall carbon dioxide emitted by humans [30]. It has been estimated that approximately 50% of the GHG emissions associated with cement production are caused by calcination (the process of forming CaO by converting CO 2 from CaCO 3 ), and the remaining 50% are caused by the energy used during the process [31]. Each year, the building industry produces approximately four billion tons of PC [32]. The estimated annual usage of PC within the next four decades is around 6 billion tons [33]. In response, it has become essential to develop new binders that use less energy to produce and result in fewer greenhouse gas emissions [34].
Researchers have been investigating the role of artificial intelligence (AI) and machine learning (ML) methods in the development of models that are reliable, accurate, and consistent for solving structural engineering problems. Wu and Li [35] used a hybrid particle swarm optimization-support vector machine (PSO-SVM) model for damage degree evaluation. Fan et al. [36] used an artificial neural network (ANN) to predict carbon prices using a multi-layer perceptron model. This model proved to be more accurate and fitter than many other simpler models. A support vector regression-particle swarm optimization (SVR-PSO) hybrid model was employed by Wu and Zhou [37], in which the SVR and PSO algorithms are combined for the prediction and feature analysis of punching shear strength of two-way reinforced concrete slabs. Wu and Zhou [38] showed that a hybrid ML model was able to accurately predict the splitting tensile strength prediction of sustainable high-performance concrete. Using 681 data records, Han et al. [39] employed three ML models to predict the compressive strength of high-strength concrete. Wu and Zhou [40] applied a hybrid ML model that combines the SVR model and grid search (GS) optimization algorithm to predict the compressive strength of sustainable concrete.
A least squares SVM (LSSVM) model was applied by Zhu et al. [41] in order to forecast energy prices due to its nonstationarity and nonlinearity, and its performance was superior to that of autoregressive integrated moving average (ARIMA) and ANN models. A hybrid model combining ANN and SVM, developed by Patel et al. [42], had the best overall prediction performance. Moreover, Dou et al. [43] pointed out that the long short-term memory (LSTM) model has advantages over the SVM method in prediction. A hybrid model that incorporates both a statistic and an AI model can also provide relatively better performance.

Dataset
This paper uses a set of 84 data points (shown in Table A1) available in the literature as well as 63 samples of green concrete that have been designed and prepared by the authors and tested [44,45]. A detailed investigation was conducted to develop a geopolymer concrete mix design method based on fly ash. The following parameters were chosen based on considerations of workability and compressive strength.
A geopolymer's activation process is highly dependent on the amount and fineness of fly ash (FA). In previous studies, it has been shown that geopolymer concrete strength increases with increasing fly ash quantity and fineness [46,47]. With an early duration of heating, finer particles show higher workability and strength. For this reason, the proportioning procedure for geopolymer concrete is developed based on the quantities and fineness of fly ash. In the production of silicon and ferrosilicon alloys, quartz is reduced with coal to form a by-product known as silicon fume (SF) [48]. Silica fume is an extremely effective pozzolanic material as a result of its fineness and silica content. Several properties of concrete are improved by silica fumes, such as compressive strength, bond strength, abrasion resistance, and permeability. By reducing their permeability, silica fumes can also prevent the reinforcing steel from corroding [49]. The ureolytic bacillus species produce calcite to reduce concrete pores in order to increase strength and durability [50].

Machine Learning
By being specifically programmed, ML systems learn and improve independently. To provide systems with the ability to gather data and use that data to learn more, ML algorithms are designed to learn from observations. Data collected by systems are used by those systems to make vital decisions based on patterns that they find in the data. An ML algorithm's most important step is training. ML models make predictions and find patterns from the prepared data during training. Thus, a model can accomplish the task set by learning from data. The model improves over time as it is trained. Training datasets were selected randomly for 80% of the paper and testing datasets were selected randomly for 20% of it. Figure 1 shows the research methodology. The purpose of this section is to provide a brief introduction to the theory behind the three ML algorithms used in this study. These algorithms were written in Python language and included a CatBoost regressor, extra trees regressor, and gradient boosting regressor. In this paper, the grid search method was used to perform hyperparameter tuning for ML models, which presents a list of values for each hyperparameter and evaluates the model for every combination of the values in this list.

Dataset
This paper uses a set of 84 data points (shown in Table A1) available in the literature as well as 63 samples of green concrete that have been designed and prepared by the authors and tested [44,45]. A detailed investigation was conducted to develop a geopolymer concrete mix design method based on fly ash. The following parameters were chosen based on considerations of workability and compressive strength.
A geopolymer's activation process is highly dependent on the amount and fineness of fly ash (FA). In previous studies, it has been shown that geopolymer concrete strength increases with increasing fly ash quantity and fineness [46,47]. With an early duration of heating, finer particles show higher workability and strength. For this reason, the proportioning procedure for geopolymer concrete is developed based on the quantities and fineness of fly ash. In the production of silicon and ferrosilicon alloys, quartz is reduced with coal to form a by-product known as silicon fume (SF) [48]. Silica fume is an extremely effective pozzolanic material as a result of its fineness and silica content. Several properties of concrete are improved by silica fumes, such as compressive strength, bond strength, abrasion resistance, and permeability. By reducing their permeability, silica fumes can also prevent the reinforcing steel from corroding [49]. The ureolytic bacillus species produce calcite to reduce concrete pores in order to increase strength and durability [50].

Machine Learning
By being specifically programmed, ML systems learn and improve independently. To provide systems with the ability to gather data and use that data to learn more, ML algorithms are designed to learn from observations. Data collected by systems are used by those systems to make vital decisions based on patterns that they find in the data. An ML algorithm's most important step is training. ML models make predictions and find patterns from the prepared data during training. Thus, a model can accomplish the task set by learning from data. The model improves over time as it is trained. Training datasets were selected randomly for 80% of the paper and testing datasets were selected randomly for 20% of it. Figure 1 shows the research methodology. The purpose of this section is to provide a brief introduction to the theory behind the three ML algorithms used in this study. These algorithms were written in Python language and included a CatBoost regressor, extra trees regressor, and gradient boosting regressor. In this paper, the grid search method was used to perform hyperparameter tuning for ML models, which presents a list of values for each hyperparameter and evaluates the model for every combination of the values in this list.

CatBoost Regressor
CatBoost is a new type of gradient enhancement technology [51], which is a powerful ML technique. A number of fields have applied it due to its good performance, such as short-term weather forecasts [52], Kickstarter campaign predictions [53], driving style

CatBoost Regressor
CatBoost is a new type of gradient enhancement technology [51], which is a powerful ML technique. A number of fields have applied it due to its good performance, such as short-term weather forecasts [52], Kickstarter campaign predictions [53], driving style recognition [54], and diabetes prediction [55]. Additionally, CatBoost is increasingly used to estimate crop evapotranspiration.
In CatBoost, the model overfitting is dealt with by Bayesian estimators, which handle categorical and ordered features of the decision trees. CatBoost ranks the developed In PVC, a change in a feature value is calculated along with a change in prediction. ML models based on CatBoost use PVC as the default method. Models are generally ranked according to LFC using a range of models.
It is a set of input features called F, a numeric factor called β i and a prediction step called P. In F = {f 1 , f 2 , f 3 , . . . . . . . . . .. f n } Equation (1) [56], the input features are given to the ML model. In P i = β i Fj Equation (2). Feature F j represents a specific feature from the given feature set, P i represents the prediction value, β i represents the numeric factor, and P i represents the substituted numeric factor. (3), where P i+1 indicates the prediction value upon changing the numeric factor, and β i+1 indicates the modified numeric factor. This particular feature becomes necessary when there is a change in the numeric factor that changes the prediction value, as shown in P i=0 = P i = P i+1 Equation (4).

Extra Trees Regressor
Geurts et al. [57] presented an approach called extra tree regression (ETR) which evolved from the random forest (RF) model. Extra tree regression (ETR) constructs unpruned decision trees or regression trees during the process of applying the conventional top-down method [58].
Bootstrapping and bagging are utilized by the random forest (RF) model to perform regression. Each decision tree is grown using a random training dataset sample as part of the bootstrapping step. Once the ensemble has been achieved, the bagging step is used to divide the nodes in the decision tree. During this step, a number of random subsets of training data are selected. The best subset and its value are selected during the decision-making process [59].
As Breiman [60] described it, the RF model is a series of decision trees, wherein the predicting tree is the tree of results and the predicting vector is the uniform independent distribution vector that is assigned before the tree is expanded. In order to construct a forest using the Breiman equation, all trees are combined and averaged: ETRs and RF systems differ in two important ways. Firstly, the ETR selects random points from the cutting points and divides the nodes accordingly. Additionally, it minimizes bias by cultivating the trees based on the entire learning sample [57]. Two parameters govern the split process in the ETR approach: k and n min , where k is the number of features sampled randomly in each node, and n min is the minimum number of features to separate each node. Further, k and n min are used to determine the strength of the selection of attributes and the strength of the average output noise. Using these parameters improves the model's precision and reduces overfitting [61,62]. ETR structure is shown in Figure 2.

Gradient Boosting Regressor
As an ensemble technique for regression and classification, gradient boosting, was introduced by Friedman in 1999 [63]. Different boosting, such as gradient boosting, can be used for a variety of applications, but it is only effective when used for regression. It is shown in Figure 3 that every iteration of the randomly selected training set is checked against the base model in gradient boosting. It can be improved by subsampling the training data randomly, which prevents overfitting. By subsampling the training data randomly, gradient boosting performance can be improved. By fitting smaller data at each iteration, the regression model runs faster with a smaller fraction of training data. Gradient boosting regression requires tuning parameters: number of trees and shrinkage rate, where the number of trees refers to the number of trees to be grown. It is important to make sure that the number of trees is not set too low and that the shrinkage parameter, sometimes referred to as the learning rate, is applied to each tree in the expansion.

Gradient Boosting Regressor
As an ensemble technique for regression and classification, gradient boosting, was introduced by Friedman in 1999 [63]. Different boosting, such as gradient boosting, can be used for a variety of applications, but it is only effective when used for regression. It is shown in Figure 3 that every iteration of the randomly selected training set is checked against the base model in gradient boosting. It can be improved by subsampling the training data randomly, which prevents overfitting. By subsampling the training data randomly, gradient boosting performance can be improved. By fitting smaller data at each iteration, the regression model runs faster with a smaller fraction of training data. Gradient boosting regression requires tuning parameters: number of trees and shrinkage rate, where the number of trees refers to the number of trees to be grown. It is important to make sure that the number of trees is not set too low and that the shrinkage parameter, sometimes referred to as the learning rate, is applied to each tree in the expansion.

Gradient Boosting Regressor
As an ensemble technique for regression and classification, gradient boosting, was introduced by Friedman in 1999 [63]. Different boosting, such as gradient boosting, can be used for a variety of applications, but it is only effective when used for regression. It is shown in Figure 3 that every iteration of the randomly selected training set is checked against the base model in gradient boosting. It can be improved by subsampling the training data randomly, which prevents overfitting. By subsampling the training data randomly, gradient boosting performance can be improved. By fitting smaller data at each iteration, the regression model runs faster with a smaller fraction of training data. Gradient boosting regression requires tuning parameters: number of trees and shrinkage rate, where the number of trees refers to the number of trees to be grown. It is important to make sure that the number of trees is not set too low and that the shrinkage parameter, sometimes referred to as the learning rate, is applied to each tree in the expansion.

Hybrid Model
An ensemble ML technique called blending combines the predictions produced by multiple ensemble members by using an ML model. Thus, blending is also known as stacking, which is a framework for stacked generalizations. A stacking model uses two or more baseline models, termed level-0 models, combined with a meta-model, termed level-1 models, that combines the predictions from the bases. Data from the sample are used to train the meta-model.

Cross-Validation Using K Fold
ML models are evaluated using cross-validation by resampling them based on a restricted sample of data. A single parameter called k determines how many groups are to be split up from a given data sample. It is therefore sometimes referred to as k-fold crossvalidation. It is possible to use a specific value for k to replace k in the model's reference, e.g., k = 20 becomes 20-fold cross-validation. As an ML technique applying unseen data to a model, cross-validation is primarily used to estimate its skill. The model is used to make predictions based on data unused during training. This is carried out in order to estimate its performance in general when predicted based on new data. Due to its simplicity, it is a popular method since it generally leads to less biased or optimistic estimates of the model skill than other methods, such as simple trains and tests. It is important to note that each observation in the data sample is assigned to a particular group and remains within that group during the analysis. Each sample is used once in the hold-out set, once in the training set, and once in the hold-out set.

Feature Scaling
Scaling feature values is an important step before creating an ML model, as this is one of the most important techniques in ML. The goal of feature scaling is to use a common scale to change the values of columns. In one column, you can have values ranging from 0 to 1, and in another column, you can have values ranging from 1000 to 10,000. Trying to combine the values as features during modeling may be difficult due to the vast differences in scale. A weak ML model can be distinguished from a strong one by this factor. Scaling can be carried out in three ways: standardizing, normalizing, and scaling. In this paper, values in the dataset were scaled to change from 0 to 1.

Experiment and Results
In order to describe how well an ML model performs in making predictions, its accuracy must be evaluated. Several metrics are commonly used to evaluate the performance of regression models, including MSE, MAE, RMSE [65,66], and R 2 .

•
An average of the absolute difference over the data set represents the mean absolute error (MAE) between the original and predicted values.
• By taking the average difference over the data set and squaring it, MSE (mean squared error) is calculated.
• RMSE (root mean squared error) is the error rate by the square root of MSE.
where,ŷ − predicted value of y y − mean value of y • As with standard MSE, RMSLE [66,68] measures exponents rather than values themselves.
• MAPE (mean absolute percentage error) The accuracy of the model is a combination of R and these error indexes that defines. Some disciplines, such as economy and health informatics, have thresholds for the MAE, MSE, etc values (e.g. blood pressure min level). However, there is no general rule for the ranges of MAE, MSE, etc. [69][70][71]. In general, the lower the better. R 2 is sufficient when its value reaches one. Higher R values and lower statistical indexes such as RMSE and MAE values indicate a more precise model. Table 1 displays the error indices in each fold, sorted by RMSE value for the three ML models. According to Table 1, the CatBoost regressor performs best. Using a higher iteration rate, the three models can be dynamically tuned to find more optimal hyperparameters. For each of the 120 iterations, Tables 2-4 show the fitted 10 folds, resulting in 1200 fits in total. Figure 4 summarizes the results of Tables 2-5.      According to Table 5, the combined model shows a 13% improvement over the individual methods if the above three methods are combined.
What follows here are the results related to the combined model. A hybrid model's residuals are shown in Figure 5, A training dataset is represented by blue points and a testing dataset by green points. The R 2 of the hybrid model equals 0.99 as shown in Figure 6.  When performing a least-squares regression analysis, the Cook's distance is commonly used as an estimate of the impact of a data point [72]. A Cook's distance is an estimate of a data point's influence. Outliers can be removed from a dataset using a variety of techniques. When analyzing regression data, the Cook's distance is often used. Leverage and residual are taken into account for each observation. When you remove the i th observation from a regression model, Cook's distance suggests how much the model  When performing a least-squares regression analysis, the Cook's distance is commonly used as an estimate of the impact of a data point [72]. A Cook's distance is an estimate of a data point's influence. Outliers can be removed from a dataset using a variety of techniques. When analyzing regression data, the Cook's distance is often used. Lever- When performing a least-squares regression analysis, the Cook's distance is commonly used as an estimate of the impact of a data point [72]. A Cook's distance is an estimate of a data point's influence. Outliers can be removed from a dataset using a variety of techniques. When analyzing regression data, the Cook's distance is often used. Leverage and residual are taken into account for each observation. When you remove the i th observation from a regression model, Cook's distance suggests how much the model changes. Cook's distance is a measure of how strongly the fitted values are influenced by a data point. Any data point with a Cook's distance exceeding 4/n (where n is the total number of data points) is considered an outlier by default. For the hybrid model, Figure 7 shows the Cook's distance. An optimal loss function is plotted against a validation data set with the same parameter set used in the training data set to produce a learning curve in ML. This tool determines whether the estimator suffers more from variance or bias errors when adding more training data to a machine model. As the training set grows, it will not benefit much from more training data if both the validation and training scores are too low [73]. For the hybrid model, Figure 8 shows the learning curve. The low-dimensional spaces reflect the parameters while the high-dimensional spaces are the features. Manifold learning is the process of uncovering these manifold structures in data sets. Dimensionality reduction is achieved through the use of manifold learning, which is a nonlinear method. High-dimensional data can be visualized using t- An optimal loss function is plotted against a validation data set with the same parameter set used in the training data set to produce a learning curve in ML. This tool determines whether the estimator suffers more from variance or bias errors when adding more training data to a machine model. As the training set grows, it will not benefit much from more training data if both the validation and training scores are too low [73]. For the hybrid model, Figure 8 shows the learning curve. An optimal loss function is plotted against a validation data set with the same parameter set used in the training data set to produce a learning curve in ML. This tool determines whether the estimator suffers more from variance or bias errors when adding more training data to a machine model. As the training set grows, it will not benefit much from more training data if both the validation and training scores are too low [73]. For the hybrid model, Figure 8 shows the learning curve. The low-dimensional spaces reflect the parameters while the high-dimensional spaces are the features. Manifold learning is the process of uncovering these manifold structures in data sets. Dimensionality reduction is achieved through the use of manifold learning, which is a nonlinear method. High-dimensional data can be visualized using t-SNE [1]. In order to minimize the Kullback-Leibler divergence between the joint probabil- The low-dimensional spaces reflect the parameters while the high-dimensional spaces are the features. Manifold learning is the process of uncovering these manifold structures in data sets. Dimensionality reduction is achieved through the use of manifold learning, which is a nonlinear method. High-dimensional data can be visualized using t-SNE [1]. In order to minimize the Kullback-Leibler divergence between the joint probabilities of the low-dimensional embedding and the high-dimensional data, the similarity between data points is converted into joint probabilities. Due to the non-convex nature of t-SNE's cost function, different results can be obtained with different initializations [73]. Data with high dimensions are excluded from this study, as indicated in Figure 9. In this section, by using three ML methods and combining them together, an estimation model with high accuracy was presented to predict the strength of green concrete. The strength of this study is the high accuracy of the model, but its weakness is that an engineer must be familiar with programming and ML algorithms to estimate concrete strength. In future works, using methods that are based on providing a formulation, a relationship can be presented so that the user can easily use it to obtain the resistance of green concrete.

Conclusions
A green concrete's compressive strength was predicted using three ML methods. The CatBoost regressor, the extra trees regressor, and the gradient boosting regressor were evaluated using 147 samples. All of the models produced high accuracy predictions of the compressive strength of the geopolymer concrete. The models were evaluated using several statistical indices. A limited data sample was used for cross-validating ML models. Data samples were split into groups according to the single parameter "k", hence the name "k-fold cross-validation". Accordingly, a value of k = 10 has been chosen in this paper, implying that the model has 10-fold cross-validation. CatBoost regressor, extra trees regressor, and gradient boosting regressor models have an average RMSE of 2.63, 2.75, and 2.73. All three models were combined with blending, which is a method of ensemble ML. The hybrid model has a 13% greater accuracy in all statistical indices than the individual models. Additionally, the hybrid model could predict the compressive strength of green concrete based on other statistical concepts such as cook's distance, learning curves, and manifold learning.  In this section, by using three ML methods and combining them together, an estimation model with high accuracy was presented to predict the strength of green concrete. The strength of this study is the high accuracy of the model, but its weakness is that an engineer must be familiar with programming and ML algorithms to estimate concrete strength. In future works, using methods that are based on providing a formulation, a relationship can be presented so that the user can easily use it to obtain the resistance of green concrete.

Conclusions
A green concrete's compressive strength was predicted using three ML methods. The CatBoost regressor, the extra trees regressor, and the gradient boosting regressor were evaluated using 147 samples. All of the models produced high accuracy predictions of the compressive strength of the geopolymer concrete. The models were evaluated using several statistical indices. A limited data sample was used for cross-validating ML models. Data samples were split into groups according to the single parameter "k", hence the name "k-fold cross-validation". Accordingly, a value of k = 10 has been chosen in this paper, implying that the model has 10-fold cross-validation. CatBoost regressor, extra trees regressor, and gradient boosting regressor models have an average RMSE of 2.63, 2.75, and 2.73. All three models were combined with blending, which is a method of ensemble ML. The hybrid model has a 13% greater accuracy in all statistical indices than the individual models. Additionally, the hybrid model could predict the compressive strength of green concrete based on other statistical concepts such as cook's distance, learning curves, and manifold learning.  Data Availability Statement: The dataset used during the current study is available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.