Predicting Compressive Strength of Blast Furnace Slag and Fly Ash Based Sustainable Concrete Using Machine Learning Techniques: An Application of Advanced Decision-Making Approaches

: The utilization of waste industrial materials such as Blast Furnace Slag (BFS) and Fly Ash (F. Ash) will provide an effective alternative strategy for producing eco-friendly and sustainable concrete production. However, testing is a time-consuming process, and the use of soft machine learning (ML) techniques to predict concrete strength can help speed up the procedure. In this study, artiﬁcial neural networks (ANNs) and decision trees (DTs) were used for predicting the compressive strength of the concrete. A total of 1030 datasets with eight factors (OPC, F. Ash, BFS, water, days, SP, FA, and CA) were used as input variables for the prediction of concrete compressive strength (response) with the help of training and testing individual models. The reliability and accuracy of the developed models are evaluated in terms of statistical analysis such as R 2 , RMSE, MAD and SSE. Both models showed a strong correlation and high accuracy between predicted and actual Compressive Strength (CS) along with the eight factors. The DT model gave a signiﬁcant relation to the CS with R 2 values of 0.943 and 0.836, respectively. Hence, the ANNs and DT models can be utilized to predict and train the compressive strength of high-performance concrete and to achieve long-term sustainability. This study will help in the development of prediction models for composite materials for buildings.


Introduction
The construction industry is one of the world's largest producers of CO 2 emissions.As a result, they have a tremendous impact on the environment [1].Each year, carbon emissions are produced mostly owing to the manufacturing and usage of cement in building projects across the globe [2].Similarly, cement consumption is estimated to be nearly 4 billion tons (BT) per year, with one ton of ordinary cement producing around about the same rate of CO 2 emissions [3].Cement production accounts for 7% of all CO 2 pollution in the environment [3].Therefore, supplementary cementitious materials (SCMs) need to be employed as a cement replacement in the construction sector to achieve worldwide sustainable development.Fly Ash is a fine powder that is produced during the combustion of pulverized coal-based power stations, while blast furnace slag (BFS) is a by-product obtained after the processing of iron ore; these are the most easily available SCMs in the world.Cement production is a major source of carbon dioxide (CO 2 ) emission.As a result, they have a significant environmental impact.The modern industry currently produces a considerable amount of waste material and has a negative impact on the environment.Concrete researchers are particularly interested in Fly Ash and BFS, which are among the different industrial by-products.Making use of such materials will help to reduce pollution while also providing a cost-effective approach for making concrete.Furthermore, the incorporation of BFS and Fly Ash as a partial cement substitute in concrete could greatly increase concrete qualities such as compressive strength ( f c ), permeability, and concrete durability [4][5][6][7][8].As a result, determining the BFS and Fly Ash proportions for design mix is critical and important, especially in terms of improving concrete strength.
Several research studies have been conducted to estimate the BFS concentration in a concrete design mixture.According to Oner and Akyuz [9], the optimal content of BFS is about 55-59% of the overall OPC content.Shariq et al. [10] investigated the influence of three BFS dosages ranging from 20 to 60% on the compressive strength (CS) of concrete at three various water-to-binder ratios.For all W/C ratios, the concrete compressive strength incorporating 40% of BFS content is superior to those of concrete specimens made with 20% and 60% BFS contents.Siddique and Kaur [11] found that replacing 20% of OPC with BFS in high-temperature constructions is feasible.Another researcher [12] found that the compressive strength of concrete containing 60% of BFS achieved the maximum CS value after 28 days.However, Majhi et al. [13] had achieved the optimum CS of concrete with 40% of BFS substitution.
Moreover, numerous experimental studies have been conducted to determine the BFS or Fly Ash substitution proportion in the concrete mixture.The strength of concrete has been studied by Gehlot [14] in which samples made with a combination of BFS or F. Ash at six various BFS/ Fly Ash wt % ratios (i.e., 0/0, 10/20, 20/10, 30/0, and 0/30).It was observed that the more the BFS to Fly Ash wt %, the better the CS of the concrete.Similarly, the concrete compressive strength was also studied using various ratios of Fly Ash and BFS in this study [15].The set of experimental results and the variation of the mix proportions can have a significant impact on the reliability of concrete strength estimation.As a result, a new technique is needed to reduce the amount of time and money spent on experiments due to the large number of tests.There is a need to introduce and employ a universal predictive model with a high predictive accuracy.
Nowadays, Artificial Intelligence (AI) has become a widely used model for addressing a variety of challenges in the engineering and science field [16][17][18][19][20]. AI methods have been built to forecast various concrete characteristics: for example, the shear resistance of RC beams [21,22], corrosion factors in PCC sewers [23], concrete crack width [24], the maximum strength of RC beams [25], compressive strength of recycled concrete aggregate (RCA) [26], strength of concrete containing SF [27], BFS [28,29] and Fly Ash [30,31], and concrete strength of geopolymer concrete (GPC) [32].Artificial neural networks (ANNs) is the most effective and powerful AI model for simulating complicated technical situations right now [33,34].The ANNs model can solve complex, non-linear equations, particularly those in which the correlation between the multiple inputs or outputs variables is difficult to establish directly.For instance, Bilim et al. [35] developed an ANNs model to estimate the CS of the concrete using 225 datasets.Researchers used six design variables such as OPC content, ground granulated blast furnace slag content (GBFS), water/binder ratio, superplasticizer (SP), aggregate ratio, and curing age.The optimum R-square (R 2 ) value of 0.96 was achieved for this ANNs model.Chopra et al. [36] suggested an ANNs model having one hidden layer and comprising 50 neurons to estimate the CS of the concrete samples made with both BFS and Fly Ash contents, based on 204 test data.A coefficient of determination (R 2 ) of 0.92 was achieved, which is used to analyze the accuracy of such an ANNs model.Furthermore, Yeh [37] developed an ANNs model on 990 datasets for forecasting the CS of concrete incorporating various percentage ratios of BFS and F. Ash contents.Similarly, other researchers also estimated the CS of concrete using an ANNs model and found the optimum R-square value of 0.92, which showed the higher accuracy of such a model [37].However, the range of datasets and ANNs design are mainly dependent upon the number of layers, or neurons (nodes) in each hidden layers, and they have a considerable impact on the efficiency of the ANNs model.As a result, determining the number of neurons and the hidden layer is important for improving ANN effectiveness.
The novelty and research significance of this study is to explore the effects of different factors on the compressive strength of concrete made with BFS and Fly Ash using advanced machine learning algorithms such as Artificial Neural Networks (ANNs) and Decision Tree (DT).The suggested ML algorithms were selected based on their reputation and higher precision and accuracy in different scientific fields, such as biomedical and construction fields.Moreover, the main objective of this study is based on the performance and comparative analysis of compressive strength using a supervised learning algorithm of machine learning.The performance of the adopted models was compared based on the coefficient of correlation (R 2 ) and root mean square values or errors (RMSE) of each algorithm.The purpose of this study is to better comprehend the effect of model parameters and the reliability of the results from different machine learning techniques.The outcomes of the experimental research are also presented along with a comparative assessment of both collective and single ML approaches.The statistical analyses and cross-validation with k-fold were also used to assess each model's accuracy.

Supervised ML Algorithms
The model is presented with the specific dataset, also known as a feature vector in the supervised process of learning processes, which signifies that the databases have instances of observations together with their desired output.Such predictive models can generate an inferred function that defines the features vectors to result labels.Decision trees, random forest, Ada Boost, regression analysis, artificial neural networks, and k-nearest neighbors are some of the most widely used and effective supervised machine learning approaches.

Unsupervised ML Algorithms
Unsupervised learning uses datasets that have very little or no information for the result labels.However, the goal of such models is to find a link between the variables and/or to discover inactive factors.The principle of independent component analysis, hierarchical and bi-clustering, and k-means clustering are some examples of unsupervised learning approaches as shown in Figure 1.

ANN effectiveness.
The novelty and research significance of this study is to explore the effects of differe factors on the compressive strength of concrete made with BFS and Fly Ash using ad vanced machine learning algorithms such as Artificial Neural Networks (ANNs) and D cision Tree (DT).The suggested ML algorithms were selected based on their reputatio and higher precision and accuracy in different scientific fields, such as biomedical an construction fields.Moreover, the main objective of this study is based on the performan and comparative analysis of compressive strength using a supervised learning algorith of machine learning.The performance of the adopted models was compared based on th coefficient of correlation (R 2 ) and root mean square values or errors (RMSE) of each alg rithm.The purpose of this study is to better comprehend the effect of model paramete and the reliability of the results from different machine learning techniques.The outcom of the experimental research are also presented along with a comparative assessment both collective and single ML approaches.The statistical analyses and cross-validatio with k-fold were also used to assess each model's accuracy.

Supervised ML Algorithms
The model is presented with the specific dataset, also known as a feature vector the supervised process of learning processes, which signifies that the databases have in stances of observations together with their desired output.Such predictive models ca generate an inferred function that defines the features vectors to result labels.Decisio trees, random forest, Ada Boost, regression analysis, artificial neural networks, and nearest neighbors are some of the most widely used and effective supervised machin learning approaches.

Unsupervised ML Algorithms
Unsupervised learning uses datasets that have very little or no information for th result labels.However, the goal of such models is to find a link between the variabl and/or to discover inactive factors.The principle of independent component analysis, h erarchical and bi-clustering, and k-means clustering are some examples of unsupervise learning approaches as shown in Figure 1.Artificial neural networks (ANNs), decision trees (DTs), deep learning (DL), genetic programming (GP), and maximum margin classifier or SVM are all examples of machine learning (ML) approaches that are now frequently utilized to forecast engineering and science challenges [38].To combat the limitations of conventional strategies, SVM is designed in such a manner that it can cope with non-linear regression issues more effectively [39].It can produce a superior optimized solution rather than a local optimum because of its well-generalized potential.In contrast, decision tree (DTs) and random forest (RF) resemble like a tree structure; they forecast results using roots and nodes [40].DT employs a whole database containing the target variable; however, RF chooses features factors at random inside the variables that govern the number of trees in estimation.After that, the prediction is summed and correlated to the maximum number of voters, resulting in a reliable estimation.GP is one of the most modern ML-based algorithms that simulates Darwinian evolution [41].The non-linear interaction is described as an expression style tree.ML techniques are frequently used to retrieve unknown trends, statistics, and relationships from large datasets.However, this method makes use of database systems, along with ML and statistical analysis.Simulations and predictions are completed using two methods [42].One is the traditional strategy, which is based on a specific model, while the other is the ensemble algorithm strategy [43].According to previous studies on such approaches, ensemble approaches appear to be more accurate than ordinary single ML algorithms [44].Ensemble learning algorithms are used to teach so-called poor learners through data for training before integrating them into a profound learner.For many decades, ML techniques have been employed to forecast the outcome of numerous metrics.However, their application in civil engineering has increased dramatically in recent years.The basic concept of ML is like that of classical methods, although the non-linear function is more precise than linear phenomena.Several prediction models (such as ANNs, DL, DT, GEP, RF, and SVM) are often utilized nowadays.Jesika et al. [45] employed eleven models for the prediction of shear strength in RC beams made with steel fiber.To forecast the mechanical behavior of the SF concrete, Behnood et al. [4] employed ANNs based on multi-objective gray wolves (MOGW) optimizer.For the forecasting of concrete strength, Kadir et al. [46] applied DT, LR, ANNs, and SVR modeling.The mechanical strengths of waste concrete were predicted using an ANNs approach by Getahun et al. [47].Ling et al. [48] used SVM to estimate concrete strength in coastal environments and compared it with the results of ANNs or DT algorithms.Zaher et al. [49] forecast the mechanical properties of foamed concrete by using several ML algorithms.Another researcher [50] also employed an ML model for assessing the durability properties of RC structures.Suguru et al. [51] used machine learning to construct an autonomous fracture detector for civil infrastructure.Learning data were extracted from images of concrete specimens, and deep learning (DL) was utilized to detect cracks.Wassim et al. [52] assess the performance and accuracy of the models.Miao et al. [53] also employed ANNs, MLR and SVM to forecast the bond strength in fiber-based RC structures.To conclude this, we used two ML algorithms based on their higher precision and popularity in predicting the compressive strength of concrete.In addition, the author plans to test more machine learning models to find the best accurate and effective model for identifying mix design processes.

Materials and Methods
Machine learning (ML) is a promising artificial intelligence (AI) concept that is widely used in Civil Engineering for an accurate prediction of material behavior.The present study used ANNs and decision tree (DT)-based machine learning approaches to predict the compressive strengths (CS) of concrete, as shown in Figure 2.These methods are the most efficient data analysis algorithms.They were picked as they had already been utilized in several past studies.The following section gives an overview of the ANNs, and DT-based ML techniques employed in this study.
conventional models.The present study developed an empirical model to study the effects of different parameters on the compressive strength of slag and silica fume-based concrete.For this purpose, the different models of machine learning were compared to estimate the best predictive model or fit among them.In this study, two machine learning models such as artificial neural networks and decision tree have been applied.The next section describes each adopted machine learning approaches one by one.

Decision Tree (DT) Model
A decision tree is a type of machine learning that resembles a tree structure.It is composed of tree-like branches and nodes.Inner nodes have outgoing corners, while leaves do not.The set of data used for regression or classification is divided into multiple classes by an inner node according to a specific function.During the training process, the model parameters are considered as a given function.A procedure which is considered as a tree structure from given cases is the stimulant for the DT model.The performed approach gives the optimum decision tree by decreasing the fitness function.The DT model based on regression was employed in this research for the intended parameter utilizing the different variables because there were no classes in the records.The data collection is divided multiple times for individual parameters.At each split node, the performed Machine learning is a quite efficient prediction technique in terms of computing and time consumption.It minimizes error margins to nearly negligible levels as compared to conventional models.The present study developed an empirical model to study the effects of different parameters on the compressive strength of slag and silica fume-based concrete.For this purpose, the different models of machine learning were compared to estimate the best predictive model or fit among them.In this study, two machine learning models such as artificial neural networks and decision tree have been applied.The next section describes each adopted machine learning approaches one by one.

Decision Tree (DT) Model
A decision tree is a type of machine learning that resembles a tree structure.It is composed of tree-like branches and nodes.Inner nodes have outgoing corners, while leaves do not.The set of data used for regression or classification is divided into multiple classes by an inner node according to a specific function.During the training process, the model parameters are considered as a given function.A procedure which is considered as a tree structure from given cases is the stimulant for the DT model.The performed approach gives the optimum decision tree by decreasing the fitness function.The DT model based on regression was employed in this research for the intended parameter utilizing the different variables because there were no classes in the records.The data collection is divided multiple times for individual parameters.At each split node, the performed algorithm computes the variation in between predicted and experimental readings for the pre-specified weighting factor.The errors in the separated nodes are comparable throughout the variables, and the split spot is determined by the factors with the least fit values.This process is continued indefinitely to improve the model efficiency.

Artificial Neural Networks (ANNs) Model
The artificial neural network consists of neurons also known as perceptions.This model is based on non-linear modeling in which multiple inputs and output variables are used to predict the response parameters.This model is working based on multi-linear perceptions.This multi-layered structure of ANNs is responsible for the ability to estimate complex models.The simulation input is directed and multiplied by weight in a forward run.The bias is imposed to every sheet, and the observed prediction model is computed in the back pass.The estimated responses that fit the variables are measured.After that, loss is estimated.After considering the input data, the efficiency model predicts results.The back-propagation mechanism can be used to evaluate the errors of the projected results.We use various loss mechanisms based on our performance and criteria.The backward propagation of the ANNs model provides partial derivatives of the gradient descent back into the system for the individual inputs.The loss of back propagation occurs during this phase, and the model's weights are revised via a learning algorithm.The ANNs structure has three layers (input, intermediate and output layers).In this study, we used eight parameters as an input variable to study the effect of each parameter on the compressive strength of modified concrete with BFS and Fly Ash, which is the output response shown in Figure 3. algorithm computes the variation in between predicted and experimental readings for the pre-specified weighting factor.The errors in the separated nodes are comparable throughout the variables, and the split spot is determined by the factors with the least fit values.This process is continued indefinitely to improve the model efficiency.

Artificial Neural Networks (ANNs) Model
The artificial neural network consists of neurons also known as perceptions.This model is based on non-linear modeling in which multiple inputs and output variables are used to predict the response parameters.This model is working based on multi-linear perceptions.This multi-layered structure of ANNs is responsible for the ability to estimate complex models.The simulation input is directed and multiplied by weight in a forward run.The bias is imposed to every sheet, and the observed prediction model is computed in the back pass.The estimated responses that fit the variables are measured.After that, loss is estimated.After considering the input data, the efficiency model predicts results.The back-propagation mechanism can be used to evaluate the errors of the projected results.We use various loss mechanisms based on our performance and criteria.The backward propagation of the ANNs model provides partial derivatives of the gradient descent back into the system for the individual inputs.The loss of back propagation occurs during this phase, and the model's weights are revised via a learning algorithm.The ANNs structure has three layers (input, intermediate and output layers).In this study, we used eight parameters as an input variable to study the effect of each parameter on the compressive strength of modified concrete with BFS and Fly Ash, which is the output response shown in Figure 3.

Activation Function and Learning Algorithms
The artificial neural network (ANN) is a statistical projection mechanism constructed on existing attributes derived from the structure of the human brain.This framework is comprised of neurons, which are functional units.Weights are interconnected with neurons, which are normally chosen at random initially.Weights are adjusted accordingly by certain iterations in a learning phase to ultimately create the ideal network that can

Activation Function and Learning Algorithms
The artificial neural network (ANN) is a statistical projection mechanism constructed on existing attributes derived from the structure of the human brain.This framework is comprised of neurons, which are functional units.Weights are interconnected with neurons, which are normally chosen at random initially.Weights are adjusted accordingly by certain iterations in a learning phase to ultimately create the ideal network that can anticipate it with acceptable precision [54].As a result, the desired outcome in a trained ANNs model can be accomplished by processing the input data and considering the adjusted weights.The network refines with time by computing the errors and matching the expected inputs and outputs.The ML model improves with time, which implies that the precision of the predictive model can be increased, and the projected outcomes are accurate.In this study, non-linear activation algorithms such as sigmoid are utilized due to their superior reliability.The main objective of training the ANNs model is to reduce the error function, which is often known as the mean square error or MSE [55].
where "N" represents input variables, while the output and target output are represented by "t i " and "a i ", respectively.The most common technique for learning ANNs is the backpropagation approach.For improved curve fitting and productivity, the back-propagation method is selected as a training model to predict the CS of the concrete in this study.The RMSE was reduced during neural network optimization.Sigmoid transfer function has been used for hidden and output layers.The eight neurons in the hidden layer were chosen by a hit and trial approach, and it had been suggested in the literature that using fewer neurons would improve data precision [56].

Material Properties and Description of Dataset
The collected dataset used for the predictions of concrete strength developed with waste materials is extracted from previous research papers based on experimental testing [37,[57][58][59][60][61].It consists of eight parameters, which are ordinary Portland cement (OPC), blast furnace slag (BFS), Fly Ash (F.Ash), water, superplasticizer (SP), coarse aggregates (CA), fine aggregates (FA) and curing days (Age); these were used to predict the compressive strength of sustainable concrete for the overall 1030 dataset.Cement of type A and coarse aggregate sizes up to 10 mm were utilized.Two waste materials such as F. Ash collected from the power station and BFS from the local steel manufacturing industry were employed.A superplasticizer called naphthalene formaldehyde was used.Due to the greater size of coarse aggregates (above 20 mm) and under specific curing considerations, and other certain factors, some of the concrete specimens were removed from the collected data during the analysis.Moreover, previous studies used samples of various sizes and forms, which were all transformed to samples comparable for a cylindrical 150 mm (15 cm) dimension using standard procedures [37,62].Their frequency distribution for each parameter along with the concrete compressive strength is presented in Figure 4.
The performance of models was significantly influenced by the independent factors.The variables utilized to train the models to estimate the compressive strength of highstrength concrete were also beneficial in model prediction.The factors have a significant impact on the data collected (output) in terms of the improved efficiency and performance.ML is an efficient artificial modeling based on neurons to predict the multiple number of characteristics by studying the relevant concentration of each factor in an artificial manner.The statistical analysis of all the factors used for modeling is presented in Table 1.The statistics metrics such as standard deviation, coefficient of variance, SE (standard error) mean, mean, minimum and maximum ranges of all factors were evaluated.According to study [63], the minimum ratio seen between independent factors and the dataset should be at least three, and it was recommended to choose more than five for the effective prediction of models.The numbers are considerably greater in this study, which used the 1030 dataset of compressive strength with eight input factors.The processing step which determines the SFC's properties before developing a model is data collection.The validity and reliability of ML-based empirical models were determined using 80% data for training and 20% data for validation.The established model is assessed using the testing and validation of the dataset, while the training set of data is used to develop it.

Evaluation of Models Using Statistical Metrics
The reliability and efficiency of the models can be assessed using statistical measures such as R-square values and root mean square error (RMSE) for better prediction of the developed model.However, the R-square value is also called the coefficient of determination and is preferred for better model prediction.Several modeling approaches have been explored to develop a forecasting model for the final concrete strengths, owing to advancement in machine learning.We tried to construct ANNs and GT regression models

Evaluation of Models Using Statistical Metrics
The reliability and efficiency of the models can be assessed using statistical measures such as R-square values and root mean square error (RMSE) for better prediction of the developed model.However, the R-square value is also called the coefficient of determination and is preferred for better model prediction.Several modeling approaches have been explored to develop a forecasting model for the final concrete strengths, owing to advancement in machine learning.We tried to construct ANNs and GT regression models to promote the use of sustainable waste materials, i.e., slag and Fly Ash in concrete pro-duction at an industrial scale.The compressive strength was then used to evaluate both models and find out the best one forecasted among them to predict the result with minimal or no variance.In this study, data analysis and the estimation of error indicators are used to verify the models.These measurements can give us a lot of information about the errors in our model.The performance of the proposed models is also assessed using the R-square and RMSE.
The mean squared variation in between estimated and actual testing is called the RMSE.It calculates the mean squared extent of the error.It is the standard deviation of the anticipated error.This method gives higher weight to big deviations, such as outliers, due to its higher square variations and lower square variations.When estimating the output response for multiple input factors, the RMSE measures the mean standard errors of the models.The algorithm is effective if the RMSE is minimal.The RMSE value of 0.5 signifies that it is inadequate to anticipate the data accurately.Equation ( 1) is used to estimate the RMSE of the models.These statistical matrices will help check the validity and accuracy of the predicted models.R 2 readings around 0.65 to 0.75 reflect positive outcomes, whereas R 2 values < 0.50 represent poor outcomes.R-square can be estimated using the following Equation (2).In Equation ( 2), SSE stands for sum of square errors, while SSy is the overall variation.

Outcomes of DT Model
As demonstrated in Figures 5 and 6, the predicted response of concrete strength using supervised-based machine learning modeling gave successful and accurate results.The developed DT model showed a strong correlation with R-square values of 0.943 and 0.836 during the training and testing of the DT model, respectively.The RMSE values of 4.00 and 6.54 were achieved during the training and testing phases of the DT model.Moreover, the data analysis shows that 82% of the overall dataset had a mean squared error (MSE) value of 16.02 during model training.However, the remaining 18% of the dataset had an MSE value of 42.82 in the testing phase.Similarly, the data distribution had mean absolute deviation (MAD) values of 3.03 and 4.817, respectively, throughout the training and testing processes.The mean absolute percent error (MAPE) values were 0.1143 and 0.1804, respectively.The statistical summary of the DT model is presented in Table 2.The R-squared values vs. number of nodes are presented in Figure 5.The figure demonstrates that optimal values of R 2 equal to 0.98 and 0.843 were observed at 284 nodes during the training and testing of DT model prediction of CS (MPa).The DT model developed a strong relationship between CS (MPa) and other factors.However, the R-squared values of 0.943 and 0.836 were achieved at 90 nodes, which was quite close to the optimum R 2 value at 284 nodes (reference node).The scatterplot of response fits vs. actual values using the DT model is presented in Figure 6.The graph demonstrates a strong relationship between the actual and predicted values with a minimal dataset error using DT modeling.The tree diagram produced during decision tree modeling is also presented in Figure 7.

Outcomes of ANNs Model
The ANNs model is a type of supervised learning approach used in machine learning, and its usage in concrete shows a significant correlation between predicted and real concrete strength.The accuracy and validity of ANNs models were checked based on values of R 2 , RMSE and MAD in Table 3.The ANNs model showed R-squared values of 0.873 and 0.848 in the training and testing phase of concrete prediction.The overall dataset consists of 1030 samples; of these, 686 samples were used for training the ANNs model, while the remaining 344 were utilized for testing and validation purposes.Similarly, the RMSE values of predicted CS using the ANNs model were 5.90 and 6.60 during the training and testing stage of the ANNs model.The mean absolute deviation (MAD) values of 4.549 and 5.086 were found in the training and testing of the ANNs model for CS strength.However, a considerable increase in R 2 was seen using the ANNs approaches, as presented in Figure 8.The predicted and actual values of CS made with waste materials also showed a positive correlation in both the training and testing phase.As a result, cross-validation was advised as a useful method for precisely measuring a model's prediction performance and setting model parameters.As a result, improved achievable accuracy for predicted value of CS parameters was attained.Statistical parameter estimations using ANNs modeling have also been shown in Table 4.

Outcomes of ANNs Model
The ANNs model is a type of supervised learning approach used in machine learning, and its usage in concrete shows a significant correlation between predicted and real concrete strength.The accuracy and validity of ANNs models were checked based on values of R 2 , RMSE and MAD in Table 3.The ANNs model showed R-squared values of 0.873 and 0.848 in the training and testing phase of concrete prediction.The overall dataset con-

Prediction Profiler
The prediction profiler can help with better understanding each parameter contribution in concrete compressive strength.The relationships established between eight input variables and output response are presented as a prediction profiler in Figure 9.The target optimization value of CS (MPa) is 44.48, which is presented on the top left side of the prediction profiler along with combinations of eight factors.It was observed that the compressive strength of concrete was increased as the OPC content changed from 102 to 540 kg/m 3 .The optimum OPC content was achieved at 281.17 kg/m 3 .Similarly, the content of BFS and Fly Ash increased in the range of 0 to 359 and 200 kg/m 3 , respectively.The optimum value of BFS and Fly Ash contents was achieved at 73.9 and 54.19 kg/m 3 .On the other hand, the effect of the water factor had a negative influence on the compressive strength.The positive correlation was observed in case of curing ages ranging between 1 and 365 days and the CS of the concrete.A slight increase of CS (MPa) was observed in case of SP, CA, and FA.

Prediction Profiler
The prediction profiler can help with better understanding each parameter contribution in concrete compressive strength.The relationships established between eight input variables and output response are presented as a prediction profiler in Figure 9.The target optimization value of CS (MPa) is 44.48, which is presented on the top left side of the prediction profiler along with combinations of eight factors.It was observed that the compressive strength of concrete was increased as the OPC content changed from 102 to 540 kg/m 3 .The optimum OPC content was achieved at 281.17 kg/m 3 .Similarly, the content of BFS and Fly Ash increased in the range of 0 to 359 and 200 kg/m 3 , respectively.The optimum value of BFS and Fly Ash contents was achieved at 73.9 and 54.19 kg/m 3 .On the other hand, the effect of the water factor had a negative influence on the compressive strength.The positive correlation was observed in case of curing ages ranging between 1 and 365 days and the CS of the concrete.A slight increase of CS (MPa) was observed in case of SP, CA, and FA.

Interaction Profiler
The interaction profiler plot of eight factors with corresponding compressive strength (CS) is presented in Figure 10.The minimum and maximum ranges of each predicted parameter are represented in the form of red and blue trend lines, respectively.The interaction profiler can be used as a visual graphical aid to describe the effects of different factors on the CS properties of the concrete.The effects of various parameters on the OPC and CS are described in the first row of Figure 10.The explanation of the interaction plot is presented from left to right and so on.In the first row, a positive correlation was seen between CS and the OPC content, which increased from 102 to 540 kg/m 3 along with other factors, as they showed a slight increase in their values except for the water variable, which had an indirect effect on CS and OPC.In the 2nd row, the interaction plot among CS and BFS with other variables is presented and showed a positive relationship between increased CS and BFS content ranging between 0 and 359 kg/m 3 .However, the CS of BFSbased concrete had decreased strength with an increase in water content from 121 to 247 kg/m 3 .The 3rd row showed an interaction plot of CS and BFS in combination with other factors.It showed a significant increase in the presence of OPC, BFS and days factors for the CS.However, a slight increase of CS was attained in case of SP, CA, and FA factors.The change in water content had a negative effect on the combined CS and BFS interaction plot.The 4th row of the plot describes the direct effect of water on CS and other factors.A significant increasing trend is observed in the 5th row, which shows a direct relation between three factors: OPC, BFS and curing days.However, there is a declining trend of CS and SP with an increase in water ranging between 121 and 247 kg/m 3 .The incorporation of CA and FA had a strong direct effect on OPC, BFS and days factors, but a slight increase was seen in case of Fly Ash, CA and FA factors.Similarly, the last row of the interaction plot showed that the increased curing days ranging from 1 to 365 days significantly improved the CS of the concrete, but water has an inverse relation to CS, which shows different varying trends.

Interaction Profiler
The interaction profiler plot of eight factors with corresponding compressive strength (CS) is presented in Figure 10.The minimum and maximum ranges of each predicted parameter are represented in the form of red and blue trend lines, respectively.The interaction profiler can be used as a visual graphical aid to describe the effects of different factors on the CS properties of the concrete.The effects of various parameters on the OPC and CS are described in the first row of Figure 10.The explanation of the interaction plot is presented from left to right and so on.In the first row, a positive correlation was seen between CS and the OPC content, which increased from 102 to 540 kg/m 3 along with other factors, as they showed a slight increase in their values except for the water variable, which had an indirect effect on CS and OPC.In the 2nd row, the interaction plot among CS and BFS with other variables is presented and showed a positive relationship between increased CS and BFS content ranging between 0 and 359 kg/m 3 .However, the CS of BFS-based concrete had decreased strength with an increase in water content from 121 to 247 kg/m 3 .The 3rd row showed an interaction plot of CS and BFS in combination with other factors.It showed a significant increase in the presence of OPC, BFS and days factors for the CS.However, a slight increase of CS was attained in case of SP, CA, and FA factors.The change in water content had a negative effect on the combined CS and BFS interaction plot.The 4th row of the plot describes the direct effect of water on CS and other factors.A significant increasing trend is observed in the 5th row, which shows a direct relation between three factors: OPC, BFS and curing days.However, there is a declining trend of CS and SP with an increase in water ranging between 121 and 247 kg/m 3 .The incorporation of CA and FA had a strong direct effect on OPC, BFS and days factors, but a slight increase was seen in case of Fly Ash, CA and FA factors.Similarly, the last row of the interaction plot showed that the increased curing days ranging from 1 to 365 days significantly improved the CS of the concrete, but water has an inverse relation to CS, which shows different varying trends.

Relative Variable Importance
In this study, we used eight variables such as OPC, BFS, Fly Ash, water, SP, days, sand, and coarse aggregates to carry out sensitivity analysis.The effects of each input variable are described in Figure 11.However, the Fly Ash and fine aggregates have the minimal sensitivity effect on the development of models for concrete compressive strength.OPC (kg/m 3 ) and age (day) had the significant sensitive parameters for compressive strength.Both Fly Ash and FA played a small role in the creation of both models.

Relative Variable Importance
In this study, we used eight variables such as OPC, BFS, Fly Ash, water, SP, days, sand, and coarse aggregates to carry out sensitivity analysis.The effects of each input variable are described in Figure 11.However, the Fly Ash and fine aggregates have the minimal sensitivity effect on the development of models for concrete compressive strength.OPC (kg/m 3 ) and age (day) had the significant sensitive parameters for compressive strength.Both Fly Ash and FA played a small role in the creation of both models.

Comparative Analysis of Statistical Metrics and K-Fold
To analyze the biased and variances reduction for the testing dataset, a conventional method of K-fold based on cross-validation is used.The statistical measures such as coefficient of determination (R 2 ), root mean square error (RMSE), mean square error (MSE) and mean absolute deviation (MAD) were evaluated and compared with both models, as presented in Figure 12.
In contrast, the output prediction of both machine learning approaches displays small swings in their statistical results.In comparison to ANNs and DT, the DT model has fewer errors and a higher R 2 value.This may be due to the increasing number of nodes during the DT modeling prediction.However, the validation values of both models displayed close results when compared to each other.The R-squared values of 0.943 and 0.836 were achieved during training and testing of the DT model.In case of the ANNs model, R 2 is equal to 0.873 and 0.848, respectively, in the training and validation of the dataset.Each single model showed fewer validation errors.The RMSE, MSE, MAD and MAPE values for the DT model are 6.54 MPa, 42.82 MPa, 4.82 and 0.1804 MPa, respectively, as per the testing or validation results.However, the RMSE, MSE, and MAD values for the ANNs prediction model are 6.60 MPa, 43.57MPa, and 5.086, respectively, as per the validation results.The statistical analysis of both models DT and ANNs is also described in Tables 2 and 3. When compared against each other, the DT and ANN model both showed less variance in their predicted output.The R 2 value is directly correlated to this check; the lower the error rates, the better the R 2 value of the predicted model.

Comparative Analysis of Statistical Metrics and K-Fold
To analyze the biased and variances reduction for the testing dataset, a conventional method of K-fold based on cross-validation is used.The statistical measures such as coefficient of determination (R 2 ), root mean square error (RMSE), mean square error (MSE) and mean absolute deviation (MAD) were evaluated and compared with both models, as presented in Figure 12.In contrast, the output prediction of both machine learning approaches displays small swings in their statistical results.In comparison to ANNs and DT, the DT model has fewer errors and a higher R 2 value.This may be due to the increasing number of nodes during the DT modeling prediction.However, the validation values of both models dis-

Comparative Analysis of Statistical Metrics and K-Fold
To analyze the biased and variances reduction for the testing dataset, a conventional method of K-fold based on cross-validation is used.The statistical measures such as coefficient of determination (R 2 ), root mean square error (RMSE), mean square error (MSE) and mean absolute deviation (MAD) were evaluated and compared with both models, as presented in Figure 12.In contrast, the output prediction of both machine learning approaches displays small swings in their statistical results.In comparison to ANNs and DT, the DT model has fewer errors and a higher R 2 value.This may be due to the increasing number of nodes during the DT modeling prediction.However, the validation values of both models displayed close results when compared to each other.The R-squared values of 0. To conclude this, this research demonstrates that multiple machine learning algorithms can be used to anticipate outcomes.It provides better knowledge to engineers about a better choice of model parameters and regressors to execute the algorithm and produce precise projected results.The machine learning techniques utilized in this study produce a significant correlation between the desired and projected outcomes.The significance of using models with higher precision in civil engineering is demonstrated by their usage in predicting concrete parameters, which is a time-consuming process.Finally, the precise formulations of proposed models will help to promote the utilization of waste materials such as BFS and FA instead of dumping as industrial waste for future construction works.

Limitations and Future Recommendations
This research studied two types of machine learning techniques, ANNs and decision tree, for predicting the compressive strength of high-strength concrete.However, there are some certain limitations.The data provided are limited to the estimated model's performance.The database used in this research had a maximum size of 1030 records.Moreover, this study was limited to predicting compressive strength and ignored the flexural strength and corrosive performance of the concrete under elevated temperature conditions.Certainly, an appropriate dataset and validation are required for the modeling of engineering parameters in various civil engineering applications.This work used a diverse variety of datasets comprising eight independent factors; however, the dataset and input factors can be extended to further improve the performance and accuracy of the models.
Moreover, additional new factors such as extreme temperature, moisture, and other weather conditions should be included as input parameters to assess the model's efficiency for predicting desirable results.As climatic factors have a major effect on concrete characteristics, other ML techniques such as genetic algorithms, ANFIS and k-nearest neighbors should be used to investigate their effects under various scenarios.

Conclusions
Concrete producers all around world employ mixed design methodologies to produce a mix design that meets the needs of their clients.It is vital for field engineers to understand how a mix proportion was developed to properly modify the mix design based on field constraints, comprehend fresh properties, and achieve standards of quality.However, identifying the process used to develop a mix design can be challenging because most design methodologies offer identical mix design proportions for specific required features.This research used two machine learning approaches, ANNs and DT, to tackle the classification challenge by studying eight important parameters.The following conclusions are derived from our study:

•
The outcomes of individual ANNs and DT models showed a significant correlation between the predicted and actual values of CS with R 2 values of 0.848 and 0.836, respectively, in testing of the model for the CS prediction.However, the R 2 values of 0.873 and 0.943 were attained during the training phase of both models.

•
The increased R 2 values and reduced RMSE and MAD showed the higher precision and accuracy of both predictive models.These statistical measures showed satisfactory outcomes for both models, i.e., ANNs and decision tree.

•
Sensitivity analysis demonstrated that OPC (kg/m 3 ), Age (curing) and water content are the key factors in the creation of a model for the CS of concrete.However, other factors such as Fly Ash and fine aggregates (FA) had the least effect on the CS in the created model.

•
Decision tree and ANNs are types of supervised learning techniques which produced significant correlations between the predicted and observed values.The DT proved effective, according to the K-fold cross-validation technique, which was used to verify the prediction performance.

•
The precise formulations and models will help to promote the utilization of waste materials such as SFG and Fly Ash instead of dumping as industrial waste for future construction works.This research provides a long-term sustainability by reducing energy use, disposal waste, and carbon emissions.
Future work will comprise training and validating ML models employing a detailed experimental setup.In addition, the author plans to test more machine learning models to find the most accurate and effective model for identifying mixed design processes.

Figure 5 .
Figure 5. R-Squared Values vs. Number of Terminal Nodes.

Figure 6 .
Figure 6.Predicted and Actual Values of CS During the Training and Testing Phase.

Figure 5 .
Figure 5. R-Squared Values vs. Number of Terminal Nodes.

Figure 5 .
Figure 5. R-Squared Values vs. Number of Terminal Nodes.

Figure 6 .
Figure 6.Predicted and Actual Values of CS During the Training and Testing Phase.Figure 6. Predicted and Actual Values of CS During the Training and Testing Phase.

Figure 6 .
Figure 6.Predicted and Actual Values of CS During the Training and Testing Phase.Figure 6. Predicted and Actual Values of CS During the Training and Testing Phase.

Figure 7 .
Figure 7. Decision Tree Diagram using DT Model for Training Dataset.

Figure 7 .
Figure 7. Decision Tree Diagram using DT Model for Training Dataset.

Figure 8 .
Figure 8. Actual by predicted plot (a); Residual by predicted plot (b) in training and validation phase of ANNs model.Figure 8. Actual by predicted plot (a); Residual by predicted plot (b) in training and validation phase of ANNs model.

Figure 8 .
Figure 8. Actual by predicted plot (a); Residual by predicted plot (b) in training and validation phase of ANNs model.Figure 8. Actual by predicted plot (a); Residual by predicted plot (b) in training and validation phase of ANNs model.

Figure 10 .
Figure 10.Interaction Profiler of CS vs. Eight Different Factors.

Figure 10 .
Figure 10.Interaction Profiler of CS vs. Eight Different Factors.

Figure 12 .
Figure 12.Comparative Analysis of Prediction Using DT and ANNs models.

Figure 12 .
Figure 12.Comparative Analysis of Prediction Using DT and ANNs models.

Figure 12 .
Figure 12.Comparative Analysis of Prediction Using DT and ANNs models.

Table 1 .
Descriptive Statistical Analysis of the Dataset.

Table 2 .
Statistical Model Summary of Training and Testing Dataset Using DT Modeling.

Table 2 .
Statistical Model Summary of Training and Testing Dataset Using DT Modeling.

Table 2 .
Statistical Model Summary of Training and Testing Dataset Using DT Modeling.

Table 3 .
Predicted Statistical Measures with Both Training and Validation Values.

Table 4 .
Statistical parameter estimations using ANNs modeling.

Table 4 .
Statistical parameter estimations using ANNs modeling.