Application of Soft-Computing Methods to Evaluate the Compressive Strength of Self-Compacting Concrete

This research examined machine learning (ML) techniques for predicting the compressive strength (CS) of self-compacting concrete (SCC). Multilayer perceptron (MLP), bagging regressor (BR), and support vector machine (SVM) were utilized for analysis. A total of 169 data points were retrieved from the various published articles. The data set was based on 11 input parameters, such as cement, limestone, fly ash, ground granulated blast-furnace slag, silica fume, rice husk ash, coarse aggregate, fine aggregate, superplasticizers, water, viscosity modifying admixtures, and one output with compressive strength of SCC. In terms of properly predicting the CS of SCC, the BR technique outperformed both the SVM and MLP models, as determined by the research results. In contrast to SVM and MLP, the coefficient of determination (R2) for the BR model was 0.95, whereas for SVM and MLP, the R2 was 0.90 and 0.86, respectively. In addition, a k-fold cross-validation approach was adopted to check the accuracy of the employed models. The statistical measures mean absolute percent error, mean absolute error, and root mean square error ensure the validity of the model. Using sensitivity analysis, the influence of input factors on the intended CS of SCC was also explored. This analysis reveals that the highest contributing parameter towards the CS of SCC was cement with 16.2%, while rice husk ash contributed the least with 4.25% among all the input variables.


Introduction
Self-compacting concrete (SCC), a type of high-performance concrete (HPC) with a superior ability to deform and resistance to segregation, was developed for the first time in Japan in 1986 [1]. SCC has been utilized in Japan for major office buildings as well as innovative types of extruded tunnels combined with steel fibers [2]. The utilization of SCC reduced the construction site noise level and its impact on the environment. SCC is better than regular concrete for many reasons, including (1) eliminating the need for vibration; (2) lowering construction duration and costs of labor; minimizing noise pollution; (4) enhancing the filling volume of highly crowded structural elements; (5) improving the transition zone among the cement paste and reinforcement or aggregate; (6) limiting concrete's permeability and increasing its durability [3,4]. The introduction of SCC allows for the exploitation of replacement materials, industrial waste, and other secondary resources, such as mineral chemicals, and generates interest in carrying out this process [5][6][7].
In general, the quality of SCC is determined by its compressive strength (CS), which provides a basic indication of concrete because it is linked to the structure of the hardened mixture [8,9]. Typically, the compressive strength of SCC is determined by costly and time-consuming physical trials, therefore the work productivity will be extremely low [10]. On account of its complicated composition, SCC requires a suitable mixed design procedure in order to achieve its desired qualities [11]. For the selected design procedure, the materials used must be balanced with at least one mineral and one or more chemical additives [12]. The difficulty in improving grain size dispersal and packing particles in stronger cohesion for SCC is met by looking for the optimal balance equivalency among the coarse and fine components and the admixtures [13][14][15]. For this reason, technological advancements make it possible to solve engineering challenges at a lesser cost by employing empirical regression, simulation techniques, and machine learning algorithms [16][17][18]. These approaches enable the forecasting of the CS of SCC based on the proportions of different components in the mixture that has been created such as aggregate, cement, superplasticizers, and water [19][20][21].
In recent decades, machine learning (ML) approaches have emerged as an appealing modelling tool appropriate to a broad array of scientific fields, including materials engineering [22][23][24][25][26][27]. These data sets can be used to build an appropriate surrogate model for predetermined model parameters, hence eliminating the need for costly and timeconsuming trials [28]. Considering this, a trend has gained a surge in recent years by using ML techniques to anticipate the CS of concrete material [29][30][31][32][33][34][35]. These methods can be utilized for a number of applications, including regression, classification, correlation, and clustering [36][37][38][39][40]. With the advancement of ML approaches, it is consequently uncomplicated to investigate the CS of SCC along with the concrete's other properties [41,42]. Thus, to investigate the strength properties of SCC. Asteris et al. [43] employed the artificial neural network algorithm from ML techniques. The study was based on the prediction of 28 days CS of SCC in a limited time period. Awoyera et al. [44] investigate the predictive accuracy of ANN and GEP approaches for the strength properties of SCC. It was reported that both ANN and GEP successfully anticipated the required properties of SCC.
The purpose of this research is to investigate and evaluate the prediction capabilities of three distinct machine learning techniques for the CS of superplasticized self-compacting concrete (SCC). This research is groundbreaking in that it makes a prediction about the CS of SCC on the data set that was chosen by employing both ensemble machine learning methods (boosting regressor) and individual machine learning approaches (SVM, MLP). This research involves the descriptive analysis of the variables, the application of Python codes for running the employed models, statistical checks for the model's legitimacy, a validation approach for validating the models, and sensitivity analysis to check the impact that the variables have on the predictive outcome. This study has the potential to make a significant contribution to the construction industry's utilization of novel tools and approaches for investigating the various properties of construction materials in a manner that is economical, takes a limited amount of time, and does not require any physical effort in the laboratory.

Research Significance
This study presents the implementation of individual machine learning algorithms in addition to ensemble machine learning approaches in order to estimate the compressive strength of self-compacting concrete (SCC). In order to execute the necessary models for the purpose of prediction, the anaconda navigator software was programmed with the Python programming language. Twenty bagging sub-models were trained on the data, and then those models were tuned so that they had the maximum R 2 value. In addition to this, the test data were confirmed by employing k-fold cross-validation in conjunction with R 2 , MAPE, MAE, and RMSE. Moreover, the statistical model performance index was utilized in order to contrast individual models with ensemble models (e.g., MAPE, MAE, and RMSE). Furthermore, a comparative study of the obtained results and with the results of similar published articles has also been carried out in order to have a better understanding of an accurate model for the forecasting of the concrete's strength. This was carried out in order to have a better understanding of the accurate model towards the forecasting of the concrete's strength. In addition, the sensitivity analysis was included in the research in order to analyze the contribution level of each input parameter toward the strength prediction of SCC. This was carried out in order to ensure that the study was as accurate as possible.

Materials and Methods
Python coding (attached in Supplementary Data) in the Anaconda navigator software plays a vital role and was used for running all the employed models. The data set of self-compacting concrete (SCC) used for running the models to anticipate the compressive strength (CS) was retrieved from the literature [45][46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61][62]. A total of 169 data points (attached in the Supplementary Data) was used for running the selected models. The software automatically splits 70% of the data for training the model and 30% for testing the model. While the k-fold cross validation approach was adopted to validate the required model. To reduce the complexity of the data, the data preprocessing method was adopted. Data preprocessing for data mining addresses one of the most crucial challenges inside the renowned knowledge discovery from data procedure. Data preparation covers data reduction strategies that try to reduce the data's complexity by recognizing and deleting irrelevant and noisy data items. The model's analysis was conducted by using the regression and error distribution processes. Eleven input variables, including cement, limestone powder, coarse aggregate, fly ash, water, fine aggregate, GGBS, silica fume, RHA, superplasticizers, and VMA, were introduced for a single outcome such as compressive strength. The selection of these parameters was based on the importance of their effect in the concrete material. The selected input parameters show a significant effect when evaluating their effect using sensitivity analysis. The influence of all the input parameters was also accessed for predicting the CS of SCC through sensitivity analysis. The descriptive statistical analysis was also incorporated for these parameters as listed in Table 1. The validation method has also been adopted to evaluate the precision level of the employed models. Moreover, the histograms give the relative frequency dispersion of all the variables, as shown in Figure 1. A frequency distribution of all the input variables describes how often different values occur in a complete data set. Relative frequency distributions are valuable because they show how common a value is in a data set in comparison to all other values. In addition, violin plot distribution for all the variables is shown in the Figure 2. the employed models. Moreover, the histograms give the relative frequency dispersion of all the variables, as shown in Figure 1. A frequency distribution of all the input variables describes how often different values occur in a complete data set. Relative frequency distributions are valuable because they show how common a value is in a data set in comparison to all other values. In addition, violin plot distribution for all the variables is shown in the Figure 2.

Multilayer Perceptron (MLP)
An MLP is a type of feedforward ANN that turns a set of inputs into outputs. Between the output and input layers, a targeted graph connects many layers of input nodes. In MLP, the network is trained with backpropagation. It can also connect many loops in a directed graph, with signals moving in only one direction across the nodes. Every entity, with the exception of the input nodes, possesses its very own unique nonlinear activation function. MLPs, which are a form of supervised learning, make use of backpropagation in their learning processes. MLP is often called a deep learning approach because it uses so many layers of neurons. MLP is often used in studies of supervised learning, imputation, parallel distributed processing, and pure science. Machine translation, image recognition, and speech recognition are all examples of applications. To begin, the algorithm selects the predictors that it will be utilized throughout the regression phase in order to locate the variance inflation component (VIF). The VIF then figures out how much an estimated regression coefficient has changed because of collinearity. Figure 3 is a flowchart that shows the whole process of predicting the results of the MLP model.

Multilayer Perceptron (MLP)
An MLP is a type of feedforward ANN that turns a set of inputs into outputs. Between the output and input layers, a targeted graph connects many layers of input nodes. In MLP, the network is trained with backpropagation. It can also connect many loops in a directed graph, with signals moving in only one direction across the nodes. Every entity, with the exception of the input nodes, possesses its very own unique nonlinear activation function. MLPs, which are a form of supervised learning, make use of backpropagation in their learning processes. MLP is often called a deep learning approach because it uses so many layers of neurons. MLP is often used in studies of supervised learning, imputation, parallel distributed processing, and pure science. Machine translation, image recognition, and speech recognition are all examples of applications. To begin, the algorithm selects the predictors that it will be utilized throughout the regression phase in order to locate the variance inflation component (VIF). The VIF then figures out how much an estimated regression coefficient has changed because of collinearity. Figure 3 is a flowchart that shows the whole process of predicting the results of the MLP model.

Support Vector Machine (SVM)
SVM refers to a type of algorithm which connected learning algorithms used for evaluating data for both regression and classification. A SVM technique is a description of the samples as points in space that have been drawn in such a way that the patterns of the different classifications are separated by a discrete vector (line/plane) with the largest possible gap. Figure 4 depicts the classification of additional cases based on the side of the vector on which they lie. Figure 5 displays the implementation approach for the SVM model. This model is used to assess the material's strength, taking into account the influence of multiple factors. The optimization strategy is used to determine the SVM model's parameters.

Support Vector Machine (SVM)
SVM refers to a type of algorithm which connected learning algorithms used for evaluating data for both regression and classification. A SVM technique is a description of the samples as points in space that have been drawn in such a way that the patterns of the different classifications are separated by a discrete vector (line/plane) with the largest possible gap. Figure 4 depicts the classification of additional cases based on the side of the vector on which they lie. Figure 5 displays the implementation approach for the SVM model. This model is used to assess the material's strength, taking into account the influence of multiple factors. The optimization strategy is used to determine the SVM model's parameters.

Support Vector Machine (SVM)
SVM refers to a type of algorithm which connected learning algorithms used for evaluating data for both regression and classification. A SVM technique is a description of the samples as points in space that have been drawn in such a way that the patterns of the different classifications are separated by a discrete vector (line/plane) with the largest possible gap. Figure 4 depicts the classification of additional cases based on the side of the vector on which they lie. Figure 5 displays the implementation approach for the SVM model. This model is used to assess the material's strength, taking into account the influence of multiple factors. The optimization strategy is used to determine the SVM model's parameters.

Bagging Regressor (BR)
BR, also referred to as bootstrap aggregation, is a method for combining multiple versions of an anticipated model. Each model is independently trained, then the results are averaged. BR's primary objective is to attain a lesser divergence than any one model. The process of producing bootstrap samples from a selected data point is known as bootstrapping. The samples are formed by selecting and exchanging data points at random. The characteristics of the resampled data are distinct from those of the original data in its totality. It shows how the data are spread out and tends to keep bootstrapped samples from becoming too similar. This means that the data distribution must stay the same while keeping bootstrapped samples from becoming too similar. This aids in the development of robust models. In addition, bootstrapping helps prevent the overfitting issue. When constructing a model, the utilization of a large number of training data sets results in a decreased likelihood of errors and improved performance when applied to test data. This reduces variation by giving the test set a strong base. Multiple permutations of the model ensure that it is not biased towards an inaccurate outcome. The BR model's flowchart can be seen in the Figure 6.

Bagging Regressor (BR)
BR, also referred to as bootstrap aggregation, is a method for combining multiple versions of an anticipated model. Each model is independently trained, then the results are averaged. BR's primary objective is to attain a lesser divergence than any one model. The process of producing bootstrap samples from a selected data point is known as bootstrapping. The samples are formed by selecting and exchanging data points at random. The characteristics of the resampled data are distinct from those of the original data in its totality. It shows how the data are spread out and tends to keep bootstrapped samples from becoming too similar. This means that the data distribution must stay the same while keeping bootstrapped samples from becoming too similar. This aids in the development of robust models. In addition, bootstrapping helps prevent the overfitting issue. When constructing a model, the utilization of a large number of training data sets results in a decreased likelihood of errors and improved performance when applied to test data. This reduces variation by giving the test set a strong base. Multiple permutations of the model ensure that it is not biased towards an inaccurate outcome. The BR model's flowchart can be seen in the Figure 6.

Bagging Regressor (BR)
BR, also referred to as bootstrap aggregation, is a method for combining multiple versions of an anticipated model. Each model is independently trained, then the results are averaged. BR's primary objective is to attain a lesser divergence than any one model. The process of producing bootstrap samples from a selected data point is known as bootstrapping. The samples are formed by selecting and exchanging data points at random. The characteristics of the resampled data are distinct from those of the original data in its totality. It shows how the data are spread out and tends to keep bootstrapped samples from becoming too similar. This means that the data distribution must stay the same while keeping bootstrapped samples from becoming too similar. This aids in the development of robust models. In addition, bootstrapping helps prevent the overfitting issue. When constructing a model, the utilization of a large number of training data sets results in a decreased likelihood of errors and improved performance when applied to test data. This reduces variation by giving the test set a strong base. Multiple permutations of the model ensure that it is not biased towards an inaccurate outcome. The BR model's flowchart can be seen in the Figure 6.    Figure 8 illustrates the disparity between the actual and expected results. The tabulated information in the figure shows that 'x' is the variable that is being explained, and y is the variable that is being investigated. The slope of the line is denoted by the letter b, and 'a' is the intercept (the value of y when x is equal to 0). The difference depicts the higher and lower values equal to 21.50 MPa, and 0.18 MPa, respectively. Moreover, it has been noted that the 41.18% of the difference data were found between the minimum value (0.18 MPa) and 5 MPa, and 45.10% of the data were noted among 5 MPa, and 10 MPa. However, only 13.73% of the difference data were located above 10 MPa.  Figure 7 shows a depiction of the relationship between the actual and anticipated values for the self-compacting concrete's (SCC) compressive strength. This relationship gives the coefficient of determination (R 2 ) value of 0.86. Figure 8 illustrates the disparity between the actual and expected results. The tabulated information in the figure shows that 'x' is the variable that is being explained, and y is the variable that is being investigated. The slope of the line is denoted by the letter b, and 'a' is the intercept (the value of y when x is equal to 0). The difference depicts the higher and lower values equal to 21.50 MPa, and 0.18 MPa, respectively. Moreover, it has been noted that the 41.18% of the difference data were found between the minimum value (0.18 MPa) and 5 MPa, and 45.10% of the data were noted among 5 MPa, and 10 MPa. However, only 13.73% of the difference data were located above 10 MPa.

MLP Model Outcome
The box plot as shown in the Figure 9 gives more statistical information such as the minimum, maximum, median, mean, and first and third quartile values for both the experimental and forecasted outcomes from the test set. The values on the graph clearly indicate the difference of predicted and actual results while comparing.   The box plot as shown in the Figure 9 gives more statistical information such as the minimum, maximum, median, mean, and first and third quartile values for both the experimental and forecasted outcomes from the test set. The values on the graph clearly indicate the difference of predicted and actual results while comparing.

SVM Model Output
As shown in Figure 10, the SVM model provides a superior link between the experimental CS of SCC and the projected outcome when compared to the MLP model, which results in an R 2 value of 0.90 having been determined. Figure 11 is an illustration of the distribution of the data, which shows the disparity between the actual and the targeted values. The greatest value, the minimum value, and the average value, all based on this

SVM Model Output
As shown in Figure 10, the SVM model provides a superior link between the experimental CS of SCC and the projected outcome when compared to the MLP model, which results in an R 2 value of 0.90 having been determined. Figure 11 is an illustration of the distribution of the data, which shows the disparity between the actual and the targeted values. The greatest value, the minimum value, and the average value, all based on this distribution, are 14.81 MPa, 0.21 MPa, and 5.72 MPa, respectively. In addition, 50.98% of these measurements was obtained between 0.21 MPa and 5 MPa, 33.333% of these measurements was obtained between 5 MPa and 10 MPa, and only 15.61% of these measurements was obtained at or above 10 MPa. In addition, 50.98% of these measurements was obtained between 0.21 MPa and 5 MPa, 33.333% of these measurements was obtained between 5 MPa and 10 MPa, and only 15.61% of these measurements was obtained at or above 10 MPa. In addition, Figure 12 provides additional statistical information, including the minimum, maximum, median, mean, first quartile, and third quartile values for both the experimental and projected outcomes from the test set. The data on the graph make it abundantly evident that there is a disparity between the results that were projected and those that were actually achieved.    Figure 11. Indication of the error's difference between the actual and forecasted CS result of SCC from SVM model. Figure 11. Indication of the error's difference between the actual and forecasted CS result of SCC from SVM model. In addition, Figure 12 provides additional statistical information, including the minimum, maximum, median, mean, first quartile, and third quartile values for both the experimental and projected outcomes from the test set. The data on the graph make it abundantly evident that there is a disparity between the results that were projected and those that were actually achieved.

BR Model Outcome
As can be seen in Figure 13, the output of the bagging model demonstrates a strong and better relationship with the experimental CS result of the self-compacting concrete than the predictions of the MLP and SVM models, and it gives an R 2 value of 0.95. This is in contrast to the predictions of the MLP and SVM models. Figure 14 also provides a visual representation of the error's distribution, which is an additional point of interest. The variation produces data with a maximum of 13.05 MPa, a minimum of 0.16 MPa, and an average of 3.87 MPa, respectively. Additionally, it was seen that 72.55% of this data fell between 0.16 MPa and 5 MPa, while 19.61% of the data were reported to fall between 5 MPa and 10 MPa. However, only 5.88% of these values were found to be higher than the 10 MPa criterion.

BR Model Outcome
As can be seen in Figure 13, the output of the bagging model demonstrates a strong and better relationship with the experimental CS result of the self-compacting concrete than the predictions of the MLP and SVM models, and it gives an R 2 value of 0.95. This is in contrast to the predictions of the MLP and SVM models. Figure 14 also provides a visual representation of the error's distribution, which is an additional point of interest. The variation produces data with a maximum of 13.05 MPa, a minimum of 0.16 MPa, and an average of 3.87 MPa, respectively. Additionally, it was seen that 72.55% of this data fell between 0.16 MPa and 5 MPa, while 19.61% of the data were reported to fall between 5 MPa and 10 MPa. However, only 5.88% of these values were found to be higher than the 10 MPa criterion.
variation produces data with a maximum of 13.05 MPa, a minimum of 0.16 MPa, and an average of 3.87 MPa, respectively. Additionally, it was seen that 72.55% of this data fell between 0.16 MPa and 5 MPa, while 19.61% of the data were reported to fall between 5 MPa and 10 MPa. However, only 5.88% of these values were found to be higher than the 10 MPa criterion.
Moreover, further statistical information is provided in Figure 15, including the minimum, maximum, median, mean, and first and third quartile values for both the experimental and predicted test set results. The discrepancy between the expected and actual outcomes is graphically represented by the graph's values. The result of the Bagging model seems closer with one another (actual and predicted) as opposed to both SVM and ML models.   Moreover, further statistical information is provided in Figure 15, including the minimum, maximum, median, mean, and first and third quartile values for both the experimental and predicted test set results. The discrepancy between the expected and actual outcomes is graphically represented by the graph's values. The result of the Bagging model seems closer with one another (actual and predicted) as opposed to both SVM and ML models.

K-Fold Cross Validation Outcomes and Statistical Metrics
K-fold and statistical tests were applied to validate the ML algorithms in use. Typically, the k-fold method is utilized to test the viability of a strategy by arbitrarily distributing and dividing relevant data into 10 groups. As shown in Figure 16, nine groups are used to train machine learning models, while one is used to validate them. The ML approach is more accurate when the errors (MAPE, MAE, and RMSE) are minor and R 2 is superior. In addition, the technique must be performed 10 times for a desirable outcome. This huge amount of work is a big reason why the model is so accurate. Moreover, the statistical metrics obtained from the models are listed in the Table 2. In the meantime,

K-Fold Cross Validation Outcomes and Statistical Metrics
K-fold and statistical tests were applied to validate the ML algorithms in use. Typically, the k-fold method is utilized to test the viability of a strategy by arbitrarily distributing and dividing relevant data into 10 groups. As shown in Figure 17, nine groups are used to train machine learning models, while one is used to validate them. The ML approach is more accurate when the errors (MAPE, MAE, and RMSE) are minor and R 2 is superior. In addition, the technique must be performed 10 times for a desirable outcome. This huge amount of work is a big reason why the model is so accurate. Moreover, the statistical metrics obtained from the models are listed in the Table 2. In the meantime, Figure 16 gives the statistical information about the accuracy level of the employed models for the CS of SCC. This Tylor diagram also indicates the better performance of the bagging model towards the required outcome as compared to SVM and MLP models. The error percent for BR model is less than 8 MPa, while both MLP and SVM models give the same result equal to 12.96 MPa and 11.44 MPa, respectively.

Statistical and k-Fold Analysis
In order to determine whether or not the model being used is legitimate, a k-fold cross validation check was implemented as a standard. To investigate the results, the statistical metrics were taken into consideration: R 2 , MAE, RMSE. According to the k-fold study, MLP models had higher values of R 2 , MAE, and RMSE, as shown in Figure 18: 0.86, 18.53, and 24.46 MPa, respectively. Similarly, the highest values for the same metrics for SVM models were reported as 0.90, 19.20 MPa, and 20.98 MPa, as shown in Figure 19. However, the higher, lower, and average values of R 2 , MAE, and RMSE for the bagging model were noted as 0.95, 19.74 MPa, and 18.94 MPa, respectively, and can be seen in Figure 20.  Using Equations (1)-(3) derived from previous research [66], the statistical prediction performance of the techniques was evaluated.
where n = number of data points, P i = anticipated values, and T i = experimental values, A is the actual values and F is the forecasted values from the data set.
Materials 2022, 15, x FOR PEER REVIEW 13 of 21 Figure 17 gives the statistical information about the accuracy level of the employed models for the CS of SCC. This Tylor diagram also indicates the better performance of the bagging model towards the required outcome as compared to SVM and MLP models. The error percent for BR model is less than 8 MPa, while both MLP and SVM models give the same result equal to 12.96 MPa and 11.44 MPa, respectively. Using Equations (1)-(3) derived from previous research [66], the statistical prediction performance of the techniques was evaluated.
where = number of data points, = anticipated values, and = experimental values, A is the actual values and F is the forecasted values from the data set.   Figure 17. Statistical evaluations of the models used for this investigation [67].

Statistical and k-Fold Analysis
In order to determine whether or not the model being used is legitimate, a k-fold cross validation check was implemented as a standard. To investigate the results, the statistical metrics were taken into consideration: R 2 , MAE, RMSE. According to the k-fold study, MLP models had higher values of R 2 , MAE, and RMSE, as shown in Figure 18: 0.86, 18.53, and 24.46 MPa, respectively. Similarly, the highest values for the same metrics for SVM models were reported as 0.90, 19.20 MPa,and 20.98 MPa, as shown in Figure 19. However, the higher, lower, and average values of R 2 , MAE, and RMSE for the bagging model were noted as 0.95, 19.74 MPa, and 18.94 MPa, respectively, and can be seen in Figure 20.

Statistical and k-Fold Analysis
In order to determine whether or not the model being used is legitimate, a k-fold cross validation check was implemented as a standard. To investigate the results, the statistical metrics were taken into consideration: R 2 , MAE, RMSE. According to the k-fold study, MLP models had higher values of R 2 , MAE, and RMSE, as shown in Figure 18: 0.86, 18.53, and 24.46 MPa, respectively. Similarly, the highest values for the same metrics for SVM models were reported as 0.90, 19.20 MPa,and 20.98 MPa, as shown in Figure 19. However, the higher, lower, and average values of R 2 , MAE, and RMSE for the bagging model were noted as 0.95, 19.74 MPa, and 18.94 MPa, respectively, and can be seen in Figure 20.

Discussion on the Main Findings
This study describes the predictive performance of three different types of ML algorithms for the CS of SCC. The multilayer perceptron (MLP), SVM, and bagging regressor (BR) have been investigated for the analysis. Even though MLP and SVM are individual ML techniques, the precision of their predictive results was noted to be within acceptable limits. The BR belongs to the ensemble ML approach, which normally goes through the process of splitting the model into 20-sub models for optimization to have a strong outcome. The result of the bagging sub-models can be seen in the Figure 21. It has been noted that the input parameters and number of data points have a significant effect on the required outcomes. Therefore, the descriptive statistics of the input variables, relative frequency distribution of the input data, and sensitivity analysis for evaluation of their influence on the outcome were incorporated into the study. It was determined that the correlation between the experimental CS result and the prediction CS result from all employed models was satisfactory. The k-fold cross validation approach was also introduced to check the legitimacy of the models. The comparison of the present study

Discussion on the Main Findings
This study describes the predictive performance of three different types of ML algorithms for the CS of SCC. The multilayer perceptron (MLP), SVM, and bagging regressor (BR) have been investigated for the analysis. Even though MLP and SVM are individual ML techniques, the precision of their predictive results was noted to be within acceptable limits. The BR belongs to the ensemble ML approach, which normally goes through the process of splitting the model into 20-sub models for optimization to have a strong outcome. The result of the bagging sub-models can be seen in the Figure 21. It has been noted that the input parameters and number of data points have a significant effect on the required outcomes. Therefore, the descriptive statistics of the input variables, relative frequency distribution of the input data, and sensitivity analysis for evaluation of their influence on the outcome were incorporated into the study. It was determined that the correlation between the experimental CS result and the prediction CS result from all employed models was satisfactory. The k-fold cross validation approach was also introduced to check the legitimacy of the models. The comparison of the present study with the other relevant studies has also been taken into consideration and found to have a reasonable and better relationship. with the other relevant studies has also been taken into consideration and found to have a reasonable and better relationship.

Comparison with Other Studies
The result comparison for the application of ML approaches for predicting the same type of outcomes reported in the published articles are listed in the Table 3.

Comparison with Other Studies
The result comparison for the application of ML approaches for predicting the same type of outcomes reported in the published articles are listed in the Table 3.

Sensitivity Analysis
This approach was introduced to investigate the impact of each input variable on the predicted CS of SCC. This analysis reveals that the highest contribution was made by the binding material (cement) by giving 16.25% towards the anticipation of CS of SCC. However, rice husk ash contributed the least (4.25%) to predicting the required outcome. Moreover, the other variables' impact from the analysis in the descending order were reported for superplasticizers (13.44%), silica fume (11%), fly ash (9.94%), coarse aggregate (9.50%), limestone powder (8.90%), fine aggregate (8.80%), VMA (6.65%), and water (6.40) as depicted in the Figure 22. However, the Equations (4) and (5) were used to calculate the percent contribution of each parameter towards the required outcome.
where, f max (x i ) and f min (x i ) are the highest and lowest of the anticipated output over the i th output.
percent contribution of each parameter towards the required outcome.
where, and are the highest and lowest of the anticipated output over the output. Figure 22. Influence of the input parameters towards the predicted output.

Limitations and Future Perspective
The following are the limitations regarding the application of machine learning approaches along with recommendations for future studies.

•
The fact that it is difficult to describe how these algorithms arrive at their findings is a key shortcoming of machine learning.

•
A machine learning algorithm is analogous to a black box in that it receives inputs and generates outputs without providing an explanation of how the results were generated.

•
In addition to the fact that these algorithms operate in a mysterious manner, machine learning is also susceptible to the fallacy of "garbage in, garbage out." • According to this dictum, the output quality is directly proportional to the quality of the data sets used in the analysis.

•
The outputs of the algorithm will reflect any mistakes caused by inaccurate labeling of the images/data that are used as inputs.

•
Studies based on ML approaches for the prediction of required outcomes can also be enriched with the application of proper hyperparameters for the employed models. Convergence curves based on the RMSE for the selected models and new evaluation indexes such as PI and A-10index can also be included for checking the accuracy level of the employed models. Moreover, the employed models can also be split into training and testing sets to evaluate their performance separately.

Limitations and Future Perspective
The following are the limitations regarding the application of machine learning approaches along with recommendations for future studies.

•
The fact that it is difficult to describe how these algorithms arrive at their findings is a key shortcoming of machine learning. • A machine learning algorithm is analogous to a black box in that it receives inputs and generates outputs without providing an explanation of how the results were generated.

•
In addition to the fact that these algorithms operate in a mysterious manner, machine learning is also susceptible to the fallacy of "garbage in, garbage out". • According to this dictum, the output quality is directly proportional to the quality of the data sets used in the analysis.

•
The outputs of the algorithm will reflect any mistakes caused by inaccurate labeling of the images/data that are used as inputs. • Studies based on ML approaches for the prediction of required outcomes can also be enriched with the application of proper hyperparameters for the employed models. Convergence curves based on the RMSE for the selected models and new evaluation indexes such as PI and A-10index can also be included for checking the accuracy level of the employed models. Moreover, the employed models can also be split into training and testing sets to evaluate their performance separately.

Conclusions
This investigation was predicated on the utilization of supervised ML techniques for the purpose of estimating the CS of SCC. For the purpose of predicting the CS of SCC, the MLP, SVM, and the BR were all investigated. Nevertheless, the following inference can be made based on the findings of the study: • ML algorithms successfully predict the CS of SCC using python coding.

•
In contrast to the SVM and MLP models, the BR model demonstrates outstanding predictive performance with a high degree of precision.

•
The high coefficient of determination (R 2 ) value (0.95) for the BR model demonstrates its great accuracy in predicting the CS of SCC.

•
The K-fold cross validation and statistical metrics also confirm the legitimacy of the employed models.

•
The sensitivity analysis provides the impact of each input parameter towards the anticipation of CS of SCC, and it was found that cement contributed the most towards the required outcome.

•
This study can also provide civil engineering researchers with a better understanding of how to select the suitable machine learning (ML) algorithms for researching the strength qualities of any form of concrete. The study further provides insight into the significance of input characteristics for the anticipated outcome. Moreover, the experimental approach should also be included for data points (results) to have a better accuracy level of the employed models. Consequently, it is essential to use the required measures and methods for selecting the input variables. Overall, the application of soft computing techniques and tools provides a simple and cost-effective method for analyzing the properties of complicated materials such as concrete.