Prediction of Compressive Strength of Sustainable Foam Concrete Using Individual and Ensemble Machine Learning Approaches

The entraining and distribution of air voids in the concrete matrix is a complex process that makes the mechanical properties of lightweight foamed concrete (LFC) highly unpredictable. To study the complex nature of aerated concrete, a reliable and robust prediction model is required, employing different machine learning (ML) techniques. This study aims to predict the compressive strength of LFC by using a support vector machine (SVM) as an individual learner along with bagging, boosting, and random forest (RF) as a modified ensemble learner. For that purpose, a database of 191 data points was collected from published literature, where the mix design ingredients, i.e., cement content, sand content, water to cement ratio, and foam volume, were chosen to predict the compressive strength of LFC. The 10-K fold cross-validation method and different statistical error and regression tools, i.e., mean absolute error (MAE), root means square error (RMSE), and coefficient of determinant (R2), were used to evaluate the performance of the developed ML models. The modified ensemble learner (RF) outperforms all models by yielding a strong correlation of R2 = 0.96 along with the lowest statistical error values of MAE = 1.84 MPa and RMSE = 2.52 MPa. Overall, the result suggests that the ensemble learners would significantly enhance the performance and robustness of ML models.


Introduction
The production of normal concrete consumes a large quantity of cement and natural aggregates, which raises concerns about environmental degradation and sustainability. The emission of carbon dioxide (CO 2 ) from cement production plants is considered as one of the main sources of greenhouse gas (GHG) production [1]. It is roughly estimated that cement production plants are responsible for 7-8% of CO 2 emissions into the atmosphere [1][2][3]. As the production of cement is expected to increase, the percentage of CO 2 emission also rises rapidly [4]. The production of cement process requires raw materials and fuel, and the continuous mining of these materials will lead to loss of topsoil and deforestation [5]. On the other hand, the continuous usage and quarrying of resources greatly disturb the natural habitats of organisms. From the lithosphere, the construction industry is expected to consume 60% of the extracted materials [6]. Thus, there is a need for the production of concrete that will minimize or replace the use of cement and natural aggregates and transform the construction industry towards sustainability and also be helpful to alleviate the above-mentioned issues.
Foamed concrete (FC) is a lightweight material composed of either cement or mortar paste with entrapped air voids. The LFC is used as an insulating material having interesting structural features [7]. The LFC can also be used as a structural element for short-and long-term purposes [8]. By controlling the dosage of foaming agent in LFC, a broad range of densities (400-1850 kg/m 3 ) can be obtained for different application purposes, i.e., insulation, structural, filling grade, partition, etc. [9,10]. The compressive strength of LFC decreases rapidly with a decrease in its dry density [11]. The fracture energy of the FC notched beam is relatively high, around 18 to 25 N/m, with compressive strength of 6.4-14 MPa [12]. It has been estimated that the entrained air bubbles can replace up to 50% of the total concrete volume, which results in less consumption of cement and natural aggregates [13]. The entrained air voids exhibit a strong plasticizing effect, thus increasing the workability of foamed concrete [14]. The strength of FC can also be affected by the shape and size of the sample specimen, loading direction, pore formation method, and curing method [15]. The LFC has been identified as a light, economic, durable, and sustainable construction material [16]. The possibility of replacing concrete volume with entrained air bubbles has enhanced the sustainability feature and reduced the consumption of cement and aggregates in concrete production.
For the production and practical application of sustainable LFC, the optimization of the main ingredient of mix design is very important. The mix design will significantly affect the behavior and performance of LFC [17][18][19]. The strength of LFC is dependent on mix design ingredients, i.e., cement and sand content, water to binder ratio, foam volume, and curing method [20][21][22]. All the significant properties of concrete, such as durability, permeability, resistance to abrasion, etc., can be represented in terms of its compressive strength [23]. The durability and safety of the concrete elements are evaluated in terms of concrete compressive strength and is considered as the most important parameter [24]. The presence of entrained air voids in LFC makes it difficult to estimate the concrete strength accurately. Normally, the strength of concrete samples in the laboratory is calculated by casting and crushing concrete samples of standard dimension after the stipulated time of curing [25]. However, this is a hit-and-trial method that requires extensive laboratory work and is uneconomical and time-consuming.
Nowadays, the evolution in the artificial intelligence (AI) and machine learning (ML) techniques has made it possible to predict and estimate the different physical and mechanical properties of concrete [26][27][28][29][30]. The strength of concrete can be forecasted accurately against different parameters by using different ML techniques, such as classification, regression, and clustering [31][32][33]. The ML technique provides accurate and reliable results as compared to previous regression methods [34]. Different ML techniques, such as random forest (RF), decision tree (DT), deep learning (DL), gene expression programming (GEP), artificial neural network (ANN), and support vector machine (SVM), use pattern recognition ability to resolve a complex engineering problem. In the case of RF and DT, tree-like structures are used to predict the response. The RF technique randomly chooses the important parameters and DT utilizes the whole database with interested parameters and builds multiple prediction trees. The maximum voters with averaged prediction value give an accurate result. The nonlinear computational approach of ANN can resolve complex engineering problems by developing input and output relations without using any specific equation and can solve complex problems having imprecise or incomplete information. SVM is designed to handle nonlinear regression problems with high generalization ability and provides a globally optimal solution. GEP is an advanced form of genetic algorithm based on Darwinian evolution theory and solves complex engineering problems in the form of non-linear parse tree-like structures called expression trees and provides an explicit numerical expression for the practical application of the developed model. Among all the ML techniques, the DL approach uses a robust design algorithm to resolve complex and rigorous engineering problems, and provides better prediction results. Siddique et al. [35] studied the incorporation of bottom ash in self-compacting concrete by using the ANN approach. Similarly, Dantas et al. [36] utilized the ANN technique to evaluate the strength of recycled concrete made from construction wastes. Chou et al. [37] employed SVM and ANN techniques to estimate the load-bearing capacity of concrete. In the research work of Zhang et al. [38], the RF regression method is used to predict and assess the strength of concrete, and the significant input parameters are also discussed.
The ML approach utilizes the pattern recognition technique by using both a database and statistical analysis. The required information is extracted from a large dataset and establishes different relations to simplify the complex pattern and provide a simple resolution. In the ML approach, there are two types of techniques used for prediction modeling. The first is the standard technique, where a single separate ML model is used for prediction. In the second technique, the newly developed ensemble learning algorithms, i.e., bagging, RF, and boosting, are used. Studies suggest that the ensemble learning model results are more adamant and reliable than individual ML models [39]. The individual standard ML approach, i.e., ANN, SVM, GEP, etc., forms the weak learners. In the ensemble learning approach, the training data are used to train several weak learners, which are then integrated into a strong learner. The high-performing ML techniques are used to model the complex concrete nature by incorporating ensemble learning algorithms and classifier generators. The increasing popularity of the ensemble learning approach has been witnessed in the latest prediction modeling studies due to its accuracy in results as compared to individual standard learners [40].
This research aims to evaluate and compare the prediction capability of network and tree-based ML models, i.e., SVM and RF. This study also addressed the enhancement in the performance of models by using ensemble techniques, such as bagging, boosting, and modified ensemble learner (RF). The novelty and significance of the present study are concerned with the prediction and estimation of LFC compressive strength against different combinations of input ingredients, i.e., cement content, sand content, water to cement ratio, and volume of foam, by implementing the ensemble algorithm over individual learners. Different statistical regression and error tools along with the 10-K fold cross-validation approach were used to assess the performance, reliability, and generalization capability of the prediction models.

Development of Data
The required data to develop the ML models was collected from the experimental results of seven different past published literature [2,13,[41][42][43][44][45]. The collected database is comprised of 191 data points where the basic mix design ingredients, i.e., cement content (kg/m 3 ), sand content (kg/m 3 ), water-cement ratio, and foam volume (dm 3 /m 3 ) are taken as input, and the 28-day compressive strength of LFC as an output variable. All the compressive strength test results used in this study are cube specimens having the dimensions of (15 × 15 × 15) cm 3 . Table 1 illustrates the statistical description of the collected data, which contain the maximum and minimum ranges, average values, and standard deviation (SD) of all the input and output variables. To obtain a reliable prediction model for the compressive strength of LFC, it is suggested to use the proposed expression within the specified range. The statistical analysis shows that the data covers a large range of mixed design ingredients, and the SD shows the distribution of the data along with its mean value. The greater the SD value, the greater the distribution will be. The distribution histogram of different input variables against the strength of LFC is shown in Figure 1. The histogram shows that the collected data are highly diverse and well distributed. The performance of the AI model is highly dependent on the distribution and dispersion of available data [46]. The collected data of 191 data points were randomly distributed into training and testing data. Here, 80% of data (152 data points) was used to train and develop the ML model, and the other 20% of data (39 data points) was used to evaluate the performance of the prediction model. performance of the AI model is highly dependent on the distribution and dispersion of available data [46]. The collected data of 191 data points were randomly distributed into training and testing data. Here, 80% of data (152 data points) was used to train and develop the ML model, and the other 20% of data (39 data points) was used to evaluate the performance of the prediction model.

Pre-Processing of Data
In AI, the pre-processing of data is a key step that is used to evaluate the relation of input and output parameters before the development of any ML models. This step is used to check the validity of the collected data and to assess the trend followed by the output parameter under the influence of the inputs. To avoid any complexity in the assessment of the final ML model, the correlation between the input and output variable is evaluated before the development of the AI model [47]. The Pearson correlation coefficient (r) was evaluated to find out the relation between the given variables [48]. The Pearson correlation (r) matrix of given variables is shown in Table 2 and was calculated by using the statistical software Minitab-16. Here, the ±1 shows a strong correlation and 0 means no relation between the input and output parameters. The positive sign shows a direct relation, and the negative sign means there exists an inverse relationship between the variables. Figure 2  shows the relationship of mix design parameters and the strength of LFC in the form of contour maps, which show that all the input parameters followed the global trend. For example, cement and sand content show a direct relation as shown in Figure 2a,b. Whereas, w/c and foam volume followed the inverse relation as illustrated in Figure 2c,d. The dark colors of contour maps show the intensity of input variables within a range. The results of pre-processing manifest that all the input parameters hold a strong correlation with the compressive strength of LFC and have also followed the global trend. Hence, the collected data are valid and can be used for the development of ML models.
input and output parameters before the development of any ML models. This step is used to check the validity of the collected data and to assess the trend followed by the output parameter under the influence of the inputs. To avoid any complexity in the assessment of the final ML model, the correlation between the input and output variable is evaluated before the development of the AI model [47]. The Pearson correlation coefficient (r) was evaluated to find out the relation between the given variables [48]. The Pearson correlation (r) matrix of given variables is shown in Table 2 and was calculated by using the statistical software Minitab-16. Here, the ±1 shows a strong correlation and 0 means no relation between the input and output parameters. The positive sign shows a direct relation, and the negative sign means there exists an inverse relationship between the variables. Figure 2 shows the relationship of mix design parameters and the strength of LFC in the form of contour maps, which show that all the input parameters followed the global trend. For example, cement and sand content show a direct relation as shown in Figure 2a,b. Whereas, w/c and foam volume followed the inverse relation as illustrated in Figure 2c,d. The dark colors of contour maps show the intensity of input variables within a range. The results of pre-processing manifest that all the input parameters hold a strong correlation with the compressive strength of LFC and have also followed the global trend. Hence, the collected data are valid and can be used for the development of ML models.

Methodology
The AI models are developed by training the available data and are calibrated and validated with the laboratory test results. The pattern recognition ability of the AI technique transforms the complex pattern of available data into a simplified pattern to resolve complex engineering problems. Table 3 illustrates the summary of different ML algorithms used in recent years for predicting the various properties of concrete. In this study, the ML approaches are chosen to evaluate and compare the prediction performance of tree and network-based decision-making techniques. The ensemble learning algorithms were applied to individual ML models to further enhance the prediction capability of the developed models. Furthermore, the validity of the models is evaluated by using a 10-K fold cross-validation method and different statistical evaluation tools.   The RF technique uses both the classification and regression approaches and has been used by different researchers [38,76]. Though DT and RF both work on tree-based decision methods but there is a major difference between them. In DT modeling, a single tree is developed, but the RF technique results in the construction of several trees which are called forests, and the arbitrarily chosen data are assigned to them. The data are provided in matrix form and the different dimensions of rows and columns are selected [77]. Large datasets can be more effectively handled by RF than any other ML technique. There are three main steps in RF regression model development. First, the training dataset is used to assemble the trained regression trees. Then, the mean value is evaluated for single regression tree outcomes, and finally, validation datasets are used to validate the predicted results. The new trained data set, which is comprised of boot-strap data, is calculated from the original data set. The removal and swapping of data points occur and result in the formulation of a new dataset called out-of-bag datapoints, which assembles all the removed data points. In the end, the two by third data points are used for the estimation of the regression function and the developed regression model is validated against the remaining out-of-bag data points. The process continues until the required accuracy is achieved. The deletion of data points in the out-of-bag dataset and using them in validation is a distinctive feature of the RF technique [29]. Finally, the gross error is computed for all expression trees, which manifests the accuracy and effectiveness of each developed tree.

Support Vector Machine (SVM) Models
The SVM is a supervised learner that analyzes the data for classification and regression problems. The SVM approach can generalize and resolve practical problems, such as nonlinearity, high input dimensional spaces, and small database problems. To achieve better accuracy, the SVM can transform input space into a high dimensional space with the help of a non-linear transformation, which is defined by an inner product function. The non-linear regression problems are solved efficiently by using SVM regression models [78]. For the classification of data, the regression data are first mapped into the n-dimensional space function. The non-linear kernel functions are used which meet the high dimensional space to enhance the classification and distinction of the original input space data. Equation (1) shows the linear function in space in terms of f(x,w).
where w, g j (x), and b refer to weight vector transformation, non-linear input space, and bias term respectively. The loss function Lε is a measurement of estimation quality and is given in Equation (2).
In the SVM regression approach, the new higher dimensional feature space is computed from the linear regression function by lowering the ||w || 2 , which also reduces the complexity of model at the same time. The non-negative slack variables ξ i + ξ * i establish the function, where i = 1,2, 3 . . . , n will identify samples from the π-intensive field. The simplified SVM regression model is constructed from the functions given in Equation (3). The optimized problem can be changed into a resolved dual problem and is given in Equation (4).
where nsv = number of support vectors. The kernel function is given in Equation (5).
To find the support vector along with the function space, the kernel functions, i.e., linear, polynomial, radial basis, or sigmoid function, are chosen by the training set. It should also be noted that the kernel parameters are also affected by the implemented software and the chosen function.

Ensemble Algorithms Using Bagging and Boosting
The ensemble learners enhance the prediction capability and accuracy of the ML techniques. In ensemble techniques, the training data are combined and aggregated from several weak predictive models to reduce the concern of over-fitting. The formation of an optimal predictive model is achieved from the combination of qualified sub-models (weak predictive models) by using the combining, averaging, and voting approach. In ensemble modeling, bagging is an effective technique that utilizes the bootstrap retesting approach and assembles benefits. In this process, the part models are substituted by the initial training set. There is a possibility that the product models may contain some data points several times and some data points may be ignored. The outputs of component models are averaged to obtain the final output.
Similarly, in the boosting technique, the cumulative models are developed, and several components are formed having higher precision than individual models. In the boosting technique, the sub-models are assembled in finals model based on the weighted average of the dependent sub-models. In this research, the SVM regression technique is employed as a base learner along with ensemble algorithms, i.e., bagging, boosting, and RF technique, to predict the compressive strength of LFC. In the current study, the ensemble learners (1 each) with 1, 2, 3, . . . . . . . . . , 20 sub-model components were employed to select the optimum range of base learners, and the best construction was chosen based on the coefficient of correlation (R) values. The performance of various ensemble models against different sub-model components is shown in Figure 3. Figure 3a shows the SVR-bagging ensemble, where 9 sub-models develop a strong correlation, and the prominent effect of sub-models on boosting and RF ensemble models is shown in Figure 3b,c. This initial analysis shows an enhancement in the individual learner performance with the incorporation of ensemble learners. The chosen architectures for ensemble learners are described in Table 4.

10-K Fold Cross-Validation and Statistical Evaluation
The 10-K fold cross-validation algorithms are used to minimize the random sampling of training and hold-out data sets. A reliable variance within the optimal computational time is obtained from the 10 K-fold validation approach [79]. In this study, a statistical 10-K fold approach was applied to evaluate the performance of developed models, which distributes a data set into ten equal subsets. For model development and validation, a unique data subset for training and testing was taken with other data subsets in each of the ten rounds. The algorithm accuracy in 10-validation rounds for ten models is expressed as an average accuracy.
Furthermore, different statistical regression and error tools were used to evaluate and gauge the performance of the developed models and are given in Equations (6)- (8). Different researches suggest that the models having a high value of R 2 and low values of statistical error are considered accurate and reliable [46,80].
where ai = ith actual value and pi = ith prediction value. = average of actual output values, ̅ = average of the prediction output, and n = the total number of data points. Figure 4 shows the prediction results of SVM regression and the ensemble models along with the prediction error distribution graphs. The individual SVM model yields a correlation of R 2 = 0.78 and the ensemble model yields R 2 = 0.96 and R 2 = 0.91 for bagging and boosting models, respectively, as shown in Figure 4a,c,e. From Figure 4b, the error

10-K Fold Cross-Validation and Statistical Evaluation
The 10-K fold cross-validation algorithms are used to minimize the random sampling of training and hold-out data sets. A reliable variance within the optimal computational time is obtained from the 10 K-fold validation approach [79]. In this study, a statistical 10-K fold approach was applied to evaluate the performance of developed models, which distributes a data set into ten equal subsets. For model development and validation, a unique data subset for training and testing was taken with other data subsets in each of the ten rounds. The algorithm accuracy in 10-validation rounds for ten models is expressed as an average accuracy.
Furthermore, different statistical regression and error tools were used to evaluate and gauge the performance of the developed models and are given in Equations (6)- (8). Different researches suggest that the models having a high value of R 2 and low values of statistical error are considered accurate and reliable [46,80].
where ai = ith actual value and pi = ith prediction value. a = average of actual output values, p = average of the prediction output, and n = the total number of data points. Figure 4 shows the prediction results of SVM regression and the ensemble models along with the prediction error distribution graphs. The individual SVM model yields a correlation of R 2 = 0.78 and the ensemble model yields R 2 = 0.96 and R 2 = 0.91 for bagging and boosting models, respectively, as shown in Figure 4a,c,e. From Figure 4b, the error distribution graph shows an average error of 4.96 MPa for the SVM regression model and that for bagging and boosting, an average error of 2.05 MPa and 2.72 MPa was recorded, respectively, as shown in Figure 4d,f. The result also shows that 80% of the individual SVM model results have error values less than 6 MPa, and that for both bagging and boosting, 92% of the model results have error values less than 5 MPa. It is observed from the results that the ensemble learning models have a strong prediction capability as compared to the individual SVM regression model. Moreover, the robustness of the models is also depicted by statistical analysis. Table 5 represents the statistical evaluation of the models. distribution graph shows an average error of 4.96 MPa for the SVM regression model and that for bagging and boosting, an average error of 2.05 MPa and 2.72 MPa was recorded, respectively, as shown in Figure 4d,f. The result also shows that 80% of the individual SVM model results have error values less than 6 MPa, and that for both bagging and boosting, 92% of the model results have error values less than 5 MPa. It is observed from the results that the ensemble learning models have a strong prediction capability as compared to the individual SVM regression model. Moreover, the robustness of the models is also depicted by statistical analysis. Table 5 represents the statistical evaluation of the models.

Results of Random Forest Regression
Random forest is a modified ensemble ML technique that combines the bagging ensemble learner and random feature selection, which is user-friendly and can be employed for the development of reliable prediction models. Better accuracy in the prediction of compressive strength of LFC has been achieved by employing the RF technique and is shown in Figure 5. Figure 5a shows a strong correlation of R 2 = 0.96 between the experimental and RF prediction values. From Figure 5b, it can be seen that 90% of the data points have error values less than 5 MPa and have a maximum and minimum error value of 6.65 MPa and 0.015 MPa, respectively. An average prediction error value of 1.85 MPa was recorded for the RF regression model. The low values of prediction errors and high value of the coefficient of determinant (R 2 ) manifest that the performance of prediction models can be enhanced with the application of ensemble and modified ensemble techniques and better accuracy can be achieved.

Results of Random Forest Regression
Random forest is a modified ensemble ML technique that combines the bagging ensemble learner and random feature selection, which is user-friendly and can be employed for the development of reliable prediction models. Better accuracy in the prediction of compressive strength of LFC has been achieved by employing the RF technique and is shown in Figure 5. Figure 5a shows a strong correlation of R 2 = 0.96 between the experimental and RF prediction values. From Figure 5b, it can be seen that 90% of the data points have error values less than 5 MPa and have a maximum and minimum error value of 6.65 MPa and 0.015 MPa, respectively. An average prediction error value of 1.85 MPa was recorded for the RF regression model. The low values of prediction errors and high value of the coefficient of determinant (R 2 ) manifest that the performance of prediction models can be enhanced with the application of ensemble and modified ensemble techniques and better accuracy can be achieved. The statistical evaluation of the developed ML models is illustrated in Table 5. The individual SVR model performance is enhanced with the application of ensemble techniques and the coefficient of regression R 2 is increased from 0.81 for SVR to 0.96 for the SVR-bagging model. Similarly, after the application of ensemble learners, the statistical error values also reduced significantly. For example, the MAE value for SVR is recorded as 4.96 MPa, which is reduced to 2.05 MPa for SVR-bagging ensemble learners. The modified ensemble learner (RF) outperforms all the ML techniques used in this research and yields R 2 = 0.96 along with the least statistical error values of MAE = 1.84 MPa and RMSE = 2.52 MPa, proving to be a more efficient technique with adamant results.

10-K Fold Cross-Validation and Statistical Evaluation
A desired level of accuracy is required for the validity of prediction models. The 10 K-fold cross-validation method is used to ensure the accuracy of the model by shuffling The statistical evaluation of the developed ML models is illustrated in Table 5. The individual SVR model performance is enhanced with the application of ensemble techniques and the coefficient of regression R 2 is increased from 0.81 for SVR to 0.96 for the SVR-bagging model. Similarly, after the application of ensemble learners, the statistical error values also reduced significantly. For example, the MAE value for SVR is recorded as 4.96 MPa, which is reduced to 2.05 MPa for SVR-bagging ensemble learners. The modified ensemble learner (RF) outperforms all the ML techniques used in this research and yields R 2 = 0.96 along with the least statistical error values of MAE = 1.84 MPa and RMSE = 2.52 MPa, proving to be a more efficient technique with adamant results.

10-K Fold Cross-Validation and Statistical Evaluation
A desired level of accuracy is required for the validity of prediction models. The 10 K-fold cross-validation method is used to ensure the accuracy of the model by shuffling the available data. By using this technique, the bias associated with a random sampling of training data set is minimized. This technique divides the experimental data samples into equal ten subsets and utilizes the nine subsets for developing and shaping the strong learner. Meanwhile, the last subset is utilized to gauge the validity of the developed model. The validation process repeats for ten times, and at the end, the average accuracy is obtained from the ten times repetition. The generalization performance and the reliability of the model are well represented by 10 K-fold cross-validations [79]. The cross-validation tests for individual non-linear, ensemble, and modified ensemble models are represented in Figure 6. The results show that with the application of ensemble techniques, the performance of the model is enhanced from a weak to strong relation along with adamant results. The results of 10 K-fold cross-validations are assessed by using the coefficient of determinant R 2 (regression tool) along with MAE and RMSE (statistical error tools). In Figure 6a, fluctuation in the value R 2 is observed for the 10 K-fold validation of different ML techniques, but still, a high level of accuracy is maintained in each fold. For example, the range of R 2 values for SVR-Bagging, SVR-Boosting, and RF is 0.84-0.96, 0.82-0.96, and 0.86-0.95, respectively. The accuracy of the cross-validation was also assessed in terms of MAE and RMSE and is given in Figure 6b,c, respectively. The average value of MAE for SVR-bagging, SVR-Adaboost, and RF are 5.6 MPa, 5.8 MPa, and 4.2 MPa, respectively, as shown in Figure 6b. Figure 6c shows the RMSE values of 10 K-fold validation and gives an average value of 5.7 MPa, 5.6 MPa, and 5.7 MPa for SVR-bagging, SVR-Adaboost, and RF, respectively. The results of the 10 K-fold cross-validation method reflect the accuracy and reliability of the concerned developed models.
learner. Meanwhile, the last subset is utilized to gauge the validity of the developed model. The validation process repeats for ten times, and at the end, the average accuracy is obtained from the ten times repetition. The generalization performance and the reliability of the model are well represented by 10 K-fold cross-validations [79]. The cross-validation tests for individual non-linear, ensemble, and modified ensemble models are represented in Figure 6. The results show that with the application of ensemble techniques, the performance of the model is enhanced from a weak to strong relation along with adamant results. The results of 10 K-fold cross-validations are assessed by using the coefficient of determinant R 2 (regression tool) along with MAE and RMSE (statistical error tools). In Figure 6a, fluctuation in the value R 2 is observed for the 10 K-fold validation of different ML techniques, but still, a high level of accuracy is maintained in each fold. For example, the range of R 2 values for SVR-Bagging, SVR-Boosting, and RF is 0.84-0.96, 0.82-0.96, and 0.86-0.95, respectively. The accuracy of the cross-validation was also assessed in terms of MAE and RMSE and is given in Figure 6b,c, respectively. The average value of MAE for SVR-bagging, SVR-Adaboost, and RF are 5.6 MPa, 5.8 MPa, and 4.2 MPa, respectively, as shown in Figure 6b. Figure 6c shows the RMSE values of 10 K-fold validation and gives an average value of 5.7 MPa, 5.6 MPa, and 5.7 MPa for SVR-bagging, SVR-Adaboost, and RF, respectively. The results of the 10 K-fold cross-validation method reflect the accuracy and reliability of the concerned developed models.

Conclusions
The different machine learning approaches, individual learner and ensemble learners, are used to predict and estimate the compressive strength of lightweight foamed concrete. The conclusions based on this analysis are given as follow.
(1) The performance of the individual SVR learner has significantly increased with the application of bagging and boosting ensemble learners. The modified ensemble learner (RF) has enhanced the performance of the prediction model by 23% when compared to the individual SVR learner and yields a high correlation of R 2 = 0.96. (2) In the 10-fold cross-validation method, all the ensemble learning approaches maintained high accuracy along with the lowest statistical error values of MAE and RMSE. (3) The statistical evaluation was performed using MAE, RMSE, and R 2 . The modified ensemble learner (RF) approach shows a reduced error of about 62% for both MAE and RMSE as compared to individual SVR learners.

Conclusions
The different machine learning approaches, individual learner and ensemble learners, are used to predict and estimate the compressive strength of lightweight foamed concrete. The conclusions based on this analysis are given as follow.
(1) The performance of the individual SVR learner has significantly increased with the application of bagging and boosting ensemble learners. The modified ensemble learner (RF) has enhanced the performance of the prediction model by 23% when compared to the individual SVR learner and yields a high correlation of R 2 = 0.96. (2) In the 10-fold cross-validation method, all the ensemble learning approaches maintained high accuracy along with the lowest statistical error values of MAE and RMSE. (3) The statistical evaluation was performed using MAE, RMSE, and R 2 . The modified ensemble learner (RF) approach shows a reduced error of about 62% for both MAE and RMSE as compared to individual SVR learners. (4) The SVR-bagging reports 58% and 61% lower error values of MAE and RMSE, respectively, as compared to individual SVR learners, and an enhancement of 20% in the robustness of the performance was observed, yielding R 2 = 0.96. (5) The SVR-boosting approach records 45% and 38% lower values of MAE and RMSE, respectively, and yields R 2 = 0.91 with a 17% enhancement in model performance as compared to individual SVR learners. Data Availability Statement: The data will be available on request.

Conflicts of Interest:
The authors declare no conflict of interest.