The Application of Response Surface Methodology and Machine Learning for Predicting the Compressive Strength of Recycled Aggregate Concrete Containing Polypropylene Fibers and Supplementary Cementitious Materials

Alkharisi, Mohammed K.; Dahish, Hany A.

doi:10.3390/su17072913

Open AccessArticle

The Application of Response Surface Methodology and Machine Learning for Predicting the Compressive Strength of Recycled Aggregate Concrete Containing Polypropylene Fibers and Supplementary Cementitious Materials

by

Mohammed K. Alkharisi

and

Hany A. Dahish

^*

Department of Civil Engineering, College of Engineering, Qassim University, Buraidah 52571, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(7), 2913; https://doi.org/10.3390/su17072913

Submission received: 15 February 2025 / Revised: 15 March 2025 / Accepted: 19 March 2025 / Published: 25 March 2025

Download

Browse Figures

Review Reports Versions Notes

Abstract

The construction industry’s development trend has resulted in a large volume of demolished concrete. Improving the efficiency of the proper use of this waste as a recycled aggregate (RA) in concrete is a promising solution. In this study, we utilized response surface methodology (RSM) and three machine learning (ML) techniques—the M5P algorithm, the random forest (RF) algorithm, and extreme gradient boosting (XGB)—to optimize and predict the compressive strength (CS) of RA concrete containing fly ash (FA), silica fume (SF), and polypropylene fiber (PPF). To build the models, the results regarding 529 data points were used as a dataset with varying numbers of input parameters (out of a total of ten). The CS quadratic model under RSM exhibited acceptable prediction accuracy. The best CS was found with a 100% volume of RA consisting of coarse aggregate, 1.13% PPF by volume of concrete, 7.90% FA, and 5.30% SF as partial replacements of binders by weight. The XGB model exhibited superior performance and high prediction accuracy, with a higher R² and lower values of errors, as depicted by MAE, RMSE, and MAPE, when compared to the other developed models. Furthermore, SHAP analysis showed that PPF had a positive impact on predicting CS, but the curing age and superplasticizer dose had the highest positive impact on predicting the CS of RA concrete.

Keywords:

recycled aggregate; polypropylene fiber; supplementary cementitious materials; response surface methodology; optimization; machine learning

1. Introduction

Concrete is among the most versatile composite materials commonly utilized in the construction industry. Consequently, there exists a substantial annual demand for crushed stone, natural gravel, sand, and cement. In addition, the demolition of concrete buildings, for a variety of reasons, including structural decay, changes in purpose, variances in traffic flow and load, and urban reorganization, produces colossal amounts of concrete waste [1]. Reducing construction and demolition waste and conserving natural resources are two ways in which recycled aggregates help make buildings more sustainable [2]. The high demand for aggregates, coupled with the scarcity of natural resources, intensifies the environmental concerns associated with concrete production [3]. Recycling and transforming construction waste into aggregates not only help with the recycling of waste materials but also help save natural resources, protect the environment, and pave the way for green and sustainable construction (Figure 1).

There are several fundamental limitations of recycled aggregates, including their increased porosity and decreased strength in concrete [4,5] and the existence of attached mortar, which limit the amount of RA that can be utilized for practical concrete applications [6]. Despite these limitations, using RA in concrete offers substantial sustainability benefits [7]. There have been numerous investigations into the durability and mechanical properties of recycled aggregate concretes (RACs) [8]. Several research studies have been undertaken to enhance the durability and mechanical properties of RA concrete [9,10,11,12]. Removal of adhering mortar and mechanical treatment are standard ways of improving the characteristics of RA (Figure 2). An appropriate treatment process for the surface of RA could improve the compressive strength of RA concrete [13]. Pstrowska et al. [14] investigated the chemical modification of road binding materials with formalin in a hermetic container. The authors concluded that the production of road binding materials via tar chemical modification with formaldehyde is quite a flexible process. The incorporation of supplementary cementitious materials (SCMs) is a widely adopted method for enhancing the strength of RA concrete [15]. Many studies have investigated the effect of the inclusion of FA in RACs [16,17]. FA reduces pore size and chlorine permeability in concrete [18]. The incorporation of FA in RAC enhances its properties, minimizes environmental impacts, and provides economic advantages [19]. Researchers have found that adding up to 30% FA improves the microstructure of RAC [20]. Many studies have investigated how adding SF to concrete containing RA effects the concrete’s mechanical properties [21,22,23,24,25,26,27]. The addition of SF improved the performance of RAC at early and late curing ages [27]. The influence of metakaolin, FA, and SF on the durability and mechanical properties of RAC was investigated in [28]. It was concluded that incorporating 15% metakaolin and 10% FA in RAC improved the durability and mechanical properties of the resultant concrete.

The effect of the inclusion of PPF in concrete containing RA on its properties was also investigated [29,30]. Polypropylene fibers are frequently utilized to enhance the flexural strength, energy absorption, impact resistance, post-cracking behavior, and toughness of concrete [31,32]. Concrete applications incorporate a variety of fibers, including steel, polymer, glass, and carbon-based fibers. PPFs possess numerous advantageous properties that make them well suited for concrete applications [33]. The effect of the combined incorporation of FA and PPF in RA concrete was investigated. The effects of steel fiber and PPF on the properties of fully RA concrete containing FA, including on concrete strength, were also investigated [34]. The authors concluded that the combined incorporation of fibers had a more significant effect on mechanical properties than a single fiber mix. The effect of the inclusion of SF and PPF in RA concrete was investigated by Ahmed et al. [22]. They found that the addition of up to 0.6% PPF enhanced the compressive strength of concrete by 20.8%, 15.2%, and 11.6% for concrete containing 0%, 50%, and 100% RA, respectively. The effects of adding up to 3% PPF to concrete with a fixed amount of FA, at 87 kg/m³, and of SF, at 21.75 kg/m³, as well as RA at percentages of 50%, 75%, and 100%, on the durability and strength of concrete were investigated by Alharthai et al. [1]. They concluded that the combination of FA and PPF in RAC improved its strength and durability.

In previous studies, researchers have employed various statistical methods, including RSM, for the prediction of concrete properties [35,36,37,38,39]. Response surface methodology (RSM) allows one to analyze the effects of multiple independent variables through mathematical and statistical techniques, facilitating the examination of process parameter interactions and the development of a precise mathematical model to describe the process. Several researchers have utilized RSM to optimize and construct models and plan experiments [38,40,41,42,43,44,45]. RSM was utilized to model the compressive strength (CS) of RAC containing crumb rubber [38]. The author reported that the complete replacement of NCA with RA resulted in a 12% reduction in concrete compressive strength. Additionally, a quadratic model for predicting the CS of RAC containing crumb rubber demonstrated improved accuracy, with a coefficient of determination (R²) of 0.9854.

Machine learning (ML) involves the development of algorithms enabling computers to learn and make predictions without explicit programming [46]. In ML approaches, statistical methods are used to find patterns, correlations, and trends in historical data [47]. ML approaches include unsupervised, semi-supervised, supervised, and reinforced learning. In supervised learning, the most used method, previous data are used to identify trends and make predictions [48]. Unsupervised learning requires data preparation, model setup, feature extraction, algorithm selection, training, validation, and testing, generally carried out alongside supervised learning. It identifies efficient model parameters and unrecognized relationships by using a dataset for training. Ideally, hyperparameters would not be adjusted using testing or training data [49]. In model training, overfitting occurs when the model aligns too closely with the dataset without appropriate regularization. In such instances, the trained model rarely passes testing validation. In this work, we used cross-validation (CV) to alleviate overfitting difficulties with a small dataset. Using the k-fold CV method, training data were divided into numerous, separate folds. The model was trained on each fold and evaluated using the remaining data. This method was repeated k times, and the model’s success was measured by averaging the data. Although computationally expensive, this technique preserves data, particularly for small datasets [50].

With the growing interest in sustainable concrete materials, there is a pressing need to develop prediction models in order to better understand the complex relationship between compressive strength and concrete constituents. Prior studies focused mostly on the use of RA with FA, SF, or PPF. Few studies have used RA with three components (FA, SF, and PPF) in concrete. No previous research has focused on establishing a model for predicting the compressive strength of concrete containing RA, SF, FA, and PPF utilizing RSM and ML. The addition of PPF, SF, and FA, as well as RA, complicates the forecast process because of the complex relationship between CS and its components.

This study focuses on employing the central composite design (CCD) method in response surface methodology (RSM) alongside three machine learning (ML) algorithms, namely, M5P, random forest (RF), and extreme gradient boosting (XGB), to simulate the effects of varying the ratios of ten input parameters on the compressive strength (CS) of concrete. A dataset containing 10 input parameters and 529 datapoints was created using the literature. The experimental data were evaluated by fitting them to second-order polynomial models using RSM. To verify the significance of the model, analysis of variance (ANOVA) was employed. k-fold cross-validation and statistical measures such as the coefficient of determination (R²), mean absolute percentage error (MAPE), root mean squared error (RMSE), and mean absolute error (MAE) were used to test how well the ML model performed. We also aimed to assess the positive or negative impact of raw ingredients and fibers on the compressive strength of RA concrete containing PPF, SF, and FA utilizing SHAP analysis. The RSM approach and ML algorithm were used to validate the combined effect of SF, FA, and PPF. Finally, numerical optimization was utilized to optimize the properties of concrete containing RA, SF, FA, and PPF. This study elucidates the fundamental concepts of several machine learning algorithms and evaluates their efficacy in forecasting the compressive strength of RAC using PPF, SF, and FA. This research is intended to improve our understanding of the efficiency of various RA ratios and SF, FA, and PPF doses, thereby contributing to the sustainability of the concrete industry. It encourages increasing the use of recycled materials in construction, supporting industry practices by providing a data-driven approach that leads to more sustainable concrete production.

2. Materials and Methods

In this study, we employed RSM and three machine learning models to predict the CS of concrete containing RA, SF, FA, and PPF utilizing data obtained from previously published investigations [1,16,17,19,21,22,23,24,25,26,27,28,29,30,34,51,52,53,54,55,56,57,58,59,60]. A total of 529 compressive strength data samples were obtained. The collected compressive strength data were based on compressive strength tests of concrete conforming to various standards, such as ASTM C39 [61], BS188 [62], GB/T 50081-2019 [63], BS EN 12390 [64], and IS 516-1959 [65]. The compressive strengths of cylindrical samples were modified to be equivalent to those of cubic specimens.

The ten input variables chosen were C, NFA, NCA, RA, FA, SF, PPF, W/C, SP, and AGE, while the output was the compressive strength of concrete. Training and test variables for the developed machine learning models were selected based on the available data from the existing datasets. These variables were selected to provide a comprehensive representation of the factors that impact the model’s predictions. We chose these training variables so that we could include a wide range of samples that could address the key factors affecting the model’s predictions, including the proportions of RA, FA, SF, and PPF, as well as other important material properties. When selecting the data for training and testing the constructed machine learning models, we aimed to ensure a representative assessment of the model’s performance. A random splitting method was employed to guarantee effective model generalization. The dataset was partitioned into a training set for model training and a test set reserved exclusively for assessing the model’s performance with respect to unseen data. The training/testing dataset ratio was 70:30. Prediction models are frequently trained to predict outcomes with a high level of correctness for the output data. The test phase is used to ensure that an algorithm has a prediction power for the output that is based on a variety of data sources. Table 1 provides a description of the input and output CS data. All the variables showed a good range of distributional symmetry, with skewness values between −3 and +3. Kurtosis in this investigation was within the adequate range of −10 to +10, indicating that the variables were adequately distributed and had a suitable number of peaks [66].

Figure 3 provides a flowchart of the methodology for developing the prediction models for the compressive strength of concrete containing RA, SF, FA, and PPF. This study includes 10 input variables—C, NFA, NCA, RA, FA, SF, PPF, W/C, SP, and AGE—and CS is the output. Central composite design in RSM was employed using Design-Expert 13 software. The developed prediction models were M5P, RF, and XGB. Weka software (version 3.8.6) [67] was utilized to construct the M5P model, while Anaconda-based Python (version 3.12) [68] was utilized to develop the RF and XGB models. The created models’ performance was assessed utilizing four statistical measures, including R², MAPE, MAE, and RMSE. The models were then tested for performance using K-fold cross-validation. SHAP analysis was conducted on the input variables to assess the impact of each parameter on model accuracy and the effectiveness of predictions of CS.

To ensure that every feature contributed equally to the training process, min–max normalization was employed. Equation (1) shows the mathematical formula for the normalization of features.

X_{i} = \frac{X_{i} - X_{m i n}}{X_{m a x} - X_{m i n}} (i = 1, 2, \dots, n)

(1)

where

X_{m i n}

and

X_{m a x}

are the minimum and maximum values of feature

X

.

The correlation coefficients (R) and heat map of the Pearson’s correlation coefficient (Figure 4) with respect to the input variables and the output (CS) for the preprocessing of the data for the RSM and ML models provide useful information about the relationships between these variables. In regression analyses, the R value is useful for identifying potential multicollinearity issues. This statistical analysis offers a comprehensive examination of the distribution of and interrelationship between the independent variables that affect the response (CS). An R value near −1 indicates a strong negative correlation, a value near 1 indicates a strong positive correlation, and a value around 0 indicates no or little correlation. The values range from −1 to 1. When examining the input–output relationship, it is essential to pay attention to the size and significance of the correlation coefficients. With R-values of 0.06, 0.26, 0.02, 0.48, 0.22, 0.72, and 0.24, it is clear that the C, NFA, NCA, SF, PPF, SP, and AGE are positively correlated with CS. RA (R = −0.14), FA (R = −0.07), and W/C (R = −0.49) are negatively correlated with CS. Figure 5 shows the scatter plots for the relationship between the input variables and the output, indicating whether each variable is positively correlated, negatively correlated, or uncorrelated with the output (CS).

The CS is given as a function of the ten input variables of the dataset to show their effect of the output parameter. Analysis of Figure 5 and Figure 6 reveals that the compressive strength of concrete containing RA, SF, FA, and PPF is greatly affected by the RA, SF, PPF, W/C, SP, and AGE. As the amount of RA increases, the CS decreases. There is an inverse relationship between the W/C and compressive strength. Figure 5 and Figure 6 show that the compressive strength of concrete was positively affected by C, NFA, SF, PPF, SP, and AGE. It has also been found that SF and SP tend to boost compressive strength more than PPF (Figure 5 and Figure 6).

2.1. Response Surface Methodology (RSM)

RSM is used to design, simulate, evaluate, develop, and improve optimization processes using advanced statistical and mathematical approaches. It is a powerful multi-objective optimization method for response and variable objectives [45]. RSM supports central composites, historical data, and user-defined models. The type of data and number of variables govern which model should be used. This method can be used to develop models and analyze a pre-prepared experiment matrix using a user-defined approach [69]. CCD was implemented using Design-Expert (Version 13). CCD is the simplest and most frequent RSM design approach used in construction [45,70,71]. Linear or higher-degree polynomials connect independent variables to responses. Equation (2) shows the multiple-degree polynomials that best express RSM-generalized equations [72].

Y = ω_{o} + \sum_{i = 1}^{t} ω_{i} X_{i} + \sum_{i = 1}^{t} ω_{i i} X_{j}^{2} + \sum_{i <} \sum_{j} ω_{i j} X_{i} X_{j} + e r r o r,

(2)

where Y is the response, X₁ and X₂ are the first and second inputs,

ω

₀ is the intercept,

ω

_i and

ω

_ii are the coefficients of the first and second parameters, i and j are the coded values of the input’s linear and quadratic terms, and t is the number of inputs.

In this study, ANOVA was used to examine the actual data and determine which factors of the input variables (C, NFA, NCA, RA, FA, SF, PPF, W/C, SP, and AGE) had the greatest effects on the response (CS). Design Expert software was utilized for the statistical analysis. A model-based analysis of the responses was carried out to determine the effect of the input parameters on the predicted output. To ensure that the model accurately reflected actual data, it was subjected to statistical validation and verification. Table 2 summarizes the ANOVA results. At 95% confidence and a p-value < 0.05, the R², adjusted R², predicted R², adequate precision, p-value, CV, and F-value are shown. All responses had p-values less than 0.05 and F-values greater than 4, showing that the suggested models were appropriate and that independent parameters, model coefficients, and interaction terms affected the results. To validate the model, the signal-to-noise ratio was computed precisely. This ratio should exceed four. The model can navigate the design space because the adequate precision of the responses was greater than four, indicating a sufficient signal. The lack-of-fit F-value of 0.99 indicates that the lack of fit is not significant, which means that the model is good at predicting the response. The significant model terms were NFA, NCA, RCA, AGE, C × NFA, C × (W/C), NFA × FA, NFA × PPF, NFA × (W/C), NFA × SP, NFA × AGE, FA × (W/C), FA × AGE, SF × (W/C), SF × SP, SF × AGE, (W/C) × SP, (W/C) × AGE, SP × AGE, RCA², PPF², (W/C)², SP², and AGE². The predicted R² of 0.8670 aligns reasonably with the adjusted R² of 0.8787, with a difference < 0.2. The adequate precision ratio was 65.673, which indicates that the signal is adequate and the model can serve as a tool for navigating the design space.

2.2. Machine Learning (ML) Approach

Machine learning (ML) approaches were utilized to predict the CS of concrete containing RA, SF, FA, and PPF, and precise machine learning prediction models were developed. These models were built primarily using the M5P and random forest (RF) algorithms [73,74].

2.2.1. M5P-Tree (M5P)

The M5P algorithm [75], an enhanced variant of Quinlan’s M5 algorithm [76], is a decision tree with LR functions at leaf nodes. M5P uses decision trees and linear regression to improve comprehension and forecasting accuracy. This method makes it easier to solve complicated issues by breaking them down into their constituent parts and then combining the results of solving each of these parts. Figure 7a shows a decision tree structure that takes two parameters into account and uses the sample-space-partitioning method. Choosing input variables and parameters that reduce node error optimizes data splitting. For each node, the standard deviation reduction (SDR) is utilized to assess the errors.

A colossal tree-like structure is generated when the M5P tree is divided. The subsequent phase entails the pruning of the substantial tree and the subsequent reconstruction of the pruned subtrees utilizing LR functions. The last nodes in Figure 7a include the LR functions.

2.2.2. Random Forest (RF)

Random forest is an ensemble learning algorithm that generates several decision trees and subsequently amalgamates their predictions to yield a final outcome. Each decision tree is constructed from a random selection of training data attributes and instances. The ultimate prediction is derived by averaging all the individual tree forecasts, as seen in Figure 7b [77]. The equation for random forest prediction is shown in Equation (3).

R e s p o n s e = m o d e (m_{1}, m_{2}, \dots, m_{n})

(3)

where

m_{i}

denotes the predicted value of each individual tree, and mode returns the average forecast.

2.2.3. Extreme Gradient Boosting (XGB)

XGB is a tree-based technique introduced by Chen and Guestrin [78] that utilizes the concept of boosting. XGB improves the accuracy of predictions by combining the outputs of several decision trees that are working simultaneously. The performance of each tree is improved by lowering the errors generated by earlier trees [79]. A predictive model characterized by high accuracy and generalizability was developed through the XGB algorithm, which combines the strengths of various machine learning algorithms, such as decision trees and gradient boosting. In this method, decision trees and additional weak learners are utilized to iteratively improve the ensemble by integrating new trees that correct the errors generated by earlier ones. Compared to alternative techniques, it demonstrates a superior capacity to manage numerous attributes, performs effectively with high-dimensional data, and mitigates overfitting through the incorporation of a regulatory term [80]. Additionally, it enhances the loss function by employing the second-order Taylor series expansion. It enables precise control over model parameters, significantly enhancing performance, and offers extensive customization options [81]. The objective function comprises the summation of the regularization term and the loss function. The loss function quantifies the discrepancy between model predictions and actual values, while the regularization factor imposes a penalty on overly complex models.

Figure 7c depicts the overall XGB prediction process. The model starts with a single decision tree. Next, the discrepancy between the predicted and experimental values is calculated. Calculating the residual is the initial step in training a new decision tree to rectify the errors generated by the preceding one. In this study, the predicted values were revised by combining the estimates from all trees in the ensemble, and an additional tree was integrated. The stopping criterion was triggered by iterating through the new tree ensemble stages and executing residual calculations.

2.3. Model Efficiencies

In this study, the developed models’ predictions were evaluated using the following performance statistic metrics: R², MAE, RMSE, and MAPE. In many applications, including regression analysis and machine learning, these measures are employed to evaluate model performance. Each indication shows model reliability and correctness, affecting engineering decision making [82]. The R² is critical in regression analysis. Since it measures the explanatory power of a model rather than the error of prediction, it is more informative. High R² values indicate that the dependent variable accounts for a considerable share of the independent variable’s variation. Equation (4) is the mathematical expression for R². MAE represents the discrepancy between predicted and experimental values. Equation (5) depicts the MAE relationship. RMSE measures how far predicted values differ from actual values. Equation (6) is used for the calculation of RMSE. MAPE quantifies the average error of a model’s predictions. It simplifies prediction model comparison. In ML applications where several algorithms produce varied outcomes, MAPE can be used to evaluate which model makes the most accurate predictions. Equation (7) shows the mathematical expression for MAPE. A low MAPE value indicates greater prediction accuracy.

R^{2} = {[\frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{(\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}) (\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2})}]}^{2}

(4)

M A E = \frac{1}{n} \sum_{i = 1}^{n} |(x_{i} - y_{i})|

(5)

R M S E = \sqrt{\frac{\sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}}{n}}

(6)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} \frac{⌈x_{i} - y_{i}⌉}{x_{i}}

(7)

where

x_{i}

and

y_{i}

are the actual and predicted data, n is the number of instances, and

\bar{x}

and

\bar{y}

are the means of the experimental data and predictions.

3. Results and Discussion

3.1. RSM

Several regression transformation approaches and independent variable interactions were studied to model the CS response. Quadratic regression was used to model the interaction between input variables and the subsequent output. The models generated throughout the experiment exhibit statistical significance and are applicable for predicting outputs based on input variables, particularly when R² values are close to one. The regression Equation (8) shows the quadratic model for CS. The R², adjusted R², and predicted R² were 0.8860, 0.8787, and 0.8670, respectively, for the CS model. The variations between the adjusted and predicted R² were < 0.2 for the quadratic model for CS, indicating reasonable agreement.

\begin{matrix} C S = 493.55 & - 0.574 C - 0.25 N F A - 0.0397 N C A - 0.053 R C A - 0.859 F A - 0.313 S F + 27 P P F - 864.7 (W / C) \\ + 7.03 S P + 0.585 A G E + 0.0003 C \times N F A - 0.00021 C \times S F + 0.83 C \times (W / C) \\ + 0.00053 N F A \times F A - 0.025 N F A \times P P F + 0.173 N F A \times (W / C) + 0.008 N F A \times S P \\ + 0.00054 N F A \times A G E + 0.949 F A \times (W / C) + 0.00074 F A \times A G E + 0.923 S F \times (W / C) \\ + 0.06 S F \times S P - 0.003 S F \times A G E - 1.05 P P F \times S P - 14.88 (W / C) \times S P - 1.038 (W / C) \times A G E \\ + 0.029 S P \times A G E + 5.052 \times 10^{- 6} {R C A}^{2} - 1.91 {P P F}^{2} + 387.42 {(W / C)}^{2} - 0.983 {S P}^{2} \\ - 0.0024 {A G E}^{2} \end{matrix}

(8)

The residuals of the predictions and the experimental vs. predicted data are plotted in Figure 8 for the developed model for CS. In this figure, blue dots represent the minimum output values, gradually transitioning to red dots for higher values. These maps indicate that the practical and predicted outcomes matched well and that most of their intersection locations were near the median line, supporting the regression models’ performance. Figure 8a demonstrates the compositional patterns of the data for all the prediction models in their residual plots arranged by run order. The run-order plot of the residuals is used to validate the regression model because any consistent and regular pattern could indicate issues. The regression analysis assumed the residuals are independent, which is supported by the random distribution of data points along the straight line. The model appears to be steady and not to drift because of sinusoidal data point fluctuations across the run order. This shows that the projections are valid and can predict explanatory factor effects [83]. Overall, the diagnostic charts demonstrate that the models are acceptable.

Optimization by RSM

Figure 9 depicts the results of the RSM optimization performed on the CS as well as the associated factors. The red and blue dots represent the CS parameters (C, NFA, NCA, RA, FA, SF, PPF, W/C, SP, and AGE) and responses (CS), respectively, for succeeding optimization. The optimization objectives were set to “maximize” as the target of the solution for CS. The optimization objective for RA was set to “maximize”, the AGE was set to a value of 28 days, and the other factors were set to “in range”. The best CS was found to be 115 MPa at a 100% volume of RA of coarse aggregate, 1.13% PPF by volume of concrete, 7.90% FA, and 5.30% SF as partial replacements of binders by weight.

3.2. Performance of ML Models

The XGB, M5P, and RF—all ML approaches—were utilized to establish a basis for assessing the given data and gain significant insights. The results obtained using this methodology are presented in the next subsection, along with a comprehensive analysis of the findings. To evaluate the performance and fitting effectiveness of the ML models, they were tested on new and unseen data after hyperparameter selection and training.

3.2.1. XGB Model

Figure 10a,b present the results of the statistical analysis of both the actual and predicted values for the compressive strength of the RAC, as developed in the XGB model. The scatter plots provide a comparison of the experimental and predicted CS. Approximately all the predicted values fall within the ±20% error margins. The scatter plots of the training and testing datasets indicate that the points are closely clustered around the optimal line. The scatter around the ideal line in both datasets is minimal, suggesting that the predictions made by the XGB model closely align with the actual values. The XGB model demonstrates a high determination coefficient and minimal error between the predicted and actual values, indicating accurate predictions. The R², MAPE, RMSE, and MAE for the training dataset were 0.97896, 3.19%, 2.32 MPa, and 1.15 MPa, respectively. For the test dataset, these values were 0.9485, 6.45%, 3.98 MPa, and 2.48 MPa, respectively.

3.2.2. M5P Model

Figure 11 provides an M5P data split diagram for compressive strength prediction. Equations (9) to (20) present the constructed LR models, LM1 to LM12, for RAC incorporating FA, SF, and PPF at the leaf nodes. The dataset shows an error range of ±20% for the CS of the concrete. The findings reveal that almost all the recorded values were within the ±20% error limit (Figure 10c,d). The training and testing datasets’ scatter plots exhibit tight clustering of points around the ideal line, with few scattered points indicating minimal variation. In both datasets, the divergence from the optimal line is negligible, indicating that the model can accurately predict the result.

\begin{matrix} LM 1 : C S = 29.9604 - 0.002 C - 0.0015 N F A - 0.0026 N C A - 0.0068 R A - 0.008 F A - 0.0076 S F + 0.349 P P F - \\ 11.3011 (W / C) + 0.3019 S P + 0.5783 A G E \end{matrix}

(9)

\begin{matrix} LM 2 : C S = 32.1631 - 0.006 C - 0.0117 N F A - 0.0019 N C A - 0.0056 R A - 0.0121 F A - 0.0053 S F + \\ 0.349 P P F + 34.7258 (W / C) + 0.3019 S P + 0.0413 A G E \end{matrix}

(10)

\begin{matrix} LM 3 : C S = 29.9604 - 0.002 C - 0.0015 N F A - 0.0026 N C A - 0.0068 R A - 0.008 F A - 0.0076 S F + 0.349 P P F - \\ 11.3011 (W / C) + 0.3019 S P + 0.5783 A G E \end{matrix}

(11)

\begin{matrix} LM 4 : C S = 63.679 - 0.0071 C - 0.001 N F A - 0.0027 N C A - 0.0056 R A - 0.0221 F A - 0.0053 S F + 0.349 P P F - \\ 50.9311 (W / C) + 0.3019 S P + 0.0544 A G E \end{matrix}

(12)

\begin{matrix} LM 5 : C S = 86.5485 + 0.0062 C + 0.005 N C A - 0.0076 R A - 0.0032 F A + 6.0598 P P F - 122.0337 (W / C) + \\ 0.7471 S P + 0.371 A G E \end{matrix}

(13)

\begin{matrix} LM 6 : C S = 10.1455 + 0.045 C + 0.0149 N F A + 0.0019 N C A - 0.004 R A - 0.0032 F A + 1.984 P P F - \\ 24.8504 W / C + 1.1376 S P + 0.3579 A G E \end{matrix}

(14)

\begin{matrix} LM 7 : C S = 3.0095 + 0.0547 C + 0.0225 N F A - 0.0003 N C A - 0.0077 R A - 0.0032 F A + 1.984 P P F - \\ 24.8504 (W / C) + 0.6006 S P + 0.5604 A G E \end{matrix}

(15)

\begin{matrix} LM 8 : C S = 10.5398 + 0.0448 C + 0.0227 N F A - 0.0003 N C A - 0.0088 R A - 0.0032 F A + 1.984 P P F - \\ 24.8504 (W / C) + 0.6006 S P + 0.4331 A G E \end{matrix}

(16)

\begin{matrix} LM 9 : C S = 46.2424 + 0.022 C + 0.0069 N F A - 0.0053 N C A - 0.0113 R A - 0.0032 F A + 4.6347 P P F - \\ 35.6615 (W / C) + 0.5282 S P + 0.0452 A G E \end{matrix}

(17)

\begin{matrix} LM 10 : C S = 57.9508 + 0.0104 C + 0.0069 N F A - 0.0061 N C A - 0.0113 R A - 0.0032 F A + 6.0981 P P F - \\ 40.5718 (W / C) - 0.9591 S P + 0.0508 A G E \end{matrix}

(18)

\begin{matrix} LM 11 : C S = 52.6181 + 0.0366 C + 0.0056 N F A - 0.0061 N C A - 0.0154 R A - 0.0032 F A + 2.9047 P P F - \\ 43.2214 (W / C) + 0.9227 S P + 0.053 A G E \end{matrix}

(19)

\begin{matrix} LM 12 : C S = 74.9424 - 0.0437 C + 0.0056 N F A - 0.0074 N C A - 0.0174 R A - 0.0032 F A + 3.6085 P P F - \\ 36.2764 W / C + 0.9227 S P + 0.0687 A G E \end{matrix}

(20)

In comparison to the XGB model, the XGB model predicted CS more accurately, as evidenced by the higher R² and lower prediction errors. The R² values were 0.9790 for the training dataset and 0.9485 for the test dataset, showing that the XGB model is more accurate. The CS model’s MAE values were 1.15 and 2.48 MPa for the training and testing datasets, respectively. The XGB model had RMSE values of 2.32 MPa for the training dataset and 3.98 MPa for the test dataset. The MAPE values were 3.19% and 6.45% for the training and test datasets, respectively. These numbers indicate the enhanced precision of the XGB model developed.

Figure 12a,b present comparisons between the values predicted by the XGB model and the actual values of CS for training and test datasets. Figure 12c,d present comparisons between the predicted values from M5P model and actual values of CS for the training and test datasets. In comparison to the predictions made by the XGB model, the predicted CS values generated by the XGB model were closer to the experimental results than those predicted by the M5P model. The inferior accuracy of the M5P model compared to the M5P model can be ascribed to its failure to consider interactions among variables, which may result in suboptimal outcomes when significant feature interactions are present.

3.2.3. Random Forest Model (RF)

The CS prediction model performed effectively when paired with the RF technique, as demonstrated in Figure 10e,f, which show the predicted versus actual CS values as well as the steep linear fit’s error ranges of ±20%. The scatter plots (Figure 10e,f) show points closely clustered around the ideal line in both the training and testing datasets, suggesting that the RF predictions are close to the real values, within the 20% error margin. With R² values of 0.9751 and 0.9307 for the training and test datasets, respectively, it can be concluded that the RF model effectively predicted CS. For the RF model, the MAE values were 1.481 MPa and 3.116 MPa for the training and test datasets, respectively. For the training dataset, the RMSE was 2.56 MPa, and that for the testing set was 4.728 MPa. The MAPE values for the testing and training datasets were 8.208% and 4.187%, respectively. These numbers show the better accuracy of the RF model we developed, demonstrating its superior precision. In comparison to the predictions made by the XGB and M5P models, the XGB and RF models’ predictions are approximately similar and closer to the actual results than those made by the M5P model, but the XGB model is more accurate in predicting CS than the RF model (Figure 12).

3.3. Comparison Between the Developed Models

The statistical measures shown in Figure 13, including R², MAE, RMSE, and MAPE, were utilized to assess the performance of the developed ML models. The XGB model has the highest R² and the lowest error of prediction when compared to the RF and M5P models, indicating the superior accuracy of the XGB model constructed. Figure 14 shows a comparison of the CS prediction errors based on all the developed models for the training and test datasets. It is clear that the error values of the XGB model’s predictions were less than those of the RF and M5P models. The XGB model demonstrated superior accuracy in predicting the compressive strength of RAC containing FA, SF, and PPF.

3.4. Cross Validation

K-fold cross validation is employed to evaluate the robustness of a model over various data subsets. This method helps reduce overfitting and bias during the training process. K-fold cross-validation, in which statistical measures are used, was used to examine the effectiveness of the created model. The scores greatly improved after ten iterations while still being very precise. For the training dataset, 90% of the dataset was used, while the remaining 10% was used for the testing dataset. Figure 15 depicts the findings of 10-fold cross-validation for all the developed models, including the R², MAE, RMSE, and MAPE. The XGB model has R² values that range between 0.9338 and 0.9920, with an average of 0.9768 and an SD of 0.0182. The M5P model has R² values ranging between 0.7724 and 0.9224, with a mean of 0.8826 and an SD of 0.0437. The RF model has R² values ranging between 0.8927 and 0.9545, with a mean of 0.9340 and an SD of 0.0171.

The XGB model has MAE values that range between 0.4762 and 1.8941, with a mean of 1.315 and an SD of 0.554. The M5P model has MAE values ranging between 3.507 and 4.496, with a mean of 3.994 and an SD of 0.328. The RF model has MAE values ranging between 2.007 and 3.114, with a mean of 2.708 and an SD of 0.358. The XGB model has RMSE values that range from 0.6085 to 2.687, yielding a mean of 1.892 and an SD of 0.813. The M5P has RMSE values that range from 4.603 to 6.212, yielding a mean of 5.45 and an SD of 0.478. The RF model has RMSE values that range from 3.031 to 5.08, yielding a mean of 4.244 and an SD of 0.624. The XGB model has MAPE values that range from 1.865 to 4.67, yielding a mean of 3.46 and an SD of 1.15. The M5P model has MAPE values that range from 10.40 to 14.82, yielding a mean of 12.197 and an SD of 1.38. The RF model has MAPE values that range from 6.10 to 8.20, yielding a mean of 7.49 and an SD of 0.70. The k-fold cross-validation results indicate that XGB outperformed both the RF and M5P models in the 10-fold cross-validation analysis.

3.5. SHAP Analysis for Feature Importance of the RF and XGB Models

Lundberg and Lee’s [84] SHAP analysis is an approach to analyzing machine learning models in which Shapely Additive explanations are used to clarify key factors affecting the compressive strength of RA concrete. By integrating local SHAP explanations, a SHAP analysis across all datasets provides a better characterization of the factors that affect global representation. The SHAP algorithm quantified the influence of the individual input variable on the outputs. The SHAP analysis was applied to the RF model for CS. Figure 16 illustrates how the various aspects correlated with SHAP values for the CS of RAC containing SF, FA, and PPF. The SHAP value illustrates that the top variable has the most significant effect on the predictions, whereas the bottom variable has the least effect. A transition in dot color from blue to red indicates a positive correlation between the feature and the model’s outcome. Figure 16 indicates that the curing age of concrete exhibits the highest SHAP value in predicting the CS of RAC, followed by SP and water/cement ratio. A greater curing age correlates with a higher SHAP value, indicating that the compressive strength of concrete enhances with a greater curing duration. This outcome is consistent with the findings of Alamri et al. [28], which indicated that an increase in curing age is associated with an improvement in the compressive strength of concrete. Longer curing improves the hydration process of cement and supplementary cementitious materials, thereby improving the strength of the concrete [26,28]. This investigation’s findings, supported by experimental data from Ref. [85], indicate that an increase in SP concentration enhances the compressive strength of concrete. A smaller W/C results in a higher SHAP value, demonstrating that in concrete containing supplementary cementitious materials and RA, the compressive strength increases as the W/C decreases. The contributions of cement, FA, and SF to the strength of concrete are positive. The PPF positively influences the prediction of CS. This influence is similar to what has been observed in recent experimental studies [1,22,29]. An increase in the replacement ratio of RA correlates with lower SHAP values, suggesting that a higher RA replacement ratio leads to a decrease in compressive strength. Recent research [1,22,24,25] has demonstrated that the use of 100% RA results in a reduction in the strength of concrete when compared to 0% RA. Increased NFA and NCA content generally resulted in higher SHAP values.

4. Conclusions

In this study, the XGB, M5P, and RF algorithms were used to model the CS of concrete containing RA, SF, FA, and PPF. Ten input variables were considered, namely, C, NFA, NCA, RA, FA, SF, PPF, W/C ratio, SP, and AGE, while the output was the compressive strength of concrete. Four statistical metrics were utilized to evaluate the models’ performance. SHAP analysis was employed to assess the significance and impact of the input parameters on CS prediction. The following conclusions were drawn.

The correlation coefficient values were illustrated as a heat map, demonstrating the relationships between the input and output parameters. The input parameter SP significantly influenced CS, with a value of 0.7155, followed by SF, which had a value of 0.48. The W/C input parameter had a large negative impact on CS, with a value of −0.4917, followed by RA, which had a value of −0.1365. PPF exhibited a positive correlation with CS, indicated by an R value of 0.2220.
Based on the optimization of the CS of RAC containing FA, SF, and PPF, the optimal CS was found to be 115 MPa at a 100% volume of RA consisting of coarse aggregate, 1.13% PPF by volume of concrete, 7.90% FA, and 5.30% SF as partial replacements of binders by weight.
The XGB model outperformed the RF and M5P models regarding robustness and accuracy in the 10-fold cross-validation.
The XGB prediction model demonstrated a robust correlation between the predicted and experimental data, achieving R² values of 0.9790 and 0.9485 for the training and test datasets, respectively, indicating the model’s high predictive power and its accurate representation of the dataset’s trend.
According to the results of the XGB models, the MAPE, RMSE, and MAE values for CS were 3.19%, 2.324 MPa, and 1.149 MPa, respectively, demonstrating that the XGB prediction model exhibited error rates below 5%. The XGB model had superior overall performance in terms of higher R² and lower MAE, RMSE, and MAPE values.
The SHAP analysis conducted using the XGB and RF models indicated that factors such as curing age, SP, cement, NFA, NCA, SF, and FA positively influence compressive strength. In contrast, W/C and RA negatively affect CS. Curing age and SP exert the most significant influence relative to the other factors. Moreover, the RA input parameter plays a more significant role than the NCA. PPF positively affects compressive strength. While SF and FA contribute to compressive strength, their impact is less pronounced than that of cement.

5. Limitations and Future Work

A dataset, sophisticated algorithms, and SHAP analysis were used in this study. We used M5P, RF, and XGB techniques, showing this study’s diversity; however, this study’s shortcomings must be addressed. It is vital to note that machine learning methods are optimized for input parameters. New input parameters require training, testing, and hyperparameter tuning. Data completeness is crucial to prediction model accuracy. In this study, we used 10 input variables with 529 datapoints, but more input variables are needed to determine their importance in predicting the CS of RAC. These input parameters include the sources of NFA, NCA, and RCA; the crushing index and absorption of both NCA and RA; the RA treatment process; the fineness modulus of NFA; the pozzolanic component ratios and the specific surface area of SF and FA; and the tensile strength and length-to-diameter ratio of PPF. The volume of the required dataset depends on the number of input variables considered in the study. ML models should also be utilized to forecast RAC’s mechanical and durability properties utilizing a massive dataset with many explanatory variables. Current databases should be investigated using additional ML models. An analysis of the economic feasibility and cost-effectiveness of the combined incorporation of RA, PPF, SF, and FA in concrete production would provide a more comprehensive understanding of their potential for use in large-scale applications.

Author Contributions

Conceptualization, M.K.A. and H.A.D.; methodology, M.K.A. and H.A.D.; software, H.A.D.; validation, M.K.A.; formal analysis, H.A.D.; investigation, M.K.A. and H.A.D.; resources, M.K.A.; data curation, H.A.D.; writing—original draft preparation, H.A.D.; writing—review and editing, M.K.A.; visualization, H.A.D.; supervision, M.K.A.; project administration, M.K.A.; funding acquisition, M.K.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All the raw data supporting the conclusion of this paper were provided by the authors.

Acknowledgments

The researchers would like to thank the Deanship of Graduate Studies and Scientific Research at Qassim University for financial support (QU-APC-2025).

Conflicts of Interest

The authors declare there are no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

RA	Recycled aggregate
RAC	Recycled-aggregate concrete
PPF	Polypropylene fiber
FA	Fly ash
SF	Silica fume
CS	Compressive strength
LR	Linear regression
ML	Machine learning
RF	Random forest
SCMs	Supplementary cementitious materials
CV	Cross-validation
C	Cement
NFA	Natural fine aggregate
CCD	Central composite design
NCA	Natural coarse aggregate
W/C	Water/binder ratio
SP	Super plasticizer
AGE	Curing period
SDR	Standard deviation reduction
R	Correlation coefficient
R²	Coefficient of determination
MAE	Mean absolute error
RMSE	Root mean squared error
MAPE	Mean absolute percentage error
SD	Standard deviation
RSM	Response surface methodology
ANOVA	Analysis of variance

References

Alharthai, M.; Ali, T.; Qureshi, M.Z.; Ahmed, H. The Enhancement of Engineering Characteristics in Recycled Aggregates Concrete Combined Effect of Fly Ash, Silica Fume and PP Fiber. Alex. Eng. J. 2024, 95, 363–375. [Google Scholar] [CrossRef]
Younis, K.H.; Mustafa, S.M. Feasibility of Using Nanoparticles of SiO₂ to Improve the Performance of Recycled Aggregate Concrete. Adv. Mater. Sci. Eng. 2018, 2018, 1512830. [Google Scholar] [CrossRef]
Younis, K.H.; Pilakoutas, K. Strength Prediction Model and Methods for Improving Recycled Aggregate Concrete. Constr. Build. Mater. 2013, 49, 688–701. [Google Scholar] [CrossRef]
Kurda, R.; de Brito, J.; Silvestre, J.D. Water Absorption and Electrical Resistivity of Concrete with Recycled Concrete Aggregates and Fly Ash. Cem. Concr. Compos. 2019, 95, 169–182. [Google Scholar] [CrossRef]
Kurad, R.; Silvestre, J.D.; de Brito, J.; Ahmed, H. Effect of Incorporation of High Volume of Recycled Concrete Aggregates and Fly Ash on the Strength and Global Warming Potential of Concrete. J. Clean. Prod. 2017, 166, 485–502. [Google Scholar] [CrossRef]
Adessina, A.; Ben Fraj, A.; Barthélémy, J.F.; Chateau, C.; Garnier, D. Experimental and Micromechanical Investigation on the Mechanical and Durability Properties of Recycled Aggregates Concrete. Cem. Concr. Res. 2019, 126, 105900. [Google Scholar] [CrossRef]
Tam, V.W.Y.; Tam, C.M.; Wang, Y. Optimization on Proportion for Recycled Aggregate in Concrete Using Two-Stage Mixing Approach. Constr. Build. Mater. 2007, 21, 1928–1939. [Google Scholar] [CrossRef]
Thomas, J.; Thaickavil, N.N.; Wilson, P.M. Strength and Durability of Concrete Containing Recycled Concrete Aggregates. J. Build. Eng. 2018, 19, 349–365. [Google Scholar] [CrossRef]
Kazmi, S.M.S.; Munir, M.J.; Wu, Y.F.; Patnaikuni, I.; Zhou, Y.; Xing, F. Influence of Different Treatment Methods on the Mechanical Behavior of Recycled Aggregate Concrete: A Comparative Study. Cem. Concr. Compos. 2019, 104, 103398. [Google Scholar] [CrossRef]
Xuan, D.; Zhan, B.; Poon, C.S. Durability of Recycled Aggregate Concrete Prepared with Carbonated Recycled Concrete Aggregates. Cem. Concr. Compos. 2017, 84, 214–221. [Google Scholar] [CrossRef]
Choi, H.; Choi, H.; Lim, M.; Inoue, M.; Kitagaki, R.; Noguchi, T. Evaluation on the Mechanical Performance of Low-Quality Recycled Aggregate Through Interface Enhancement Between Cement Matrix and Coarse Aggregate by Surface Modification Technology. Int. J. Concr. Struct. Mater. 2016, 10, 87–97. [Google Scholar] [CrossRef]
Wang, R.; Yu, N.; Li, Y. Methods for Improving the Microstructure of Recycled Concrete Aggregate: A Review. Constr. Build. Mater. 2020, 242, 118164. [Google Scholar] [CrossRef]
Saeed, M.K.; Al Sayed, A.A.-K.A.; Almutairi, A.D.; Dahish, H.A.; Al-Fasih, M.Y.M. Utilizing Alkali-Activated Recycled Concrete Aggregates from Demolished Structures to Investigate Concrete Properties in the Jeddah Region of Saudi Arabia. Sustainability 2025, 17, 1903. [Google Scholar] [CrossRef]
Pstrowska, K.; Gunka, V.; Prysiazhnyi, Y.; Demchuk, Y.; Hrynchuk, Y.; Sidun, I.; Kułażyński, M.; Bratychak, M. Obtaining of Formaldehyde Modified Tars and Road Materials on Their Basis. Materials 2022, 15, 5693. [Google Scholar] [CrossRef]
Xuan, D.; Zhan, B.; Poon, C.S. Assessment of Mechanical Properties of Concrete Incorporating Carbonated Recycled Concrete Aggregates. Cem. Concr. Compos. 2016, 65, 67–74. [Google Scholar] [CrossRef]
Katar, I.; Ibrahim, Y.; Abdul Malik, M.; Khahro, S.H. Mechanical Properties of Concrete with Recycled Concrete Aggregate and Fly Ash. Recycling 2021, 6, 23. [Google Scholar] [CrossRef]
Karthik, C.H.; Nagaraju, A. An Experimental Study on Recycled Aggregate Concrete with Partial Replacement of Cement with Flyash and Alccofine. In Proceedings of the Innovative Technology for Smart Construction Materials and Sustainable Infrastructure, online, 14–15 October 2022; Institute of Physics: London, UK, 2023; Volume 1130. [Google Scholar]
Shicong, K.; Poon, C.S. Compressive Strength, Pore Size Distribution and Chloride-Ion Penetration of Recycled Aggregate Concrete Incorporating Class-F Fly Ash. J. Wuhan Univ. Technol. Mater. Sci. Ed. 2006, 21, 130–136. [Google Scholar] [CrossRef]
Ali, B.; Qureshi, L.A.; Nawaz, M.A.; Aslam, H.M.U. Combined Influence of Fly Ash and Recycled Coarse Aggregates on Strength and Economic Performance of Concrete. Civ. Eng. J. 2019, 5, 832–844. [Google Scholar] [CrossRef]
Sunayana, S.; Barai, S.V. Partially Fly Ash Incorporated Recycled Coarse Aggregate Based Concrete: Microstructure Perspectives and Critical Analysis. Constr. Build. Mater. 2021, 278, 122322. [Google Scholar] [CrossRef]
Ali, B.; Ahmed, H.; Ali Qureshi, L.; Kurda, R.; Hafez, H.; Mohammed, H.; Raza, A. Enhancing the Hardened Properties of Recycled Concrete (RC) through Synergistic Incorporation of Fiber Reinforcement and Silica Fume. Materials 2020, 13, 4112. [Google Scholar] [CrossRef]
Ahmed, T.W.; Ali, A.A.M.; Zidan, R.S. Properties of High Strength Polypropylene Fiber Concrete Containing Recycled Aggregate. Constr. Build. Mater. 2020, 241, 118010. [Google Scholar] [CrossRef]
Nazir, S.; Mahajan, A.; Jaggi, S. An Experimental Study on Enhancing Recycled Aggregate Concrete Properties Through Silica Fume Incorporation. Res. Sq. 2023. [Google Scholar] [CrossRef]
Shahab, M.; Bashar, N. Effect of Silica Fume on Strength of Recycled Aggregate Concrete. Int. J. Res. Eng. Innov. 2024, 8, 101–107. [Google Scholar] [CrossRef]
Ismail, A.J.; Younis, K.H.; Maruf, S.M. Recycled Aggregate Concrete Made with Silica Fume: Experimental Investigation. Civ. Eng. Archit. 2020, 8, 1136–1143. [Google Scholar] [CrossRef]
Jahandari, S.; Mohammadi, M.; Rahmani, A.; Abolhasani, M.; Miraki, H.; Mohammadifar, L.; Kazemi, M.; Saberian, M.; Rashidi, M. Mechanical Properties of Recycled Aggregate Concretes Containing Silica Fume and Steel Fibres. Materials 2021, 14, 7065. [Google Scholar] [CrossRef] [PubMed]
Nadim, F.; Hasan, R.; Rahman Sobuz, H.; Ashraf, J.; Sadiqul Hasan, N.; Dip Datta, S.; Islam, H.; Islam, A.; Awall, R.; Rahman, S.A.; et al. Effect of Silica Fume on the Microstructural and Mechanical Properties of Concrete Made with 100% Recycled Aggregates. Rev. Constr. 2024, 23, 413–435. [Google Scholar] [CrossRef]
Alamri, M.; Ali, T.; Ahmed, H.; Qureshi, M.Z.; Elmagarhe, A.; Adil Khan, M.; Ajwad, A.; Sarmad Mahmood, M. Enhancing the Engineering Characteristics of Sustainable Recycled Aggregate Concrete Using Fly Ash, Metakaolin and Silica Fume. Heliyon 2024, 10, e29014. [Google Scholar] [CrossRef]
Ye, P.; Chen, Z.; Su, W. Mechanical Properties of Fully Recycled Coarse Aggregate Concrete with Polypropylene Fiber. Case Stud. Constr. Mater. 2022, 17, e01352. [Google Scholar] [CrossRef]
Sun, S.; Du, Y.; Sun, S.; Yu, Q.; Li, Y. Mechanical Properties of Recycled Concrete with Polypropylene Fiber and Its Bonding Performance with Rebars. Mater. Sci. 2024, 30, 396–403. [Google Scholar] [CrossRef]
Zhang, H.; Liu, Y.; Sun, H.; Wu, S. Transient Dynamic Behavior of Polypropylene Fiber Reinforced Mortar under Compressive Impact Loading. Constr. Build. Mater. 2016, 111, 30–42. [Google Scholar] [CrossRef]
Fallah, S.; Nematzadeh, M. Mechanical Properties and Durability of High-Strength Concrete Containing Macro-Polymeric and Polypropylene Fibers with Nano-Silica and Silica Fume. Constr. Build. Mater. 2017, 132, 170–187. [Google Scholar] [CrossRef]
Yan, P.; Chen, B.; Afgan, S.; Aminul Haque, M.; Wu, M.; Han, J. Experimental Research on Ductility Enhancement of Ultra-High Performance Concrete Incorporation with Basalt Fibre, Polypropylene Fibre and Glass Fibre. Constr. Build. Mater. 2021, 279, 122489. [Google Scholar] [CrossRef]
Zhang, L.; Li, X.; Li, C.; Zhao, J.; Cheng, S. Mechanical Properties of Fully Recycled Aggregate Concrete Reinforced with Steel Fiber and Polypropylene Fiber. Materials 2024, 17, 1156. [Google Scholar] [CrossRef] [PubMed]
Imran, H.; Al-Abdaly, N.M.; Shamsa, M.H.; Shatnawi, A.; Ibrahim, M.; Ostrowski, K.A. Development of Prediction Model to Predict the Compressive Strength of Eco-Friendly Concrete Using Multivariate Polynomial Regression Combined with Stepwise Method. Materials 2022, 15, 317. [Google Scholar] [CrossRef]
Dahish, H.A.; Elsayed, M.; Mohamed, M.; Elymany, M. Experimental Investigation on the Effect of Using Crumb Rubber and Recycled Aggregate on the Mechanical Properties of Concrete. ARPN J. Eng. Appl. Sci. 2021, 16, 2157–2168. [Google Scholar]
Dahish, H.A.; Bakri, M.; Alfawzan, M.S. Predicting the Strength of Cement Mortars Containing Natural Pozzolan and Silica Fume Using Multivariate Regression Analysis. Int. J. GEOMATE 2021, 20, 68–76. [Google Scholar] [CrossRef]
Dahish, H.A. Predicting the Compressive Strength of Concrete Containing Crumb Rubber and Recycled Aggregate Using Response Surface Methodology. Int. J. GEOMATE 2023, 24, 117–124. [Google Scholar] [CrossRef]
Dahish, H.A.; Alkharisi, M.K. Hybrid Fiber Reinforcement in HDPE-Concrete: Predictive Analysis of Fresh and Hardened Properties Using Response Surface Methodology. Buildings 2024, 14, 3479. [Google Scholar] [CrossRef]
Raveendran, N.; K, V. Synergistic Effect of Nano Silica and Metakaolin on Mechanical and Microstructural Properties of Concrete: An Approach of Response Surface Methodology. Case Stud. Constr. Mater. 2024, 20, e03196. [Google Scholar] [CrossRef]
Yin, Y.; Qiao, L.; Li, Q.; Chen, L.; Miao, M.; Dong, J.; Song, L.; Luo, A.; Zheng, H. Thermodynamic Performance of SiC-Enhanced MicroPCM Backfill Based on Response Surface Methodology. Case Stud. Constr. Mater. 2024, 20, e03345. [Google Scholar] [CrossRef]
Hassani, A.; Kazemian, F. Investigating Geopolymer Mortar Incorporating Industrial Waste Using Response Surface Methodology: A Sustainable Approach for Construction Materials. Case Stud. Constr. Mater. 2024, 21, e03609. [Google Scholar] [CrossRef]
Haque, M.; Ray, S.; Mita, A.F.; Mozumder, A.; Karmaker, T.; Akter, S. Prediction and Optimization of Hardened Properties of Concrete Prepared with Granite Dust and Scrapped Copper Wire Using Response Surface Methodology. Heliyon 2024, 10, e24705. [Google Scholar] [CrossRef] [PubMed]
Patil, S.; Ramesh, B.; Sathish, T.; Saravanan, A. RSM-Based Modelling for Predicting and Optimizing the Rheological and Mechanical Properties of Fibre-Reinforced Laterized Self-Compacting Concrete. Heliyon 2024, 10, e25973. [Google Scholar] [CrossRef] [PubMed]
Dahish, H.A.; Almutairi, A.D. Effect of Elevated Temperatures on the Compressive Strength of Nano-Silica and Nano-Clay Modified Concretes Using Response Surface Methodology. Case Stud. Constr. Mater. 2023, 18, e02032. [Google Scholar] [CrossRef]
Mitchell, T.M. Machine Learning; McGraw-Hill: New York, NY, USA, 1997; ISBN 0070428077. [Google Scholar]
Pereira, F.; Mitchell, T.; Botvinick, M. Machine Learning Classifiers and FMRI: A Tutorial Overview. Neuroimage 2009, 45, S199–S209. [Google Scholar] [CrossRef]
Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef]
Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Kohavi, R. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence-Volume 2, Montreal, QC, Canada, 20–25 August 1995; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 1995; pp. 1137–1143. [Google Scholar]
Wan Mohammad, W.N.S.; Ismail, S.; Wan Alwi, W.A. Properties of Recycled Aggregate Concrete Reinforced with Polypropylene Fibre. MATEC Web Conf. 2016, 66, 00077. [Google Scholar] [CrossRef]
Hanumesh, B.; Harish, B.; Venkata Ramana, N. Influence of Polypropylene Fibres on Recycled Aggregate Concrete. Mater. Today Proc. 2018, 5, 1147–1155. [Google Scholar] [CrossRef]
Ali, B.; Fahad, M.; Mohammed, A.S.; Ahmed, H.; Elhag, A.B.; Azab, M. Improving the Performance of Recycled Aggregate Concrete Using Nylon Waste Fibers. Case Stud. Constr. Mater. 2022, 17, e01468. [Google Scholar] [CrossRef]
Matar, P.; Zéhil, G.-P. Effects of Polypropylene Fibers on the Physical and Mechanical Properties of Recycled Aggregate Concrete. J. Wuhan Univ. Technol. Mater. Sci. Ed. 2019, 34, 1327–1344. [Google Scholar] [CrossRef]
Ahmed, L.A.; Hassan, S.S.; Al-Ameer, O.A. Ultra-High Performance Reinforced by Polypropylene Fiber Concrete Made with Recycled Coarse Aggregat. Kufa J. Eng. 2017, 8, 128–141. [Google Scholar] [CrossRef]
Turk, O. Evaluation of Compressive Strength for Recycled Aggregate Concrete Reinforced with Polypropylene Fibers. Master’s Thesis, The British University in Dubai, Dubai, United Arab Emirates, 2021. [Google Scholar]
Sonkhla, P. Effect of Silica Fume and Recycled Coarse Aggregate in Concrete; Jaypee University of Information Technology: Waknaghat, India, 2016. [Google Scholar]
Saravanakumar, P.; Dhinakaran, G. Strength Characteristics of High-Volume Fly Ash–Based Recycled Aggregate Concrete. J. Mater. Civ. Eng. 2013, 25, 1127–1133. [Google Scholar] [CrossRef]
Sowmith, N.; Anjaneya Babu, P.S.S. Influence of Fly Ash on the Performance of Recycled Aggregate Concrete. Int. J. Sci. Res. (IJSR) 2016, 5, 1740–1744. [Google Scholar] [CrossRef]
Bajad, M.N.; Mutha, N.; Husain, H.; Kshirsagar, N. Effect of Recycled Aggregate and Fly Ash in Concrete. IOSR J. Mech. Civ. Eng. (IOSR-JMCE) 2015, 12, 28–35. [Google Scholar] [CrossRef]
ASTM C39/C39M-18; ASTM International Standard Test Method for Compressive Strength of Cylindrical Concrete Specimens. ASTM International: West Conshohocken, PA, USA, 2018.
BS 1881; P. 116 Testing Concrete. Method for Determination of Compressive Strength of Concrete Cubes. BSI: London, UK, 1983.
GB/T 50081-2019; Standard for Test Methods of Concrete Physical and Mechanical Properties. China Architecture & Building Press: Beijing, China, 2019.
BS EN 12390-3; BSI Hardened Concrete-Part 3: Testing Hardened Concrete. Compressive Strength of Test Specimens. BSI: London, UK, 2019.
IS 516; Indian Standards Methods of Tests for Strength of Concrete. Bur Indian Stand: New Delhi, India, 1959.
Khan, M.A.; Farooq, F.; Javed, M.F.; Zafar, A.; Ostrowski, K.A.; Aslam, F.; Malazdrewicz, S.; Maślak, M. Simulation of Depth of Wear of Eco-Friendly Concrete Using Machine Learning Based Computational Approaches. Materials 2021, 15, 58. [Google Scholar] [CrossRef] [PubMed]
Weka 3 Data Mining Software in Java; University of Waikato: Hamilton, New Zealand, 2011; Volume 19, p. 52.
Anaconda Inc. Anaconda Individual Edition, Anaconda Website. 2024. Available online: https://www.anaconda.com/download (accessed on 20 January 2025).
Montgomery, D.C. Design and Analysis of Experiments, 10th ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2019; ISBN 9781118146927. [Google Scholar]
Junaid, M.; Jiang, C.; Eltwati, A.; Khan, D.; Alamri, M.; Eisa, M.S. Statistical Analysis of Low-Density and High-Density Polyethylene Modified Asphalt Mixes Using the Response Surface Method. Case Stud. Constr. Mater. 2024, 21, e03697. [Google Scholar] [CrossRef]
Elsayed, M.; Almutairi, A.D.; Hussein, M.; Dahish, H.A. Axial Capacity of Rubberized RC Short Columns Comprising Glass Powder as a Partial Replacement of Cement. Structures 2024, 64, 106612. [Google Scholar] [CrossRef]
Adamu, M.; Trabanpruek, P.; Limwibul, V.; Jongvivatsakul, P.; Iwanami, M.; Likitlersuang, S. Compressive Behavior and Durability Performance of High-Volume Fly-Ash Concrete with Plastic Waste and Graphene Nanoplatelets by Using Response-Surface Methodology. J. Mater. Civ. Eng. 2022, 34, 04022222. [Google Scholar] [CrossRef]
Bishop, C.M. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006. [Google Scholar]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Wang, Y.; Witten, I.H. Induction of Model Trees for Predicting Continuous Classes; Computer Science Working Papers; University of Waikato: Hamilton, New Zealand, 1996. [Google Scholar]
Quinlan, J.R. Learning with Continuous Classes. In Proceedings of the Australian Joint Conference on Artificial Intelligence, Hobart, Australia, 16–18 November 1992; pp. 343–348. [Google Scholar]
Alkharisi, M.K.; Dahish, H.A.; Youssf, O. Prediction Models for the Hybrid Effect of Nano Materials on Radiation Shielding Properties of Concrete Exposed to Elevated Temperatures. Case Stud. Constr. Mater. 2024, 21, e03750. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA; pp. 785–794. [Google Scholar]
Wu, J.; Ma, D.; Wang, W. Leakage Identification in Water Distribution Networks Based on XGBoost Algorithm. J. Water Resour. Plan. Manag. 2022, 148, 04021107. [Google Scholar] [CrossRef]
Wang, T.; Bian, Y.; Zhang, Y.; Hou, X. Classification of Earthquakes, Explosions and Mining-Induced Earthquakes Based on XGBoost Algorithm. Comput. Geosci. 2023, 170, 105242. [Google Scholar] [CrossRef]
Chakraborty, D.; Awolusi, I.; Gutierrez, L. An Explainable Machine Learning Model to Predict and Elucidate the Compressive Behavior of High-Performance Concrete. Results Eng. 2021, 11, 100245. [Google Scholar] [CrossRef]
Elsayed, M.; Almutairi, A.D.; Dahish, H.A. Effect of Elevated Temperatures on the Residual Capacity of Rubberized RC Columns Containing Waste Glass Powder. Case Stud. Constr. Mater. 2024, 20, e02944. [Google Scholar] [CrossRef]
Obaid, H.A.; Enieb, M.; Eltwati, A.; Al-Jumaili, M.A. Prediction and Optimization of Asphalt Mixtures Performance Containing Reclaimed Asphalt Pavement Materials and Warm Mix Agents Using Response Surface Methodology. Int. J. Pavement Res. Technol. 2024. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Demie, S.; Nuruddin, M.F.; Shafiq, N. Effects of Micro-Structure Characteristics of Interfacial Transition Zone on the Compressive Strength of Self-Compacting Geopolymer Concrete. Constr. Build. Mater. 2013, 41, 91–98. [Google Scholar] [CrossRef]

Figure 1. Importance of recycled aggregate concrete.

Figure 2. (a) Concrete with natural aggregate; (b) concrete with recycled aggregate.

Figure 3. Flowchart of the present research methodology.

Figure 4. Heat map of the Pearson’s correlation coefficients of the input and output variables.

Figure 5. Scatter plots depicting the relationship between the input variables and the output.

Figure 6. Scatter plots depicting the relationship between the input parameters and CS: (a) C, (b) NFA, (c) NCA, (d) RCA, (e) FA, (f) SF, (g) PPF, (h) W/C, (i) SP, and (j) AGE.

Figure 7. ML algorithms’ structures: (a) M5P; (b) RF; and (c) XGB.

Figure 8. RSM results: (a) residual vs. run plot; (b) predicted vs. actual data plot.

Figure 9. Optimal RAC compressive strength response at 28 days.

Figure 10. Regression plots of the developed models: (a) training XGB; (b) testing XGB; (c) training M5P; (d) testing M5P; (e) training RF; and (f) testing RF.

Figure 11. Data split diagram (M5P).

Figure 12. Absolute error plots of the developed models: (a) training XGB; (b) testing XGB; (c) training M5P; (d) testing M5P; (e) training RF; and (f) testing RF.

Figure 13. Statistical metrics of all the developed models: (a) R²; (b) MAE; (c) RMSE; and (d) MAPE.

Figure 14. Prediction errors: (a) training and (b) testing.

Figure 15. Spider plots for K-fold statistical metrics: (a) R²; (b) MAE; (c) RMSE; and (d) MAPE.

Figure 16. SHAP plots: (a) RF model and (b) XGB model.

Table 1. Description of the experimental data.

Parameters	Unit	Min.	Max.	Mean	Std. Deviation	Skewness	Kurtosis
Cement (C)	kg/m³	243	704	423.08	107.37	1.004	0.332
Natural fine aggregate (NFA)	kg/m³	320	962.3	662.34	145.46	−0.429	−0.332
Natural coarse aggregate (NCA)	kg/m³	0	1548	629.52	411.55	−0.103	−0.987
Recycled aggregate (RA)	kg/m³	0	1278	481.93	389.59	0.425	−0.864
Fly ash (FA)	kg/m³	0	162	28.75	43.48	1.213	0.340
Silica fume (SF)	kg/m³	0	79.2	12.53	20.98	1.764	2.437
Polypropylene fiber (PPF)	%	0	3.0	0.249	0.607	2.924	8.709
Water/cement ratio (W/C)		0.26	0.66	0.447	0.091	0.099	−0.496
Super plasticizer (SP)	kg/m³	0	7.84	1.779	2.156	1.295	1.092
Curing period (AGE)	day	3	180	28.52	29.0	2.808	9.380
Compressive strength (CS)	MPa	7.88	115.30	36.803	16.53	1.526	2.983

Table 2. ANOVA analysis of CS response.

Source	Sum of Squares	df	Mean Square	F-Value	p-Value		Metrics	Value
Model	1.28 × 10⁵	32	3993.47	120.51	<0.0001	significant	R²	0.8860
A-C	7.91	1	7.91	0.2387	0.6254		Adjusted R²	0.8787
B-NFA	274.44	1	274.44	8.28	0.0042		Predicted R²	0.867
C-NCA	1432.61	1	1432.6	43.23	<0.0001		Adeq. Precision	65.67
D-RCA	1814.14	1	1814.1	54.74	<0.0001		Std. Dev.	5.76
E-FA	21.82	1	21.82	0.6585	0.4175		Mean	36.8
F-SF	24.8	1	24.8	0.7483	0.3874		C.V. %	15.64
G-PPF	29.84	1	29.84	0.9005	0.3431
H-W/C	90.42	1	90.42	2.73	0.0992
J-SP	16.3	1	16.3	0.492	0.4834
K-AGE	138.81	1	138.8	4.19	0.0412
AB	1126.23	1	1126.2	33.99	<0.0001
AF	35.16	1	35.16	1.06	0.3035
AH	676.2	1	676.2	20.41	<0.0001
BE	700.09	1	700.09	21.13	<0.0001
BG	340.3	1	340.3	10.27	0.0014
BH	314.96	1	314.96	9.5	0.0022
BJ	364.36	1	364.36	11	0.001
BK	391.56	1	391.56	11.82	0.0006
EH	588.4	1	588.4	17.76	<0.0001
EK	555.73	1	555.73	16.77	<0.0001
FH	200.58	1	200.58	6.05	0.0142
FJ	646.99	1	646.99	19.52	<0.0001
FK	621.97	1	621.97	18.77	<0.0001
GJ	111.36	1	111.36	3.36	0.0674
HJ	280.43	1	280.43	8.46	0.0038
HK	1490.4	1	1490.4	44.98	<0.0001
JK	339.67	1	339.67	10.25	0.0015
D²	256.09	1	256.09	7.73	0.0056
G²	269.4	1	269.4	8.13	0.0045
H²	1006.79	1	1006.8	30.38	<0.0001
J²	1692.32	1	1692.3	51.07	<0.0001
K²	4833.3	1	4833.3	145.85	<0.0001
Residual	16,436.49	496	33.14
Lack of Fit	13,902.16	420	33.1	0.9926	0.533	not significant
Pure Error	2534.34	76	33.35
Cor Total	1.44 × 10⁵	528

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alkharisi, M.K.; Dahish, H.A. The Application of Response Surface Methodology and Machine Learning for Predicting the Compressive Strength of Recycled Aggregate Concrete Containing Polypropylene Fibers and Supplementary Cementitious Materials. Sustainability 2025, 17, 2913. https://doi.org/10.3390/su17072913

AMA Style

Alkharisi MK, Dahish HA. The Application of Response Surface Methodology and Machine Learning for Predicting the Compressive Strength of Recycled Aggregate Concrete Containing Polypropylene Fibers and Supplementary Cementitious Materials. Sustainability. 2025; 17(7):2913. https://doi.org/10.3390/su17072913

Chicago/Turabian Style

Alkharisi, Mohammed K., and Hany A. Dahish. 2025. "The Application of Response Surface Methodology and Machine Learning for Predicting the Compressive Strength of Recycled Aggregate Concrete Containing Polypropylene Fibers and Supplementary Cementitious Materials" Sustainability 17, no. 7: 2913. https://doi.org/10.3390/su17072913

APA Style

Alkharisi, M. K., & Dahish, H. A. (2025). The Application of Response Surface Methodology and Machine Learning for Predicting the Compressive Strength of Recycled Aggregate Concrete Containing Polypropylene Fibers and Supplementary Cementitious Materials. Sustainability, 17(7), 2913. https://doi.org/10.3390/su17072913

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Application of Response Surface Methodology and Machine Learning for Predicting the Compressive Strength of Recycled Aggregate Concrete Containing Polypropylene Fibers and Supplementary Cementitious Materials

Abstract

1. Introduction

2. Materials and Methods

2.1. Response Surface Methodology (RSM)

2.2. Machine Learning (ML) Approach

2.2.1. M5P-Tree (M5P)

2.2.2. Random Forest (RF)

2.2.3. Extreme Gradient Boosting (XGB)

2.3. Model Efficiencies

3. Results and Discussion

3.1. RSM

Optimization by RSM

3.2. Performance of ML Models

3.2.1. XGB Model

3.2.2. M5P Model

3.2.3. Random Forest Model (RF)

3.3. Comparison Between the Developed Models

3.4. Cross Validation

3.5. SHAP Analysis for Feature Importance of the RF and XGB Models

4. Conclusions

5. Limitations and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI