Next Article in Journal
A Novel Noise Environmental Measurement Removal Technique for mmW Automotive Radar Measurements
Previous Article in Journal
Query-Side Adversarial Attacks on Event-Based Person Re-Identification: A First-Order Robustness Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Machine Learning-Based Prediction of Compressive Strength in Recycled Aggregate Self-Compacting Concrete: An Ensemble Modeling Approach with SHAP Interpretability Analysis

1
School of Engineering, University of Manchester, Manchester M13 9PL, UK
2
College of Civil Engineering, Hunan University, Changsha 410082, China
3
Department of Civil Engineering, Zhejiang University, Hangzhou 310058, China
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2026, 16(5), 2432; https://doi.org/10.3390/app16052432
Submission received: 20 January 2026 / Revised: 27 February 2026 / Accepted: 27 February 2026 / Published: 3 March 2026

Abstract

The incorporation of recycled concrete aggregates (RCAs) into self-compacting concrete (SCC) represents a critical sustainable construction strategy addressing both construction waste management and natural resource conservation. However, predicting the compressive strength of recycled aggregate self-compacting concrete (RASCC) remains challenging due to complex nonlinear interactions among mixture parameters. This study develops a robust predictive framework using ensemble machine learning algorithms to accurately estimate RASCC compressive strength across diverse mixture compositions. A comprehensive database comprising 301 experimental specimens with 18 input variables—including curing age, binder components, water-to-binder ratio, recycled aggregate properties, and supplementary cementitious materials—was systematically analyzed. Four advanced modeling approaches were evaluated: Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), Stacked Generalization with Ridge regression meta-learner, and Voting ensemble with Non-Negative Least Squares optimization. The Stacking ensemble model demonstrated superior predictive performance on the independent test set, with R2 = 0.963, RMSE = 3.321 MPa, and MAE = 2.506 MPa. Rigorous residual analysis confirmed model validity through satisfaction of normality, homoscedasticity, and independence assumptions. SHAP interpretability analysis identified specimen age as the dominant predictor, followed by recycled aggregate density and water-to-binder ratio, while elucidating the complex nonlinear contributions of supplementary cementitious materials including fly ash and ground granulated blast furnace slag. The developed framework demonstrates practical applicability for predicting RASCC compressive strength across conventional to high-performance grades, facilitating sustainable mix design optimization while maintaining structural performance requirements, and advancing circular economy principles through confident integration of recycled aggregates in SCC applications.

1. Introduction

Concrete stands as the second most consumed substance on Earth after water and remains the most extensively utilized artificial construction material in contemporary infrastructure development [1,2,3,4]. The global construction industry’s substantial reliance on concrete has resulted in significant environmental challenges, particularly regarding natural resource depletion and carbon dioxide emissions [5,6,7]. The cement industry alone accounts for approximately 8% of global anthropogenic CO2 emissions, having contributed 44.9 Gt of CO2 between 1928 and 2021 [8,9]. This environmental burden is further compounded by the construction sector’s generation of substantial solid waste, which constitutes approximately 36% of global waste production [10]. The accelerating pace of urbanization and population growth has intensified these environmental pressures, creating an urgent need for sustainable alternatives in concrete production.
The management of construction and demolition waste has emerged as a critical environmental challenge in recent decades. As aging infrastructure undergoes renovation and replacement, large volumes of concrete waste are generated, requiring substantial financial resources for proper disposal [11]. Traditional disposal methods, such as landfilling, not only occupy valuable land resources but also contribute to environmental pollution through groundwater contamination and greenhouse gas emissions [12,13]. This situation has catalyzed research into sustainable waste management strategies, with the utilization of recycled concrete aggregate (RCA) representing a particularly promising approach [14]. By processing construction waste into recycled aggregates, the construction industry can simultaneously address waste management challenges and reduce the demand for natural aggregates, thereby promoting circular economy principles within the built environment [15,16]. Beyond conventional recycled concrete aggregates, other construction waste streams—such as recycled ceramic waste used as aggregate or cementitious replacement—have been shown to maintain or improve strength while reducing embodied carbon, especially when combined with fiber reinforcement [17].
Self-compacting concrete (SCC), recognized as one of the most significant advancements in construction materials since its development in Japan during the 1980s, offers exceptional flowability, uniformity, and stability in its fresh state while demonstrating superior mechanical and durability properties when hardened [18,19]. The integration of recycled aggregates into SCC production presents a synergistic opportunity to combine the benefits of advanced concrete technology with sustainable waste utilization. However, the inherent characteristics of RCA, particularly its higher water absorption capacity, lower density, and irregular morphology compared to natural aggregates, significantly influence the fresh and hardened properties of the resulting concrete [20,21]. Research has demonstrated that RCA incorporation can affect both the rheological properties and mechanical performance of SCC, with the water absorption of recycled aggregates playing a particularly critical role in determining the effective water-to-cement ratio and consequently influencing the workability and strength development of the mixture [22,23].
The complex relationships between recycled aggregate properties, mixture proportions, and concrete performance characteristics present substantial challenges for traditional empirical prediction methods. Conventional approaches, including linear regression and polynomial models, have demonstrated limited effectiveness in capturing the nonlinear interactions among multiple variables that influence recycled concrete strength, typically achieving coefficient of determination (R2) values of only 0.22 to 0.28 [24]. This inadequacy has motivated researchers to explore more sophisticated computational approaches capable of modeling the intricate relationships inherent in recycled concrete systems. Machine learning and deep learning techniques have emerged as powerful tools for addressing these complex prediction challenges, offering the ability to identify patterns and relationships within large datasets without requiring explicit programming of physical relationships [25,26,27,28].
Recent advances in artificial intelligence have demonstrated considerable promise in materials engineering applications, particularly in predicting concrete compressive strength. Naderpour et al. developed an artificial neural network model for forecasting recycled aggregate concrete strength using 139 datasets, demonstrating the capability of neural networks for precise prediction [29]. Similarly, research employing deep neural networks, multivariate adaptive regression splines, and extreme learning machines has validated the accuracy of advanced computational methods in predicting compressive strength with variable compositions [30]. Researchers predicted the compressive strength of recycled concrete by employing support vector regression (SVR), one-dimensional convolutional neural networks (1D-CNN), and a hybrid artificial intelligence model that integrates elastic net, random forest algorithms, and light gradient boosting decision trees (LGBM), while incorporating Gaussian noise during training to enhance the model’s generalization capability [31]. These studies have established that machine learning approaches can effectively capture the complex interactions between mixture components and resulting mechanical properties, though the optimal modeling approach remains subject to ongoing investigation.
Despite substantial progress in applying machine learning to recycled concrete strength prediction, several critical research gaps persist. First, existing studies have not comprehensively examined how recycled coarse aggregate characteristics—particularly water absorption capacity and particle morphology—affect the rheological behavior of self-compacting concrete, limiting the development of targeted strategies to maintain adequate flow properties [32]. Second, the underlying mechanisms by which aggregate quality parameters influence both fresh-state workability and hardened-state mechanical performance remain incompletely understood, especially regarding the relative importance of moisture-related effects compared to geometric and surface texture considerations [33]. Third, the majority of previous research has utilized pre-saturated recycled aggregates or implemented additional water adjustment protocols, approaches that may obscure critical interactions between aggregate moisture state and concrete performance under realistic production scenarios [34].
Although the RAGN-R method proposed by Kazemi et al. [35] has achieved excellent results in predicting the strength of PAC and FRC materials, this study differs fundamentally in the following aspects: (1) Research Object: This paper focuses on Recycled Aggregate Self-Compacting Concrete (RASCC). The nonlinear complexity of this material system stems from the diverse physical properties of recycled aggregates (density, water absorption, particle shape), rather than fiber reinforcement effects. (2) Methodological Innovation: We introduce a dual-base learner combination of LightGBM and CatBoost, coupled with Non-Negative Least Squares (NNLS) weight optimization, offering an ensemble strategy complementary to RAGN-R. (3) Interpretability: Through SHAP analysis, we systematically quantify the contribution mechanisms of 18 mix variables to strength, including an analysis of base model contribution weights at the meta-learner level—an aspect not addressed in prior studies. (4) Validation Rigor: We conducted cross-domain validation using an external UCI dataset, which n (representing sample size) = 1030, and multi-seed stability analysis, demonstrating the domain-independent applicability of our method.
This investigation addresses these gaps through systematic analysis of experimental observations, employing advanced ensemble machine learning approaches including LightGBM, CatBoost, and stacked generalization to predict compressive strength across diverse mixture compositions. SHAP (SHapley Additive exPlanations) interpretability analysis was applied to quantify the marginal contribution of each input feature to individual model predictions. Based on cooperative game theory, SHAP assigns each feature a Shapley value (φj) representing the average contribution across all possible feature subsets. For tree-based ensemble models, the TreeSHAP algorithm was employed, reducing computational complexity from O(TL2M) to O(TLD2). Summary plots and dependency plots were generated to visualize global feature importance and feature interaction effects, respectively, enabling transparent interpretation of the ‘black-box’ ensemble models. By integrating SHAP-based interpretability analysis, this study quantifies the relative contributions of individual mixture parameters and aggregate characteristics to mechanical performance, providing actionable insights for mix design optimization. The findings support the advancement of sustainable construction practices by enabling more confident utilization of recycled aggregate self-compacting concrete in infrastructure development, thereby reducing natural resource consumption and construction waste while maintaining structural performance requirements.

2. Data Sources and Analysis

2.1. Data Sources and Descriptive Statistics

This study utilized a publicly available dataset from Yang’s research [36] as shown in the Appendix A Table A1, comprising 301 RASCC specimens with 18 input variables. Key parameters include curing age (2–91 days), cement content (101–520 kg/m3), water-to-binder ratio (0.24–0.60), recycled aggregate replacement ratio (0–100%), and supplementary cementitious materials (fly ash, GGBFS, silica fume). Detailed descriptive statistics are provided in Table 1.
To provide a more rigorous characterization of the dataset scope and coverage, the following aspects are elaborated upon. The database comprises 301 experimentally tested specimens compiled from Yang’s research [36], which aggregated results from multiple independent experimental campaigns reported in the literature. These campaigns were conducted by different research groups under varying laboratory conditions, thereby introducing inherent diversity in testing protocols, curing environments, and material sourcing. The specimens encompass a wide range of curing ages from 2 to 112 days, covering early-age (2–7 days), standard-age (14–28 days), and extended-age (56–112 days) strength development stages. This temporal coverage enables the models to learn the nonlinear hydration kinetics governing strength evolution across different maturity levels, as confirmed by the SHAP analysis in Section 6, which identifies specimen age as the most influential predictor with SHAP values spanning approximately ±15 MPa.
The diversity of mixture configurations within the dataset is substantial. The 289 specimens originate from at least 10 distinct experimental series, encompassing three cement strength grades (31.25, 42.5, and 52.5 MPa), recycled aggregate replacement ratios spanning 0% to 100%, water-to-binder ratios from 0.24 to 0.562, and supplementary cementitious material combinations including binary (cement + fly ash), ternary (cement + fly ash + GGBFS), and quaternary (cement + fly ash + GGBFS + silica fume) binder systems. Recycled aggregate properties vary considerably, with densities ranging from 2205 to 2685 kg/m3 and water absorption values from 1.77% to 7.7%, representing recycled aggregates of varying quality from different demolition sources. This compositional diversity ensures that the predictive models are exposed to a broad spectrum of practically relevant RASCC formulations, spanning from conventional structural grades to high-performance concrete applications.
Regarding curing conditions and environmental exposure, the compiled dataset primarily reflects standard laboratory curing conditions (approximately 20 ± 2 °C and ≥95% relative humidity) as reported in the original experimental studies. The present investigation focuses exclusively on compressive strength prediction as a function of mixture design parameters and curing age under controlled curing environments. Durability-related degradation phenomena, including carbonation depth progression, chloride ion penetration, and freeze–thaw cycling damage, are not within the scope of this study and would require dedicated experimental datasets incorporating environmental exposure variables and long-term monitoring protocols. This scope definition ensures that the predictive framework maintains clear physical interpretability, with all input features directly related to mixture proportioning and material characterization rather than service-life environmental factors.
Nevertheless, certain limitations of the dataset should be acknowledged. First, the dataset size of 301 specimens, while adequate for ensemble machine learning approaches as demonstrated by the robust cross-validated performance (Section 4.2.3, mean R2 = 0.941 ± 0.019), is moderate compared to some benchmark concrete datasets. Second, the geographic and climatic diversity of the source experiments is not explicitly documented, which may limit the generalizability of the models to regions with significantly different raw material characteristics. Third, the absence of field-cured specimens restricts the direct applicability of the current framework to laboratory-scale strength prediction scenarios. These limitations are partially mitigated through the external validation on the independent UCI Concrete Compressive Strength dataset containing 1030 samples (Section 5.3), which demonstrates the framework’s domain-agnostic applicability, and through the seed stability analysis (Section 4.2.3), which confirms robustness to data partitioning variability.

2.2. Data Limitations

The 301-specimen database compiled in this study was drawn from 57 published sources spanning multiple countries and decades, which inevitably introduces a degree of heterogeneity in testing conditions. Four specific limitations are acknowledged. First, specimen geometry and size are not uniform across the dataset; some studies report results from standard cylinders (e.g., 150 × 300 mm or 100 × 200 mm) while others use cubes (e.g., 150 × 150 × 150 mm). Cylindrical specimens generally yield lower f_c values than cubic specimens of comparable cross-sectional dimensions. This difference is primarily attributable to the height-to-width ratio (h/d) effect and its interaction with platen friction-induced confinement. During compression testing, friction between the specimen ends and the machine platens restricts lateral expansion near the loading surfaces, creating confined zones. According to the St. Venant principle, this restraining effect extends to approximately 0.87 times the lateral dimension from each platen. For cubic specimens (h/d ≈ 1), the friction-induced confinement covers the entire specimen height, subjecting the material to an effective triaxial stress state and thereby producing higher apparent compressive strength. For standard cylinders (h/d = 2), the mid-height region remains largely unconfined, producing failure under conditions closer to true uniaxial compression [37,38]. This cube-to-cylinder strength ratio decreases with increasing concrete strength, from approximately 1.25 for normal-strength concrete to 1.12 for high-strength concrete, as documented in the CEB-FIP Model Code 1990 [39].
The conversion factors applied to harmonize compressive strength values were determined based on established international standards and empirical relationships reported in the literature. Specifically, the cube-to-cylinder conversion factor (K = f_c,cyl/f_c,cube) was adopted from EN 206-1/Eurocode 2 (EN 1992-1-1:2004), which defines strength-dependent values ranging from approximately 0.80 for normal-strength concrete (≤C50/60) to 0.87 for high-strength concrete (≈C90/105) [40]. The CEB-FIP Model Code 1990 further corroborates this strength-dependent relationship, indicating a progressive decrease in the cube-to-cylinder strength ratio from 1.25 to 1.12 for cylinder strengths of 40 to 80 MPa [39]. For recycled aggregate concrete specifically, Pacheco et al. [41] demonstrated that full incorporation of coarse recycled aggregates reduces the mean K factor to 0.77, compared to 0.81 for natural aggregate concrete. In the present study, where the original source literature explicitly reported cube strengths, a conversion factor of 0.80 was applied for normal-strength concrete (<60 MPa cube strength) and 0.85 for higher-strength concrete (≥60 MPa cube strength), consistent with the Eurocode 2 framework. However, residual inconsistencies may remain. Second, the effect of specimen end-surface preparation—including grinding, sulfur capping, or neoprene pad systems—could not be systematically controlled, as this information was not uniformly reported in the source literature. Such variability is known to influence measured compressive strength by up to 15% and should be regarded as a source of scatter in the dataset. Third, the dataset encompasses a wide curing age range (2–91 days), and predictions for early-age specimens (<7 days) carry greater uncertainty than those at the standard 28-day benchmark, particularly for mixtures incorporating supplementary cementitious materials (SCMs) such as fly ash or slag, whose strength contributions are more pronounced at later ages. Curing age is explicitly included as a model input feature to partially account for this variability, though it cannot fully resolve differences arising from distinct hydration kinetics across binder types. These limitations are inherent to any large-scale data aggregation effort and do not invalidate the predictive capacity of the machine learning framework; rather, they underscore the need for future studies to adopt standardized reporting protocols when constructing training databases for concrete strength prediction models.
Early-age data separation and analysis. Following the recommendation for distinguishing early-age behavior, the dataset was stratified into two subgroups: early-age specimens with curing age < 7 days (n = 11, comprising 2-day, 3-day, and 5-day specimens) and standard/extended-age specimens with curing age ≥ 7 days (n = 290). The early-age subset represents only 3.7% of the total dataset, with compressive strength values ranging from 18.40 to 48.30 MPa (mean: 32.57 MPa, standard deviation: 11.02 MPa). The standard/extended-age subset exhibits a broader range of 5.36 to 89.00 MPa (mean: 46.80 MPa, standard deviation: 16.58 MPa). The descriptive statistics of both subgroups are summarized in Table 2.

2.3. Pearson Correlation Coefficient Analysis

The Pearson correlation coefficient, which quantifies the linear relationship between input and output variables, is defined as the quotient of the covariance between two variables and their respective standard deviations. This coefficient ranges from −1 to 1, providing a standardized measure of association strength and direction. The formula is given below:
r x y = 1 n 1 i = 1 n x i x ¯ y i y ¯ σ x σ y
where  r x y    is the Pearson correlation coefficient between variables  x  and  y n  is the number of observations;  x i  and  y i  are the  i -th observations of variables  x  and  y , respectively;  x ¯  and  y ¯  are the sample means; and  σ x    and  σ y  are the standard deviations of  x  and  y , respectively.
Figure 1 presents the Pearson correlation coefficient matrix examining the relationships between concrete mixture parameters and compressive strength, with particular emphasis on recycled aggregate concrete performance factors. The correlation analysis reveals several significant findings regarding compressive strength dependencies. Age demonstrates a strong positive correlation with compressive strength (r = 0.42), consistent with the fundamental principle of concrete strength development over time. Cement content similarly exhibits a robust positive correlation (r = 0.51), indicating that increased cement dosage serves as an effective mechanism for enhancing concrete strength. Notably, the water-to-binder ratio shows a negative correlation (r = −0.25), corroborating the established theory that lower water-to-binder ratios promote higher strength development.
Regarding mineral admixtures, silica fume demonstrates a modest positive correlation with compressive strength (r = 0.026), suggesting a beneficial though limited contribution. Conversely, fly ash (r = −0.22) and ground granulated blast furnace slag (r = −0.15) exhibit negative correlations, potentially reflecting the influence of these supplementary cementitious materials on early-age strength development within the scope of this investigation. Concerning recycled aggregate effects, the replacement ratio shows a slight negative correlation with compressive strength (r = −0.053), suggesting a declining strength trend with increased recycled aggregate incorporation. The recycled aggregate absorption rate similarly correlates negatively (r = −0.11), indicating that elevated water absorption characteristics adversely affect concrete performance. In contrast, recycled aggregate density displays a positive correlation (r = 0.36), demonstrating that higher-density recycled aggregates facilitate superior strength development.
The matrix also reveals noteworthy inter-parameter relationships. A strong positive correlation exists between ground granulated blast furnace slag and silica fume (r = 0.56), while cement type exhibits a moderate negative correlation with natural aggregate content (r = −0.34). The recycled aggregate replacement ratio demonstrates strong negative and positive correlations with natural aggregate content (r = −0.97) and recycled aggregate content (r = 0.98), respectively, reflecting the inherent interdependencies among mixture components in proportion design. Fineness modulus and maximum aggregate size both show negative correlations with compressive strength (r = −0.18 and r = −0.17, respectively), suggesting that aggregate particle size characteristics exert measurable influences on concrete strength. Collectively, this correlation analysis provides valuable empirical evidence for optimizing recycled concrete mixture proportions and enhancing engineering performance characteristics.

3. Algorithms and Evaluation Metrics

3.1. Algorithm

LightGBM (v4.1.0) and CatBoost (v1.2.3) are both gradient boosting frameworks that construct ensembles of decision trees sequentially, but they differ fundamentally in their optimization strategies: LightGBM employs Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) to accelerate training efficiency, while CatBoost introduces ordered boosting and symmetric tree structures to mitigate prediction shift and reduce overfitting. In contrast, stacked generalization (Stacking) is a meta-learning technique that does not build trees sequentially but instead trains multiple heterogeneous base learners in parallel and combines their outputs using a secondary meta-learner (Ridge regression in this study). This hierarchical architecture allows the meta-learner to exploit the complementary strengths of diverse base models, potentially achieving superior generalization beyond any individual algorithm.

3.1.1. Light Gradient Boosting Machine (LightGBM)

LightGBM (v4.1.0) is a gradient boosting framework based on decision tree algorithms, originally developed by Microsoft Research [42]. The fundamental prediction function of LightGBM can be expressed as:
y ^ i = t = 1 T f t ( x i )
where  f t  represents the  t -th tree in the ensemble and  T  denotes the total number of trees in the model.
The optimization objective function combines the loss function with regularization terms:
O b j = i = 1 n L ( y i , y ^ i ) + t = 1 T Ω ( f t )
where  L ( y i , y ^ i )  represents the loss function measuring the discrepancy between actual and predicted values, and the regularization term is defined as:
Ω ( f t ) = γ T + 1 2 λ j = 1 T w j 2
with  γ  controlling the number of leaves and  λ  penalizing the magnitude of leaf weights.
LightGBM introduces two key algorithmic innovations. The Gradient-based One-Side Sampling (GOSS) technique retains instances with large gradients while randomly sampling instances with small gradients, significantly reducing computational complexity. The Exclusive Feature Bundling (EFB) algorithm merges mutually exclusive features to reduce dimensionality while maintaining information gain accuracy [42].

3.1.2. Category Boosting (CatBoost)

CatBoost (v1.2.3) is a gradient boosting library developed by Yandex that addresses prediction shift through ordered boosting and handles categorical features natively [43]. The prediction function follows the standard gradient boosting framework:
y ^ i = t = 1 T α t h t ( x i )
where  α t  represents the weight of the  t -th weak learner  h t .
CatBoost employs a novel approach for encoding categorical features called Ordered Target Statistics. For a categorical feature k and instance i, the encoding is computed as:
x ^ k i = j = 1 i 1 1 [ x k j = x k i ] y j + a p j = 1 i 1 1 [ x k j = x k i ] + a
where  1 [ ]  is the indicator function,  p  represents the prior probability of the target variable, and  a  is a smoothing parameter that prevents overfitting when category counts are low.
CatBoost constructs symmetric trees (also known as oblivious decision trees), where all nodes at the same depth share identical splitting criteria. This structure reduces model complexity and accelerates prediction time [44].

3.1.3. Stacked Generalization (Stacking)

Stacking, originally proposed by [45] and formalized by [46], is a meta-learning technique that combines predictions from multiple base learners through a secondary model. The algorithm operates in two distinct layers.
In the first layer, each base learner generates predictions:
p l ( i ) = h l ( x i ) , l = 1,2 , , L
where  h l  represents the  l -th base model and  p l ( i )  denotes its prediction for instance  i .
The second layer employs a meta-learner that takes base model predictions as input features:
y ^ i = g ( p 1 ( i ) , p 2 ( i ) , , p L ( i ) )
For linear meta-learners such as Ridge regression, the prediction function can be expressed as:
y ^ i = β 0 + l = 1 L β l p l ( i )
where  y ^ i  is the predicted compressive strength for the  i -th sample;  β 0  is the intercept term;  β l  is the weight coefficient assigned to the  l -th base learner;  p l i  is the prediction of the  l -th base model for sample  i ; and  L  is the total number of base learners.
To prevent overfitting and information leakage, base learners utilize K-fold cross-validation to generate out-of-fold predictions on the training data. This process creates a meta-feature matrix:
Z = p 1 ( 1 ) p 2 ( 1 ) p L ( 1 ) p 1 ( 2 ) p 2 ( 2 ) p L ( 2 ) p 1 ( n ) p 2 ( n ) p L ( n )
Van der Laan et al. [47] provided theoretical guarantees for stacked ensembles under the name “Super Learner”, demonstrating that the ensemble will perform asymptotically as well as the best possible weighted combination of base learners.

3.1.4. Voting with Non-Negative Least Squares (VotingNNLS)

VotingNNLS represents a weighted voting ensemble that determines optimal model weights through non-negative least squares optimization. The ensemble prediction is computed as a weighted linear combination [48]:
y ^ i = l = 1 L w l p l ( i )
where  w l  represents the weight assigned to the  l -th base model.
The weight optimization problem is formulated as a constrained least squares minimization:
min w 0   Y P w 2 2
subject to the non-negativity constraint  w 0  for all  l = 1 , , L , where  Y = [ y 1 , y 2 , , y n ] T  is the vector of true labels,  P = [ p 1 , p 2 , , p L ]  is the prediction matrix, and  w = [ w 1 , w 2 , , w L ] T  is the weight vector.
Following weight optimization, normalization is applied to ensure interpretability:
w l n o r m = w l j = 1 L w j
The non-negativity constraint ensures that each base model contributes positively to the ensemble, preventing counterintuitive negative weights that can arise in unconstrained regression. Ref. [46] demonstrated that non-negative weighting schemes typically yield superior generalization performance compared to unconstrained linear combinations.

3.1.5. Ridge Regression with Cross-Validation (RidgeCV)

Ridge regression extends ordinary least squares by incorporating an L2 regularization penalty, effectively addressing multicollinearity and overfitting [49]. The objective function is:
min β i = 1 n y i β 0 j = 1 p β j x i j 2 + α j = 1 p β j 2
where  α  is the regularization parameter controlling the strength of the penalty term.
The closed-form solution for the coefficient estimates is:
β ^ = ( X T X + α I ) 1 X T Y
where  I  represents the identity matrix.
RidgeCV automates hyperparameter selection by evaluating multiple candidate values  { α 1 , α 2 , , α k }  through cross-validation and selecting the parameter that minimizes validation error. This automated tuning ensures optimal regularization strength without manual intervention.

3.1.6. SHapley Additive exPlanations (SHAP)

SHAP provides a unified framework for interpreting machine learning model predictions based on Shapley values from cooperative game theory [50]. The Shapley value for feature j is defined as:
ϕ j = S F { j } | S | ! ( | F | | S | 1 ) ! | F | ! f S { j } ( x S { j } ) f S ( x S )
where  F  represents the set of all features,  S  is a subset of features excluding feature  j f S  denotes the model prediction using only features in subset  S , and  ϕ j  quantifies the contribution of feature  j  to the prediction.
SHAP satisfies the additive property, ensuring that the sum of feature contributions equals the difference between the model prediction and the expected value:
f ( x ) = ϕ 0 + j = 1 p ϕ j
where  ϕ 0  represents the baseline value, typically computed as the average prediction across the training dataset.
For tree-based models, Ref. [51] developed TreeSHAP, an efficient algorithm that reduces computational complexity from  O ( T L 2 M )  to  O ( T L D 2 ) , where T is the number of trees, L is the maximum number of leaves, and D is the maximum depth. This optimization enables practical application of SHAP to large-scale ensemble models.

3.2. Performance Evaluation Metrics

These four performance evaluation metrics serve as comprehensive measures to assess the accuracy and reliability of regression models in predicting concrete compressive strength [52]. The coefficient of determination (R2, Equation (18)) quantifies the proportion of variance in the dependent variable explained by the model, with values closer to unity indicating superior predictive capability. Mean Squared Error (MSE, Equation (19)) and Root Mean Squared Error (RMSE, Equation (20)) evaluate prediction accuracy by penalizing larger deviations more heavily, with RMSE offering the advantage of maintaining the same units as the target variable for more intuitive interpretation [53]. Mean Absolute Error (MAE, Equation (21)) provides a robust assessment of average prediction deviation that exhibits greater resilience to outliers compared to squared error metrics [52]. Together, these metrics enable comprehensive model evaluation from multiple perspectives, ensuring that the selected ensemble approaches demonstrate both statistical validity and practical applicability for concrete strength prediction tasks.
R 2 = 1 i = 1 n y i y i ^ 2 i = 1 n y i y i ¯ 2
M S E = 1 m i = 1 m y i y ^ i 2
R M S E = 1 m i = 1 m y i y ^ i 2
M A E = 1 n i = 1 n | y i y ^ i |

4. Dataset and Methods

4.1. Dataset Preparation and Preprocessing

The experimental dataset comprised concrete compressive strength measurements along with various mixture composition parameters, sourced from a comprehensive materials testing database [36]. The dataset was subjected to rigorous quality control procedures to ensure data integrity and model reliability [54]. Prior to model training, a systematic preprocessing pipeline was implemented to address common data quality issues and standardize feature scales.
Missing values in the numerical features were imputed using the median strategy, which demonstrates greater robustness to outliers compared to mean imputation [55]. This approach preserves the central tendency of each feature distribution while minimizing the influence of extreme values that may result from measurement errors or data collection inconsistencies. Following imputation, all numerical features underwent standardization through z-score normalization, transforming each feature to have zero mean and unit variance [56,57]. This standardization step ensures that features with different measurement scales contribute equally to model training, preventing features with larger numerical ranges from dominating the learning process [58].
The preprocessed dataset was partitioned into training and testing subsets using stratified random sampling with an 80-20 split ratio. This partitioning strategy maintains the statistical distribution of the target variable across both subsets while ensuring sufficient data volume for model training [59]. A fixed random seed of 42 was employed throughout all experiments to guarantee reproducibility of results and enable fair comparison across different modeling approaches.

4.2. Hyperparameter Optimization and Sensitivity Analysis

4.2.1. Hyperparameter Optimization Strategy

To address concerns regarding the empirical nature of the optimization procedure, a systematic hyperparameter optimization (HPO) framework was implemented for the LightGBM base learner using randomized search with 5-fold cross-validation, following the recommendations of recent literature emphasizing the importance of structured hyperparameter tuning in machine learning for engineering applications [60].
The selection of tunable hyperparameters and their candidate ranges was guided by both algorithmic theory and domain-specific considerations. Seven key hyperparameters were identified based on their documented influence on gradient boosting model performance: the number of boosting iterations (n_estimators), learning rate, maximum number of leaves per tree (num_leaves), maximum tree depth (max_depth), minimum number of samples per leaf node (min_child_samples), row subsampling ratio (subsample), and column subsampling ratio (colsample_bytree). The search space for each parameter was defined to encompass a practically meaningful range while avoiding extreme configurations that could lead to overfitting or underfitting. Table 3 summarizes the HPO search space and the selected optimal configuration.
The randomized search evaluated 5 candidate configurations across 5-fold cross-validation (totaling 25 fits), with the negative root mean squared error as the scoring metric. The optimal configuration achieved a best cross-validated RMSE of 3.676 MPa. The objective function was chosen as RMSE rather than R2 because RMSE directly penalizes large prediction errors in the same physical units as the target variable (MPa), which is more meaningful for engineering applications where prediction accuracy in absolute terms is critical for structural design decisions.
The selection of LightGBM and CatBoost as base learners was motivated by their complementary algorithmic characteristics: LightGBM employs leaf-wise tree growth with Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB), which excels at capturing fine-grained nonlinear patterns, while CatBoost utilizes symmetric (oblivious) trees with ordered boosting, providing inherent regularization against prediction shift. This complementarity enhances ensemble diversity, a prerequisite for effective ensemble learning. Ridge regression was selected as the meta-learner due to its closed-form solution, L2 regularization that manages multicollinearity between correlated base learner predictions, and its demonstrated effectiveness in stacking frameworks for concrete strength prediction.

4.2.2. Sensitivity Analysis

To demonstrate the stability of the optimized model with respect to hyperparameter perturbations, a sensitivity analysis was conducted by varying key hyperparameters by ±20% from their optimal values while holding all other parameters constant. Figure 2 presents the sensitivity analysis results for learning rate and num_leaves, the two hyperparameters most directly controlling model complexity and convergence behavior.
The results demonstrate remarkable stability across perturbation levels. When the learning rate was decreased by 20% (from 0.05 to 0.04), the mean cross-validated RMSE increased marginally from 3.676 to 3.744 MPa (a 1.9% increase), while a 20% increase (to 0.06) yielded a mean RMSE of 3.662 MPa (a 0.4% decrease). For num_leaves, both the 20% decrease (from 20 to 16) and 20% increase (to 24) produced identical mean RMSE values of 3.676 MPa, indicating that the model performance is insensitive to moderate variations in tree complexity within this range. The overlapping confidence intervals across all perturbation conditions confirm that the optimized hyperparameters reside within a stable plateau of the performance landscape, rather than at a sharp optimum that would be sensitive to minor configuration changes.

4.2.3. Reproducibility and Seed Stability

To verify the reproducibility of the modeling results, the complete training and evaluation pipeline was executed across three independent random seeds (0, 1, 2), each generating different train-test splits with an 80-20 ratio. Figure 3 presents the distribution of performance metrics across these random seeds.
The model consistently achieved R2 values exceeding 0.926 across all random splits, with a mean R2 of 0.941 ± 0.019 and mean RMSE of 3.698 ± 0.706 MPa, as shown in Table 4. The relatively low standard deviation in R2 (0.019) confirms that the model’s predictive capability is robust to variations in data partitioning, demonstrating that the reported performance is not an artifact of a particular favorable train-test split. The observed variability in RMSE is attributable to the moderate dataset size (301 samples), where different random splits may include varying proportions of extreme mixture compositions in the test set.

4.3. Base Learner Configuration

Two gradient boosting algorithms were selected as base learners for the ensemble framework due to their complementary strengths in handling complex non-linear relationships and diverse data characteristics. The selection of these specific algorithms was motivated by their demonstrated success in materials science prediction tasks and their ability to capture different aspects of the underlying data patterns.

4.3.1. Hyperparameter Configuration of LightGBM

The LightGBM implementation employed a leaf-wise tree growth strategy with carefully tuned hyperparameters to balance model complexity and generalization performance [42]. The model was configured with 700 gradient boosting iterations, allowing sufficient capacity to capture intricate patterns in the compressive strength relationships. A conservative learning rate of 0.05 was selected to ensure stable convergence and prevent overfitting through gradual parameter updates. Each decision tree was constructed with a maximum of 20 leaves, providing adequate model expressiveness while maintaining computational efficiency through the leaf-wise growth algorithm [42].
To enhance model robustness and retableduce variance, stochastic training strategies were implemented through both row and column subsampling [61]. The subsample parameter was set to 0.8, meaning that each boosting iteration utilized 80 percent of the training instances selected through random sampling without replacement. Similarly, the column sample ratio (colsample_bytree) was configured at 0.8, ensuring that each tree was constructed using 80 percent of the available features. These subsampling techniques introduce beneficial randomization that prevents individual trees from overfitting to specific data patterns while promoting diversity among ensemble members.

4.3.2. Categorical Boosting (CatBoost)

The CatBoost regressor was configured with 700 gradient boosting iterations, representing a slightly more conservative ensemble size compared to LightGBM [43]. This configuration was selected based on preliminary experiments indicating that CatBoost achieves stable performance with fewer iterations due to its symmetric tree structure and advanced regularization techniques [62]. The learning rate was set to 0.06, slightly higher than LightGBM to compensate for the reduced number of iterations while maintaining training stability.
A maximum tree depth of 4 was imposed to construct relatively shallow decision trees, which serves as an effective regularization mechanism against overfitting. Shallow trees with limited depth are less prone to memorizing noise in the training data and tend to capture more generalizable patterns. This depth limitation, combined with CatBoost’s built-in ordered boosting algorithm, provides strong protection against overfitting while maintaining predictive accuracy [43]. The random seed was consistently set to 42 across all experiments to ensure reproducibility and enable direct comparison with other modeling approaches.

4.4. Ensemble Learning Architectures

Two distinct ensemble learning strategies were implemented to investigate different approaches for combining base learner predictions. These strategies represent fundamentally different philosophies in ensemble construction: mathematical optimization of linear combinations versus hierarchical learning of meta-patterns [63,64].

4.4.1. Voting Ensemble with Non-Negative Least Squares (VotingNNLS)

The VotingNNLS ensemble implements a mathematically rigorous approach to optimal weight determination through constrained least squares optimization [46]. Following independent training of the LightGBM and CatBoost base learners on the preprocessed training data, both models generate prediction vectors for the training set. These prediction vectors are concatenated column-wise to form a prediction matrix P of dimensions n × 2, where n represents the number of training samples and each column corresponds to predictions from one base learner.
The optimal ensemble weights are determined by solving a constrained optimization problem that minimizes the squared residual between the weighted combination of base predictions and the true target values. Formally, this optimization seeks weight vector w that minimizes the objective function ||Pw − y||22, subject to the non-negativity constraint w ≥ 0 [48]. This constraint ensures that all base learners contribute positively to the final prediction, preventing negative weights that could lead to counterintuitive model behavior and reduced interpretability.
The optimization problem is solved using the Non-Negative Least Squares (NNLS) algorithm implemented through the scipy.optimize module, which employs an active set method to efficiently identify the optimal weight configuration [48,65]. Upon convergence, the algorithm produces a weight vector that may assign zero weight to underperforming models, effectively implementing automatic model selection within the ensemble framework. The resulting weights are normalized to sum to unity, creating a convex combination that maintains the prediction scale while optimally balancing the contributions of each base learner according to their performance on the training data.
The final ensemble prediction for any input instance is computed as a weighted linear combination of the base learner predictions, expressed mathematically as ŷ_final = w1·ŷ_LGBM + w2·ŷ_CatBoost, Where the weight vector  w * = [ w 1 , w 2 ] T  represents the optimal weights determined through Non-Negative Least Squares (NNLS) optimization. These weights minimize the squared error  P w y 2  under the constraint  w 0 , ensuring that each base model contributes positively to the final prediction. This approach offers computational efficiency since no additional model training is required beyond solving a single convex optimization problem, while still achieving optimal linear aggregation of base learner outputs, as shown in Figure 4.

4.4.2. Stacked Generalization

The stacked generalization framework implements a hierarchical two-layer architecture that enables learning of complex non-linear relationships between base model predictions and target values [45,66]. This approach treats the base learner predictions as derived features for a higher-level meta-learning process, potentially capturing synergistic interactions that simple weighted averaging cannot represent [67].
In the base layer (Layer 0), the LightGBM and CatBoost models are trained independently on the preprocessed training data using their respective configurations as previously described. Unlike the VotingNNLS approach, the predictions from these base learners are conceptualized as meta-features rather than final predictions requiring combination [68]. Each base learner generates prediction vectors for all training instances, and these vectors are concatenated to form a meta-dataset where each original training instance is now represented by a two-dimensional feature vector comprising the predictions from both base models.
The meta-learning layer (Layer 1) employs Ridge regression as the meta-learner, selected for its ability to learn smooth combinations of base predictions while providing robustness against multicollinearity through L2 regularization [49]. The Ridge regression model was implemented using cross-validated hyperparameter selection (RidgeCV) to automatically determine the optimal regularization strength from a predefined set of candidate alpha values: 0.1, 1.0, and 10.0 [69]. This cross-validation procedure evaluates each alpha value using internal cross-validation folds on the training data, selecting the configuration that minimizes prediction error on held-out validation sets.
The meta-learner is trained on the meta-dataset where the input features are the base learner predictions and the target remains the original compressive strength values. This training process enables the Ridge regression to learn an optimal blending function that may assign different importance to base learners depending on input characteristics, effectively implementing context-dependent model weighting. The learned function can capture non-linear decision boundaries in the meta-feature space through the linear model’s coefficients, which may vary significantly from simple averaging or even the optimal linear weights found by VotingNNLS.
During inference on the test set, the stacked generalization pipeline first generates predictions from both base learners for each test instance. These predictions are then fed as input features to the trained meta-learner, which produces the final ensemble prediction. This two-stage prediction process maintains the hierarchical structure and ensures that the meta-learner’s learned blending strategy is consistently applied to new data. The stacking approach offers greater modeling flexibility compared to VotingNNLS, as the meta-learner can potentially identify and exploit patterns in how base learner prediction errors vary across different regions of the input space, though this increased flexibility requires additional computational resources for meta-learner training and cross-validation, as shown in Figure 5.

4.5. Model Evaluation and Performance Metrics

Model performance was assessed using a comprehensive suite of regression metrics computed on both training and testing datasets to evaluate both fit quality and generalization capability. The primary evaluation metric was the coefficient of determination (R2), which quantifies the proportion of variance in compressive strength explained by model predictions. Additional metrics included Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE), providing complementary perspectives on prediction accuracy at different scales and with varying sensitivity to outliers.
For each modeling approach, predictions were generated for all instances in both training and testing sets. These predictions were systematically recorded and exported as comma-separated value files containing paired actual and predicted values, facilitating detailed error analysis and visualization. All computational experiments were conducted using Python 3.x with the scikit-learn, LightGBM, and CatBoost libraries, and results were archived in timestamped directories to maintain a complete experimental record and enable result reproduction.

5. Results and Discussion

5.1. Model Performance Comparison and Selection

Based on the comparative analysis of the four machine learning methodologies, the performance metrics reveal distinct characteristics across different modeling approaches, as shown in Table 5. The Stacking ensemble method demonstrated superior generalization capability, achieving the lowest testing RMSE of 3.321 and the highest testing R2 of 0.963, along with the minimal testing MAE of 2.506. This performance superiority comes at a computational cost, requiring 2.168 s for model training, which represents approximately seventeen times the execution time of the LGBM baseline. The VotingNNLS approach exhibited competitive performance with a testing RMSE of 3.553 and R2 of 0.957, while maintaining reasonable computational efficiency at 0.559 s. Among the individual learners, CatBoost showed marginally better training fit with an R2 of 0.990, though its testing performance (RMSE: 3.749, R2: 0.953) lagged the ensemble methods. The LGBM model presented the most favorable balance between predictive accuracy and computational efficiency, achieving a testing RMSE of 3.386 with merely 0.124 s of training time. Notably, all models maintained testing R2 values above 0.95, indicating robust predictive capability. The relatively modest gap between training and testing metrics across all approaches suggests appropriate model complexity without severe overfitting. Considering the trade-off between predictive precision and computational demand, the Stacking ensemble emerges as the optimal choice for applications prioritizing accuracy, while LGBM represents a pragmatic alternative for scenarios requiring rapid model deployment or real-time predictions.

5.2. Enhanced Model Performance Comparison and Validation Analysis

The correlation between predicted and actual compressive strength values provides comprehensive validation of model performance across the four machine learning approaches, as illustrated in Figure 6. The LGBM model (Figure 6a) exhibited the strongest linear relationship with an R2 of 0.9841, demonstrating exceptional predictive fidelity with minimal deviation from the ideal prediction line across the entire strength spectrum from 5.36 to 89 MPa. This superior correlation validates the model’s capability to capture the complex nonlinear relationships between mix design parameters and mechanical properties in recycled aggregate concrete. The residual distribution in the LGBM predictions demonstrates remarkable consistency, with the majority of data points falling within a narrow band around the theoretical prediction line, suggesting that the model maintains uniform accuracy without exhibiting systematic bias toward either underestimation or overestimation across different strength ranges. The homoscedastic nature of the residuals, evidenced by the consistent scatter width along the entire prediction range, further confirms the model’s robustness and reliability for practical applications in mix design optimization.
The CatBoost model (Figure 6b) achieved an R2 of 0.9529, displaying slightly increased scatter around the regression line, particularly at higher strength values exceeding 70 MPa, though maintaining strong overall predictive accuracy. The increased deviation in the upper strength range suggests potential limitations in the model’s capacity to extrapolate beyond the most densely represented training data regions, which typically occur in the moderate strength range of 30–60 MPa where conventional recycled aggregate concrete formulations are most common. Despite this marginal increase in prediction variance, the CatBoost model demonstrates commendable performance with more than 95 percent of the variance in compressive strength explained by the model predictions. The distribution pattern indicates that the model’s predictive uncertainty increases proportionally with compressive strength magnitude, a characteristic that practitioners should consider when applying the model to high-performance recycled aggregate concrete formulations.
The Stacking ensemble approach (Figure 6c) yielded an R2 of 0.9429, demonstrating that while this method achieved the lowest testing RMSE of 3.321 MPa as shown in Table 2, the correlation coefficient indicates marginally higher variance in individual predictions compared to LGBM. This apparent discrepancy between RMSE performance and correlation strength can be attributed to the Stacking model’s superior handling of extreme outliers and its ability to minimize large prediction errors through the ensemble weighting mechanism. The scatter plot reveals a balanced distribution of residuals across the strength spectrum, with no evident clustering of mispredictions in any particular range, suggesting that the model’s reduced R2 value stems from a more uniform distribution of small to moderate prediction errors rather than from systematic bias or heteroscedastic variance. This characteristic makes the Stacking approach particularly valuable in applications where minimizing maximum prediction error is prioritized over achieving the tightest overall correlation.
The VotingNNLS model (Figure 6d) produced an R2 of 0.9257, representing the lowest correlation among the four approaches yet still maintaining robust predictive capability exceeding 92 percent. The increased scatter observed in this model’s predictions, particularly evident in the mid-range strength values between 40–60 MPa, reflects the inherent trade-off between model complexity and generalization capability in ensemble learning approaches. The non-negative least squares constraint imposed during the voting weight optimization process, while providing theoretical guarantees regarding prediction bounds, appears to limit the model’s flexibility in capturing certain nonlinear interaction effects between input features. Nevertheless, the consistent performance across the full range of observed compressive strength values indicates that the VotingNNLS approach remains a viable option for practical applications, particularly in scenarios where computational efficiency and interpretability are valued over marginal improvements in prediction accuracy.
All models exhibited consistent performance across the range of observed compressive strength values, with data points clustering tightly around the diagonal reference line, particularly in the moderate strength range of 30–60 MPa where the majority of recycled aggregate concrete formulations occurred. This concentration of data in the intermediate strength range reflects the practical reality of recycled aggregate concrete applications, where most commercial and research formulations target conventional structural strength requirements. The uniform distribution of residuals within this region across all four models confirms their reliability for the most common use cases in sustainable construction practice. The visual analysis corroborates the quantitative metrics presented in Table 4, confirming that LGBM demonstrates superior balance between correlation strength and computational efficiency, while the Stacking approach achieves optimal accuracy metrics at the cost of increased computational demand. The marginal performance differences between the four models, with R2 values spanning only 5.84 percentage points, underscore the maturity and effectiveness of modern gradient boosting and ensemble learning techniques for predicting concrete mechanical properties. For practical implementation, the selection among these models should be guided by specific project requirements, balancing prediction accuracy, computational resources, interpretability needs, and the tolerance for prediction variance in different strength ranges. Promote the application of recycled aggregate self-compacting concrete to achieve carbon reduction and reach the goal of sustainable development.

5.3. Generalizability Validation on the UCI Concrete Compressive Strength Dataset

The curing age in the database spans from 2 to 112 days (mean: 33.7 days, standard deviation: 27.1 days), covering three stages: early age (2–7 days), standard age (14–28 days), and extended age (56–112 days). SHAP analysis indicates that curing age is the most influential variable affecting prediction results, with SHAP values ranging approximately ±15 MPa. This reflects the nonlinear characteristics of cement hydration kinetics: strength gains rapidly during the early age (consistent with the Powers hydration model) and gradually plateaus in later stages. Variations in curing conditions exist among experimental samples from different sources (standard laboratory conditions: 20 ± 2 °C, relative humidity ≥ 95%). These discrepancies constitute an inherent challenge to the model’s cross-dataset generalization capability and serve as one of the motivations for the external validation design presented in this paper (UCI dataset, 1030 samples). To further address concerns regarding the generalizability of the proposed ensemble framework, the four models (LGBM, CatBoost, Stacking, and VotingNNLS) were re-evaluated on an independent, publicly available benchmark dataset: the UCI Concrete Compressive Strength dataset [70], which contains 1030 samples with eight input features including cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate, and age. The same model configurations, hyperparameters, and ensemble strategies as described in Section 4 were applied without modification, ensuring a fair cross-domain comparison.
The performance metrics for all models on both the training and test splits are summarized in Table 6.
The results demonstrate that the proposed Stacking ensemble achieved the best generalization performance on this external dataset, attaining a test R2 of 0.9387, RMSE of 3.9749 MPa, and MAE of 2.6725 MPa. Among individual base learners, CatBoost marginally outperformed LGBM on the test set (R2: 0.9345 vs. 0.9299; RMSE: 4.1083 vs. 4.2500 MPa). Notably, VotingNNLS yielded identical predictions to LGBM on this dataset, suggesting that the NNLS weight optimization converged to a solution dominated by LGBM under this data distributional behavior consistent with the theoretical property of NNLS that assigns zero weight to weaker contributors. Overall, the consistent superiority of the Stacking ensemble across both the original dataset and this independent benchmark confirms the robustness and domain-agnostic applicability of the proposed modeling framework.

5.4. Comprehensive Residual Analysis of the Stacking Ensemble Model for Compressive Strength Prediction

The Stacking Ensemble Model is more accurate and is suitable for application in engineering projects. The residual analysis of the Stacking ensemble model, presented in Figure 7, provides critical diagnostic information regarding model performance, prediction reliability, and the appropriateness of the underlying statistical assumptions. The five-panel diagnostic plot systematically examines different aspects of model behavior, revealing characteristics that validate the Stacking approach as the optimal methodology for predicting recycled aggregate self-compacting concrete compressive strength.
Figure 7a presents the residual plot against the independent variable sequence, demonstrating a relatively random scatter pattern of prediction errors across the dataset. The residuals fluctuate within an approximate range of −10 to +10 MPa, with the majority concentrated within ±5 MPa of the zero line. This distribution pattern indicates that the model does not exhibit systematic bias across different data points in the sequence, suggesting robust generalization capability regardless of the order in which samples were collected or tested. The absence of discernible trends or patterns in residual plots confirms that the model has successfully captured all significant relationships within the data, as a well-fitted model will present residuals which randomly scatter around zero with the indication of any pattern showing the existence of misspecification in the model [71]. This characteristic is particularly important for practical applications, as it demonstrates that the model maintains consistent prediction accuracy across different experimental batches and testing campaigns, which may have been conducted under varying conditions or by different research teams.
The histogram of conventional residuals in Figure 7b reveals a distribution that approximates normality, with the highest frequency occurring in the central bins near zero and a gradual decrease in frequency toward the tails. The distribution exhibits approximate symmetry around zero, with roughly 16 observations falling within the central bin and progressively fewer observations in the peripheral ranges. While the distribution shows slight deviations from perfect normality, particularly in the presence of a few observations with residuals approaching ±10 MPa, the overall pattern satisfies the normality assumption required for valid statistical inference in regression modeling. The near-normal distribution of residuals indicates that prediction errors arise primarily from random variation rather than systematic model inadequacies, supporting the validity of confidence intervals and hypothesis tests based on the model outputs, as regression experts consistently recommend plotting residuals for model diagnosis despite the availability of many numerical hypothesis test procedures [72]. Research has demonstrated that normally distributed residuals enhance the interpretability of error metrics and provide more reliable estimates of prediction uncertainty in regression applications.
Figure 7c illustrates the relationship between conventional residuals and fitted values, providing crucial insights into heteroscedasticity and model specification adequacy. The residuals display relatively constant variance across the entire range of fitted values from approximately 10 to 90 MPa, with no evident funnel-shaped pattern that would indicate heteroscedastic behavior. This homoscedastic distribution confirms that the model maintains consistent prediction precision regardless of the magnitude of the predicted compressive strength, a property essential for reliable application across the full spectrum of recycled aggregate concrete formulations from low-strength to high-performance mixtures. The absence of systematic patterns or trends validates the linear relationship assumption and confirms that the model provides proper specification, as heteroscedastic residual patterns in ensemble learning applications would compromise the effectiveness of the meta-learner’s blending strategy [73]. Studies on concrete strength prediction using ensemble learning have emphasized that homoscedastic residual patterns indicate proper model specification and adequate feature representation, demonstrating the superiority of combining several models to produce more precise predictions [74].
The residual sequence plot in Figure 7d examines the temporal or sequential independence of prediction errors, revealing no discernible autocorrelation patterns across the 60 test samples. The residuals oscillate irregularly around the zero line without exhibiting runs of consistently positive or negative values, confirming the independence assumption crucial for valid statistical inference. This independence property indicates that prediction errors for individual samples do not systematically influence the errors for subsequent samples, validating the assumption that each concrete mixture represents an independent observation drawn from the same underlying population. The absence of autocorrelation is particularly significant for ensemble learning applications, as correlated errors among base learners can compromise the effectiveness of ensemble aggregation strategies and reduce the diversity benefits that justify ensemble construction, necessitating cautious quantification of uncertainty from both experimental data and machine learning architectures [75]. The independence of residual errors confirms the appropriateness of the ensemble aggregation strategy and validates the use of cross-validation procedures employed during model training and evaluation.
Figure 7e presents a percentile-percentile plot comparing the distribution of conventional residuals against theoretical normal distribution quantiles, providing a formal graphical assessment of the normality assumption. The data points align closely with the diagonal reference line across most of the distribution, with slight deviations observable only in the extreme tails. This pattern indicates that the residuals follow an approximately normal distribution for the central 95 percent of observations, with minor departures occurring only for the most extreme predictions. The strong linear relationship between observed and theoretical quantiles confirms the validity of parametric statistical inference based on normality assumptions, supporting the use of standard confidence intervals and hypothesis testing procedures for model evaluation. Research on stacking ensemble learning methods has demonstrated that Ridge regression meta-learners provide superior performance due to their ability to handle correlated inputs while providing robustness against multicollinearity through L2 regularization, with Ridge regression being selected as a meta-learner specifically because it can effectively capture important information without overfitting [76].
The comprehensive residual analysis reveals several key strengths of the Stacking ensemble approach that justify its selection as the optimal modeling framework for this application. First, the absence of systematic patterns in residual plots confirms that the model has successfully captured all significant relationships within the data without overfitting to noise or introducing spurious correlations. Second, the approximately normal distribution of residuals validates the statistical assumptions underlying the Ridge regression meta-learner and supports the reliability of reported performance metrics. Third, the homoscedastic residual variance demonstrates consistent prediction precision across the entire strength range, ensuring that the model provides equally reliable predictions for both conventional and high-performance recycled aggregate concrete formulations. Fourth, the independence of residual errors confirms the appropriateness of the ensemble aggregation strategy and validates the use of cross-validation procedures employed during model training and evaluation.
These diagnostic characteristics collectively demonstrate that the Stacking ensemble model not only achieves superior quantitative performance metrics as reported in Table 2, with a testing RMSE of 3.321 MPa and R2 of 0.963, but also satisfies the fundamental statistical assumptions required for valid inference and reliable prediction. The residual analysis provides empirical evidence that the ensemble learning approach successfully combines several weak learners to obtain a comprehensive strong learner, with SHAP-based interpretability analysis confirming that the model captures the underlying relationships while maintaining high precision and generalization capability [77,78]. The model’s predictions can be trusted for practical mix design optimization applications, supporting the advancement of sustainable construction practices through the increased utilization of recycled aggregate self-compacting concrete in infrastructure development.

5.5. Early-Age Versus Standard-Age Prediction Performance

To evaluate the model reliability across different curing ages, the dataset was stratified into early-age (<7 days) and standard/extended-age (≥7 days) subsets, and the prediction performance of all four models was assessed separately. Table 7 presents the results for the full dataset (training + testing).
As shown in Table 6, all four models exhibited lower R2 values for the early-age subset (0.8676–0.9696) compared to the standard/extended-age subset (0.9806–0.9835), confirming that predictions for early-age specimens carry greater uncertainty. Among the individual base learners, CatBoost demonstrated the strongest early-age performance (R2 = 0.9696, RMSE = 1.832 MPa), while LGBM showed the most notable performance degradation for early-age specimens (R2 = 0.8676, RMSE = 3.824 MPa). The Stacking ensemble maintained robust performance across both subsets, achieving the highest R2 (0.9835) and lowest RMSE (2.127 MPa) for the standard-age subset while delivering competitive early-age performance (R2 = 0.9578, RMSE = 2.158 MPa).
The reduced prediction accuracy for early-age specimens is attributable to three factors: (1) the limited representation of early-age data in the dataset (n = 11, 3.7%), which constrains the models’ ability to learn the complex early-age strength development patterns; (2) the highly nonlinear and rapid hydration kinetics during the first few days after mixing, which are more sensitive to minor variations in curing conditions and cement chemistry; and (3) the negligible pozzolanic contribution of supplementary cementitious materials (fly ash, GGBFS) at early ages, despite their presence in the mixture proportions as input features. At early ages, the SCM particles have not yet undergone significant pozzolanic reactions and primarily act as inert fillers, whereas at later ages their contribution to strength development becomes substantial. This temporal shift in the role of SCMs introduces additional complexity for models trained predominantly on standard-age data.
It should be noted that the early-age subset contains only 11 specimens, and the results should be interpreted with caution given this limited sample size. Users of this predictive framework should exercise additional care when applying the model to predict compressive strength of RASCC at ages below 7 days. Future studies should prioritize the collection of additional early-age experimental data to improve prediction reliability in this critical strength development period.

6. SHAP-Based Model Interpretability Analysis

To enhance the transparency and interpretability of the developed ensemble models, SHAP (SHapley Additive exPlanations) analysis was conducted to quantify the contribution of each input feature to the predicted compressive strength values. SHAP values employ the concept of game theory to compute the contribution of each ‘game player’ (feature) to the final ‘game outcome’ (model prediction), providing a unified measure of feature importance by calculating the marginal contribution of each feature across all possible feature combinations [79]. The SHAP method is a comprehensive interpretable method that includes both global and local interpretations, where global interpretation assesses feature importance and dependency while local interpretation quantifies the contribution of each input variable to predicted values [80].

6.1. LightGBM Model Feature Importance Analysis

Figure 8 presents the SHAP summary plot for the LightGBM base learner, illustrating the distribution of SHAP values for all input features across the entire dataset. The vertical axis displays features ranked by their mean absolute SHAP value, indicating their overall importance in determining compressive strength predictions. The horizontal axis represents the SHAP value magnitude, where positive values indicate an increase in predicted strength and negative values suggest a decrease. The color gradient from blue to red represents the feature value magnitude, with red indicating high feature values and blue representing low values.
The analysis reveals that specimen age emerges as the most influential parameter in the LightGBM model, exhibiting the widest distribution of SHAP values ranging from approximately −15 to +15 MPa. The color distribution indicates that increased curing age consistently produces positive SHAP values, with SHAP analysis identifying curing age and cement content as the most influential variables, reinforcing domain knowledge about cement hydration and strength development, as curing age segmentation enhances predictions for long-term strength [81,82]. High age values, represented by red data points, predominantly cluster in the positive SHAP value region, demonstrating that prolonged curing periods contribute substantially to enhanced compressive strength predictions. This finding aligns with the fundamental principles of concrete technology, where the progressive hydration of cementitious materials over time leads to the formation of calcium silicate hydrate gel and the consequent densification of the concrete matrix.
Recycled aggregate density (RA-midu) demonstrates the second highest feature importance, with SHAP values spanning from approximately −10 to +8 MPa. High density values of recycled aggregates predominantly contribute positively to strength predictions, confirming that denser recycled aggregates facilitate superior mechanical performance, as the SHAP technique reveals that physical properties of aggregates are dominant parameters in estimating concrete strength [83]. The color distribution pattern shows that high-density recycled aggregates, indicated by red points, are associated with positive SHAP values, while low-density aggregates, shown in blue, correspond to negative contributions. This relationship reflects the physical reality that denser aggregates typically possess lower porosity, reduced water absorption capacity, and stronger inherent mechanical properties, all of which contribute to enhanced interfacial bonding with the cement paste matrix and improved overall concrete performance.
The water-to-binder ratio (w/b) exhibits substantial negative influence on compressive strength predictions, with SHAP values distributed between approximately −8 to +5 MPa. The feature demonstrates a clear inverse relationship with strength, as evidenced by the concentration of red points representing high w/b ratio values in the negative SHAP value region. Feature importance analysis using SHAP identified the water-to-binder ratio as the most influential factor negatively affecting strength, with Partial Dependence Plots employed to further examine the relationships between key input features and strength outputs [84]. This pattern corroborates the well-established principle in concrete technology that elevated water content increases capillary porosity, reduces the density of the hardened cement paste, and weakens the interfacial transition zone between aggregate particles and the binding matrix.
Fly ash content (FA) presents considerable variability in its impact on strength predictions, with SHAP values ranging from approximately −5 to +5 MPa and displaying a more dispersed distribution pattern compared to the previously discussed features. The mixed influence of fly ash, characterized by both positive and negative SHAP contributions across its value range, reflects the complex dual nature of this supplementary cementitious material. High fly ash dosages can provide long-term strength enhancement through pozzolanic reactions that consume calcium hydroxide and produce additional calcium silicate hydrate. However, at early ages or when used in excessive quantities, fly ash may dilute the cement content and delay strength development, resulting in the heterogeneous SHAP value distribution observed in the analysis.
Cement content exhibits a predominantly positive correlation with compressive strength, as indicated by the concentration of high cement dosage values in the positive SHAP region. However, the magnitude of this effect appears more moderate compared to specimen age and recycled aggregate density, with SHAP values typically ranging from −5 to +5 MPa. Water content demonstrates similar inverse patterns to the w/b ratio, where high water dosages, represented by red data points, predominantly occupy the negative SHAP value space, reflecting the detrimental impact of excess water on concrete strength through increased porosity and reduced matrix density.
Fineness modulus shows a concentrated SHAP value distribution near zero with slight negative tendency for high values, suggesting that while aggregate gradation influences workability and packing efficiency, its direct impact on compressive strength is relatively limited within the studied range. Recycled aggregate water absorption (RAabsorption) displays notable negative influence when absorption values are high, reflecting the adverse effects of porous recycled aggregates on concrete performance through reduced effective water-to-cement ratio and compromised aggregate-paste bonding. SHAP analysis indicates that cement content and recycled aggregate percentages are the effective input parameters affecting concrete mechanical properties [85].
The remaining parameters, including maximum aggregate size, sand content, recycled aggregate content, silica fume, natural aggregate density, natural aggregate absorption, natural aggregate content, recycled aggregate replacement ratio, ground granulated blast furnace slag, and cement type, exhibit progressively diminishing influence on model predictions. These features display narrower SHAP value distributions concentrated near zero, indicating their contributions are relatively minor or highly context-dependent based on complex interactions with other mixture parameters. The hierarchical importance ranking provided by SHAP analysis offers valuable insights for mixture proportion optimization, suggesting that practitioners should prioritize controlling specimen age, aggregate density, water-to-binder ratio, and cementitious material dosages to achieve targeted strength performance.

6.2. CatBoost Model Feature Importance Analysis

Figure 9 presents the SHAP summary plot for the CatBoost base learner, revealing distinct feature importance patterns that complement the insights obtained from the LightGBM analysis. While the overall feature ranking structure exhibits similarities with LightGBM, notable differences in SHAP value distributions and feature influence magnitudes provide valuable information regarding how different gradient boosting implementations capture relationships within the recycled aggregate concrete dataset.
Specimen age maintains its position as the dominant predictor in the CatBoost model, demonstrating SHAP values spanning approximately −10 to +10 MPa. The distribution pattern remains consistent with the LightGBM findings, where high age values represented by red points predominantly cluster in the positive SHAP value region. However, the CatBoost model exhibits a slightly more concentrated distribution compared to LightGBM, suggesting that CatBoost’s ordered boosting algorithm and symmetric tree structure capture the age-strength relationship with somewhat different granularity. Recycled aggregate properties maintain critical importance across different model architectures, with SHAP-based feature attribution providing precise illustration of feature interdependencies and quantifying their complex relationships to establish a hierarchy of importance, demonstrating exceptional predictive accuracy with R2 values exceeding 0.94 across multiple ensemble learning approaches [86,87].
Recycled aggregate density continues to demonstrate substantial influence in the CatBoost model, with SHAP values distributed between approximately −8 to +8 MPa. The color gradient pattern reveals a consistent positive correlation between high aggregate density and positive strength contributions, similar to the LightGBM findings. However, the CatBoost model shows a marginally tighter clustering of SHAP values in the mid-range, potentially reflecting the model’s categorical feature handling capabilities and its approach to splitting decisions through symmetric trees. This characteristic may enable CatBoost to more efficiently partition the feature space when dealing with the continuous density values of recycled aggregates.
The water-to-binder ratio maintains its strong negative influence in the CatBoost model, with high w/b values consistently associated with negative SHAP contributions to predicted strength. The distribution pattern closely mirrors the LightGBM results, confirming the robust identification of this inverse relationship across different modeling approaches. The consistency of this finding across both base learners validates the physical principle that excess water compromises concrete strength and demonstrates that both gradient boosting implementations successfully capture this fundamental relationship despite their algorithmic differences.
Fly ash content exhibits comparable variability in the CatBoost model as observed in LightGBM, with SHAP values spanning both positive and negative regions. The distribution suggests that CatBoost similarly captures the complex dual nature of fly ash contributions, where the pozzolanic benefits must be balanced against potential early-age strength dilution effects. Cement content shows predominantly positive influence with moderate magnitude, maintaining consistency with the LightGBM findings and confirming cement dosage as a reliable strength enhancement parameter across different model architectures.
Water content demonstrates clear negative correlation patterns in the CatBoost model, with high water dosages occupying predominantly negative SHAP value space. The distribution characteristics closely align with the w/b ratio findings, reflecting the interconnected nature of these mixture proportion parameters and their combined influence on concrete porosity and strength development. Ensemble boosting algorithms including CatBoost, XGBoost, and LightGBM demonstrated superior predictive accuracy, with models excelling in estimation across different concrete properties through effective feature interdependency analysis [88].
Fineness modulus recycled aggregate water absorption, and maximum aggregate size exhibit similar importance rankings and distribution patterns in CatBoost compared to LightGBM, though with subtle differences in SHAP value spread and concentration. These similarities suggest that both models converge on comparable feature importance hierarchies despite their different tree-building strategies, with CatBoost’s ordered boosting and symmetric trees producing results that align well with LightGBM’s leaf-wise growth approach.
The lower-ranked features, including sand content, recycled aggregate content, silica fume, natural aggregate properties, recycled aggregate replacement ratio, ground granulated blast furnace slag, and cement type, maintain their limited influence in the CatBoost model. The consistency of these findings across both base learners enhances confidence that these parameters genuinely exert minimal direct impact on compressive strength within the studied dataset, though they may participate in higher-order interactions that the ensemble framework captures through the meta-learner integration.
The comparative analysis of LightGBM and CatBoost SHAP distributions reveals that while both models identify similar primary drivers of compressive strength, subtle differences in their feature importance quantification reflect their distinct algorithmic approaches. These complementary perspectives provide the foundation for the Stacking ensemble’s superior performance, as the Ridge regression meta-learner can leverage the unique insights from each base learner to achieve enhanced prediction accuracy and robustness.

6.3. Meta-Learner Base Model Contributions

Figure 10 quantifies the mean absolute SHAP values for base learner contributions in the Stacking ensemble meta-learner. The horizontal bar chart reveals that LightGBM generates a mean absolute SHAP value of 7.06 MPa, while CatBoost produces 6.28 MPa. The stacking ensemble models improved prediction metrics by reaching higher R2 values and lower RMSE compared to base learners, confirming the effectiveness of ensemble learning in enhancing prediction accuracy through synergistic combination of diverse model predictions [89]. The modest difference of 0.78 MPa indicates that both models contribute substantially and comparably to the final ensemble prediction.
The slightly higher SHAP value for LightGBM suggests marginally greater influence on the ensemble output. This differential may reflect LightGBM’s leaf-wise tree growth strategy, which enables deeper, more specialized trees that capture certain nonlinear relationships with higher fidelity. The model’s gradient-based one-side sampling and exclusive feature bundling techniques contribute to identifying patterns that provide more informative predictions for the Ridge regression meta-learner.
Conversely, CatBoost’s symmetric tree structure and ordered boosting algorithm offer complementary strengths in handling categorical variables and reducing prediction shift. The comparable magnitude of CatBoost’s mean absolute SHAP value confirms that this model provides meaningful, non-redundant information that enhances ensemble predictive capability. Ridge regression was selected as a meta-learner due to its superior performance in stacking models, as it can effectively capture important information without overfitting when the number of base learners is appropriate, and its regularization mechanism helps manage correlation between base learner predictions [76].
The balanced contribution structure enhances ensemble robustness and generalization capability. When multiple base learners contribute comparably to final predictions, the ensemble becomes less vulnerable to the failure modes of any single model. If one base learner encounters input data outside its optimal operating range, the other can compensate, maintaining overall prediction reliability. This redundancy mechanism enhances the ensemble’s ability to generalize to new concrete mixtures that may differ from the training distribution, while preserving the essential relationships captured by both models.
The learned attention structure of the meta-learner—quantified via SHAP values of 7.06 MPa for LightGBM and 6.28 MPa for CatBoost—is physically interpretable in the context of RASCC behavior. LightGBM’s marginally higher weight aligns with the fact that the dataset exhibits strong time-dependent strength evolution (curing age being the top feature in both base models, Figure 8 and Figure 9). LightGBM’s leaf-wise growth strategy is particularly adept at capturing the nonlinear, threshold-like nature of cement hydration kinetics—a well-established physicochemical process where strength gains accelerate rapidly in early curing stages and plateau at later stages [90,91]. This behavior is consistent with Powers’ hydration model and the S-curve strength development pattern widely reported in the concrete literature [92,93]. CatBoost’s slightly lower but comparable weight reflects its strength in handling categorical and bound continuous variables such as cement type and recycled aggregate replacement ratio (RAreplace) parameters whose influence on strength is more discrete and less time dependent. Crucially, the near-parity of the two weights (7.06 vs. 6.28 MPa, a difference of ~11%) indicates that the meta-learner does not disproportionately favor one base model, suggesting that the ensemble has learned to exploit complementary physical mechanisms—hydration kinetics captured by LightGBM and aggregate–matrix interaction effects captured by CatBoost—rather than relying on a single modeling strategy. This balanced attribution is consistent with the known multi-mechanism nature of RASCC strength development, where both binder chemistry and aggregate physical properties play indispensable roles [94,95,96].

6.4. Integrated Ensemble Feature Attribution

Figure 11 presents the comprehensive SHAP beeswarm plot for the complete Stacking ensemble model, synthesizing feature importance analysis across the integrated framework. This visualization captures final feature attribution after the Ridge regression meta-learner has optimally combined predictions from both base learners. The consistency between the integrated ensemble SHAP rankings and the individual base learner rankings (both identifying age, RA-midu, and w/b as top contributors) further validates that the meta-learner’s attention weights genuinely reflect domain-informed feature prioritization rather than spurious statistical correlations.
Specimen age maintains its position as the most influential feature, with SHAP values spanning approximately −15 to +15 MPa. Using the SHAP algorithm, the impact of different input features on model output has been visualized, with negative SHAP values corresponding to a decreasing effect on model prediction, whereas positive SHAP values indicate an increasing effect [97]. The dense clustering of red points in the positive SHAP region and blue points in the negative region demonstrates a consistent, monotonic relationship between curing age and predicted strength. The broader SHAP value distribution compared to individual base learners reflects the ensemble’s enhanced sensitivity to age variations.
Recycled aggregate density emerges as the second most critical parameter, exhibiting SHAP values ranging from approximately −10 to +10 MPa. The clear separation between high-density values contributing positively and low-density values contributing negatively demonstrates that the ensemble has successfully learned the fundamental relationship between aggregate quality and concrete performance. The ensemble’s SHAP distribution for this feature shows greater definition compared to individual base learners.
The water-to-binder ratio continues to demonstrate strong negative influence, with high w/b values predominantly associated with negative SHAP contributions. The SHAP analysis demonstrates that the developed ensemble models capture physically meaningful relationships between mixture design parameters and compressive strength, where ensemble learning models trained on comprehensive datasets enhance predictions by 15–20% and decrease Root Mean Squared Error [98]. The concentration and consistency of this pattern reflects the robustness of this relationship, as both base learners independently identify the inverse correlation and the meta-learner reinforces this finding through optimal weight assignment.
Fly ash content maintains its complex, bidirectional influence pattern, with SHAP values distributed across both positive and negative regions. The ensemble captures a more nuanced representation of fly ash effects, reflecting the meta-learner’s ability to identify context-dependent relationships where fly ash contributions vary based on interactions with other mixture parameters such as cement content, water-to-binder ratio, and curing age. This sophisticated pattern recognition capability exemplifies the advantage of ensemble learning for modeling complex material behaviors.
Cement content shows predominantly positive influence with moderate SHAP value magnitudes, while water content displays clear negative correlation patterns. Fineness modulus, recycled aggregate water absorption, maximum aggregate size, and sand content exhibit moderate influence, with SHAP value distributions centered near zero but showing occasional significant deviations. These patterns suggest that while these parameters exert secondary effects on average, they can become critically important under specific mixture proportion combinations.
The lower-ranked features maintain limited direct influence in the ensemble model. However, their inclusion remains justified, as they may participate in higher-order interactions that contribute to the ensemble’s overall predictive accuracy. The consistency of findings across different model configurations demonstrates that precise predictions are crucial for enhancing structural reliability and optimizing resource usage in construction projects, with machine learning algorithms successfully capturing the intricate interactions between input variables [86]. The interpretability provided by SHAP analysis enables practitioners to understand not only which features are important but also how their specific values influence predictions, thereby facilitating informed decision-making in mix design optimization for recycled aggregate self-compacting concrete applications.

7. Conclusions

This investigation successfully developed and validated advanced ensemble machine learning frameworks for predicting the compressive strength of recycled aggregate self-compacting concrete, addressing critical gaps in sustainable construction materials research. Through systematic analysis of 301 experimental observations encompassing diverse mixture compositions, curing ages, and recycled aggregate characteristics, this study establishes robust predictive capabilities while providing interpretable insights into the complex relationships governing concrete mechanical performance.
The comparative evaluation of four machine learning methodologies revealed that the Stacking ensemble approach, employing LightGBM and CatBoost as base learners with Ridge regression as meta-learner, achieved superior predictive accuracy with a coefficient of determination of 0.963, root mean squared error of 3.321 MPa, and mean absolute error of 2.506 MPa on the testing dataset. This performance represents a substantial advancement over conventional empirical prediction methods, which typically achieve coefficient of determination values below 0.30 for recycled concrete systems. The Light Gradient Boosting Machine demonstrated an optimal balance between predictive accuracy and computational efficiency, achieving a coefficient of determination of 0.961 with execution time of merely 0.124 s, suggesting its suitability for real-time applications in construction practice where rapid mix design optimization is required.
Comprehensive residual diagnostic analysis validated the statistical integrity of the developed models, confirming that prediction errors satisfy fundamental assumptions including approximate normality, homoscedasticity across the entire strength range, temporal independence, and random distribution without systematic bias. These characteristics demonstrate that the ensemble learning approach successfully captures the underlying physical relationships governing concrete strength development without overfitting to spurious patterns in the training data. The consistent performance across training and testing datasets, with minimal variance between evaluation metrics, provides strong evidence of robust generalization capability to new mixture compositions not encountered during model development.
The application of SHAP-based interpretability analysis quantified the relative contributions of individual mixture parameters to compressive strength predictions, revealing a clear hierarchy of feature importance that aligns with established concrete technology principles. Specimen age emerged as the most influential parameter, exhibiting SHAP values ranging from negative 15 to positive 15 MPa, reflecting the fundamental role of hydration kinetics in strength development. Recycled aggregate density demonstrated the second highest importance, with high-density aggregates consistently contributing positively to strength predictions through reduced porosity and enhanced interfacial bonding characteristics. The water-to-binder ratio exhibited strong negative influence, corroborating the well-established inverse relationship between water content and concrete strength through effects on capillary porosity and matrix density.
The SHAP analysis revealed complex, context-dependent contributions from supplementary cementitious materials, with fly ash displaying bidirectional influence patterns that reflect the competing effects of pozzolanic strength enhancement versus cement dilution and delayed early-age strength development. Recycled aggregate water absorption demonstrated notable negative influence when absorption values exceeded four percent, highlighting the critical importance of aggregate moisture management in mix design optimization. The consistency of feature importance rankings across different base learner architectures enhances confidence that these relationships represent genuine physical dependencies rather than model-specific artifacts.
The developed predictive framework enables several practical advances for sustainable construction applications. First, the models facilitate rapid optimization of recycled aggregate self-compacting concrete mixture proportions to achieve targeted strength requirements while maximizing recycled content, thereby supporting circular economy principles within the construction sector. Second, the interpretability provided by SHAP analysis allows practitioners to understand which mixture parameters require precise control and which parameters offer flexibility for cost optimization or material substitution. Third, the robust performance across strength ranges from 5.36 to 89 MPa demonstrates applicability to both conventional structural applications and high-performance concrete formulations, expanding the potential scope of recycled aggregate utilization.
Several limitations of this investigation warrant acknowledgment. The dataset, while encompassing substantial compositional diversity with 301 specimens spanning at least 10 experimental series, three cement strength grades, recycled aggregate replacement ratios from 0% to 100%, and multiple supplementary cementitious material combinations, derives from laboratory-scale specimens tested under standard curing conditions (approximately 20 ± 2 °C, ≥95% relative humidity). The current predictive framework addresses compressive strength as a function of mixture design parameters and curing age and does not incorporate durability-related phenomena such as carbonation depth progression, chloride ion ingress, or freeze–thaw degradation, which are governed by distinct environmental exposure variables beyond the scope of the present mixture-property modeling approach. Validation using field-scale concrete placements subjected to variable curing conditions, construction practices, and environmental exposures would enhance confidence in the models’ applicability to real-world construction scenarios. The temporal scope of strength predictions extends to 112 days, yet many infrastructure applications require service life predictions spanning decades. Future work should integrate long-term strength development models and dedicated durability datasets incorporating environmental exposure parameters to extend the predictive framework from mechanical strength estimation to comprehensive service life assessment, thereby bridging the gap between laboratory-calibrated models and field-level engineering applications.
This study’s dataset focuses on concrete compressive strength and does not systematically include fresh concrete workability indicators (such as slump or spread flow). However, the water-to-binder ratio (w/b), a core parameter influencing workability, is included among the 18 input variables. SHAP analysis (Section 6) reveals that w/b exerts a significant negative effect on strength (with SHAP values ranging approximately from −8 to +5 MPa), which aligns with existing literature regarding the dual impact of w/b on both the flowability and strength of self-compacting concrete. Future research should incorporate rheological parameters (such as plastic viscosity and yield stress) to construct a synergistic prediction framework for workability and strength.
Future research should explore several promising extensions of this work. The incorporation of additional input features characterizing recycled aggregate microstructure, such as residual mortar content, interfacial transition zone properties, and micro-crack distributions, may enable more refined predictions and deeper mechanistic understanding. The development of multi-objective optimization frameworks that simultaneously consider compressive strength, workability characteristics, environmental impact metrics, and economic costs would support holistic decision-making in sustainable concrete design. The application of physics-informed machine learning approaches that embed fundamental conservation laws and constitutive relationships within neural network architectures may improve extrapolation capabilities and reduce data requirements for model training.
The integration of uncertainty quantification methods, such as Bayesian neural networks or conformal prediction intervals, would provide practitioners with confidence bounds on strength predictions, enabling risk-informed decision-making in structural design. The extension of interpretability analysis to investigate higher-order feature interactions, beyond the individual feature attributions provided by SHAP, may reveal synergistic or antagonistic effects among mixture components that could inform novel mix design strategies. The development of transfer learning approaches that leverage knowledge gained from natural aggregate concrete datasets to improve predictions for recycled aggregate systems with limited experimental data represents another valuable research direction.
In conclusion, this investigation demonstrates that ensemble machine learning, combined with rigorous interpretability analysis, provides powerful tools for advancing sustainable construction materials research and practice. The developed models achieve predictive accuracy exceeding 96 percent while maintaining transparency regarding the physical relationships underlying their predictions. By enabling confident utilization of recycled aggregate self-compacting concrete across diverse applications, this work supports the construction industry’s transition toward circular economy principles, contributing to reduced natural resource consumption, decreased construction waste generation, and diminished carbon dioxide emissions while maintaining the structural performance requirements essential for safe and durable infrastructure. The methodological framework established herein offers a template for addressing similar prediction challenges in other sustainable construction materials, accelerating the development and deployment of environmentally responsible building technologies.

Author Contributions

Conceptualization, Z.Z. and B.L.; methodology, Z.Z., B.L. and Y.S.; software, Z.Z., B.L. and Y.S.; validation, Z.Z., B.L. and Y.S.; formal analysis, Z.Z.; investigation, Z.Z.; resources, Z.Z., B.L. and Y.S.; data curation, Z.Z.; writing—original draft preparation, Z.Z.; writing—review and editing, B.L. and Y.S.; visualization, Z.Z., B.L. and Y.S.; supervision, B.L. and Y.S.; project administration, B.L.; Funding Acquisition, B.L. and Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China (Grant No. 52408449) and the Postgraduate Scientific Research Innovation Project of Hunan Province (Grant No. CX20240431).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SCCSelf-Compacting Concrete
RCARecycled Concrete Aggregates
FAFly Ash
w/bWater-to-Binder Ratio
RMSERoot Mean Squared Error
R2Coefficient of Determination
MAEMean Absolute Error
MSEMean Squared Error
StD(Std)Standard Deviation
SkewSkewness
MPaMegapascal (unit of pressure/strength)
LightGBMLight Gradient Boosting Machine
CatBoostCategorical Boosting
LGBMLight Gradient Boosting Machine (alternative abbreviation used)
SHAPSHapley Additive exPlanations
SVRSupport Vector Regression
1D-CNNOne-Dimensional Convolutional Neural Network
L2L2 Regularization (Ridge penalty)
NNLSNon-Negative Least Squares
VotingNNLSVoting ensemble with Non-Negative Least Squares optimization
RidgeCVRidge Regression with Cross-Validation
GOSSGradient-based One-Side Sampling (LightGBM algorithm)
EFBExclusive Feature Bundling (LightGBM algorithm)
CVCross-Validation
AIArtificial Intelligence
NANatural (Coarse) Aggregate
RARecycled (Coarse) Aggregate
GGBFSGround Granulated Blast Furnace Slag
SFSilica Fume
RASCCRecycled Aggregate Self-Compacting Concrete
RCARecycled Concrete Aggregate
OPCOrdinary Portland Cement
IQRInterquartile Range
P05, P25, P75, P955th, 25th, 75th, 95th Percentile

Appendix A

Data Set

Table A1. Data set from Yang’s research [36].
Table A1. Data set from Yang’s research [36].
NO.Age (d)CementtypeCement (kg/m3)FA (kg/m3)GGBFS (kg/m3)SF (kg/m3)w/bWater (kg/m3)NA (kg/m3)RA (kg/m3)NA-Midu (kg/m3)RA-Midu (kg/m3)NAabsorption (%)RAabsorption (%)RAreplace (%)Sand (kg/m3)Maxsize (mm)FinenessCompressive (MPa)
11442.55200000.351828670269825940.54.90785206.9539.33
22842.55200000.351828670269825940.54.90785206.9553.45
35642.55200000.351828670269825940.54.90785206.9554.65
411242.55200000.351828670269825940.54.90785206.9556.33
51442.55200000.35206.7433416.4269825940.54.950785206.9543.16
62842.55200000.35206.7433416.4269825940.54.950785206.9546.54
75642.55200000.35206.7433416.4269825940.54.950785206.9550.26
811242.55200000.35206.7433416.4269825940.54.950785206.9553.12
91442.5260260000.35206.7433416.4269825940.54.950785206.9517.08
102842.5260260000.35206.7433416.4269825940.54.950785206.9518.04
115642.5260260000.35206.7433416.4269825940.54.950785206.9526.37
1211242.5260260000.35206.7433416.4269825940.54.950785206.9538.06
131442.526013013000.35206.7433416.4269825940.54.950785206.9531.2
142842.526013013000.35206.7433416.4269825940.54.950785206.9533.51
155642.526013013000.35206.7433416.4269825940.54.950785206.9539.55
1611242.526013013000.35206.7433416.4269825940.54.950785206.9547.27
171442.5260104104520.35206.7433416.4269825940.54.950785206.9538.04
182842.5260104104520.35206.7433416.4269825940.54.950785206.9540.31
195642.5260104104520.35206.7433416.4269825940.54.950785206.9546.08
2011242.5260104104520.35206.7433416.4269825940.54.950785206.9554
211442.5130390000.35206.7433416.4269825940.54.950785206.955.36
222842.5130390000.35206.7433416.4269825940.54.950785206.957.17
235642.5130390000.35206.7433416.4269825940.54.950785206.9512.66
2411242.5130390000.35206.7433416.4269825940.54.950785206.9522.85
251442.513019519500.35206.7433416.4269825940.54.950785206.9518.27
262842.513019519500.35206.7433416.4269825940.54.950785206.9519.66
275642.513019519500.35206.7433416.4269825940.54.950785206.9524.63
2811242.513019519500.35206.7433416.4269825940.54.950785206.9531.44
291442.5130156156780.35206.7433416.4269825940.54.950785206.9529.93
302842.5130156156780.35206.7433416.4269825940.54.950785206.9535.54
315642.5130156156780.35206.7433416.4269825940.54.950785206.9542.43
3211242.5130156156780.35206.7433416.4269825940.54.950785206.9549.76
332842.55200000.35214.90832269825940.54.9100785206.9543.89
342842.5260260000.35214.90832269825940.54.9100785206.9521
352842.526013013000.35214.90832269825940.54.9100785206.9538.38
362842.5260104104520.35214.90832269825940.54.9100785206.9549.44
372842.5130390000.35214.90832269825940.54.9100785206.9513.64
382842.513019519500.35214.90832269825940.54.9100785206.9530.84
392842.5130156156780.35214.90832269825940.54.9100785206.9542.75
402842.54550000.4214.90832269825940.54.9100785206.9529.81
412842.5113.75341.2000.4214.90832269825940.54.9100785206.958.31
422842.5113.75170.6170.6300.4214.90832269825940.54.9100785206.9518.35
432842.5113.75136.5136.568.30.4214.90832269825940.54.9100785206.9526.23
442842.54040000.45214.90832269825940.54.9100785206.9519.75
452842.5101303000.45214.90832269825940.54.9100785206.955.43
462842.5101151.5151.500.45214.90832269825940.54.9100785206.9512.07
472842.5101121.2121.260.60.45214.90832269825940.54.9100785206.9518.86
482842.54970000.391948070272626850.540853207.1342
492842.54970000.39207404392272626850.5450853207.1340.31
502842.54970000.392210783272626850.54100853207.1337.89
512842.5248248000.39207404392272626850.5450853207.1326.98
522842.5248248000.392210783272626850.54100853207.1325.03
532842.524812412400.39207404392272626850.5450853207.1336.4
542842.524812412400.392210783272626850.54100853207.1333.24
552842.512418618600.39207404392272626850.5450853207.1334.76
562842.512418618600.392210783272626850.54100853207.1331.39
572842.5124159159530.39207404392272626850.5450853207.1338.95
582842.5124159159530.392210783272626850.54100853207.1336.72
592842.54970000.391948070272626850.540853207.1333.09
602842.54970000.39207404392272626850.5450853207.1330.11
612842.54970000.392210783272626850.54100853207.1328.43
622842.5248248000.39207404392272626850.5450853207.1323.5
632842.5248248000.392210783272626850.54100853207.1322.81
642842.524812412400.39207404392272626850.5450853207.1329.14
652842.524812412400.392210783272626850.54100853207.1325.29
662842.512418618600.39207404392272626850.5450853207.1324.46
672842.512418618600.392210783272626850.54100853207.1321.03
682842.5124159159530.39207404392272626850.5450853207.1331.98
692842.5124159159530.392210783272626850.54100853207.1329.13
702852.52500000.4511211500261024101.164.15065012.56.6849.09
712852.52500000.45112920250261024101.164.152065012.56.6849.98
722852.52500000.45112540540261024101.164.155067012.56.6855.64
732852.52500000.4511201040261024101.164.1510072012.56.6856.75
742852.52900000.3811211500261024101.164.15065012.56.6858.3
752852.52900000.38112920250261024101.164.152065012.56.6860.25
762852.52900000.38112540540261024101.164.155067012.56.6858.52
772852.52900000.3811201040261024101.164.1510072012.56.6870.56
782852.53200000.3511211500261024101.164.15065012.56.6863.36
792852.53200000.35112920250261024101.164.152065012.56.6864.13
802852.53200000.35112540540261024101.164.155067012.56.6866.82
812852.53200000.3511201040261024101.164.1510072012.56.6872.81
82542.53500000.511796230250023001.47.70975167.2329.9
83542.53500000.49172467195250023001.47.713896167.2328.7
84542.53500000.49172374313250023001.47.721849167.2337.1
85542.53500000.511790589250023001.47.740975167.2336.6
862842.53500000.511796230250023001.47.70975167.2343.8
872842.53500000.49172467195250023001.47.713896167.2345.4
882842.53500000.49172374313250023001.47.721849167.2350.3
892842.53500000.511790589250023001.47.740975167.2351.1
90742.5284.90000.559159.3807.50270524970.1454.0750730.719.15.7243.8
91742.5284.90000.559159.3726.874.9270524970.1454.07510730.719.15.7243.3
92742.5284.90000.56159.5646149.8270524970.1454.07520730.719.15.7243.1
93742.5284.90000.562160.1565.2224.6270524970.1454.07530730.719.15.7242.9
94742.5284.90000.562160.1484.5299.4270524970.1454.07540730.719.15.7242.5
952842.5284.90000.559159.3807.50270524970.1454.0750730.719.15.7254.2
962842.5284.90000.559159.3726.874.9270524970.1454.07510730.719.15.7253.9
972842.5284.90000.56159.5646149.8270524970.1454.07520730.719.15.7253.7
982842.5284.90000.562160.1565.2224.6270524970.1454.07530730.719.15.7253.3
992842.5284.90000.562160.1484.5299.4270524970.1454.07540730.719.15.7253
100252.53900000.51950700254023802.24.41001025127.6445
101252.53900000.51950700254022802.25.71001025127.6647.5
102252.53900000.51950700254022202.26.91001025127.1548.3
103752.53900000.51950700254023802.24.41001025127.6450.8
104752.53900000.51950700254022802.25.71001025127.6652.5
105752.53900000.51950700254022202.26.91001025127.1553.1
1062852.53900000.51950700254023802.24.41001025127.6459.3
1072852.53900000.51950700254022802.25.71001025127.6660.1
1082852.53900000.51950700254022202.26.91001025127.1561.7
109742.5300169000.41918750267025500.54.73076120 22.6
110742.5300169000.4191656206267025500.54.732576120 21.5
111742.5300169000.4191437413267025500.54.735076120 20.4
112742.5300169000.4191219619267025500.54.737576120 17.9
113742.5300158000.41940825267025500.54.7310076120 16.4
1142842.5300169000.41918750267025500.54.73076120 36.2
1152842.5300169000.4191656206267025500.54.732576120 35.4
1162842.5300169000.4191437413267025500.54.735076120 34.7
1172842.5300169000.4191219619267025500.54.737576120 32.8
1182842.5300158000.41940825267025500.54.7310076120 30.3
119352.5440146.67000.36211.27500270024000.726.1084612.56.118.4
120352.5440146.6000.36211.2600133.6270024000.726.12084612.56.119.5
121352.5440146.6000.36211.27500270024000.726.10820.6212.56.123.3
122352.5440146.6000.36211.2600133.6270024000.726.120820.6212.56.124
123752.5440146.6000.36211.27500270024000.726.1084612.56.131.3
124752.5440146.6000.36211.2600133.6270024000.726.12084612.56.129
125752.5440146.6000.36211.27500270024000.726.10820.6212.56.130
126752.5440146.6000.36211.2600133.6270024000.726.120820.6212.56.131
1272852.5440146.6000.36211.27500270024000.726.1084612.56.143.4
1282852.5440146.6000.36211.2600133.6270024000.726.12084612.56.148
1292852.5440146.6000.36211.27500270024000.726.10820.6212.56.145.6
1302852.5440146.6000.36211.2600133.6270024000.726.120820.6212.56.146.9
131742.5270247000.361877870266526001.21.80698 29
132742.5270247000.36187591193266526001.21.825690 27
133742.5270247000.36187394385266526001.21.850682 26.5
134742.5270247000.361870770266526001.21.8100698 25
135742.5270247000.361877870266526001.21.80667 21
1362842.5270247000.361877870266526001.21.80698 43.5
1372842.5270247000.36187591193266526001.21.825690 39
1382842.5270247000.36187394385266526001.21.850682 37.2
1392842.5270247000.361870770266526001.21.8100698 34
1402842.5270247000.361877870266526001.21.80667 29
1419142.5270247000.361877870266526001.21.80698 49
1429142.5270247000.36187591193266526001.21.825690 47
1439142.5270247000.36187394385266526001.21.850682 44
1449142.5270247000.361870770266526001.21.8100698 44
1459142.5270247000.361877870266526001.21.80667 39
146742.5437148000.321887870266524901.22.20695 66
147742.5437148000.32188591184266524901.22.225677 61.7
148742.5437148000.32188394369266524901.22.250659 59
149742.5437148000.321880737266524901.22.2100695 59
150742.5437148000.321887870266524901.22.20624 56.8
1512842.5437148000.321887870266524901.22.20695 79
1522842.5437148000.32188591184266524901.22.225677 77.5
1532842.5437148000.32188394369266524901.22.250659 76.2
1542842.5437148000.321880737266524901.22.2100695 74
1552842.5437148000.321887870266524901.22.20624 69
1569142.5437148000.321887870266524901.22.20695 89
1579142.5437148000.32188591184266524901.22.225677 87.9
1589142.5437148000.32188394369266524901.22.250659 87
1599142.5437148000.321880737266524901.22.2100695 85
1609142.5437148000.321887870266524901.22.20624 78.5
161731.25340200000.331800895 2530 3.8910069520 32.9
162731.25340200000.331800895 2530 3.8910067420 34
163731.25340200000.331800895 2530 3.8910065320 31.1
164731.25340200000.331800895 2530 3.8910063220 29.7
165731.25340200000.331800895 2530 3.8910061020 29.2
1662831.25340200000.331800895 2530 3.8910069520 44.3
1672831.25340200000.331800895 2530 3.8910067420 44.5
1682831.25340200000.331800895 2530 3.8910065320 43.4
1692831.25340200000.331800895 2530 3.8910063220 41.3
1702831.25340200000.331800895 2530 3.8910061020 38.7
1719031.25340200000.331800895 2530 3.8910069520 56.5
1729031.25340200000.331800895 2530 3.8910067420 54.7
1739031.25340200000.331800895 2530 3.8910065320 55.7
1749031.25340200000.331800895 2530 3.8910063220 50.8
1759031.25340200000.331800895 2530 3.8910061020 50.1
176731.25340270000.31800850 2530 3.8910066220 36.8
177731.25340270000.31800850 2530 3.8910064220 43.9
178731.25340270000.31800850 2530 3.8910062220 42.1
179731.25340270000.31800850 2530 3.8910060220 40.9
180731.25340270000.31800850 2530 3.8910058120 38.3
1812831.25340270000.31800850 2530 3.8910066220 53.7
1822831.25340270000.31800850 2530 3.8910064220 64.3
1832831.25340270000.31800850 2530 3.8910062220 62.3
1842831.25340270000.31800850 2530 3.8910060220 56.3
1852831.25340270000.31800850 2530 3.8910058120 53.2
1869031.25340270000.31800850 2530 3.8910066220 78.9
1879031.25340270000.31800850 2530 3.8910064220 82.6
1889031.25340270000.31800850 2530 3.8910062220 81.4
1899031.25340270000.31800850 2530 3.8910060220 75.3
1909031.25340270000.31800850 2530 3.8910058120 71.7
191731.25340270000.31800850 2530 3.8910058120 38.3
192731.25340270000.271650850 2530 3.8910061620 44
193731.25340270000.241450850 2530 3.8910066220 43.8
1942831.25340270000.31800850 2530 3.8910058120 53.2
1952831.25340270000.271650850 2530 3.8910061620 59.1
1962831.25340270000.241450850 2530 3.8910066220 64.2
1979031.25340270000.31800850 2530 3.8910058120 71.7
1989031.25340270000.271650850 2530 3.8910061620 77
1999031.25340270000.241450850 2530 3.8910066220 81.8
2002831.254451550300.35220660026502450 081510 59.4
2012831.254451550300.3522049515226502450 2581510 63.7
2022831.254451550300.3522033030526502450 5081510 65.3
2032831.254451550300.3522016545826502450 7581510 60
2042831.254451550300.35220061026502450 10081510 53.8
2055642.5427.50142.500.3171859.80270023700.57.390765165.677.96
2065642.5370.50142.5570.3171851.40270023700.57.390757.5165.681.4
2075642.5360012000.43206.4869.30270023700.57.390773.4165.666.63
2085642.53120120480.43206.4862.30270023700.57.390767.2165.672.47
2095642.5427.50142.500.31710749.2270023700.57.39100765165.668.67
2105642.5370.50142.5570.31710741.9270023700.57.39100757.5165.670.39
2115642.5360012000.43206.40757.5270023700.57.39100773.4165.655.38
2125642.53120120480.43206.40751.3270023700.57.39100767.2165.663.89
2135642.5427.50142.500.3171859.80270023700.57.390667165.661.97
2145642.5370.50142.5570.3171851.40270023700.57.390660.5165.664.61
2155642.5360012000.43206.4869.30270023700.57.390674.4165.648.69
2165642.53120120480.43206.4862.30270023700.57.390668.9165.661.04
2175642.5427.50142.500.31710749.2270023700.57.39100667165.655.76
2185642.5370.50142.5570.31710741.9270023700.57.39100660.5165.657.41
2195642.5360012000.43206.40757.5270023700.57.39100674.4165.646.04
2205642.53120120480.43206.40751.3270023700.57.39100668.9165.652.92
2212844.6315135000.431947450264024800.214.80738166.4352.3
2222844.6315135000.482167200264024800.214.80713166.4342.2
2232844.6315135000.532396940264024800.214.80688166.4337.2
2242844.6315135000.43194596149264024800.214.820738166.4354.7
2252844.6315135000.48216576144264024800.214.820713166.4344
2262844.6315135000.53239555139264024800.214.820688166.4337.7
2272844.6315135000.43194447298264024800.214.840738166.4357.2
2282844.6315135000.48216432288264024800.214.840713166.4344.3
2292844.6315135000.53239416278264024800.214.840688166.4338.2
2302844.6315135000.43194298447264024800.214.860738166.4351.1
2312844.6315135000.48216288432264024800.214.860713166.4340.9
2322844.6315135000.53239278416264024800.214.860688166.4335.8
2332842.5440110000.382090705 2440 7.66100748166.8853
2342842.5440110000.382090708.7 2455 6.84100748166.8855.6
2352842.5440110000.382090680.9 2350 1.77100750166.8858.5
2362842.5440110000.382090705 100748166.8860
2372842.5440110000.382090635.5 2205 7.35100748166.8850
2389042.5440110000.382090705 2440 7.66100748166.8860.5
2399042.5440110000.382090708.7 2455 6.84100748166.8861.5
2409042.5440110000.382090680.9 2350 1.77100750166.8866
2419042.5440110000.382090705 100748166.8867.2
2429042.5440110000.382090635.5 2205 7.35100748166.8855
243742.5350100000.31140986027202440 0720225.9450
244742.5350100000.34153966027202440 0706225.9445.8
245742.5350100000.37167947027202440 0692225.9442.5
246742.5350100000.4180927027202440 0678225.9439.8
247742.5350100000.311400100227202440 100720225.9449
248742.5350100000.34153098227202440 100706225.9443.8
249742.5350100000.37167096327202440 100692225.9440
250742.5350100000.4180094327202440 100678225.9441
2512842.5350100000.31140986027202440 0720225.9457
2522842.5350100000.34153966027202440 0706225.9456.6
2532842.5350100000.37167947027202440 0692225.9456
2542842.5350100000.4180927027202440 0678225.9455.8
2552842.5350100000.311400100227202440 100720225.9455
2562842.5350100000.34153098227202440 100706225.9453.6
2572842.5350100000.37167096327202440 100692225.9453.1
2582842.5350100000.4180094327202440 100678225.9451.8
259752.5410260000.271788600264025241.23.7067620 54.5
260752.5410260000.27178645215264025241.23.72567620 52
261752.5410260000.27178430430264025241.23.75067620 50
262752.5410260000.271780860264025241.23.710067620 45
263752.5410260000.27178645215264025001.24.82567620 48
264752.5410260000.27178430430264025001.24.85067620 40.5
265752.5410260000.271780860264025001.24.810067620 36.5
2661452.5410260000.271788600264025241.23.7067620 59.5
2671452.5410260000.27178645215264025241.23.72567620 57
2681452.5410260000.27178430430264025241.23.75067620 55.2
2691452.5410260000.271780860264025241.23.710067620 49
2701452.5410260000.27178645215264025001.24.82567620 53.2
2711452.5410260000.27178430430264025001.24.85067620 50
2721452.5410260000.271780860264025001.24.810067620 44.6
2732852.5410260000.271788600264025241.23.7067620 69.5
2742852.5410260000.27178645215264025241.23.72567620 65.9
2752852.5410260000.27178430430264025241.23.75067620 62
2762852.5410260000.271780860264025241.23.710067620 55.2
2772852.5410260000.27178645215264025001.24.82567620 60.2
2782852.5410260000.27178430430264025001.24.85067620 57.4
2792852.5410260000.271780860264025001.24.810067620 52.9
2805652.5410260000.271788600264025241.23.7067620 73
2815652.5410260000.27178645215264025241.23.72567620 71
2825652.5410260000.27178430430264025241.23.75067620 68.7
2835652.5410260000.271780860264025241.23.710067620 63.2
2845652.5410260000.27178645215264025001.24.82567620 66.5
2855652.5410260000.27178430430264025001.24.85067620 64.4
2865652.5410260000.271780860264025001.24.810067620 59.2
287742.5257.71259.3000.33169.95848026502515 6.850706.86206.7936.9
288742.5257.71259.3000.33169.95636199.9426502515 6.8525706.86206.7929.8
289742.5257.71259.3000.33169.95424399.8926502515 6.8550706.86206.7929.3
290742.5257.71259.3000.33169.95212599.8326502515 6.8575706.86206.7929
291742.5257.71259.3000.33169.950799.7726502515 6.85100706.86206.7924.9
2922842.5257.71259.3000.33169.95848026502515 6.850706.86206.7949.7
2932842.5257.71259.3000.33169.95636199.9426502515 6.8525706.86206.7938.3
2942842.5257.71259.3000.33169.95424399.8926502515 6.8550706.86206.7942.2
2952842.5257.71259.3000.33169.95212599.8326502515 6.8575706.86206.7939.9
2962842.5257.71259.3000.33169.950799.7726502515 6.85100706.86206.7936.5
2976042.5257.71259.3000.33169.95848026502515 6.850706.86206.7956.8
2986042.5257.71259.3000.33169.95636199.9426502515 6.8525706.86206.7945
2996042.5257.71259.3000.33169.95424399.8926502515 6.8550706.86206.7948.2
3006042.5257.71259.3000.33169.95212599.8326502515 6.8575706.86206.7937.4
3016042.5257.71259.3000.33169.950799.7726502515 6.85100706.86206.7933.5
Note: FA = Fly Ash; GGBFS = Ground Granulated Blast Furnace Slag; SF = Silica Fume; w/b = Water-to-Binder Ratio; NA = Natural Coarse Aggregate; RA = Recycled Coarse Aggregate; NA-midu = Apparent Density of Natural Coarse Aggregate; RA-midu = Apparent Density of Recycled Coarse Aggregate; NAabsorption = Water Absorption of Natural Coarse Aggregate; RAabsorption = Water Absorption of Recycled Coarse Aggregate; RAreplace = Recycled Aggregate Replacement Ratio; Maxsize = Maximum Aggregate Size; fineness = Fineness Modulus of Fine Aggregate; cement type = OPC Strength Grade (MPa).

References

  1. World Business Council for Sustainable Development. Cement Technology Roadmap 2009: Carbon Emission Reductions up to 2050; Energy Consumption; IEA: Paris, France, 2009.
  2. Luo, B.; Su, Y.; Ding, X.; Chen, Y.; Liu, C. Modulation of initial CaO/Al2O3 and SiO2/Al2O3 ratios on the properties of slag/fly ash-based geopolymer stabilized clay: Synergistic effects and stabilization mechanism. Mater. Today Commun. 2025, 47, 113295. [Google Scholar] [CrossRef]
  3. Luo, B.; Su, Y.; Hu, X.; Chen, Z.; Chen, Y.; Ding, X. Strength behavior and microscopic mechanisms of geopolymer-stabilized waste clays considering clay mineralogy. J. Clean. Prod. 2025, 530, 146877. [Google Scholar] [CrossRef]
  4. Su, Y.; Luo, B.; Luo, Z.; Xu, F.; Huang, H.; Long, Z.; Shen, C. Mechanical characteristics and solidification mechanism of slag/fly ash-based geopolymer and cement solidified organic clay: A comparative study. J. Build. Eng. 2023, 71, 106459. [Google Scholar] [CrossRef]
  5. Shi, Y.; Wang, Y.; Wang, L.-N.; Wang, W.-N.; Yang, T.-Y. Bridge Tower Warning Method Based on Improved Multi-Rate Fusion Under Strong Wind Action. Buildings 2025, 15, 2733. [Google Scholar] [CrossRef]
  6. Shi, Y.; Wang, Y.; Wang, L.-N.; Wang, W.-N.; Yang, T.-Y. Bridge Cable Performance Warning Method Based on Temperature and Displacement Monitoring Data. Buildings 2025, 15, 2342. [Google Scholar] [CrossRef]
  7. Liu, Z.; Qi, X.; Ke, J.; Shui, Z. Enhancing the toughness of ultra-high performance concrete through improved fiber-matrix interface bonding. Constr. Build. Mater. 2025, 491, 142616. [Google Scholar] [CrossRef]
  8. Andrew, R.M. Global CO2 Emissions from Cement production, 1928–2018. Earth Syst. Sci. Data 2019, 11, 1675–1710. [Google Scholar] [CrossRef]
  9. Miller, S.A.; John, V.M.; Pacca, S.A.; Horvath, A. Carbon dioxide reduction potential in the global cement industry by 2050. Cem. Concr. Res. 2018, 114, 115–124. [Google Scholar] [CrossRef]
  10. Wilson, D.C. Global Waste Management Outlook; United Nations Environment Programme: Nairobi, Kenya, 2015.
  11. Zheng, L.; Wu, H.; Zhang, H.; Duan, H.; Wang, J.; Jiang, W.; Dong, B.; Liu, G.; Zuo, J.; Song, Q. Characterizing the generation and flows of construction and demolition waste in China. Constr. Build. Mater. 2017, 136, 405–413. [Google Scholar] [CrossRef]
  12. Zhang, N.; Duan, H.; Sun, P.; Li, J.; Zuo, J.; Mao, R.; Liu, G.; Niu, Y. Characterizing the generation and environmental impacts of subway-related excavated soil and rock in China. J. Clean. Prod. 2020, 248, 119242. [Google Scholar] [CrossRef]
  13. Wu, H.; Zuo, J.; Yuan, H.; Zillante, G.; Wang, J. Cross-regional mobility of construction and demolition waste in Australia: An exploratory study. Resour. Conserv. Recycl. 2020, 156, 104710. [Google Scholar] [CrossRef]
  14. Su, T.; Yu, X.; Jin, H.; Chen, L.; Tan, Z.; Ngo, T. Macro-mechanical properties and freeze thaw evaluation of innovative nano-silica modified concrete reinforced by recycled carpet fibers. Constr. Build. Mater. 2025, 492, 142894. [Google Scholar] [CrossRef]
  15. Akhtar, A.; Sarmah, A.K. Construction and demolition waste generation and properties of recycled aggregate concrete: A global perspective. J. Clean. Prod. 2018, 186, 262–281. [Google Scholar] [CrossRef]
  16. Kim, J. Influence of quality of recycled aggregates on the mechanical properties of recycled aggregate concretes: An overview. Constr. Build. Mater. 2022, 328, 127071. [Google Scholar] [CrossRef]
  17. Geng, Y.; Ji, Y.; Wang, D.; Yuan, Y.; Zhang, H. Analyzing the impact of fiber-reinforced recycled ceramic waste concrete on carbon emissions and mechanical performance: A systematic review. Low-Carbon Mater. Green Constr. 2025, 3, 26. [Google Scholar] [CrossRef]
  18. Safhi, A.e.M.; Benzerzour, M.; Rivard, P.; Abriak, N.-E.; Ennahal, I. Development of self-compacting mortars based on treated marine sediments. J. Build. Eng. 2018, 22, 252–261. [Google Scholar] [CrossRef]
  19. Kebaïli, O.; Mouret, M.; Arabi, N.; Cassagnabere, F. Adverse effect of the mass substitution of natural aggregates by air-dried recycled concrete aggregates on the self-compacting ability of concrete: Evidence and analysis through an example. J. Clean. Prod. 2015, 87, 752–761. [Google Scholar] [CrossRef]
  20. Poon, C.S.; Shui, Z.H.; Lam, L.; Fok, H.; Kou, S.C. Influence of moisture states of natural and recycled aggregates on the slump and compressive strength of concrete. Cem. Concr. Res. 2004, 34, 31–36. [Google Scholar] [CrossRef]
  21. Silva, R.V.; de Brito, J.; Dhir, R.K. The influence of the use of recycled aggregates on the compressive strength of concrete: A review. Eur. J. Environ. Civ. Eng. 2014, 19, 825–849. [Google Scholar] [CrossRef]
  22. Li, T.; Nogueira, R.; Costa Pereira, M.F.; de Brito, J.; Liu, J. Effect of the incorporation ratio of recycled concrete aggregate on the properties of self-compacting mortar. Cem. Concr. Compos. 2024, 147, 105429. [Google Scholar] [CrossRef]
  23. Evangelista, L.; de Brito, J. Mechanical behaviour of concrete made with fine recycled concrete aggregates. Cem. Concr. Compos. 2007, 29, 397–401. [Google Scholar] [CrossRef]
  24. Makul, N.; Rattanadecho, P.; Agrawal, D.K. Applications of microwave energy in cement and concrete—A review. Renew. Sustain. Energy Rev. 2014, 37, 715–733. [Google Scholar] [CrossRef]
  25. Salehi, H.; Burgueño, R. Emerging artificial intelligence methods in structural engineering. Eng. Struct. 2018, 171, 170–189. [Google Scholar] [CrossRef]
  26. Avci, O.; Abdeljaber, O.; Kiranyaz, S.; Hussein, M.; Gabbouj, M.; Inman, D.J. A review of vibration-based damage detection in civil structures: From traditional methods to Machine Learning and Deep Learning applications. Mech. Syst. Signal Process. 2021, 147, 107077. [Google Scholar] [CrossRef]
  27. Guo, Z.; Li, J.; Wang, T.; Xie, J.; Yang, J.; Niu, B. Dynamically Constrained Digital Twin-Based Mechanical Diagnosis Framework Under Undetermined States Without Fault Data. IEEE Trans. Instrum. Meas. 2025, 74, 3547715. [Google Scholar] [CrossRef]
  28. Zhao, Y.; Ta, Y.; Bi, R.; Tang, B.; Lu, Z.; Yan, Y.; Xie, J.; Guo, Z. A cross-working-condition prediction method for bearing remaining useful life based on SPW-SVDD health indicators and temporal-self -attention mechanism. Adv. Eng. Inform. 2026, 71, 104313. [Google Scholar] [CrossRef]
  29. Naderpour, H.; Rafiean, A.H.; Fakharian, P. Compressive strength prediction of environmentally friendly concrete using artificial neural networks. J. Build. Eng. 2018, 16, 213–219. [Google Scholar] [CrossRef]
  30. Kumar, M.; Biswas, R.; Kumar, D.R.; Samui, P.; Kaloop, M.R.; Eldessouki, M. Soft computing-based prediction models for compressive strength of concrete. Case Stud. Constr. Mater. 2023, 19, e02321. [Google Scholar] [CrossRef]
  31. Geng, Y.; Ji, Y.; Wang, D.; Zhang, H.; Lu, Z.; Xing, A.; Gao, M.; Chen, M. Strength prediction of recycled concrete using hybrid artificial intelligence models with Gaussian noise addition. Eng. Appl. Artif. Intell. 2025, 149, 110566. [Google Scholar] [CrossRef]
  32. Güneyisi, E.; Gesoglu, M.; Algın, Z.; Yazıc, H. Rheological and fresh properties of self-compacting concretes containing coarse and fine recycled concrete aggregates. Constr. Build. Mater. 2016, 113, 622–630. [Google Scholar] [CrossRef]
  33. Jiao, D.; Shi, C.; Yuan, Q.; An, X.; Liu, Y.; Li, H. Effect of constituents on rheological properties of fresh concrete—A review. Cem. Concr. Compos. 2017, 83, 146–159. [Google Scholar] [CrossRef]
  34. Fonseca, N.; de Brito, J.; Evangelista, L. The influence of curing conditions on the mechanical performance of concrete made with recycled concrete waste. Cem. Concr. Compos. 2011, 33, 637–643. [Google Scholar] [CrossRef]
  35. Kazemi, F.; Çiftçioğlu, A.Ӧ; Shafighfard, T.; Asgarkhani, N.; Jankowski, R. RAGN-R: A multi-subject ensemble machine-learning method for estimating mechanical properties of advanced structural materials. Comput. Struct. 2025, 308, 107657. [Google Scholar] [CrossRef]
  36. Yang, S.; Sun, J.; Xu, Z. Prediction on compressive strength of recycled aggregate self-compacting concrete by machine learning method. J. Build. Eng. 2024, 88, 109055. [Google Scholar] [CrossRef]
  37. Yi, S.-T.; Yang, E.-I.; Choi, J.-C. Effect of specimen sizes, specimen shapes, and placement directions on compressive strength of concrete. Nucl. Eng. Des. 2006, 236, 115–127. [Google Scholar] [CrossRef]
  38. Neville, A.M. Properties of Concrete; Harlow Pearson Education: Harlow, UK, 2011. [Google Scholar]
  39. Comite Euro-International Du Beton. CEB-FIP MODEL CODE 1990; Emerald Publishing Limited: Leeds, UK, 1993. [Google Scholar] [CrossRef]
  40. EN 1992-1-1; Eurocode 2: Design of Concrete Structures—Part 1-1: General Rules and Rules for Buildings. British Standards Institution: London, UK, 2004.
  41. Pacheco, J.; de Brito, J.; Chastre, C.; Evangelista, L. Probabilistic Conversion of the Compressive Strength of Cubes to Cylinders of Natural and Recycled Aggregate Concrete Specimens. Materials 2019, 12, 280. [Google Scholar] [CrossRef]
  42. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 3149–3157. [Google Scholar]
  43. Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, QC, Canada, 3–8 December 2018; Curran Associates Inc.: Red Hook, NY, USA, 2018; pp. 6639–6649. [Google Scholar]
  44. Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient boosting with categorical features support. arXiv 2018, arXiv:1810.11363. [Google Scholar] [CrossRef]
  45. Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
  46. Breiman, L. Stacked regressions. Mach. Learn. 1996, 24, 49–64. [Google Scholar] [CrossRef]
  47. Van Der Laan, M.J.; Polley, E.C.; Hubbard, A.E. Super Learner. Stat. Appl. Genet. Mol. Biol. 2007, 6, 25. [Google Scholar] [CrossRef]
  48. Lawson, C.L.; Hanson, R.J. Solving Least Squares Problems; SIAM: Bangkok, Thailand, 1995. [Google Scholar]
  49. Hoerl, A.E.; Kennard, R.W. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
  50. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4768–4777. [Google Scholar]
  51. Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.-I. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef]
  52. Chicco, D.; Warrens, M.J.; Jurman, G. The Coefficient of Determination R-squared Is More Informative than SMAPE, MAE, MAPE, MSE and RMSE in Regression Analysis Evaluation. PeerJ Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef]
  53. Willmott, C.J.; Matsuura, K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
  54. Yaseen, Z.M.; Deo, R.C.; Hilal, A.; Abd, A.M.; Bueno, L.C.; Salcedo-Sanz, S.; Nehdi, M.L. Predicting compressive strength of lightweight foamed concrete using extreme learning machine model. Adv. Eng. Softw. 2018, 115, 112–125. [Google Scholar] [CrossRef]
  55. Zhang, S. Nearest neighbor selection for iteratively kNN imputation. J. Syst. Softw. 2012, 85, 2541–2552. [Google Scholar] [CrossRef]
  56. Jain, A.K.; Dubes, R.C. Algorithms for Clustering Data; Prentice-Hall: Upper Saddle River, NJ, USA, 1988. [Google Scholar]
  57. Patro, S.G.K.; Sahu, K.K. Normalization: A Preprocessing Stage. arXiv 2015, arXiv:1503.06462. [Google Scholar] [CrossRef]
  58. Ali, A.; Shamsuddin, S.M.; Ralescu, A. Classification with class imbalance problem: A review. Int. J. Adv. Soft Compu. Appl. 2015, 7, 176–204. [Google Scholar]
  59. Yeh, I.-C. Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 1998, 28, 1797–1808. [Google Scholar] [CrossRef]
  60. NAsgarkhani Kazemi, F.; Jankowski, R.; Formisano, A. Dynamic ensemble-learning model for seismic risk assessment of masonry infilled steel structures incorporating soil-foundation-structure interaction. Reliab. Eng. Syst. Saf. 2025, 267, 111839. [Google Scholar] [CrossRef]
  61. Uddin, M.N.; Ye, J.; Deng, B.; Li, L.; Yu, K. Interpretable machine learning for predicting the strength of 3D printed fiber-reinforced concrete (3DP-FRC). J. Build. Eng. 2023, 72, 106648. [Google Scholar] [CrossRef]
  62. Hancock, J.T.; Khoshgoftaar, T.M. CatBoost for big data: An interdisciplinary review. J. Big Data 2020, 7, 94. [Google Scholar] [CrossRef]
  63. Li, Q.; Song, Z. Prediction of compressive strength of rice husk ash concrete based on stacking ensemble learning model. J. Clean. Prod. 2022, 382, 135279. [Google Scholar] [CrossRef]
  64. Ahmad, M.; Al-Shayea, N.A.; Tang, X.-W.; Jamal, A.; MAl-Ahmadi, H.; Ahmad, F. Predicting the Pillar Stability of Underground Mines with Random Trees and C4.5 Decision Trees. Appl. Sci. 2020, 10, 6486. [Google Scholar] [CrossRef]
  65. Bro, R.; De Jong, S. A fast non-negativity-constrained least squares algorithm. J. Chemom. 1997, 11, 393–401. [Google Scholar] [CrossRef]
  66. Jiang, Y.; Li, H.; Zhou, Y. Compressive Strength Prediction of Fly Ash Concrete Using Machine Learning Techniques. Buildings 2022, 12, 690. [Google Scholar] [CrossRef]
  67. Kaloop, M.R.; Kumar, D.; Samui, P.; Hu, J.W.; Kim, D. Compressive strength prediction of high-performance concrete using gradient tree boosting machine. Constr. Build. Mater. 2020, 264, 120198. [Google Scholar] [CrossRef]
  68. Lee, S.; Nguyen, N.; Karamanli, A.; Lee, J.; Vo, T.P. Super learner machine-learning algorithms for compressive strength prediction of high performance concrete. Struct. Concr. 2022, 24, 2208–2228. [Google Scholar] [CrossRef]
  69. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer Series in Statistics; Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
  70. Yeh, I. Concrete Compressive Strength; UCI Machine Learning Repository: Irvine, CA, USA, 1998. [Google Scholar] [CrossRef]
  71. Verma, V. A Comprehensive Framework for Residual Analysis in Regression and Machine Learning. J. Inf. Syst. Eng. Manag. 2025, 10, 34–46. [Google Scholar] [CrossRef]
  72. Li, W.; Cook, D.; Tanaka, E.; VanderPlas, S. A Plot is Worth a Thousand Tests: Assessing Residual Diagnostics with the Lineup Protocol. J. Comput. Graph. Stat. 2024, 33, 1497–1511. [Google Scholar] [CrossRef]
  73. Farhadi, S.; Tatullo, S.; Ferrian, F. Comparative analysis of ensemble learning techniques for enhanced fatigue life prediction. Sci. Rep. 2025, 15, 11136. [Google Scholar] [CrossRef]
  74. Hosseinzadeh, M.; Mousavi, S.; Dehestani, M. An ensemble learning-based prediction model for the compressive strength degradation of concrete containing superabsorbent polymers (SAP). Sci. Rep. 2024, 14, 18535. [Google Scholar] [CrossRef]
  75. Jin, H.; Zhang, E.; Espinosa, H.D. Recent Advances and Applications of Machine Learning in Experimental Solid Mechanics: A Review. Appl. Mech. Rev. 2023, 75, 061001. [Google Scholar] [CrossRef]
  76. Wang, Q.; Lu, H. A novel stacking ensemble learner for predicting residual strength of corroded pipelines. npj Mater. Degrad. 2024, 8, 87. [Google Scholar] [CrossRef]
  77. Liu, Y.; Yang, Z.; Zou, X.; Ma, S.; Liu, D.; Avdeev, M.; Shi, S. Data quantity governance for machine learning in materials science. Natl. Sci. Rev. 2023, 10, nwad125. [Google Scholar] [CrossRef]
  78. Wang, W.; Zhao, Y.; Li, Y. Ensemble machine learning for predicting the homogenized elastic properties of unidirectional composites: A SHAP-based interpretability analysis. Acta Mech. Sin. 2024, 40, 423301. [Google Scholar] [CrossRef]
  79. Elhishi, S.; Elashry, A.M.; El-Metwally, S. Unboxing machine learning models for concrete strength prediction using XAI. Sci. Rep. 2023, 13, 19892. [Google Scholar] [CrossRef]
  80. Wang, W.; Zhong, Y.; Liao, G.; Ding, Q.; Zhang, T.; Li, X. Prediction of Compressive Strength of Concrete Specimens Based on Interpretable Machine Learning. Materials 2024, 17, 3661. [Google Scholar] [CrossRef]
  81. Lv, Q.; Zhang, J.; Zhang, L.; Zhao, H.; Ren, J. Machine learning-based optimization of concrete strength using interpretable models. Mater. Today Commun. 2025, 47, 112872. [Google Scholar] [CrossRef]
  82. Tipu, R.K.; Goyal, A.; Singh, D.; Kumar, A.K.A. Integrated deep learning and Bayesian optimization approach for enhanced prediction of high-performance concrete strength. Asian J. Civ. Eng. 2025, 26, 2371–2390. [Google Scholar] [CrossRef]
  83. Cakiroglu, C.; Shahjala, M.; Islam, K.; Mahmood, S.; Billah, M.; Nehdi, M.L. Explainable ensemble learning data-driven modeling of mechanical properties of fiber-reinforced rubberized recycled aggregate concrete. J. Build. Eng. 2023, 76, 107279. [Google Scholar] [CrossRef]
  84. Yousafzai, M.H.; Javed, M.F.; Rehan, M.; Jameel, M.; Alabduljabbar, H.; Ahmad, F. Application of Ensemble Learning, Deep Neural Networks, and Explainable AI in Predicting Compressive and Split Tensile Strength of Steel Fiber-Reinforced Recycled Aggregate Concrete. Case Stud. Constr. Mater. 2025, 23, e05392. [Google Scholar] [CrossRef]
  85. Manan, A.; Pu, Z.; Ahmad, J.; Umar, M. Multi-targeted strength properties of recycled aggregate concrete through a machine learning approach. Eng. Comput. 2024, 42, 388–430. [Google Scholar] [CrossRef]
  86. Abioye, S.O.; Babatunde, Y.O.; Abikoye, O.A.; Shaibu, A.N.; Bankole, B.J. Optimized machine learning algorithms with SHAP analysis for predicting compressive strength in high-performance concrete. AI Civ. Eng. 2025, 4, 16. [Google Scholar] [CrossRef]
  87. Gazi, M.U.; Hasan, M.T.; Debnath, P. Few-shot meta-learning for concrete strength prediction: A model-agnostic approach with SHAP analysis. AI Civ. Eng. 2025, 4, 20. [Google Scholar] [CrossRef]
  88. Alizamir, M.; Wang, M.; Ikram, R.M.A.; Gholampour, A.; Ahmed, K.O.; Heddam, S.; Kim, S. An Interpretable XGBoost-SHAP Machine Learning Model for Reliable Prediction of Mechanical Properties in Waste Foundry Sand-Based Eco-Friendly Concrete. Results Eng. 2025, 25, 104307. [Google Scholar] [CrossRef]
  89. Katlav, M.; Turk, K. Stacking ensemble models for data-driven intelligent modelling of compressive strength of sustainable recycled brick aggregate concrete (RBAC). Mater. Today Commun. 2025, 49, 113937. [Google Scholar] [CrossRef]
  90. Gholizadeh-Vayghan, A.; Hernandez, G.M.; Kingne, F.K.; Gu, J.; Dilissen, N.; El Kadi, M.; Tysmans, T.; Vleugels, J.; Rahier, H.; Snellings, R. Thermal Reactivation of Hydrated Cement Paste: Properties and Impact on Cement Hydration. Materials 2024, 17, 2659. [Google Scholar] [CrossRef] [PubMed]
  91. Fan, C.; Qian, J.; Sun, H.; Fan, Y. Development and Promotion of Concrete Strength at Initial 24 Hours. Materials 2023, 16, 4452. [Google Scholar] [CrossRef]
  92. Yang, Q.; Wang, X.; Peng, X.; Qin, F. General Curve Model for Evaluating Mechanical Properties of Concrete at Different Ages. Coatings 2023, 13, 2002. [Google Scholar] [CrossRef]
  93. Mariak, A.; Kurpińska, M.; Wilde, K. Maturity curve for estimating the in-place strength of high performance concrete. MATEC Web Conf. 2019, 262, 06007. [Google Scholar] [CrossRef]
  94. Almutairi, A. Explainable Machine Learning-Based Prediction of Compressive Strength in Sustainable Recycled Aggregate Self-Compacting Concrete Using SHAP Analysis. Sustainability 2025, 17, 11334. [Google Scholar] [CrossRef]
  95. Dong, S.; Zhang, Z. Hybrid Deep Learning with Conformal Prediction for Recycled Aggregate Self-Compacting Concrete Strength Prediction. Buildings 2025, 15, 4419. [Google Scholar] [CrossRef]
  96. Phoeuk, M.; Kwon, M. Accuracy Prediction of Compressive Strength of Concrete Incorporating Recycled Aggregate Using Ensemble Learning Algorithms: Multinational Dataset. Adv. Civ. Eng. 2023, 2023, e5076429. [Google Scholar] [CrossRef]
  97. Cakiroglu, C.; Bekdaş, G. Predictive Modeling of Recycled Aggregate Concrete Beam Shear Strength Using Explainable Ensemble Learning Methods. Sustainability 2023, 15, 4957. [Google Scholar] [CrossRef]
  98. Onyelowe, K.C.; Kamchoom, V.; Hanandeh, S.; Kumar, S.A.; Vizuete, R.F.Z.; Murillo, R.O.S.; Polo, S.M.Z.; Castillo, R.M.T.; Ebid, A.M.; Awoyera, P.; et al. Physics-informed modeling of splitting tensile strength of recycled aggregate concrete using advanced machine learning. Sci. Rep. 2025, 15, 7135. [Google Scholar] [CrossRef]
Figure 1. Heatmap of Pearson correlation coefficient matrix for input and output variables.
Figure 1. Heatmap of Pearson correlation coefficient matrix for input and output variables.
Applsci 16 02432 g001
Figure 2. Sensitivity analysis of LightGBM hyperparameters (±20% perturbation from optimum). Error bars represent ±1 standard deviation across 5-fold cross-validation.
Figure 2. Sensitivity analysis of LightGBM hyperparameters (±20% perturbation from optimum). Error bars represent ±1 standard deviation across 5-fold cross-validation.
Applsci 16 02432 g002
Figure 3. Seed stability analysis shows the distribution of R2, RMSE, and MAE across three independent random train-test splits. The solid orange lines represent the median values, while the dashed dark lines indicate the mean values. The box extents represent the interquartile range (IQR), and the whiskers show the data range. The shaded background colors are used to visually distinguish between the different performance metrics.
Figure 3. Seed stability analysis shows the distribution of R2, RMSE, and MAE across three independent random train-test splits. The solid orange lines represent the median values, while the dashed dark lines indicate the mean values. The box extents represent the interquartile range (IQR), and the whiskers show the data range. The shaded background colors are used to visually distinguish between the different performance metrics.
Applsci 16 02432 g003
Figure 4. VotingNNLS Ensemble Flowchart.
Figure 4. VotingNNLS Ensemble Flowchart.
Applsci 16 02432 g004
Figure 5. Stacked Generalization Flowchart.
Figure 5. Stacked Generalization Flowchart.
Applsci 16 02432 g005
Figure 6. Correlation analysis between predicted and actual compressive strength values across four machine learning models. Scatter plots demonstrate the predictive accuracy of (a) Light Gradient Boosting Machine (LGBM, R2 = 0.9841), (b) Categorical Boosting (CatBoost, R2 = 0.9529), (c) Stacking ensemble model (R2 = 0.9429), and (d) Voting Non-Negative Least Squares (VotingNNLS, R2 = 0.9257).
Figure 6. Correlation analysis between predicted and actual compressive strength values across four machine learning models. Scatter plots demonstrate the predictive accuracy of (a) Light Gradient Boosting Machine (LGBM, R2 = 0.9841), (b) Categorical Boosting (CatBoost, R2 = 0.9529), (c) Stacking ensemble model (R2 = 0.9429), and (d) Voting Non-Negative Least Squares (VotingNNLS, R2 = 0.9257).
Applsci 16 02432 g006
Figure 7. Residual diagnostic plots for the Stacking ensemble model. (a) Residuals versus independent variable, (b) histogram of residual distribution, (c) residuals versus fitted values, (d) residual sequence plot, and (e) normal quantile-quantile (Q-Q) plot. The blue scatter points represent the individual residual values for each data point. The solid black horizontal lines in (a,c,d) indicate the zero-residual baseline, while the red reference line in (e) represents the expected distribution for normally distributed residuals. The blue-shaded bars in (b) illustrate the frequency distribution of residual magnitudes. The five-panel analysis validates model assumptions including randomness, normality, homoscedasticity, independence, and distributional conformity.
Figure 7. Residual diagnostic plots for the Stacking ensemble model. (a) Residuals versus independent variable, (b) histogram of residual distribution, (c) residuals versus fitted values, (d) residual sequence plot, and (e) normal quantile-quantile (Q-Q) plot. The blue scatter points represent the individual residual values for each data point. The solid black horizontal lines in (a,c,d) indicate the zero-residual baseline, while the red reference line in (e) represents the expected distribution for normally distributed residuals. The blue-shaded bars in (b) illustrate the frequency distribution of residual magnitudes. The five-panel analysis validates model assumptions including randomness, normality, homoscedasticity, independence, and distributional conformity.
Applsci 16 02432 g007
Figure 8. SHAP summary plot for the LightGBM model showing feature importance rankings and value distributions. Point color indicates the feature value (blue = low, red = high); positive/negative SHAP values increase/decrease the predicted compressive strength, respectively. The horizontal axis represents the SHAP value (MPa), indicating the magnitude of each feature’s contribution to the predicted compressive strength.
Figure 8. SHAP summary plot for the LightGBM model showing feature importance rankings and value distributions. Point color indicates the feature value (blue = low, red = high); positive/negative SHAP values increase/decrease the predicted compressive strength, respectively. The horizontal axis represents the SHAP value (MPa), indicating the magnitude of each feature’s contribution to the predicted compressive strength.
Applsci 16 02432 g008
Figure 9. SHAP summary plot for the CatBoost model showing feature importance rankings and value distributions. Point color indicates the feature value (blue = low, red = high); positive/negative SHAP values increase/decrease the predicted compressive strength, respectively. The horizontal axis represents the SHAP value (MPa), indicating the magnitude of each feature’s contribution to the predicted compressive strength.
Figure 9. SHAP summary plot for the CatBoost model showing feature importance rankings and value distributions. Point color indicates the feature value (blue = low, red = high); positive/negative SHAP values increase/decrease the predicted compressive strength, respectively. The horizontal axis represents the SHAP value (MPa), indicating the magnitude of each feature’s contribution to the predicted compressive strength.
Applsci 16 02432 g009
Figure 10. Mean absolute SHAP values for LightGBM and CatBoost base learner contributions in the Stacking ensemble. Mean absolute SHAP values (MPa) for LightGBM and CatBoost base learner contributions in the Stacking ensemble.
Figure 10. Mean absolute SHAP values for LightGBM and CatBoost base learner contributions in the Stacking ensemble. Mean absolute SHAP values (MPa) for LightGBM and CatBoost base learner contributions in the Stacking ensemble.
Applsci 16 02432 g010
Figure 11. Comprehensive SHAP beeswarm plot for the Stacking ensemble model showing integrated feature importance. Point color indicates the feature value (blue = low, red = high); positive/negative SHAP values increase/decrease the predicted compressive strength, respectively. The horizontal axis represents the SHAP value (MPa), indicating the magnitude of each feature’s contribution to the predicted compressive strength.
Figure 11. Comprehensive SHAP beeswarm plot for the Stacking ensemble model showing integrated feature importance. Point color indicates the feature value (blue = low, red = high); positive/negative SHAP values increase/decrease the predicted compressive strength, respectively. The horizontal axis represents the SHAP value (MPa), indicating the magnitude of each feature’s contribution to the predicted compressive strength.
Applsci 16 02432 g011
Table 1. Descriptive statistics of test database.
Table 1. Descriptive statistics of test database.
ParameterMissing %MinP05P25MedianP75P95AverageMaxStdSkewnessKurtosisIQR
Age (d)0271428289133.6677711227.149511.3080780.99196514
Cementtype031.2531.2542.542.542.552.542.9657852.56.233342−0.24739−0.02730
Cement (kg/m3)0101130270340410455332.112652097.49968−0.46527−0.01399140
FA (kg/m3)000100148259.3270150.7102390103.3366−0.14723−0.97333159.3
GGBFS (kg/m3)00000015624.8532619555.63311.909891.971680
SF (kg/m3)000000525.1857147816.545983.1194078.5459070
w/b00.240.270.320.350.390.510.3660930.5620.0714410.9391240.633030.07
Water (kg/m3)0112140178187206.7220187.211123924.64613−0.941851.58296628.7
Sand (kg/m3)0581622676720785853739.5728102590.306591.0556591.510642109
NA (kg/m3)0000404636875355.23121150339.04570.309048−1.24992636
RA (kg/m3)000184416.4799.77895467.91191040332.8876−0.07381−1.42094615.77
NA—density (kg/m3)16.282500254026402670270027262668.167272651.03288−1.548142.64130360
RA—density (kg/m3)0.662205230024402515259426852505.742268598.97229−0.488620.379628154
NAabsorption (%)28.240.1450.210.50.51.21.40.8084722.20.4746310.9492230.7848570.7
RAabsoption (%)7.641.771.83.894.734.97.394.6629147.71.5174950.234289−0.192271.01
RAreplace (%)000205010010056.7707610039.97796−0.15637−1.5420680
Maxsize (mm)9.971012.251620202218.29151223.008057−1.190320.2228554
Fineness37.215.65.66.16.886.957.236.6228047.660.555437−0.56088−0.821490.85
Compressive (MPa)05.3619.534.7645.856.875.346.283328916.615080.070976−0.2449422.04
NOTE: Cement type refers to the strength grade of Ordinary Portland Cement (OPC) according to the Chinese standard GB 175, with grades of 32.5, 42.5, and 52.5 MPa representing the 28-day compressive strength class. Fineness refers to the fineness modulus of fine aggregate (sand), which is dimensionless.
Table 2. Descriptive statistics of the early-age and standard/extended-age subsets.
Table 2. Descriptive statistics of the early-age and standard/extended-age subsets.
SubsetnPercentage (%)f_c,min (MPa)f_c,max (MPa)f_c,mean (MPa)f_c,std (MPa)
<7 days113.718.448.332.5711.02
≥7 days29096.35.368946.816.58
All data3011005.368946.2816.62
Table 3. HPO search space and selected optimum.
Table 3. HPO search space and selected optimum.
ParameterCandidatesBest
n_estimators[300, 500, 700]700
learning_rate[0.03, 0.05, 0.07]0.05
num_leaves[20, 31, 40]20
max_depth[−1, 5, 8]−1
min_child_samples[10, 20, 30]20
subsample[0.8, 1.0]0.8
colsample_bytree[0.8, 1.0]0.8
Table 4. Seed stability results across independent random splits.
Table 4. Seed stability results across independent random splits.
SeedR2RMSE (MPa)MAE (MPa)
00.9353.9913.208
10.9264.2093.289
20.9622.8932.31
Mean ± Std0.941 ± 0.0193.698 ± 0.7062.936 ± 0.543
Table 5. Comparison of the performance of the models.
Table 5. Comparison of the performance of the models.
Research MethodologyRMSEMSER2MAETime Consumption
(Seconds)
TrainingTestingTrainingTestingTrainingTestingTrainingTesting
LGBM1.8893.3863.56711.4660.9870.9611.2642.5890.124
CatBoost1.6243.7492.63814.0530.9900.9531.1672.6130.433
Stacking1.6933.3212.86711.0280.9890.9631.2052.5062.168
VotingNNLS1.6033.5532.56912.6220.9900.9571.1372.4880.559
Table 6. Performance comparison of ML models on the UCI Concrete Compressive Strength dataset.
Table 6. Performance comparison of ML models on the UCI Concrete Compressive Strength dataset.
Research MethodologyRMSEMSER2MAETime Consumption
(Seconds)
TrainingTestingTrainingTestingTrainingTestingTrainingTesting
LGBM1.21934.25001.486718.06230.99480.92990.59462.78560.4611
CatBoost2.60184.10836.769616.87810.97620.93451.86692.88930.8919
Stacking1.76173.97493.103415.79960.98910.93871.1762.67255.729
VotingNNLS1.21934.25001.486718.06230.99480.92990.59462.78561.3573
Table 7. Model prediction performance for early-age (<7 days) and standard/extended-age (≥7 days) subsets (full dataset, n = 301).
Table 7. Model prediction performance for early-age (<7 days) and standard/extended-age (≥7 days) subsets (full dataset, n = 301).
ModelSubsetnR2RMSE (MPa)MAE (MPa)
LGBM<7 days110.86763.8242.945
≥7 days2900.98062.3081.622
All data3010.97942.381.671
CatBoost<7 days110.96961.8321.359
≥7 days2900.98172.2391.464
All data3010.9822.2251.46
Stacking<7 days110.95782.1581.531
≥7 days2900.98352.1271.438
All data3010.98352.1281.441
VotingNNLS<7 days110.96661.9191.411
≥7 days2900.98212.2141.453
All data3010.98232.2041.451
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Z.; Luo, B.; Su, Y. Machine Learning-Based Prediction of Compressive Strength in Recycled Aggregate Self-Compacting Concrete: An Ensemble Modeling Approach with SHAP Interpretability Analysis. Appl. Sci. 2026, 16, 2432. https://doi.org/10.3390/app16052432

AMA Style

Zhang Z, Luo B, Su Y. Machine Learning-Based Prediction of Compressive Strength in Recycled Aggregate Self-Compacting Concrete: An Ensemble Modeling Approach with SHAP Interpretability Analysis. Applied Sciences. 2026; 16(5):2432. https://doi.org/10.3390/app16052432

Chicago/Turabian Style

Zhang, Zhengyang, Biao Luo, and Ya Su. 2026. "Machine Learning-Based Prediction of Compressive Strength in Recycled Aggregate Self-Compacting Concrete: An Ensemble Modeling Approach with SHAP Interpretability Analysis" Applied Sciences 16, no. 5: 2432. https://doi.org/10.3390/app16052432

APA Style

Zhang, Z., Luo, B., & Su, Y. (2026). Machine Learning-Based Prediction of Compressive Strength in Recycled Aggregate Self-Compacting Concrete: An Ensemble Modeling Approach with SHAP Interpretability Analysis. Applied Sciences, 16(5), 2432. https://doi.org/10.3390/app16052432

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop