1. Introduction
Concrete stands as the second most consumed substance on Earth after water and remains the most extensively utilized artificial construction material in contemporary infrastructure development [
1,
2,
3,
4]. The global construction industry’s substantial reliance on concrete has resulted in significant environmental challenges, particularly regarding natural resource depletion and carbon dioxide emissions [
5,
6,
7]. The cement industry alone accounts for approximately 8% of global anthropogenic CO
2 emissions, having contributed 44.9 Gt of CO
2 between 1928 and 2021 [
8,
9]. This environmental burden is further compounded by the construction sector’s generation of substantial solid waste, which constitutes approximately 36% of global waste production [
10]. The accelerating pace of urbanization and population growth has intensified these environmental pressures, creating an urgent need for sustainable alternatives in concrete production.
The management of construction and demolition waste has emerged as a critical environmental challenge in recent decades. As aging infrastructure undergoes renovation and replacement, large volumes of concrete waste are generated, requiring substantial financial resources for proper disposal [
11]. Traditional disposal methods, such as landfilling, not only occupy valuable land resources but also contribute to environmental pollution through groundwater contamination and greenhouse gas emissions [
12,
13]. This situation has catalyzed research into sustainable waste management strategies, with the utilization of recycled concrete aggregate (RCA) representing a particularly promising approach [
14]. By processing construction waste into recycled aggregates, the construction industry can simultaneously address waste management challenges and reduce the demand for natural aggregates, thereby promoting circular economy principles within the built environment [
15,
16]. Beyond conventional recycled concrete aggregates, other construction waste streams—such as recycled ceramic waste used as aggregate or cementitious replacement—have been shown to maintain or improve strength while reducing embodied carbon, especially when combined with fiber reinforcement [
17].
Self-compacting concrete (SCC), recognized as one of the most significant advancements in construction materials since its development in Japan during the 1980s, offers exceptional flowability, uniformity, and stability in its fresh state while demonstrating superior mechanical and durability properties when hardened [
18,
19]. The integration of recycled aggregates into SCC production presents a synergistic opportunity to combine the benefits of advanced concrete technology with sustainable waste utilization. However, the inherent characteristics of RCA, particularly its higher water absorption capacity, lower density, and irregular morphology compared to natural aggregates, significantly influence the fresh and hardened properties of the resulting concrete [
20,
21]. Research has demonstrated that RCA incorporation can affect both the rheological properties and mechanical performance of SCC, with the water absorption of recycled aggregates playing a particularly critical role in determining the effective water-to-cement ratio and consequently influencing the workability and strength development of the mixture [
22,
23].
The complex relationships between recycled aggregate properties, mixture proportions, and concrete performance characteristics present substantial challenges for traditional empirical prediction methods. Conventional approaches, including linear regression and polynomial models, have demonstrated limited effectiveness in capturing the nonlinear interactions among multiple variables that influence recycled concrete strength, typically achieving coefficient of determination (R
2) values of only 0.22 to 0.28 [
24]. This inadequacy has motivated researchers to explore more sophisticated computational approaches capable of modeling the intricate relationships inherent in recycled concrete systems. Machine learning and deep learning techniques have emerged as powerful tools for addressing these complex prediction challenges, offering the ability to identify patterns and relationships within large datasets without requiring explicit programming of physical relationships [
25,
26,
27,
28].
Recent advances in artificial intelligence have demonstrated considerable promise in materials engineering applications, particularly in predicting concrete compressive strength. Naderpour et al. developed an artificial neural network model for forecasting recycled aggregate concrete strength using 139 datasets, demonstrating the capability of neural networks for precise prediction [
29]. Similarly, research employing deep neural networks, multivariate adaptive regression splines, and extreme learning machines has validated the accuracy of advanced computational methods in predicting compressive strength with variable compositions [
30]. Researchers predicted the compressive strength of recycled concrete by employing support vector regression (SVR), one-dimensional convolutional neural networks (1D-CNN), and a hybrid artificial intelligence model that integrates elastic net, random forest algorithms, and light gradient boosting decision trees (LGBM), while incorporating Gaussian noise during training to enhance the model’s generalization capability [
31]. These studies have established that machine learning approaches can effectively capture the complex interactions between mixture components and resulting mechanical properties, though the optimal modeling approach remains subject to ongoing investigation.
Despite substantial progress in applying machine learning to recycled concrete strength prediction, several critical research gaps persist. First, existing studies have not comprehensively examined how recycled coarse aggregate characteristics—particularly water absorption capacity and particle morphology—affect the rheological behavior of self-compacting concrete, limiting the development of targeted strategies to maintain adequate flow properties [
32]. Second, the underlying mechanisms by which aggregate quality parameters influence both fresh-state workability and hardened-state mechanical performance remain incompletely understood, especially regarding the relative importance of moisture-related effects compared to geometric and surface texture considerations [
33]. Third, the majority of previous research has utilized pre-saturated recycled aggregates or implemented additional water adjustment protocols, approaches that may obscure critical interactions between aggregate moisture state and concrete performance under realistic production scenarios [
34].
Although the RAGN-R method proposed by Kazemi et al. [
35] has achieved excellent results in predicting the strength of PAC and FRC materials, this study differs fundamentally in the following aspects: (1) Research Object: This paper focuses on Recycled Aggregate Self-Compacting Concrete (RASCC). The nonlinear complexity of this material system stems from the diverse physical properties of recycled aggregates (density, water absorption, particle shape), rather than fiber reinforcement effects. (2) Methodological Innovation: We introduce a dual-base learner combination of LightGBM and CatBoost, coupled with Non-Negative Least Squares (NNLS) weight optimization, offering an ensemble strategy complementary to RAGN-R. (3) Interpretability: Through SHAP analysis, we systematically quantify the contribution mechanisms of 18 mix variables to strength, including an analysis of base model contribution weights at the meta-learner level—an aspect not addressed in prior studies. (4) Validation Rigor: We conducted cross-domain validation using an external UCI dataset, which
n (representing sample size) = 1030, and multi-seed stability analysis, demonstrating the domain-independent applicability of our method.
This investigation addresses these gaps through systematic analysis of experimental observations, employing advanced ensemble machine learning approaches including LightGBM, CatBoost, and stacked generalization to predict compressive strength across diverse mixture compositions. SHAP (SHapley Additive exPlanations) interpretability analysis was applied to quantify the marginal contribution of each input feature to individual model predictions. Based on cooperative game theory, SHAP assigns each feature a Shapley value (φj) representing the average contribution across all possible feature subsets. For tree-based ensemble models, the TreeSHAP algorithm was employed, reducing computational complexity from O(TL2M) to O(TLD2). Summary plots and dependency plots were generated to visualize global feature importance and feature interaction effects, respectively, enabling transparent interpretation of the ‘black-box’ ensemble models. By integrating SHAP-based interpretability analysis, this study quantifies the relative contributions of individual mixture parameters and aggregate characteristics to mechanical performance, providing actionable insights for mix design optimization. The findings support the advancement of sustainable construction practices by enabling more confident utilization of recycled aggregate self-compacting concrete in infrastructure development, thereby reducing natural resource consumption and construction waste while maintaining structural performance requirements.
2. Data Sources and Analysis
2.1. Data Sources and Descriptive Statistics
This study utilized a publicly available dataset from Yang’s research [
36] as shown in the
Appendix A Table A1, comprising 301 RASCC specimens with 18 input variables. Key parameters include curing age (2–91 days), cement content (101–520 kg/m
3), water-to-binder ratio (0.24–0.60), recycled aggregate replacement ratio (0–100%), and supplementary cementitious materials (fly ash, GGBFS, silica fume). Detailed descriptive statistics are provided in
Table 1.
To provide a more rigorous characterization of the dataset scope and coverage, the following aspects are elaborated upon. The database comprises 301 experimentally tested specimens compiled from Yang’s research [
36], which aggregated results from multiple independent experimental campaigns reported in the literature. These campaigns were conducted by different research groups under varying laboratory conditions, thereby introducing inherent diversity in testing protocols, curing environments, and material sourcing. The specimens encompass a wide range of curing ages from 2 to 112 days, covering early-age (2–7 days), standard-age (14–28 days), and extended-age (56–112 days) strength development stages. This temporal coverage enables the models to learn the nonlinear hydration kinetics governing strength evolution across different maturity levels, as confirmed by the SHAP analysis in
Section 6, which identifies specimen age as the most influential predictor with SHAP values spanning approximately ±15 MPa.
The diversity of mixture configurations within the dataset is substantial. The 289 specimens originate from at least 10 distinct experimental series, encompassing three cement strength grades (31.25, 42.5, and 52.5 MPa), recycled aggregate replacement ratios spanning 0% to 100%, water-to-binder ratios from 0.24 to 0.562, and supplementary cementitious material combinations including binary (cement + fly ash), ternary (cement + fly ash + GGBFS), and quaternary (cement + fly ash + GGBFS + silica fume) binder systems. Recycled aggregate properties vary considerably, with densities ranging from 2205 to 2685 kg/m3 and water absorption values from 1.77% to 7.7%, representing recycled aggregates of varying quality from different demolition sources. This compositional diversity ensures that the predictive models are exposed to a broad spectrum of practically relevant RASCC formulations, spanning from conventional structural grades to high-performance concrete applications.
Regarding curing conditions and environmental exposure, the compiled dataset primarily reflects standard laboratory curing conditions (approximately 20 ± 2 °C and ≥95% relative humidity) as reported in the original experimental studies. The present investigation focuses exclusively on compressive strength prediction as a function of mixture design parameters and curing age under controlled curing environments. Durability-related degradation phenomena, including carbonation depth progression, chloride ion penetration, and freeze–thaw cycling damage, are not within the scope of this study and would require dedicated experimental datasets incorporating environmental exposure variables and long-term monitoring protocols. This scope definition ensures that the predictive framework maintains clear physical interpretability, with all input features directly related to mixture proportioning and material characterization rather than service-life environmental factors.
Nevertheless, certain limitations of the dataset should be acknowledged. First, the dataset size of 301 specimens, while adequate for ensemble machine learning approaches as demonstrated by the robust cross-validated performance (
Section 4.2.3, mean R
2 = 0.941 ± 0.019), is moderate compared to some benchmark concrete datasets. Second, the geographic and climatic diversity of the source experiments is not explicitly documented, which may limit the generalizability of the models to regions with significantly different raw material characteristics. Third, the absence of field-cured specimens restricts the direct applicability of the current framework to laboratory-scale strength prediction scenarios. These limitations are partially mitigated through the external validation on the independent UCI Concrete Compressive Strength dataset containing 1030 samples (
Section 5.3), which demonstrates the framework’s domain-agnostic applicability, and through the seed stability analysis (
Section 4.2.3), which confirms robustness to data partitioning variability.
2.2. Data Limitations
The 301-specimen database compiled in this study was drawn from 57 published sources spanning multiple countries and decades, which inevitably introduces a degree of heterogeneity in testing conditions. Four specific limitations are acknowledged. First, specimen geometry and size are not uniform across the dataset; some studies report results from standard cylinders (e.g., 150 × 300 mm or 100 × 200 mm) while others use cubes (e.g., 150 × 150 × 150 mm). Cylindrical specimens generally yield lower f_c values than cubic specimens of comparable cross-sectional dimensions. This difference is primarily attributable to the height-to-width ratio (h/d) effect and its interaction with platen friction-induced confinement. During compression testing, friction between the specimen ends and the machine platens restricts lateral expansion near the loading surfaces, creating confined zones. According to the St. Venant principle, this restraining effect extends to approximately 0.87 times the lateral dimension from each platen. For cubic specimens (h/d ≈ 1), the friction-induced confinement covers the entire specimen height, subjecting the material to an effective triaxial stress state and thereby producing higher apparent compressive strength. For standard cylinders (h/d = 2), the mid-height region remains largely unconfined, producing failure under conditions closer to true uniaxial compression [
37,
38]. This cube-to-cylinder strength ratio decreases with increasing concrete strength, from approximately 1.25 for normal-strength concrete to 1.12 for high-strength concrete, as documented in the CEB-FIP Model Code 1990 [
39].
The conversion factors applied to harmonize compressive strength values were determined based on established international standards and empirical relationships reported in the literature. Specifically, the cube-to-cylinder conversion factor (K = f_c,cyl/f_c,cube) was adopted from EN 206-1/Eurocode 2 (EN 1992-1-1:2004), which defines strength-dependent values ranging from approximately 0.80 for normal-strength concrete (≤C50/60) to 0.87 for high-strength concrete (≈C90/105) [
40]. The CEB-FIP Model Code 1990 further corroborates this strength-dependent relationship, indicating a progressive decrease in the cube-to-cylinder strength ratio from 1.25 to 1.12 for cylinder strengths of 40 to 80 MPa [
39]. For recycled aggregate concrete specifically, Pacheco et al. [
41] demonstrated that full incorporation of coarse recycled aggregates reduces the mean K factor to 0.77, compared to 0.81 for natural aggregate concrete. In the present study, where the original source literature explicitly reported cube strengths, a conversion factor of 0.80 was applied for normal-strength concrete (<60 MPa cube strength) and 0.85 for higher-strength concrete (≥60 MPa cube strength), consistent with the Eurocode 2 framework. However, residual inconsistencies may remain. Second, the effect of specimen end-surface preparation—including grinding, sulfur capping, or neoprene pad systems—could not be systematically controlled, as this information was not uniformly reported in the source literature. Such variability is known to influence measured compressive strength by up to 15% and should be regarded as a source of scatter in the dataset. Third, the dataset encompasses a wide curing age range (2–91 days), and predictions for early-age specimens (<7 days) carry greater uncertainty than those at the standard 28-day benchmark, particularly for mixtures incorporating supplementary cementitious materials (SCMs) such as fly ash or slag, whose strength contributions are more pronounced at later ages. Curing age is explicitly included as a model input feature to partially account for this variability, though it cannot fully resolve differences arising from distinct hydration kinetics across binder types. These limitations are inherent to any large-scale data aggregation effort and do not invalidate the predictive capacity of the machine learning framework; rather, they underscore the need for future studies to adopt standardized reporting protocols when constructing training databases for concrete strength prediction models.
Early-age data separation and analysis. Following the recommendation for distinguishing early-age behavior, the dataset was stratified into two subgroups: early-age specimens with curing age < 7 days (n = 11, comprising 2-day, 3-day, and 5-day specimens) and standard/extended-age specimens with curing age ≥ 7 days (
n = 290). The early-age subset represents only 3.7% of the total dataset, with compressive strength values ranging from 18.40 to 48.30 MPa (mean: 32.57 MPa, standard deviation: 11.02 MPa). The standard/extended-age subset exhibits a broader range of 5.36 to 89.00 MPa (mean: 46.80 MPa, standard deviation: 16.58 MPa). The descriptive statistics of both subgroups are summarized in
Table 2.
2.3. Pearson Correlation Coefficient Analysis
The Pearson correlation coefficient, which quantifies the linear relationship between input and output variables, is defined as the quotient of the covariance between two variables and their respective standard deviations. This coefficient ranges from −1 to 1, providing a standardized measure of association strength and direction. The formula is given below:
where
is the Pearson correlation coefficient between variables
and
;
is the number of observations;
and
are the
-th observations of variables
and
, respectively;
and
are the sample means; and
and
are the standard deviations of
and
, respectively.
Figure 1 presents the Pearson correlation coefficient matrix examining the relationships between concrete mixture parameters and compressive strength, with particular emphasis on recycled aggregate concrete performance factors. The correlation analysis reveals several significant findings regarding compressive strength dependencies. Age demonstrates a strong positive correlation with compressive strength (
r = 0.42), consistent with the fundamental principle of concrete strength development over time. Cement content similarly exhibits a robust positive correlation (
r = 0.51), indicating that increased cement dosage serves as an effective mechanism for enhancing concrete strength. Notably, the water-to-binder ratio shows a negative correlation (
r = −0.25), corroborating the established theory that lower water-to-binder ratios promote higher strength development.
Regarding mineral admixtures, silica fume demonstrates a modest positive correlation with compressive strength (r = 0.026), suggesting a beneficial though limited contribution. Conversely, fly ash (r = −0.22) and ground granulated blast furnace slag (r = −0.15) exhibit negative correlations, potentially reflecting the influence of these supplementary cementitious materials on early-age strength development within the scope of this investigation. Concerning recycled aggregate effects, the replacement ratio shows a slight negative correlation with compressive strength (r = −0.053), suggesting a declining strength trend with increased recycled aggregate incorporation. The recycled aggregate absorption rate similarly correlates negatively (r = −0.11), indicating that elevated water absorption characteristics adversely affect concrete performance. In contrast, recycled aggregate density displays a positive correlation (r = 0.36), demonstrating that higher-density recycled aggregates facilitate superior strength development.
The matrix also reveals noteworthy inter-parameter relationships. A strong positive correlation exists between ground granulated blast furnace slag and silica fume (r = 0.56), while cement type exhibits a moderate negative correlation with natural aggregate content (r = −0.34). The recycled aggregate replacement ratio demonstrates strong negative and positive correlations with natural aggregate content (r = −0.97) and recycled aggregate content (r = 0.98), respectively, reflecting the inherent interdependencies among mixture components in proportion design. Fineness modulus and maximum aggregate size both show negative correlations with compressive strength (r = −0.18 and r = −0.17, respectively), suggesting that aggregate particle size characteristics exert measurable influences on concrete strength. Collectively, this correlation analysis provides valuable empirical evidence for optimizing recycled concrete mixture proportions and enhancing engineering performance characteristics.
6. SHAP-Based Model Interpretability Analysis
To enhance the transparency and interpretability of the developed ensemble models, SHAP (SHapley Additive exPlanations) analysis was conducted to quantify the contribution of each input feature to the predicted compressive strength values. SHAP values employ the concept of game theory to compute the contribution of each ‘game player’ (feature) to the final ‘game outcome’ (model prediction), providing a unified measure of feature importance by calculating the marginal contribution of each feature across all possible feature combinations [
79]. The SHAP method is a comprehensive interpretable method that includes both global and local interpretations, where global interpretation assesses feature importance and dependency while local interpretation quantifies the contribution of each input variable to predicted values [
80].
6.1. LightGBM Model Feature Importance Analysis
Figure 8 presents the SHAP summary plot for the LightGBM base learner, illustrating the distribution of SHAP values for all input features across the entire dataset. The vertical axis displays features ranked by their mean absolute SHAP value, indicating their overall importance in determining compressive strength predictions. The horizontal axis represents the SHAP value magnitude, where positive values indicate an increase in predicted strength and negative values suggest a decrease. The color gradient from blue to red represents the feature value magnitude, with red indicating high feature values and blue representing low values.
The analysis reveals that specimen age emerges as the most influential parameter in the LightGBM model, exhibiting the widest distribution of SHAP values ranging from approximately −15 to +15 MPa. The color distribution indicates that increased curing age consistently produces positive SHAP values, with SHAP analysis identifying curing age and cement content as the most influential variables, reinforcing domain knowledge about cement hydration and strength development, as curing age segmentation enhances predictions for long-term strength [
81,
82]. High age values, represented by red data points, predominantly cluster in the positive SHAP value region, demonstrating that prolonged curing periods contribute substantially to enhanced compressive strength predictions. This finding aligns with the fundamental principles of concrete technology, where the progressive hydration of cementitious materials over time leads to the formation of calcium silicate hydrate gel and the consequent densification of the concrete matrix.
Recycled aggregate density (RA-midu) demonstrates the second highest feature importance, with SHAP values spanning from approximately −10 to +8 MPa. High density values of recycled aggregates predominantly contribute positively to strength predictions, confirming that denser recycled aggregates facilitate superior mechanical performance, as the SHAP technique reveals that physical properties of aggregates are dominant parameters in estimating concrete strength [
83]. The color distribution pattern shows that high-density recycled aggregates, indicated by red points, are associated with positive SHAP values, while low-density aggregates, shown in blue, correspond to negative contributions. This relationship reflects the physical reality that denser aggregates typically possess lower porosity, reduced water absorption capacity, and stronger inherent mechanical properties, all of which contribute to enhanced interfacial bonding with the cement paste matrix and improved overall concrete performance.
The water-to-binder ratio (w/b) exhibits substantial negative influence on compressive strength predictions, with SHAP values distributed between approximately −8 to +5 MPa. The feature demonstrates a clear inverse relationship with strength, as evidenced by the concentration of red points representing high w/b ratio values in the negative SHAP value region. Feature importance analysis using SHAP identified the water-to-binder ratio as the most influential factor negatively affecting strength, with Partial Dependence Plots employed to further examine the relationships between key input features and strength outputs [
84]. This pattern corroborates the well-established principle in concrete technology that elevated water content increases capillary porosity, reduces the density of the hardened cement paste, and weakens the interfacial transition zone between aggregate particles and the binding matrix.
Fly ash content (FA) presents considerable variability in its impact on strength predictions, with SHAP values ranging from approximately −5 to +5 MPa and displaying a more dispersed distribution pattern compared to the previously discussed features. The mixed influence of fly ash, characterized by both positive and negative SHAP contributions across its value range, reflects the complex dual nature of this supplementary cementitious material. High fly ash dosages can provide long-term strength enhancement through pozzolanic reactions that consume calcium hydroxide and produce additional calcium silicate hydrate. However, at early ages or when used in excessive quantities, fly ash may dilute the cement content and delay strength development, resulting in the heterogeneous SHAP value distribution observed in the analysis.
Cement content exhibits a predominantly positive correlation with compressive strength, as indicated by the concentration of high cement dosage values in the positive SHAP region. However, the magnitude of this effect appears more moderate compared to specimen age and recycled aggregate density, with SHAP values typically ranging from −5 to +5 MPa. Water content demonstrates similar inverse patterns to the w/b ratio, where high water dosages, represented by red data points, predominantly occupy the negative SHAP value space, reflecting the detrimental impact of excess water on concrete strength through increased porosity and reduced matrix density.
Fineness modulus shows a concentrated SHAP value distribution near zero with slight negative tendency for high values, suggesting that while aggregate gradation influences workability and packing efficiency, its direct impact on compressive strength is relatively limited within the studied range. Recycled aggregate water absorption (RAabsorption) displays notable negative influence when absorption values are high, reflecting the adverse effects of porous recycled aggregates on concrete performance through reduced effective water-to-cement ratio and compromised aggregate-paste bonding. SHAP analysis indicates that cement content and recycled aggregate percentages are the effective input parameters affecting concrete mechanical properties [
85].
The remaining parameters, including maximum aggregate size, sand content, recycled aggregate content, silica fume, natural aggregate density, natural aggregate absorption, natural aggregate content, recycled aggregate replacement ratio, ground granulated blast furnace slag, and cement type, exhibit progressively diminishing influence on model predictions. These features display narrower SHAP value distributions concentrated near zero, indicating their contributions are relatively minor or highly context-dependent based on complex interactions with other mixture parameters. The hierarchical importance ranking provided by SHAP analysis offers valuable insights for mixture proportion optimization, suggesting that practitioners should prioritize controlling specimen age, aggregate density, water-to-binder ratio, and cementitious material dosages to achieve targeted strength performance.
6.2. CatBoost Model Feature Importance Analysis
Figure 9 presents the SHAP summary plot for the CatBoost base learner, revealing distinct feature importance patterns that complement the insights obtained from the LightGBM analysis. While the overall feature ranking structure exhibits similarities with LightGBM, notable differences in SHAP value distributions and feature influence magnitudes provide valuable information regarding how different gradient boosting implementations capture relationships within the recycled aggregate concrete dataset.
Specimen age maintains its position as the dominant predictor in the CatBoost model, demonstrating SHAP values spanning approximately −10 to +10 MPa. The distribution pattern remains consistent with the LightGBM findings, where high age values represented by red points predominantly cluster in the positive SHAP value region. However, the CatBoost model exhibits a slightly more concentrated distribution compared to LightGBM, suggesting that CatBoost’s ordered boosting algorithm and symmetric tree structure capture the age-strength relationship with somewhat different granularity. Recycled aggregate properties maintain critical importance across different model architectures, with SHAP-based feature attribution providing precise illustration of feature interdependencies and quantifying their complex relationships to establish a hierarchy of importance, demonstrating exceptional predictive accuracy with R
2 values exceeding 0.94 across multiple ensemble learning approaches [
86,
87].
Recycled aggregate density continues to demonstrate substantial influence in the CatBoost model, with SHAP values distributed between approximately −8 to +8 MPa. The color gradient pattern reveals a consistent positive correlation between high aggregate density and positive strength contributions, similar to the LightGBM findings. However, the CatBoost model shows a marginally tighter clustering of SHAP values in the mid-range, potentially reflecting the model’s categorical feature handling capabilities and its approach to splitting decisions through symmetric trees. This characteristic may enable CatBoost to more efficiently partition the feature space when dealing with the continuous density values of recycled aggregates.
The water-to-binder ratio maintains its strong negative influence in the CatBoost model, with high w/b values consistently associated with negative SHAP contributions to predicted strength. The distribution pattern closely mirrors the LightGBM results, confirming the robust identification of this inverse relationship across different modeling approaches. The consistency of this finding across both base learners validates the physical principle that excess water compromises concrete strength and demonstrates that both gradient boosting implementations successfully capture this fundamental relationship despite their algorithmic differences.
Fly ash content exhibits comparable variability in the CatBoost model as observed in LightGBM, with SHAP values spanning both positive and negative regions. The distribution suggests that CatBoost similarly captures the complex dual nature of fly ash contributions, where the pozzolanic benefits must be balanced against potential early-age strength dilution effects. Cement content shows predominantly positive influence with moderate magnitude, maintaining consistency with the LightGBM findings and confirming cement dosage as a reliable strength enhancement parameter across different model architectures.
Water content demonstrates clear negative correlation patterns in the CatBoost model, with high water dosages occupying predominantly negative SHAP value space. The distribution characteristics closely align with the w/b ratio findings, reflecting the interconnected nature of these mixture proportion parameters and their combined influence on concrete porosity and strength development. Ensemble boosting algorithms including CatBoost, XGBoost, and LightGBM demonstrated superior predictive accuracy, with models excelling in estimation across different concrete properties through effective feature interdependency analysis [
88].
Fineness modulus recycled aggregate water absorption, and maximum aggregate size exhibit similar importance rankings and distribution patterns in CatBoost compared to LightGBM, though with subtle differences in SHAP value spread and concentration. These similarities suggest that both models converge on comparable feature importance hierarchies despite their different tree-building strategies, with CatBoost’s ordered boosting and symmetric trees producing results that align well with LightGBM’s leaf-wise growth approach.
The lower-ranked features, including sand content, recycled aggregate content, silica fume, natural aggregate properties, recycled aggregate replacement ratio, ground granulated blast furnace slag, and cement type, maintain their limited influence in the CatBoost model. The consistency of these findings across both base learners enhances confidence that these parameters genuinely exert minimal direct impact on compressive strength within the studied dataset, though they may participate in higher-order interactions that the ensemble framework captures through the meta-learner integration.
The comparative analysis of LightGBM and CatBoost SHAP distributions reveals that while both models identify similar primary drivers of compressive strength, subtle differences in their feature importance quantification reflect their distinct algorithmic approaches. These complementary perspectives provide the foundation for the Stacking ensemble’s superior performance, as the Ridge regression meta-learner can leverage the unique insights from each base learner to achieve enhanced prediction accuracy and robustness.
6.3. Meta-Learner Base Model Contributions
Figure 10 quantifies the mean absolute SHAP values for base learner contributions in the Stacking ensemble meta-learner. The horizontal bar chart reveals that LightGBM generates a mean absolute SHAP value of 7.06 MPa, while CatBoost produces 6.28 MPa. The stacking ensemble models improved prediction metrics by reaching higher R
2 values and lower RMSE compared to base learners, confirming the effectiveness of ensemble learning in enhancing prediction accuracy through synergistic combination of diverse model predictions [
89]. The modest difference of 0.78 MPa indicates that both models contribute substantially and comparably to the final ensemble prediction.
The slightly higher SHAP value for LightGBM suggests marginally greater influence on the ensemble output. This differential may reflect LightGBM’s leaf-wise tree growth strategy, which enables deeper, more specialized trees that capture certain nonlinear relationships with higher fidelity. The model’s gradient-based one-side sampling and exclusive feature bundling techniques contribute to identifying patterns that provide more informative predictions for the Ridge regression meta-learner.
Conversely, CatBoost’s symmetric tree structure and ordered boosting algorithm offer complementary strengths in handling categorical variables and reducing prediction shift. The comparable magnitude of CatBoost’s mean absolute SHAP value confirms that this model provides meaningful, non-redundant information that enhances ensemble predictive capability. Ridge regression was selected as a meta-learner due to its superior performance in stacking models, as it can effectively capture important information without overfitting when the number of base learners is appropriate, and its regularization mechanism helps manage correlation between base learner predictions [
76].
The balanced contribution structure enhances ensemble robustness and generalization capability. When multiple base learners contribute comparably to final predictions, the ensemble becomes less vulnerable to the failure modes of any single model. If one base learner encounters input data outside its optimal operating range, the other can compensate, maintaining overall prediction reliability. This redundancy mechanism enhances the ensemble’s ability to generalize to new concrete mixtures that may differ from the training distribution, while preserving the essential relationships captured by both models.
The learned attention structure of the meta-learner—quantified via SHAP values of 7.06 MPa for LightGBM and 6.28 MPa for CatBoost—is physically interpretable in the context of RASCC behavior. LightGBM’s marginally higher weight aligns with the fact that the dataset exhibits strong time-dependent strength evolution (curing age being the top feature in both base models,
Figure 8 and
Figure 9). LightGBM’s leaf-wise growth strategy is particularly adept at capturing the nonlinear, threshold-like nature of cement hydration kinetics—a well-established physicochemical process where strength gains accelerate rapidly in early curing stages and plateau at later stages [
90,
91]. This behavior is consistent with Powers’ hydration model and the S-curve strength development pattern widely reported in the concrete literature [
92,
93]. CatBoost’s slightly lower but comparable weight reflects its strength in handling categorical and bound continuous variables such as cement type and recycled aggregate replacement ratio (RAreplace) parameters whose influence on strength is more discrete and less time dependent. Crucially, the near-parity of the two weights (7.06 vs. 6.28 MPa, a difference of ~11%) indicates that the meta-learner does not disproportionately favor one base model, suggesting that the ensemble has learned to exploit complementary physical mechanisms—hydration kinetics captured by LightGBM and aggregate–matrix interaction effects captured by CatBoost—rather than relying on a single modeling strategy. This balanced attribution is consistent with the known multi-mechanism nature of RASCC strength development, where both binder chemistry and aggregate physical properties play indispensable roles [
94,
95,
96].
6.4. Integrated Ensemble Feature Attribution
Figure 11 presents the comprehensive SHAP beeswarm plot for the complete Stacking ensemble model, synthesizing feature importance analysis across the integrated framework. This visualization captures final feature attribution after the Ridge regression meta-learner has optimally combined predictions from both base learners. The consistency between the integrated ensemble SHAP rankings and the individual base learner rankings (both identifying age, RA-midu, and w/b as top contributors) further validates that the meta-learner’s attention weights genuinely reflect domain-informed feature prioritization rather than spurious statistical correlations.
Specimen age maintains its position as the most influential feature, with SHAP values spanning approximately −15 to +15 MPa. Using the SHAP algorithm, the impact of different input features on model output has been visualized, with negative SHAP values corresponding to a decreasing effect on model prediction, whereas positive SHAP values indicate an increasing effect [
97]. The dense clustering of red points in the positive SHAP region and blue points in the negative region demonstrates a consistent, monotonic relationship between curing age and predicted strength. The broader SHAP value distribution compared to individual base learners reflects the ensemble’s enhanced sensitivity to age variations.
Recycled aggregate density emerges as the second most critical parameter, exhibiting SHAP values ranging from approximately −10 to +10 MPa. The clear separation between high-density values contributing positively and low-density values contributing negatively demonstrates that the ensemble has successfully learned the fundamental relationship between aggregate quality and concrete performance. The ensemble’s SHAP distribution for this feature shows greater definition compared to individual base learners.
The water-to-binder ratio continues to demonstrate strong negative influence, with high w/b values predominantly associated with negative SHAP contributions. The SHAP analysis demonstrates that the developed ensemble models capture physically meaningful relationships between mixture design parameters and compressive strength, where ensemble learning models trained on comprehensive datasets enhance predictions by 15–20% and decrease Root Mean Squared Error [
98]. The concentration and consistency of this pattern reflects the robustness of this relationship, as both base learners independently identify the inverse correlation and the meta-learner reinforces this finding through optimal weight assignment.
Fly ash content maintains its complex, bidirectional influence pattern, with SHAP values distributed across both positive and negative regions. The ensemble captures a more nuanced representation of fly ash effects, reflecting the meta-learner’s ability to identify context-dependent relationships where fly ash contributions vary based on interactions with other mixture parameters such as cement content, water-to-binder ratio, and curing age. This sophisticated pattern recognition capability exemplifies the advantage of ensemble learning for modeling complex material behaviors.
Cement content shows predominantly positive influence with moderate SHAP value magnitudes, while water content displays clear negative correlation patterns. Fineness modulus, recycled aggregate water absorption, maximum aggregate size, and sand content exhibit moderate influence, with SHAP value distributions centered near zero but showing occasional significant deviations. These patterns suggest that while these parameters exert secondary effects on average, they can become critically important under specific mixture proportion combinations.
The lower-ranked features maintain limited direct influence in the ensemble model. However, their inclusion remains justified, as they may participate in higher-order interactions that contribute to the ensemble’s overall predictive accuracy. The consistency of findings across different model configurations demonstrates that precise predictions are crucial for enhancing structural reliability and optimizing resource usage in construction projects, with machine learning algorithms successfully capturing the intricate interactions between input variables [
86]. The interpretability provided by SHAP analysis enables practitioners to understand not only which features are important but also how their specific values influence predictions, thereby facilitating informed decision-making in mix design optimization for recycled aggregate self-compacting concrete applications.
7. Conclusions
This investigation successfully developed and validated advanced ensemble machine learning frameworks for predicting the compressive strength of recycled aggregate self-compacting concrete, addressing critical gaps in sustainable construction materials research. Through systematic analysis of 301 experimental observations encompassing diverse mixture compositions, curing ages, and recycled aggregate characteristics, this study establishes robust predictive capabilities while providing interpretable insights into the complex relationships governing concrete mechanical performance.
The comparative evaluation of four machine learning methodologies revealed that the Stacking ensemble approach, employing LightGBM and CatBoost as base learners with Ridge regression as meta-learner, achieved superior predictive accuracy with a coefficient of determination of 0.963, root mean squared error of 3.321 MPa, and mean absolute error of 2.506 MPa on the testing dataset. This performance represents a substantial advancement over conventional empirical prediction methods, which typically achieve coefficient of determination values below 0.30 for recycled concrete systems. The Light Gradient Boosting Machine demonstrated an optimal balance between predictive accuracy and computational efficiency, achieving a coefficient of determination of 0.961 with execution time of merely 0.124 s, suggesting its suitability for real-time applications in construction practice where rapid mix design optimization is required.
Comprehensive residual diagnostic analysis validated the statistical integrity of the developed models, confirming that prediction errors satisfy fundamental assumptions including approximate normality, homoscedasticity across the entire strength range, temporal independence, and random distribution without systematic bias. These characteristics demonstrate that the ensemble learning approach successfully captures the underlying physical relationships governing concrete strength development without overfitting to spurious patterns in the training data. The consistent performance across training and testing datasets, with minimal variance between evaluation metrics, provides strong evidence of robust generalization capability to new mixture compositions not encountered during model development.
The application of SHAP-based interpretability analysis quantified the relative contributions of individual mixture parameters to compressive strength predictions, revealing a clear hierarchy of feature importance that aligns with established concrete technology principles. Specimen age emerged as the most influential parameter, exhibiting SHAP values ranging from negative 15 to positive 15 MPa, reflecting the fundamental role of hydration kinetics in strength development. Recycled aggregate density demonstrated the second highest importance, with high-density aggregates consistently contributing positively to strength predictions through reduced porosity and enhanced interfacial bonding characteristics. The water-to-binder ratio exhibited strong negative influence, corroborating the well-established inverse relationship between water content and concrete strength through effects on capillary porosity and matrix density.
The SHAP analysis revealed complex, context-dependent contributions from supplementary cementitious materials, with fly ash displaying bidirectional influence patterns that reflect the competing effects of pozzolanic strength enhancement versus cement dilution and delayed early-age strength development. Recycled aggregate water absorption demonstrated notable negative influence when absorption values exceeded four percent, highlighting the critical importance of aggregate moisture management in mix design optimization. The consistency of feature importance rankings across different base learner architectures enhances confidence that these relationships represent genuine physical dependencies rather than model-specific artifacts.
The developed predictive framework enables several practical advances for sustainable construction applications. First, the models facilitate rapid optimization of recycled aggregate self-compacting concrete mixture proportions to achieve targeted strength requirements while maximizing recycled content, thereby supporting circular economy principles within the construction sector. Second, the interpretability provided by SHAP analysis allows practitioners to understand which mixture parameters require precise control and which parameters offer flexibility for cost optimization or material substitution. Third, the robust performance across strength ranges from 5.36 to 89 MPa demonstrates applicability to both conventional structural applications and high-performance concrete formulations, expanding the potential scope of recycled aggregate utilization.
Several limitations of this investigation warrant acknowledgment. The dataset, while encompassing substantial compositional diversity with 301 specimens spanning at least 10 experimental series, three cement strength grades, recycled aggregate replacement ratios from 0% to 100%, and multiple supplementary cementitious material combinations, derives from laboratory-scale specimens tested under standard curing conditions (approximately 20 ± 2 °C, ≥95% relative humidity). The current predictive framework addresses compressive strength as a function of mixture design parameters and curing age and does not incorporate durability-related phenomena such as carbonation depth progression, chloride ion ingress, or freeze–thaw degradation, which are governed by distinct environmental exposure variables beyond the scope of the present mixture-property modeling approach. Validation using field-scale concrete placements subjected to variable curing conditions, construction practices, and environmental exposures would enhance confidence in the models’ applicability to real-world construction scenarios. The temporal scope of strength predictions extends to 112 days, yet many infrastructure applications require service life predictions spanning decades. Future work should integrate long-term strength development models and dedicated durability datasets incorporating environmental exposure parameters to extend the predictive framework from mechanical strength estimation to comprehensive service life assessment, thereby bridging the gap between laboratory-calibrated models and field-level engineering applications.
This study’s dataset focuses on concrete compressive strength and does not systematically include fresh concrete workability indicators (such as slump or spread flow). However, the water-to-binder ratio (w/b), a core parameter influencing workability, is included among the 18 input variables. SHAP analysis (
Section 6) reveals that w/b exerts a significant negative effect on strength (with SHAP values ranging approximately from −8 to +5 MPa), which aligns with existing literature regarding the dual impact of w/b on both the flowability and strength of self-compacting concrete. Future research should incorporate rheological parameters (such as plastic viscosity and yield stress) to construct a synergistic prediction framework for workability and strength.
Future research should explore several promising extensions of this work. The incorporation of additional input features characterizing recycled aggregate microstructure, such as residual mortar content, interfacial transition zone properties, and micro-crack distributions, may enable more refined predictions and deeper mechanistic understanding. The development of multi-objective optimization frameworks that simultaneously consider compressive strength, workability characteristics, environmental impact metrics, and economic costs would support holistic decision-making in sustainable concrete design. The application of physics-informed machine learning approaches that embed fundamental conservation laws and constitutive relationships within neural network architectures may improve extrapolation capabilities and reduce data requirements for model training.
The integration of uncertainty quantification methods, such as Bayesian neural networks or conformal prediction intervals, would provide practitioners with confidence bounds on strength predictions, enabling risk-informed decision-making in structural design. The extension of interpretability analysis to investigate higher-order feature interactions, beyond the individual feature attributions provided by SHAP, may reveal synergistic or antagonistic effects among mixture components that could inform novel mix design strategies. The development of transfer learning approaches that leverage knowledge gained from natural aggregate concrete datasets to improve predictions for recycled aggregate systems with limited experimental data represents another valuable research direction.
In conclusion, this investigation demonstrates that ensemble machine learning, combined with rigorous interpretability analysis, provides powerful tools for advancing sustainable construction materials research and practice. The developed models achieve predictive accuracy exceeding 96 percent while maintaining transparency regarding the physical relationships underlying their predictions. By enabling confident utilization of recycled aggregate self-compacting concrete across diverse applications, this work supports the construction industry’s transition toward circular economy principles, contributing to reduced natural resource consumption, decreased construction waste generation, and diminished carbon dioxide emissions while maintaining the structural performance requirements essential for safe and durable infrastructure. The methodological framework established herein offers a template for addressing similar prediction challenges in other sustainable construction materials, accelerating the development and deployment of environmentally responsible building technologies.