1. Introduction
Coal remains a principal energy source worldwide, with a particularly strong reliance observed across the Asia-Pacific region, where its share has consistently remained elevated. Among the primary by-products of coal mining, coal gangue (CG) represents roughly 10% to 25% of annual coal production, resulting in accumulated stockpiles surpassing 4.5 billion tons—constituting over 40% of China’s total industrial solid waste [
1]. At present, open-air dumping remains the predominant disposal method, causing extensive land occupation and triggering secondary geological hazards, such as landslides and debris flows. Moreover, prolonged exposure leads to the leaching of heavy metals (e.g., Pb
2+, Zn
2+, and Cu
2+) into the surrounding ecosystems, contaminating soil and water bodies. The spontaneous combustion of residual coal and pyrite further aggravates environmental risks by releasing harmful gases (SO
2, NOₓ, CO), intensifying regional air quality concerns [
2].
Existing research on CGC has coalesced into three principal strands. First, controlled experiments establish a pronounced non-monotonic dependence of mechanical performance on the gangue-aggregate replacement ratio: moderate substitution (30% coarse aggregate) tends to maximize strength, whereas excessive incorporation induces marked reductions in both elastic modulus and compressive strength [
3]. Second, mixture-design optimization—via proportioning strategies and targeted material modifications, such as partial substitution of quartz powder with activated gangue or the inclusion of discrete fibers—has yielded improvements in strength, durability, and fatigue resistance [
4,
5]. Third, thermal mechanical activation protocols demonstrate that calcination at approximately 800 °C followed by fine grinding produces pozzolanic gangue powders that promote secondary hydration, densify the cementitious matrix, and refine pore structure, thereby enhancing strength development in alkali-activated binders. Collectively, this literature delineates the coupled influences of replacement level, mix design, and activation treatment, while elucidating the microstructural mechanisms governing performance enhancement in CGC [
6].
The lack of robust mechanical modeling remains a central impediment to the widespread structural application of CGC. Although numerous constitutive formulations have been advanced across disparate experimental regimes, including uniaxial compression, varying replacement ratios, and freeze–thaw cycles [
5,
7], persistent deficiencies in generalizability, adaptability, and physical transparency are evident. A critical appraisal isolates three limitations: (1) minimal adoption of machine learning, only a small subset applies ML to CGC [
8], with a prevailing reliance on linear regression to calibrate isolated parameters [
6,
9], a paradigm unable to capture the nonlinear, hierarchical interactions of heterogeneous cementitious systems; (2) reliance on restrictive statistical premises, normality, homoscedasticity, and feature independence—misaligned with CGC’s variability and skewed distributions, degrading robustness across loading and curing regimes [
9,
10]; and (3) weak interpretability, models rarely identify dominant predictors or coupled mechanisms, limiting engineering deployment and mechanistic insight into CGC’s behavior [
10,
11].
In contrast, recent advances in machine learning, particularly in geotechnics, fracture mechanics, and structural health monitoring, have addressed the limitations of conventional constitutive models [
12,
13,
14,
15]. Building on this, we develop a Gradient Boosted Tree (GBT) framework for predicting CGC constitutive parameters. By iteratively fitting the residuals, the model captures nonlinear interactions without prespecified functional forms and operates effectively in high-dimensional, heterogeneous feature spaces. To reconcile accuracy with physical interpretability, we couple permutation-based global importance with SHAP to quantify both aggregate influence and context-dependent effects. The framework yields three advances: (1) a unified, scalable pipeline linking feature attribution, multi-parameter prediction, stress–strain reconstruction, and finite element validation; (2) a data-driven yet interpretable alternative to regression formulations that identifies critical mix and curing variables—gangue replacement ratio, water–binder ratio, and age governing peak stress and fracture toughness; and (3) identification of a three-stage stiffness evolution—brittle, quasi-ductile, re-brittle—modulated by gangue content and consistent with micromechanical observations and numerical simulations. Collectively, these results establish a robust basis for embedding explainable AI into constitutive modeling and the engineering application of sustainable concrete.
5. Conclusions and Future Directions
(1) The CPI-based heatmap analysis suggests that ensemble models perform best on coal gangue concrete data when trained with moderately low learning rates (around 0.07–0.13) and relatively shallow tree depths (no more than 6). This combination offers a good trade-off between capturing nonlinear patterns and avoiding overfitting, making it well-suited for modeling the complex and heterogeneous behavior of CGC materials.
(2) Among the evaluated models, LightGBM outperforms in terms of accuracy, robustness, and error mitigation, particularly excelling in predicting high-dimensional targets such as the elasticity modulus and uniaxial compressive strength. CatBoost demonstrates superior adaptability in small-sample, discrete datasets, while XGBoost offers strong boundary detection and local fitting capabilities, making it ideal for scenarios with well-defined structural patterns.
(3) SHAP-based interpretability analysis elucidated a three-stage mechanical response trajectory—brittle, quasi-ductile, and re-brittle—governing the stiffness evolution of CGC with increasing gangue incorporation. This phase-structured framework, grounded in feature–response attribution patterns, was corroborated through micromechanical evidence and deformation morphologies, thereby affirming the model’s capacity to capture nonlinear degradation thresholds and interfacial instability regimes intrinsic to hybrid aggregate systems.
(4) Hybrid explainability analysis revealed compounded degradation effects arising from the coupled influence of coal gangue content with both crush index and water–cement ratio. The permutation-based global ranking established these variables as jointly dominant across all targets, while SHAP-based local visualizations elucidated nonlinear synergistic weakening mechanisms—wherein friable gangue particles and elevated moisture levels collectively destabilize the matrix–aggregate interface—thereby intensifying stiffness loss and precipitating compressive failure. These findings align with prior micromechanical observations linking interfacial debonding, pore coalescence, and moisture-induced damage to hybrid aggregate deterioration.
(5) Finite element validation, calibrated through fracture energy–based softening curves, independently substantiated the reliability of the SHAP-informed constitutive model in reproducing key structural responses across failure modes, load trajectories, and post-peak behavior. Notably, the model accurately captured the brittle → quasi-ductile → re-brittle evolution path of CGC, aligning simulation outcomes with SHAP-predicted transitions. The pronounced energy dissipation observed in the intermediate regime highlights the potential of coal gangue concrete as a ductile and damage-tolerant material for seismic applications. These findings establish a mechanistically interpretable and numerically verifiable framework for guiding the development of performance-based, data-driven structural materials. Building on the generalizability of the proposed modeling framework, there is significant potential for cross-domain extension. Future research should focus on the following key directions:
- (1)
Enhancement of dataset scale and representativeness for advanced interaction modeling.
The present study, while based on a systematically curated dataset, remains constrained by the limited number and diversity of available samples. Future investigations should incorporate more extensive and heterogeneous datasets encompassing broader variations in material composition, curing regimes, and testing protocols. Such expansions would not only strengthen the statistical robustness of machine learning models but also enable a more rigorous identification of nonlinear feature interactions and latent coupling mechanisms that govern the mechanical behavior of coal gangue concrete.
- (2)
Integration of inter-material coupling mechanisms into numerical simulations.
Although the current finite element modeling framework successfully captures the macroscopic response of CGC components, it inherently simplifies the material system by neglecting the meso-structural interactions among distinct phases, such as aggregates, cement paste, and interfacial transition zones. To enhance the physical fidelity of simulation outcomes, future work should aim to incorporate multi-phase constitutive relationships or cohesive interface models that account for inter-material bonding, slip, and damage evolution. This would facilitate a more comprehensive representation of the hybrid nature of CGC and improve the generalizability of the simulation results for complex structural scenarios.
- (3)
Integration of advanced machine learning algorithms.
The accelerating convergence of artificial intelligence and materials science is ushering in a new era of modeling paradigms that transcend traditional input–output mappings. Future research may leverage architecture-aware frameworks—such as graph-based learning or physics-informed networks—to capture spatially distributed interactions, multiscale heterogeneity, and path-dependent responses under complex loading regimes. Such approaches hold promise for embedding structural priors and mechanistic constraints directly into learning processes, thereby enabling more robust and generalizable constitutive formulations for sustainable concrete systems.