Explainable Machine Learning for Multicomponent Concrete: Predictive Modeling and Feature Interaction Insights
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Pretreatment
2.1.1. Test Protocol and Standard
2.1.2. Data Description
2.1.3. Correlation Analysis
2.2. Modeling
2.2.1. Multicollinearity
2.2.2. Linear and Polynomial Regression
2.2.3. Tree-Based Modeling
- (1)
- Decision tree regressor
- (2)
- Random forest regressor
- (3)
- Extra trees regressor
- (4)
- Adaptive boosting regressor (AdaBoost)
- (5)
- CatBoost regressor
- (6)
- XGBoost regressor
2.2.4. TabPFN
2.3. Interpretation
2.4. Evaluation Indicators
3. Results and Discussion
3.1. Prediction of the ML Models
3.1.1. Linear and Polynomial Prediction
3.1.2. Tree-Based Model Prediction
3.1.3. TabPFN Prediction
3.2. Performance of the ML Models
3.2.1. Linear and Polynomial Fit
3.2.2. Tree-Based Model Fit
3.2.3. TabPFN Model Fit
3.3. Feature Analysis
3.3.1. Feature Impact on the Concrete Strength
3.3.2. Dependence and Interpretation
4. Conclusions and Perspective
- (1)
- Machine learning models significantly increase the efficiency of concrete strength prediction, reducing the experimental cost and time. Among them, tree-based models exhibit strong generalizability and accuracy, with XGBoost achieving the best overall performance across datasets (test set R2 = 0.91; MSE = 6.75, RMSE = 2.60, MAE = 1.91), demonstrating high robustness and practical reliability.
- (2)
- Traditional linear and polynomial regression methods face performance limitations. Even with regularization techniques such as ridge, Lasso, and elastic net, improvements are not guaranteed. Model accuracy remains highly sensitive to regularization parameters and generally falls short of tree-based models without careful tuning.
- (3)
- Like any empirical method, the TabPFN model also has its limitations. It performs best in small-scale, high-precision prediction tasks. When the datasets are divided by curing age and normalized, TabPFN achieves its highest accuracy on the 28-day strength test set. However, when applied to a full dataset, discrepancies between the actual and predicted results become evident.
- (4)
- Slag, age, and cement are identified as the most influential positive features in the development of concrete strength. In contrast, higher contents of sand and aggregates have a negative impact. Notably, feature interaction analysis reveals a strong positive synergy between slag and cement (+0.12), indicating that their combined effect on strength exceeds their individual contributions.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
TabPFN | Tabular Prior-data Fitted Network |
SHAP | Shapley Additive exPlanations |
SD | Standard deviation |
VIF | Variance inflation factor |
OLS | Ordinary least squares |
MSE | Mean squared error |
MAE | Mean absolute error |
RMSE | Root mean squared error |
R2 | Coefficient of determination |
Appendix A
Appendix A.1
Variable | VIF Value |
---|---|
cement | 3.33 |
slag | 3.53 |
fa | 1.99 |
sand1 | 10.25 |
sand2 | 4.52 |
gravel | 7.36 |
water | 2.01 |
aggregatewater | 3.82 |
additiveratio | 1.38 |
adensity | 11.01 |
density | 2.84 |
SC_ratio | 3.46 |
SC_density | 7.88 |
SC_gravelratio | 3.55 |
SC_soakratio | 4.82 |
GC_density | 35.73 |
GC_porosity | 22.59 |
GC_soakratio | 1.56 |
GC_adesnity | 15.64 |
age | 1.72 |
accelerate | 1.71 |
Appendix A.2
Algorithm | In Training Dataset | In Testing Dataset | ||||||
---|---|---|---|---|---|---|---|---|
MSE | RMSE | MAE | R2 | MSE | RMSE | MAE | R2 | |
Decision trees | 9.5776 | 3.0948 | 2.2985 | 0.8679 | 15.9180 | 3.9897 | 3.0279 | 0.7871 |
Random forests | 1.7193 | 1.3112 | 0.9927 | 0.9763 | 9.9388 | 3.1526 | 2.4360 | 0.8671 |
Extratree regressors | 0.0049 | 0.0699 | 0.0019 | 0.9999 | 9.3698 | 3.0610 | 2.3280 | 0.8747 |
AdaBoost | 0.1427 | 0.3778 | 0.1141 | 0.9980 | 9.8365 | 3.1363 | 2.3697 | 0.8685 |
CatBoost | 2.5916 | 1.6099 | 1.2018 | 0.9643 | 6.8385 | 2.6151 | 1.9289 | 0.9086 |
XGBoost | 1.5776 | 1.2560 | 0.9289 | 0.9782 | 6.7487 | 2.5978 | 1.9131 | 0.9098 |
TabPFN | 3.5752 | 1.8908 | 1.4210 | 0.9410 | 11.9858 | 3.4621 | 2.5662 | 0.8216 |
Appendix B
Appendix B.1
Appendix B.2
Appendix B.3
Appendix B.4
References
- Li, C.; Chen, X.; Yuan, C.X. Does digital government reduce carbon emissions? Empirical evidence from global sources. J. Environ. Manag. 2025, 380, 125081. [Google Scholar] [CrossRef] [PubMed]
- Dixit, A.; Du, H.; Dang, J.; Pang, S.D. Quaternary blended limestone-calcined clay cement concrete incorporating fly ash. Cem. Concr. Compos. 2021, 123, 104174. [Google Scholar] [CrossRef]
- Nazeer, M.; Kapoor, K.; Singh, S.P. Strength, durability and microstructural investigations on pervious concrete made with fly ash and silica fume as supplementary cementitious materials. J. Build. Eng. 2023, 69, 106275. [Google Scholar] [CrossRef]
- Wang, X.; Wang, W.; Li, Y.; Wang, L.; Duan, P.; Liu, Y. Water absorption and desorption behavior of lightweight aggregate for internal curing in cement-based materials: A critical review. J. Build. Eng. 2025, 106, 112624. [Google Scholar] [CrossRef]
- Jiang, X.; Lu, J.-X.; Luo, X.; Leng, Z.; Poon, C.S. Enhancing photocatalytic durability of high strength pervious concrete: Micro-mechanical and microscopic mechanisms. Cem. Concr. Compos. 2025, 160, 106020. [Google Scholar] [CrossRef]
- Zhang, P.; Cui, Y.; Douglas, K.; Song, C.; Russell, A.R. Phase field fracture modeling of cohesive-frictional materials like concrete and rock using the scaled boundary finite element method. Comput. Geotech. 2025, 180, 107106. [Google Scholar] [CrossRef]
- Khormani, M.; Jaari, V.R.K. Statistical analysis of the compressive strength of concrete using 2D DIP technology and Finite Element Method. Case Stud. Constr. Mater. 2023, 19, e02461. [Google Scholar] [CrossRef]
- Amrani, M.; El Haloui, Y.; Tlidi, A.; Barbachi, M.; Taha, Y. A new empirical model for predicting complex modulus of asphalt concrete materials. Mater. Today Proc. 2021, 37 Pt 3, 3913–3920. [Google Scholar] [CrossRef]
- Paul, S.C.; Panda, B.; Huang, Y.; Garg, A.; Peng, X. An empirical model design for evaluation and estimation of carbonation depth in concrete. Measurement 2018, 124, 205–210. [Google Scholar] [CrossRef]
- Choung, S.; Park, W.; Moon, J.; Han, J.W. Rise of machine learning potentials in heterogeneous catalysis: Developments, applications, and prospects. Chem. Eng. J. 2024, 494, 152757. [Google Scholar] [CrossRef]
- Liu, X.; Fan, K.; Huang, X.; Ge, J.; Liu, Y.; Kang, H. Recent advances in artificial intelligence boosting materials design for electrochemical energy storage. Chem. Eng. J. 2024, 490, 151625. [Google Scholar] [CrossRef]
- Hollmann, N.; Müller, S.; Eggensperger, K.; Hutter, F. Accurate predictions on small data with a tabular foundation model. Nature 2025, 637, 319–326. [Google Scholar] [CrossRef] [PubMed]
- Sarkar, K.; Shiuly, A.; Dhal, K.G. Revolutionizing concrete analysis: An in-depth survey of AI-powered insights with image-centric approaches on comprehensive quality control, advanced crack detection and concrete property exploration. Constr. Build. Mater. 2024, 411, 134212. [Google Scholar] [CrossRef]
- Wang, J.; Zhang, Z.; Liu, X.; Shao, Y.; Liu, X.; Wang, H. Prediction and interpretation of concrete corrosion induced by carbon dioxide using machine learning. Corros. Sci. 2024, 233, 112100. [Google Scholar] [CrossRef]
- GB/T 50081-2019; Test Methods for Physical and Mechanical Properties of Concrete. Standardization Administration of China: Beijing, China, 2019.
- GB/T 50107-2010; Standard for Inspection and Evaluation of Concrete Strength. Standardization Administration of China: Beijing, China, 2010.
- Wang, D.; Ji, Y.; Xu, W.; Lu, J.; Dong, Q. Multi-objective optimization design of recycled concrete based on the physical characteristics of aggregate. Constr. Build. Mater. 2025, 458, 139623. [Google Scholar] [CrossRef]
- Kumari, P.; Paruthi, S.; Alyaseen, A.; Khan, A.H.; Jijja, A. Predictive performance assessment of recycled coarse aggregate concrete using artificial intelligence: A review. Clean. Mater. 2024, 13, 100263. [Google Scholar] [CrossRef]
- Geng, S.-Y.; Luo, Q.-L.; Cheng, B.-Y.; Li, L.-X.; Wen, D.-C.; Long, W.-J. Intelligent multi-objective optimization of 3D printing low-carbon concrete for multi-scenario requirements. J. Clean. Prod. 2024, 445, 141361. [Google Scholar] [CrossRef]
- Wu, J.; Zhao, G.; Wang, M.; Xu, Y.; Wang, N. Concrete carbonation depth prediction model based on a gradient-boosting decision tree and different metaheuristic algorithms. Case Stud. Constr. Mater. 2024, 21, e03864. [Google Scholar] [CrossRef]
- Zhang, L.; Jánošík, D. Enhanced short-term load forecasting with hybrid machine learning models: CatBoost and XGBoost approaches. Expert Syst. Appl. 2024, 241, 122686. [Google Scholar] [CrossRef]
- Ekanayake, I.U.; Meddage, D.P.P.; Rathnayake, U. A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud. Constr. Mater. 2022, 16, e01059. [Google Scholar] [CrossRef]
- Bhattacharya, S.K.; Sahara, R.; Narushima, T. Predicting the parabolic rate constants of high-temperature oxidation of Ti alloys using machine learning. Oxid. Met. 2020, 94, 205–218. [Google Scholar] [CrossRef]
- Ye, G.; Wan, J.; Bai, Y.; Wang, Y.; Zhu, B.; Zhang, Z.; Deng, Z. Prediction of the effluent chemical oxygen demand and volatile fatty acids for anaerobic treatment based on different feature selections machine-learning methods from lab-scale to pilot-scale. J. Clean. Prod. 2024, 437, 140679. [Google Scholar] [CrossRef]
- Xu, H.; Zou, X.; Sneed, L.H. A two-stage classification-regression method for prediction of flexural strength of fiber reinforced polymer strengthened reinforced concrete beams. Eng. Appl. Artif. Intell. 2025, 145, 110164. [Google Scholar] [CrossRef]
- Ulloa, N.; León, M.A.M.; Palmay, L.F.S.; Castillo, M.M. Evaluating the compressive strength of industrial wastes-based geopolymer concrete with machine learning models. Constr. Build. Mater. 2025, 472, 140891. [Google Scholar] [CrossRef]
- Hoxha, E.; Feng, J.; Sengupta, A.; Kirakosian, D.; He, Y.; Shang, B.; Gjinofci, A.; Xiao, J. Contrastive learning for robust defect mapping in concrete slabs using impact echo. Constr. Build. Mater. 2025, 461, 139829. [Google Scholar] [CrossRef]
- Nassar, A.K.; Kathirvel, P.; Murali, G.; Krishna, A. Development and performance evaluation of novel sustainable one-part alkali-activated fibrous concrete subjected to drop weight impact loading: An experimental study. Constr. Build. Mater. 2025, 459, 139754. [Google Scholar] [CrossRef]
- Wu, Y.; Zhou, Y. Hybrid machine learning model and Shapley additive explanations for compressive strength of sustainable concrete. Constr. Build. Mater. 2022, 330, 127298. [Google Scholar] [CrossRef]
- Abbas, Y.M.; Alsaif, A. Influence of feature-to-feature interactions on chloride migration in type-I cement concrete: A robust modeling approach using extra random forest. Mater. Today Commun. 2024, 40, 109419. [Google Scholar] [CrossRef]
Parameter | Abbreviation | Unit | 25% 1 | 50% 1 | 75% 1 | SD 2 |
---|---|---|---|---|---|---|
Cement content | cement | kg/m3 | 210 | 234 | 268 | 52.88 |
Slag content | slag | kg/m3 | 105 | 148.40 | 178.00 | 50.16 |
Fly ash content | fa | kg/m3 | 35.00 | 60.00 | 70.00 | 24.16 |
fine sand content | sand1 | kg/m3 | 244.00 | 315.00 | 488.97 | 144.54 |
coarse sand content | sand2 | kg/m3 | 358.00 | 420.60 | 521.25 | 122.31 |
Stone aggregate content | gravel | kg/m3 | 718.00 | 785.00 | 859.85 | 110.40 |
Water content | water | kg/m3 | 96.00 | 103.00 | 107.00 | 11.92 |
Aggregate water consumption | aggregatewater | kg/m3 | 65.02 | 74.00 | 86.00 | 18.68 |
Proportion of additives | additiveratio | kg/m3 | 0.01 | 0.01 | 0.02 | 0.01 |
Apparent density | adensity | kg/m3 | 2175.00 | 2219.00 | 2255.00 | 72.01 |
Actual density | density | kg/m3 | 2125.00 | 2170.00 | 2208.00 | 66.64 |
Proportion of silt | SC_ratio | - | 0.30 | 0.40 | 0.60 | 0.15 |
Actual density of sand | SC_ddensity | kg/m3 | 1719.00 | 1816.00 | 1894.00 | 133.58 |
Stone content of sand | SC_gravelratio | - | 0.00 | 0.01 | 0.05 | 0.04 |
Water absorption rate of sand | SC_soakratio | - | 0.05 | 0.06 | 0.08 | 0.02 |
Actual density of stone | GC_ddensity | kg/m3 | 1435.00 | 1475.00 | 1538.00 | 80.73 |
Porosity of stone | GC_porosity | - | 0.41 | 0.43 | 0.44 | 0.02 |
Water absorption rate of stone | GC_soakratio | - | 0.02 | 0.02 | 0.03 | 0.01 |
Apparent density of stone | GC_adensity | kg/m3 | 2528.00 | 2603.00 | 2652.32 | 101.53 |
Curing age | age | h | 48.00 | 336.00 | 672 | 231.66 |
Accelerating curing (label) | accelerate | - | 0/1 | 0/1 | 0/1 | 0.44 |
Strength value | strength_value | MPa | 23.90 | 29.50 | 35.50 | 8.55 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, J.; Deng, J.; Li, S.; Du, W.; Zhang, Z.; Liu, X. Explainable Machine Learning for Multicomponent Concrete: Predictive Modeling and Feature Interaction Insights. Materials 2025, 18, 4456. https://doi.org/10.3390/ma18194456
Wang J, Deng J, Li S, Du W, Zhang Z, Liu X. Explainable Machine Learning for Multicomponent Concrete: Predictive Modeling and Feature Interaction Insights. Materials. 2025; 18(19):4456. https://doi.org/10.3390/ma18194456
Chicago/Turabian StyleWang, Jie, Junqi Deng, Siyi Li, Weijie Du, Zengqi Zhang, and Xiaoming Liu. 2025. "Explainable Machine Learning for Multicomponent Concrete: Predictive Modeling and Feature Interaction Insights" Materials 18, no. 19: 4456. https://doi.org/10.3390/ma18194456
APA StyleWang, J., Deng, J., Li, S., Du, W., Zhang, Z., & Liu, X. (2025). Explainable Machine Learning for Multicomponent Concrete: Predictive Modeling and Feature Interaction Insights. Materials, 18(19), 4456. https://doi.org/10.3390/ma18194456