Abstract
Concrete compressive strength is a fundamental indicator of the mechanical properties of High-Performance Concrete (HPC) with multiple components. Traditionally, it is measured through laboratory tests, which are time-consuming and resource-intensive. Therefore, this study develops a machine learning-based regression framework to predict compressive strength, aiming to reduce experimental costs and resource usage. Under three different data preprocessing strategies—raw data, standard score, and Box–Cox transformation—a selected set of high-performance ensemble models demonstrates excellent predictive capacity, with both the coefficient of determination ( ) and explained variance score (EVS) exceeding 90% across all datasets, indicating high accuracy in compressive strength prediction. In particular, stacking ensemble ( - , - ), XGBoost regression ( - , - ), and HistGradientBoosting regression ( - , - ) based on Box–Cox transformation data show strong generalization capability and stability. Additionally, tree-based and boosting methods demonstrate high effectiveness in capturing complex feature interactions. Furthermore, this study presents an analytical workflow that enhances feature interpretability through visualization techniques—including Partial Dependence Plots (PDP), Individual Conditional Expectation (ICE), and SHapley Additive exPlanations (SHAP). These methods clarify the contribution of each feature and quantify the direction and magnitude of its impact on predictions. Overall, this approach supports automated concrete quality control, optimized mixture proportioning, and more sustainable construction practices.