Research on Concrete Compressive Strength Prediction Based on DE-Optimized LSSVM and Multi-Level Heterogeneous Ensemble Residual Fusion

Shi, Junfeng; Wang, Yifei; Wang, Xiongyu

doi:10.3390/eng7050250

Open AccessArticle

Research on Concrete Compressive Strength Prediction Based on DE-Optimized LSSVM and Multi-Level Heterogeneous Ensemble Residual Fusion

by

Junfeng Shi

¹,

Yifei Wang

^1,* and

Xiongyu Wang

²

¹

Department of Civil Engineering, School of Civil Engineering, Architecture and Environment, Hubei University of Technology, 28 Nanli Road, Shizishan Subdistrict, Wuhan 430068, China

²

Research on Bridge Engineering Construction China Railway Major Bridge Science Research Institute Co., Ltd., 103 Jianshe Avenue, Qiaokou District, Wuhan 430034, China

^*

Author to whom correspondence should be addressed.

Eng 2026, 7(5), 250; https://doi.org/10.3390/eng7050250

Submission received: 16 April 2026 / Revised: 10 May 2026 / Accepted: 11 May 2026 / Published: 19 May 2026

(This article belongs to the Section Chemical, Civil and Environmental Engineering)

Download

Browse Figures

Versions Notes

Abstract

Concrete compressive strength is critical to structural safety, durability, and material cost. Conventional machine learning models are often limited in capturing complex nonlinear dependencies and generalizing. To address this, a residual fusion framework is proposed that combines a least squares support vector machine (LSSVM) optimized by DE with multi-level residual structure bagged decision trees (TreeBagger) and least squares boosting (LSBoost). DE-tuned LSSVM hyperparameters are followed by a multi-level residual scheme that compensates errors layer by layer, with LSBoost performing adaptive nonlinear fusion. Experiments under varied splits, ablation, and multiple seeds show the model outperforms traditional single and ensemble methods in accuracy, generalization, and stability. The ablation attributes the improvements to complementary residual mechanisms and the fusion architecture, rather than simply adding learners. Across multiple runs, an average coefficient of determination (R²) of 0.9490, a mean absolute error (MAE) of 3.7873 MPa, a root mean square error (RMSE) of 2.4998 MPa, and an R² standard deviation of 0.0029 were obtained, confirming stability. Shapley additive explanations (SHAP) analysis further reveals that age and water–cement parameters dominate, with patterns consistent with hydration and water–binder theory. The proposed framework thus offers high accuracy, physical interpretability, and engineering applicability.

Keywords:

ablation experiments; concrete compressive strength prediction; DE; LSSVM; heterogeneous ensemble residual fusion; SHAP interpretability analysis

1. Introduction

1.1. Research Background

Concrete, the most widely used construction material globally, serves as the primary structural element in projects ranging from super-high-rise buildings to large-scale infrastructure. Its compressive strength is regarded as a fundamental indicator of structural load-bearing capacity and safety, as well as a critical reference for quality control throughout the project lifecycle and for disaster prevention. Accurate prediction of this strength, however, is significantly challenged by complex environmental conditions, heterogeneous material compositions, and the limited reliability of conventional predictive models. During construction, critical decisions such as the timing of formwork removal and the application of prestressing are typically informed by test results from specimens cured for 7 or 28 days [1]. As a result, construction schedules are delayed, early detection of strength deficiencies is hindered, and the risks of quality defects and cost overruns are increased. Traditional destructive testing can compromise structural integrity, incur additional repair costs, and require lengthy curing and testing, making it unsuitable for real-time assessment. Moreover, its limited adaptability to emerging materials and inability to capture coupled multi-factor effects often lead to predictions that deviate from actual performance. Consequently, traditional empirical models and standalone machine learning approaches are increasingly unable to meet the demands of high-performance and sustainable concrete. An urgent need is thus highlighted for next-generation predictive methods that integrate explainable artificial intelligence (XAI) [2], multi-algorithm collaboration, and physically informed constraints. Against the backdrop of the dual-carbon strategy and ongoing advances in smart construction [3], a scientific, precise, and reliable framework for predicting concrete compressive strength is therefore of considerable practical importance for enhancing structural safety, reliability, and durability.

1.2. Literature Review

The compressive strength of concrete is governed by multiple interacting factors, giving rise to strong nonlinearity and complex coupling effects. Accurate strength prediction has therefore become a central research focus. Rapid advances in artificial intelligence and computational power have accelerated progress; for example, the performance of adaptive neuro-fuzzy inference systems (ANFISs), support vector regression (SVR), and random forests (RFs) has been compared for concrete compressive strength prediction [4]. The continuous development of novel algorithms provides a solid foundation for increasingly sophisticated and accurate predictive models.

1.2.1. Research on Predicting Concrete Compressive Strength

Predicting concrete compressive strength accurately has remained a persistent challenge. Early work was dominated by statistical methods, with linear regression widely employed. However, concrete strength is governed by multiple interacting variables and a hardening mechanism that resists simple characterization. As a result, multi-factor coupling effects and the behavior of emerging materials are poorly captured by linear models. Predictive accuracy and generalization capability are consequently limited, restricting applicability to modern performance-based mix design and the stringent demands of construction quality control.

In early studies on high-performance concrete (HPC) strength prediction, Ji et al. [5] developed a back propagation (BP) neural network to examine the effects of key factors such as the water–cement ratio and curing age. As the model was applied more widely, several limitations were identified, including slow convergence, sensitivity to network architecture and hyperparameter settings, and susceptibility to local minima. To address these issues, improved variants were later proposed. For example, Wu et al. [6] employed an elastic BP algorithm to better capture the nonlinear effects of the water-to-binder ratio and ceramsite strength in lightweight aggregate concrete. Although performance gains were achieved for specific materials, a single-model framework was still relied upon, and the inherent training instability of BP was not fully overcome. Convergence instability caused by the random initialization of network weights and thresholds has motivated architectural refinements. He [7] developed a hybrid particle swarm optimization-back propagation (PSO-BP) neural network model by integrating the PSO algorithm. Prediction accuracy was improved by 8.26% and 2.05% over conventional BP and genetic algorithm-back propagation (GA-BP) models, respectively. The effectiveness of combining optimization algorithms with BP for strength prediction was thus demonstrated. With the diversification of machine learning methods, research focus has shifted from single-model improvement toward multi-algorithm fusion, generalization enhancement, and model interpretability. Cui et al. [8] constructed an HPC strength prediction model using the RF algorithm and identified cement content as the most influential parameter. Their model achieved a coefficient of determination (R²) of 0.902, outperformed several benchmark models, and served as a reliable tool for mix design optimization. To further improve predictive performance, Liu et al. [9] developed a hybrid RF-LSSVM model for the chloride ion permeability resistance of HPC, for which a root mean square error (RMSE) of 0.0491 and an R² of 0.941 were obtained. Although this hybrid paradigm was developed for a durability indicator, it provides direct methodological support for the ensemble framework proposed for compressive strength. To overcome the tendency of conventional sparrow search algorithms (SSAs) to converge on local optima, Wang et al. [10] incorporated a logistic chaotic map and a nonlinear decreasing inertia weight into the algorithm. These modifications were applied to optimize the initial weights and thresholds of a BP neural network, yielding an RF-logistic chaos sparrow search algorithm (LCSSA)-BP model. This finding further confirms the applicability of improved metaheuristics in HPC strength prediction and provides methodological support for the hybrid differential evolution-least squares support vector machine (DE-LSSVM) strategy adopted in this study. Tran et al. [11] systematically compared multiple machine learning models for predicting the compressive strength of recycled aggregate concrete, including standalone gradient boosting (GB), extreme gradient boosting (XGBoost), SVR, and PSO-hybridized versions of each. The GB-PSO hybrid model was found to perform best, yielding an R² of 0.9356. Although focused on recycled aggregate concrete, this systematic validation of hybrid model superiority applies broadly to the general paradigm of concrete compressive strength prediction. To address interpretability directly, Liu [12] introduced an explainable boosting model (EBM), which enables quantitative analysis of individual material parameter effects while maintaining high predictive accuracy (R² = 0.93, RMSE = 4.33, mean absolute error (MAE) = 3.10), and thereby offers practical guidance for mix design. These developments reflect a growing emphasis on balancing algorithmic sophistication with engineering applicability and computational efficiency. The focus has progressively shifted from excessive model complexity toward lightweight architectures, refined optimization strategies, and solutions of greater direct engineering relevance. As a result, the integration of machine learning with intelligent optimization algorithms has become a pivotal research direction. In this context, Shaaban et al. [13] compared the predictive performance of XGBoost and RF models on a Python-based platform using seven input variables, including cement content and curing age. The results showed that XGBoost delivered high accuracy while substantially reducing experimental cost and testing time. Cao [14] subsequently extended this approach by applying Bayesian optimization to a categorical boosting (CatBoost) model for concrete compressive strength prediction. Hyperparameter tuning through Bayesian optimization further improved performance, with an R² of 0.94 achieved. These findings confirm that the optimizer can be flexibly adapted to the base learner without sacrificing predictive accuracy. DE is known for its simplicity, fast convergence, and strong global search capability. Das [15] noted that classical DE is constrained by fixed control parameters, whereas adaptive and randomized strategies can greatly enhance population diversity and suppress premature convergence. This insight justifies the stochastically parameterized DE variant adopted in the present study. Wongsa et al. [16] improved DE’s search capability through adaptive switching of mutation strategies and crossover rate ranges; however, additional mechanisms, preset switching rules, and predefined parameter ranges were introduced. Metaheuristic hyperparameter optimization has greatly improved prediction accuracy, yet well-tuned single models remain inherently limited in capturing complex nonlinear mappings, and useful structural information is often left underexploited in the residuals. This has motivated a shift toward residual correction and multi-model fusion strategies.

1.2.2. Research on the Superiority of Residual Fusion Prediction Models

Residual fusion, rooted in the residual learning theory of deep learning, employs a dual-path “base prediction–residual correction” mechanism to model complex nonlinear relationships hierarchically. The predictive task is decomposed into primary modeling followed by error refinement, thereby enhancing overall performance. Residual fusion models remain relatively under-explored in concrete strength prediction. Hu et al. [17] proposed a stacking ensemble that fuses multiple base learners at the prediction level via a meta-learner for HPC compressive strength prediction. From an error-compensation perspective, the meta-learner implicitly performs residual learning by capturing systematic deviations among base predictors, though the residual structure is not explicitly modeled. In contrast, the residual correction process is made explicit and multi-level in this work: a hierarchical framework is constructed that progressively models and compensates for the primary predictor’s residuals, rather than reconciling all base predictions in a single step. Similarly, Rahchamani et al. [18] combined multiple SVR models within a fusion-learning-based optimizer, confirming that predictive uncertainty is reduced when heterogeneous outputs are synthesized. In both approaches, however, integration is performed at the level of final predictions rather than on prediction residuals. Li et al. [19] employed an improved gray wolf optimizer to tune multiple base learners and used a light gradient boosting machine (LightGBM) as a meta-learner in a stacking ensemble framework for predicting the compressive strength of hydraulic concrete. The hybrid ensemble strategy was shown to substantially outperform individual models, achieving an R² of 0.933. Wu et al. [20] developed a residual-corrected stacking ensemble model, in which a multilayer perceptron (MLP) network is employed to learn and compensate for the residuals of the primary predictor; important validation of the residual learning approach in this domain is thereby provided. Santos et al. [21] proposed a hybrid system that employs an ensemble of machine learning models, rather than a single model, to forecast the residuals of a linear predictor, demonstrating that generalization is improved and complex nonlinear error patterns are captured more effectively through ensemble-based residual modeling. Although developed for time series forecasting, this validated principle offers direct methodological motivation for the present study. The hybrid paradigm of base prediction and residual correction has been validated in other complex forecasting domains. For instance, Xu et al. [22] proposed a hybrid flood forecasting model that couples a process-driven hydrological model (HM) with data-driven models (DDMs). The DDM is employed as a post-processing procedure for residual correction of the HM’s original results, yielding a marked improvement in streamflow prediction accuracy. Although hybrid models often achieve high accuracy, their black-box nature limits the trust placed in them. Ablation studies have been increasingly used to validate mechanisms by analyzing model structures and component contributions, reflecting a dual focus on interpretability and performance. However, existing approaches typically remove only single components and rely on a narrow set of evaluation metrics, often producing biased results with insufficient interpretability.

1.2.3. Research on SHAP Analysis

Hybrid machine learning models achieve high predictive accuracy, yet their black-box nature limits engineering reliability. Recently, Shapley additive explanations (SHAP) have been applied to material property analysis with promising results. Abioye et al. [23] compared eight optimized machine learning algorithms for predicting the compressive strength of HPC; the best performance was attained by XGBoost and gradient boosting regressor (GBR) (R² = 0.935 and 0.921, respectively). Through SHAP analysis, cement content and curing age were identified as the most influential factors, illustrating an effective integration of optimized models and SHAP for accurate, interpretable prediction. Ghrici et al. [24] compared five Bayesian-optimized tree-based regression models for HPC strength prediction; the highest accuracy was achieved by CatBoost (R² = 0.95, RMSE = 4.05 MPa) on the UCI dataset. Through SHAP analysis, curing age and water-to-binder ratio were identified as the two dominant factors, offering an interpretable framework for data-driven mix design. Liu et al. [25] compared an XGBoost model with the empirical ACI 209 formula, and its decision mechanism was analyzed via SHAP. A unified framework for prediction, validation, and interpretation was thereby established, providing a benchmark for interpretability research in this domain. A similar framework integrating ensemble learning and SHAP was employed by Lin et al. [26] to predict the compressive strength of bagasse ash concrete, and the generalizability of this interpretability strategy across different concrete materials was further validated. An ensemble learning model for HPC compressive strength was developed by Taiwo et al. [27], achieving an R² of 0.943, and the SHAP framework was then applied to quantify the contributions of mix design factors, converting the model from a black box into an interpretable tool. For manufactured sand concrete, a high-precision predictive model was built by Li et al. [28] from large-scale datasets. Key factors such as curing age were accurately identified through SHAP analysis, and the underlying theoretical model was refined. A systematic framework progressing from data modeling to mechanism revelation and model optimization was thus established, providing a reliable basis for mix design. An interpretable machine learning model for recycled aggregate self-compacting concrete was subsequently developed by Miao et al. [29], where SHAP was used to quantify mix component contributions and elucidate their mechanisms, thereby linking material proportions with performance and supporting sustainable design. The combination of an explainable boosting machine (EBM) with Bayesian optimization was adopted by Liu [12]; an R² of approximately 0.93 was obtained, and both global and local explanations of mix parameter effects were provided. In general, SHAP is commonly used to quantify feature contributions and reduce the black-box nature of complex models. However, its efficiency and stability under large-scale, high-dimensional conditions still need to be improved. Future work should couple SHAP with physical models, shifting from explainable prediction toward explainable decision-making.

Existing prediction strategies can be broadly characterized as single-layer optimization or single-stage fusion: metaheuristic algorithms are employed to tune the hyperparameters of a single model, or fixed weights are applied to combine outputs from multiple base learners. Although performance gains are achieved through metaheuristic optimization, these approaches remain confined to a single-model framework. Hybrid optimization-ensemble models improve generalization by redistributing or weighting existing predictions; their ensemble process, however, is typically single-layered, and the combination is performed directly in the raw feature or prediction space without deeper modeling of error information. In this sense, existing hybrid ensembles essentially perform arithmetic on prediction outputs. Residual correction studies generally follow a fixed-residual strategy, in which one or two learners are pre-specified to capture residuals, and a predetermined fusion rule, such as simple averaging or weighted summation, is applied for correction. Although some improvement in accuracy is observed, the effect of residual ensemble depth has not been systematically examined, and sufficient adaptivity is lacking at the fusion stage. In the present work, diagnosis and repair are carried out in the residual space, and adaptive integration is accomplished through meta-learning. Overall, existing approaches have been largely confined to single-layer modeling or single-path fusion, and a multi-path, structured, and adaptive learning mechanism for prediction errors has yet to be established.

1.3. Framework and Contributions of This Study

A layered hybrid prediction framework integrating optimization, residual learning, fusion, and interpretation is proposed. First, the hyperparameters of the LSSVM primary predictor are tuned exclusively via DE, yielding a high-quality base prediction. This optimization reduces random errors and concentrates the residuals on systematic deviations and complex nonlinear structures, thus providing a more interpretable and modelable input for the subsequent multi-path residual learning. The residuals from the primary predictor are then treated as a unified learning target, and two heterogeneous ensemble learners, bagged decision trees (TreeBagger) and least squares boosting (LSBoost), are applied in parallel to capture distinct error patterns within the same residual space. A nonlinear fusion mapping is subsequently learned by an LSBoost meta-learner to adaptively combine the multiple residual corrections. In the experimental evaluation, training-set ratios ranging from 75% to 90% are systematically compared to determine the optimal data partition. Ablation experiments are conducted to validate the architectural design: by progressively removing or replacing critical components, the contributions of the LSSVM primary predictor, the TreeBagger and LSBoost residual correctors, and the fusion meta-learner are assessed, clarifying each module’s necessity and preventing ineffective or redundant modules from being retained. Model stability and robustness are evaluated using multiple random seeds. The proposed model is also compared with several traditional machine learning methods, and SHAP is employed to interpret feature contributions. Experimental results demonstrate that the proposed model achieves superior prediction accuracy and interpretability. The overall technical workflow is illustrated in Figure 1.

2. Materials and Methods

2.1. Data and Evaluation Metrics

A dataset of 1030 samples was used, consisting of eight input variables: cement, slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate, and age (7–365 days). The target variable was compressive strength (MPa). The data were obtained from laboratory tests and the UCI Machine Learning Repository (https://archive.ics.uci.edu/dataset/165/concrete, accessed on 10 June 2025). Descriptive statistics are summarized in Table 1.

The Pearson correlation coefficient matrix (rounded to three decimal places) and the pairwise correlation and distribution matrix are shown in Figure 2a,b, respectively. A moderate positive correlation is found between cement content and compressive strength, while a negative correlation is observed between water and superplasticizer content. The data distributions and correlation structures among the variables are also illustrated.

The 1030 × 9 dataset, including eight input features and the target compressive strength, was imported from an Excel file and randomly shuffled. The input features were separated from the target variable. Four training–testing split ratios (75%, 80%, 85%, and 90%) were evaluated to determine the optimal data partitioning strategy. The full dataset was retained for final visualization and comparison with conventional models.

For stable training, both the input features and the target were standardized [30] using the transformation given in Equations (1) and (2):

X_{norm} = \frac{X - μ_{X}}{σ_{X}}

(1)

Y_{norm} = \frac{Y - μ_{Y}}{σ_{Y}}

(2)

where

μ

and

σ

denote the mean and standard deviation of the training set, respectively.

2.2. Least Squares Support Vector Machine Principles and Parameter Optimization

The LSSVM, developed by Suykens et al., is a variant of the standard support vector machine (SVM) [31]. This model was implemented using the LSSVMlab toolbox (version 1.8, ESAT-STADIUS, KU Leuven, Leuven, Belgium). Give a training dataset

{(x_{i}, y_{i})}_{i = 1}^{N}

, LSSVM formulates a nonlinear prediction model as shown in Equation (3):

f (x) = w^{T} ϕ (x) + b

(3)

where

ϕ (.)

denotes a nonlinear mapping that projects the input into a high-dimensional feature space,

w

is the weight vector, and b is the bias term.

The optimization problem is defined as shown in Equation (4):

\min_{w, b, ξ} \frac{1}{2} w^{T} w + \frac{γ}{2} \sum_{i = 1}^{N} ξ_{i}^{2}

(4)

With constraints given in Equation (5):

y_{i} = w^{T} ϕ (x_{i}) + b + ξ_{i}, i = 1, 2, \dots, N

(5)

Here,

γ > 0

is the regularization parameter balancing model complexity against training error, and

ξ_{i}

is the error associated with the i-th sample.

The primal problem is transformed to its dual form via the Lagrangian and partial derivatives, which introduces the kernel function

K (x_{i}, x_{j})

, as given in Equation (6).

K (x_{i}, x_{j}) = ϕ {(x_{i})}^{T} ϕ (x_{j})

(6)

Common kernel functions include linear, polynomial, and radial basis function (RBF) kernels. The RBF kernel, given in Equation (7), is widely adopted for nonlinear modeling because of its robust performance and strong generalization capability.

K (x_{i}, x_{j}) = \exp (- \frac{‖ x_{i} - x_{j} ‖^{2}}{2 σ^{2}})

(7)

An RBF kernel is employed in the LSSVM model, whose prediction performance is controlled by two hyperparameters: the regularization parameter

γ

and the kernel width

σ^{2}

. To determine their optimal values, a DE algorithm with stochastic control parameters is used. The DE minimizes the mean RMSE obtained from three independent runs of 5-fold cross-validation, as expressed in Equation (8):

f i t n e s s = \frac{1}{3} \sum_{r = 1}^{3} (\frac{1}{5} \sum_{k = 1}^{5} \sqrt{\frac{1}{n_{k}} \sum_{i = 1}^{n_{k}} {(y_{i} - {\hat{y}}_{i})}^{2}})

(8)

The optimized LSSVM serves both as a baseline for validating the DE-based optimization and as the primary predictor whose residuals are subsequently corrected by the hierarchical ensemble framework.

2.3. Differential Evolution-Based Optimization of LSSVM Hyperparameters

Differential evolution (DE), originally proposed by Storn and Price [32], is a stochastic global optimization algorithm based on differential mutation, crossover, and selection. This is an unofficial academic code, please contact the author’s institution to obtain it.

The two LSSVM hyperparameters, regularization parameter

γ

and kernel width

σ^{2}

, are optimized using a DE algorithm. Both parameters are encoded on a logarithmic scale as

θ_{1} = \log_{2} (γ)

and

θ_{2} = \log_{2} (σ^{2})

, with the search space bounded within

θ_{1} \in [\log_{2} (0.01), \log_{2} (500)]

and

θ_{2} \in [\log_{2} (0.001), \log_{2} (100)]

.

The DE/rand/1/bin strategy with stochastically varied control parameters is employed. A population of 40 individuals is evolved over 300 generations. In each generation, a donor vector is generated for every target vector θᵢ via mutation, as defined in Equation (9):

v = θ_{r 1} + F \cdot (θ_{r 2} - θ_{r 3})

(9)

where r₁, r₂, and r₃ are randomly selected, mutually distinct indices, and the scaling factor F is uniformly sampled from [0.5, 1.0] independently for each individual. A trial vector u is then formed through binomial crossover with the current target, as defined in Equation (10):

u_{j} = \{\begin{array}{l} v_{j}, & if rand \leq C R \\ θ_{i, j}, & otherwise \end{array}

(10)

where the crossover rate CR is uniformly drawn from [0, 1] for each individual; the proportion of dimensions inherited from the donor vector is thereby determined. Boundary violations are corrected by truncation to the predefined logarithmic bounds. The target vector is replaced by the trial vector if a lower fitness value is achieved.

Fitness is evaluated through three independent runs of 5-fold cross-validation on the standardized training data. In each fold, an LSSVM with an RBF kernel is trained, and the RMSE is computed. The overall fitness is defined as the mean RMSE across all 3 × 5 folds, as expressed in Equation (11):

fitness = \frac{1}{3} \sum_{r = 1}^{3} (\frac{1}{5} \sum_{k = 1}^{5} {RMSE}_{k}^{(r)})

(11)

where

{RMSE}_{k}^{(r)}

denotes the RMSE on the k-th fold of the r-th cross-validation repeat. The DE minimizes this cross-validated error to return the optimal hyperparameters (

γ

,

σ^{2}

).

By stochastically parameterizing F and CR, population diversity is enhanced, premature convergence is mitigated, and the tedious manual tuning required by conventional DE is avoided. A complete parameter table is provided in Table 2.

2.4. Hierarchical Heterogeneous Ensemble Residual Fusion

To further improve prediction accuracy and stability, a multi-model residual fusion strategy is adopted. Baseline predictions are first generated by the DE-optimized LSSVM, and the resulting residuals are then explicitly modeled and fused. The training residual sequence

r

, defined as the difference between the measured

Y

and the predicted values

{\hat{Y}}_{LSSVM}

, is computed via Equation (12) and serves as the basis for subsequent fusion modeling.

r = Y - {\hat{Y}}_{LSSVM}

(12)

Specifically, the residual between the measured and the predicted strengths is learned in parallel by two heterogeneous ensemble learners, namely TreeBagger and LSBoost, which are both from the Statistics and Machine Learning Toolbox (version 23.2, The MathWorks, Inc., Natick, MA, USA) and operate on the same residual.

The residual is learned in parallel by two heterogeneous ensemble learners, TreeBagger and LSBoost. Their outputs are concatenated into a feature matrix and fed into an LSBoost meta-learner, which adaptively combines them into a fused residual correction.

TreeBagger was employed as a random forest regression model consisting of 200 regression trees trained using bootstrap resampling to learn the residual errors generated by the DE-LSSVM base predictor. Simultaneously, an LSBoost model based on the least-squares boosting strategy was constructed using regression trees as weak learners, where residual errors were iteratively reduced through gradient boosting. In the second stage, the outputs of the two first-level residual learners (TreeBagger and LSBoost) were concatenated as new input features and further processed by another LSBoost fusion model to generate the final residual correction term. The final corrected prediction

{\hat{Y}}_{fusion}

is obtained by adding this fused correction

{\hat{r}}_{fusion}

to the initial LSSVM output, as expressed in Equation (13).

{\hat{Y}}_{fusion} = {\hat{Y}}_{LSSVM} + {\hat{r}}_{fusion}

(13)

Experimental results show that this two-path residual fusion strategy leverages the complementary strengths of TreeBagger and LSBoost, reducing prediction error and improving generalization compared with conventional single-model residual correction; moreover, the dedicated meta-learner achieves a more nuanced integration than simple averaging.

The residuals of the LSSVM predictions are modeled by TreeBagger and LSBoost, and multi-model fusion is subsequently performed via an LSBoost meta-learner. Although LSBoost is employed at both stages, the two instances operate on fundamentally different inputs: the base LSBoost directly models the original residual using the concrete mixture features, whereas the meta-learner LSBoost operates on the transformed feature space formed by the outputs of the base residual learners, learning a nonlinear fusion strategy. This integrated approach reduces the bias inherent in any single model and improves predictive performance. LSBoost was implemented with the fitrensemble function in MATLAB, using regression trees as weak learners, and the default configuration was retained for reproducibility. Because LSBoost serves as a meta-learner for residual integration rather than a primary predictor, extensive hyperparameter tuning was not considered necessary. The consistent test-set performance observed across multiple data-splitting ratios indicates that this configuration is reasonably robust and does not materially affect the overall conclusions.

2.5. SHapley Additive exPlanations

SHAP is a model-agnostic interpretation framework grounded in cooperative game theory. Shapley [33] introduced the Shapley value concept in 1953 to achieve fair payoff distribution among coalition members. The SHAP analysis was performed using the shapley function in the Statistics and Machine Learning Toolbox (version 23.2, The MathWorks, Inc., Natick, MA, USA). The SHAP value of a feature is defined as the weighted average of its marginal contributions across all feature subsets (Equation (14)).

ϕ_{j} (val) = \sum_{S \subseteq M ∖ {j}} \frac{| S |! (M - | S | - 1)!}{M!} (val (S \cup {j}) - v a l (S))

(14)

Here,

M

denotes the complete set of all input features,

S

represents a subset of features excluding feature

i

, and

val (S)

is the prediction function for the given subset.

SHAPs are built upon an additive model, as expressed in Equation (15).

f (x) = ϕ_{0} + \sum_{i = 1}^{M} ϕ_{i}

(15)

The prediction

f (x)

for a sample

x

is decomposed into a baseline value

ϕ_{0}

and the sum of Shapley values

ϕ_{i}

. Here,

ϕ_{0}

represents the model’s output for a reference baseline, typically the average prediction over the training set, while each

ϕ_{i}

quantifies the contribution of the i-th feature to the deviation from this baseline.

The contribution of each input feature was quantified through the SHAP framework. The prediction for a given sample is additively decomposed by this method, as defined in Equation (16):

ϕ_{i} = \sum_{S \subseteq F ∖ {i}} \frac{| S |! (| F | - | S | - 1)!}{| F |!} [f_{S \cup {i}} (x_{S \cup {i}}) - f_{S} (x_{S})]

(16)

where

F

is the complete feature set and

f_{S}

is the model output obtained with the feature subset

S

.

Based on this analysis, curing age, cement content, and fly ash dosage were identified as the most influential factors affecting concrete compressive strength. A quantitative ranking of feature contributions was derived from the SHAP values, providing a direct basis for mixture-proportion optimization.

3. Results

3.1. Model Evaluation Metrics

The performance of the proposed DE-optimized LSSVM heteroscedastic residual (Treebagger and LSBoost) model was evaluated using four metrics: R², RMSE, MAE, and residual predictive deviation (RPD).

R² is defined in Equation (17) and quantifies the proportion of variance explained by the model.

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(17)

RMSE is defined in Equation (18):

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(18)

where

{\hat{y}}_{i}

and

{\hat{y}}_{i}

are the measured and predicted compressive strengths of the i-th sample, and

n

is the total number of samples. RMSE describes the average magnitude of the prediction error in the same units as the target (MPa).

MAE is defined in Equation (19):

MAE = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

(19)

And represents the mean absolute prediction error, providing an intuitive measure of average deviation.

RPD is defined in Equation (20) as the ratio of the standard deviation of the prediction residuals to the standard deviation of the observed values.

RPD = \frac{S D (y)}{S D (y - \hat{y})}

(20)

All computations, including the DE optimization, LSSVM modeling, TreeBagger and LSBoost ensemble learning, and SHAP analysis, were performed in MATLAB (R2023b, The MathWorks, Inc., Natick, MA, USA). Results show that the proposed model achieves a near-perfect R², low RMSE and MAE, and an RPD above 2.0, indicating high predictive accuracy and robustness.

3.2. Prediction Performance and Model Analysis

A two-level optimization framework was proposed. Model parameters were optimized at the first level, and model structure was refined at the second. This design demonstrates the advantage of integrating baseline prediction with intelligent residual correction. The dataset was randomly shuffled and split into training and test sets at ratios of 0.75:0.25, 0.80:0.20, 0.85:0.15, and 0.90:0.10. Model performance was evaluated across these partitions, and the split yielding the highest test-set R² was selected to maximize data utilization and modeling robustness. Through parameter optimization, improvements were achieved in data utilization efficiency, modeling robustness, and the inferential reliability of the results.

Data leakage was avoided by standardizing the features and the target using only the training-set statistics; the same transformation parameters were subsequently applied to the test set. Under the selected data partition, the first optimization level was executed. The LSSVM hyperparameters (regularization coefficient

γ

and kernel parameter

σ^{2}

) were tuned via DE within a log₂ search space to ensure numerical stability. Base predictions were generated by the optimized LSSVM, and the corresponding residuals were obtained. These residuals were independently modeled by TreeBagger and LSBoost. The resulting residual predictions were assembled as inputs and fused through an LSBoost meta-model. The final prediction was obtained by adding the fused residual to the initial LSSVM output. During DE optimization, each candidate hyperparameter set was evaluated through repeated five-fold cross-validation to stabilize parameter selection. This two-level architecture substantially improved both accuracy and robustness.

Model performance was evaluated on the held-out test set using R², RMSE, MAE, and RPD. Model interpretability was examined through SHAP, which quantified the contribution of each input feature. The complete workflow of the proposed framework is presented in Figure 3.

3.2.1. Results and Analysis of Models with Different Training and Testing Set Scales

To enhance reproducibility, a fixed random seed of 2025 was used; the seed was chosen arbitrarily and without performance tuning. The effect of the training-test split was assessed across four ratios: 0.75:0.25, 0.80:0.20, 0.85:0.15, and 0.90:0.10. Predictive accuracy varied with the partition, reflecting differences in the size and composition of the training subset. All four configurations are compared in Figure 4a–d.

Model performance under each training-test split ratio is presented in Table 3. For the 0.75:0.25 partition, a training R² of 0.9964 and a test R² of 0.9282 were obtained, along with RMSE, MAE, and RPD values of 4.3482 MPa, 2.8731 MPa, and 3.7345, respectively. When the training proportion was increased to 80%, a training R² of 0.9962 and a test R² of 0.9362 were recorded, accompanied by lower RMSE and MAE values of 4.1286 MPa and 2.6129 MPa. For the 0.85:0.15 split, a training R² of 0.9956 was achieved; on the test set, an R² of 0.9434, an RMSE of 3.9476 MPa, an MAE of 2.5642 MPa, and an RPD of 4.2236 were obtained. The best overall performance was produced by the 0.90:0.10 split, for which the highest test R² of 111, the lowest prediction errors (RMSE = 3.2891 MPa, MAE = 2.1580 MPa), and the largest RPD of 5.2240 were recorded. These results confirm the superior predictive accuracy and generalization capability of this split relative to the other ratios evaluated.

Test-set R² values exceeded 0.92 across all training-test splits, indicating robust predictive performance and stable generalization. The highest test-set R² (0.9625), together with a training-set R² of 0.9958, was obtained with the 0.90:0.10 split. This split was therefore adopted as the optimal partitioning scheme for subsequent analyses.

3.2.2. Ablation Results

A systematic ablation study was performed to evaluate the contribution of each component in this study. Five model configurations were built upon the DE-optimized LSSVM core: (1) the base model without residual correction; (2) the base model with a single-level residual module (TreeBagger); (3) the base model with a two-level residual structure composed of a TreeBagger and bootstrap aggregating (Bagging); (4) the base model with a three-level residual structure (TreeBagger, Bagging, and LSBoost); and (5) the base model with a two-level residual structure (TreeBagger and LSBoost). Performance metrics for each configuration are reported in Table 4.

A test-set R² of 0.9183 was achieved by the baseline DE-optimized LSSVM, reflecting strong standalone predictive capability. Specifically, R² increased from 0.9183 to 0.9512, while RMSE and MAE decreased from 4.8523 MPa to 3.7498 MPa and from 3.5569 MPa to 2.1819 MPa, respectively. The MAE was reduced by 38.7%, corresponding to an average prediction error lowered by about 1.4 MPa. For concrete compressive strength prediction, such a reduction is practically meaningful because it narrows uncertainty in key engineering decisions, including formwork removal timing, quality acceptance, and mix design adjustment. The RMSE reduction of 22.7% further indicates that large prediction deviations are effectively suppressed; this is particularly important in safety-critical applications where extreme errors can lead to over-conservative design or insufficient strength assessment. In addition, the RPD was raised from 3.53 to 4.56, indicating a shift from very good to excellent predictive capability for quantitative engineering applications. Overall, these results demonstrate that the proposed model not only improves statistical accuracy but also provides more reliable support for practical, concrete engineering decisions.

In contrast, the two-level residual ensemble composed of TreeBagger and LSBoost achieved the best overall performance among all compared configurations, outperforming both the single-level baseline and the three-level ensemble. Notably, the variant that incorporated bagging failed to yield further gains and even exhibited a slight degradation, indicating that simply increasing model diversity is not necessarily beneficial. The key factor appears to be the balance between ensemble diversity and structural simplicity, which avoids overfitting and redundant error compensation. Further analysis shows that the LSBoost-based meta-model plays a critical role in this architecture: it effectively integrates the complementary strengths of the base learners and reduces residual errors through iterative gradient boosting. Meanwhile, the RPD increased from approximately 3.53 to above 4.56, providing additional evidence of the robustness and predictive reliability of the proposed framework for concrete compressive strength.

A performance comparison of the five experimental configurations is provided in Figure 5a–d, based on the metrics R², RMSE, MAE, and RPD.

At the 0.9 split ratio, an R² of 0.9512 and an RMSE of 3.7859 MPa were achieved by the Treebagger + LSBoost model (Figure 6), confirming its reliability and predictive stability. A systematic offset of approximately 1% was observed across independent runs despite identical random seeds, model architectures, and core hyperparameters. This minor discrepancy is attributed to implementation-level variations in the training pipeline, including differences in data loading order and framework-specific floating-point operations. Although the global random seed governs the primary stochastic processes, certain internal mechanisms within the ensemble learners are not fully constrained by it. Such behavior is commonly encountered in complex ensemble frameworks and falls well within acceptable reproducibility tolerances.

The experimental results demonstrate that the proposed framework reliably predicts concrete compressive strength. By employing LSBoost as a meta-learner to integrate outputs from the DE-optimized LSSVM base model and a residual-correcting ensemble, both predictive accuracy and robustness are enhanced.

3.2.3. Comparison of Results from Different Random Seed Models

To evaluate the stability and robustness of the proposed DE-optimized LSSVM residual fusion model, repeated experiments were performed using random seeds of 99, 666, 2025, 2028, 2030, and 8888. The model architecture and parameter ranges were held fixed; only the random initialization was varied. Variations in performance were assessed by comparing R², RMSE, and MAE obtained across different seeds.

The test-set performance of the model across six random seeds is presented in Table 5. The results demonstrate minimal variability: R² ranged from 0.9442 to 0.9518, RMSE from 3.4214 to 4.0221 MPa, and MAE from 2.3929 to 2.7413 MPa. The average R² was 0.9490 (±0.0029), with the maximum value of 0.9518 for seed 8888 and the minimum of 0.9442 for seed 666. The stable predictive performance under different random seeds was evidenced by the consistently low standard deviations across all metrics. The robustness of the proposed framework and the reliability of the reported results were further attested by the low sensitivity to seed selection.

Performance curves obtained under different random seeds are shown in Figure 7. Slight fluctuations due to random initialization were observed, yet the proposed DE-optimized LSSVM (TreeBagger + LSBoost) model consistently outperformed the conventional methods by a clear margin. The low variability across independent runs confirmed the framework’s stability. All experiments were conducted through independent executions, and the minimal performance deviations attest to the robustness of the results without affecting the reported superiority of the model.

In summary, the proposed model exhibited low sensitivity to random initialization, as reflected by the consistently small performance variations across independent runs. This insensitivity suggests that the optimization process reliably converges to a favorable and essentially similar basin of attraction regardless of the starting conditions, thereby contributing to the robust and reproducible performance reported.

3.3. Comparative Analysis of Different Model Results

A comparative analysis was conducted to evaluate the predictive performance of the proposed model against several benchmarks, including DE-LSSVM, BP, RF, SVR, linear regression (LR), and decision tree (DT). These benchmarks were selected to provide a comprehensive baseline. All experiments employed a fixed 90:10 training-test split. Data were uniformly normalized, and hyperparameters were optimized via cross-validation to avoid selection bias. Model performance was assessed using R², RMSE, and MAE, which respectively capture goodness-of-fit, error magnitude, and average deviation.

The test-set performance of the proposed DE-LSSVM heterogeneous residual ensemble model is compared with that of six benchmark models in Table 6. Conventional approaches such as BP and SVR were consistently outperformed, confirming the effectiveness of the DE algorithm for hyperparameter selection. In contrast, poor performance was exhibited by LR and DT, with low R², high RMSE, and high MAE, indicating that concrete strength is governed by a strongly nonlinear relationship with its influencing factors. Prediction accuracy was improved by SVR and BP, yet these models were constrained by overfitting or convergence issues. Greater stability was offered by RF, but its generalization capability was surpassed by the proposed fusion model.

The superior performance of the proposed DE-LSSVM heterogeneous residual ensemble model was demonstrated on the test set, achieving an R² of 0.9512, an RMSE of 3.7498 MPa, and an MAE of 2.1819, indicating high predictive accuracy and strong fitting capability. Consistent improvements across all metrics were observed when compared to a single-optimizer or non-residual ensemble model, validating the effectiveness of the residual fusion mechanism in enhancing both generalization and prediction accuracy.

As shown in Figure 8a–c, the best overall performance was achieved by the DE-LSSVM (TreeBagger and LSBoost) heterogeneous residual ensemble model, with a test-set R² of 0.9512. The clear performance margin over the benchmarks confirms that higher accuracy and greater robustness are obtained with the proposed framework for predicting concrete compressive strength.

The scatter distributions of predicted versus measured values for the seven models are shown in Figure 9a–g. The scatter points of the proposed model are observed to cluster most tightly around the diagonal y = x, with smaller dispersion and noticeably fewer outliers, indicating close agreement with the measured values. By contrast, greater scatter is exhibited by the other models, especially LR and DT, reflecting comparatively lower predictive accuracy.

In summary, significant and consistent performance gains were demonstrated by the proposed model across all validation settings, including multi-scale evaluations, ablation studies, random-seed tests, and systematic comparisons with baseline models. These gains are traced not to any single algorithmic component, but to the integrated design of multi-model collaboration and heterogeneous residual ensemble. Through the combination of complementary predictive strengths and a structured error-correction mechanism, substantial improvements in both accuracy and robustness are achieved. A reliable and practical solution for high-precision forecasting in complex engineering applications is thus provided.

3.4. SHAP Model Interpretability Analysis

To interpret the DE-LSSVM heterogeneous residual (Treebagger + LSBoost) fusion model for concrete compressive strength prediction, the SHAP method was employed. SHAP quantifies the contribution of each input variable, providing both global and local explanations of this complex nonlinear model.

The mean absolute SHAP values and the beeswarm plot are presented in Figure 10a,b. The overall contribution of each input variable is reflected by the former, while the distribution of SHAP values across all test samples is shown by the latter, revealing how contribution directions vary across samples. A greater overall influence of a feature on the model prediction is indicated by a larger mean

|S H A P|

value.

Cement was identified by the SHAP-based ranking as the most influential variable in the fusion model, followed closely by age, indicating that the learning of compressive strength patterns relied primarily on binder content and curing time. Among supplementary cementitious materials, a higher importance was assigned to slag than to fly ash, possibly reflecting differences in their learned contributions under the current data distribution. Water was also found to be an important predictor, which is consistent with its well-known role in hydration and pore-structure development. In contrast, relatively smaller mean absolute SHAP values were observed for superplasticizer and the aggregate-related variables. Overall, this ranking is broadly consistent with established concrete behavior, suggesting that the dominant predictive patterns in the data have been captured by the model. However, these SHAP results should be interpreted as model-level contribution tendencies rather than strict causal physical coefficients.

In the SHAP beeswarm plot, high-value samples of age and cement (red dots) were observed more frequently in the positive SHAP region, indicating that larger curing ages and greater cement contents typically raised the predicted compressive strength. By contrast, high-value water samples were concentrated in the negative SHAP region, suggesting that higher water content was associated with lower predicted strength. This finding aligns with the classical understanding that a higher water–cement ratio leads to greater porosity and reduced structural density. For slag and fly ash, both positive and negative SHAP contributions were observed across samples, revealing a nonlinear influence governed by feature interactions and sample conditions.

Only the dataset-level average contribution trend is captured by the mean SHAP value; it should not be interpreted as a strict physical causal relationship. Because of the strong coupling among mix proportions and the nonlinear interactions between variables in concrete, both positive and negative SHAP contributions can be shown by the same feature in different samples, leading to cancelation when averaged. The statistical results of Mean

|S H A P|

and Mean SHAP for each input variable are listed in Table 7. For instance, although high global importance is assigned to cement, its mean SHAP approaches zero, indicating that its influence direction is not consistent across samples. This finding suggests that complex synergistic relationships in the concrete mix are captured by the model, rather than simple single-factor linear laws.

Age was assigned the highest Mean

|S H A P|

value, and its Mean SHAP was significantly positive, confirming that age is not only a key predictor of compressive strength but also exhibits high directional stability. This pattern is consistent with the ongoing hydration and later-age strength development of concrete. In contrast, a negative mean SHAP value was observed for water, indicating that greater water content generally reduced the predicted strength. This aligns with the classical understanding that a higher water–cement ratio increases porosity and lowers structural density. For variables such as slag, fly ash, and superplasticizer, the average contribution direction was influenced by the substitution rate, water–cement ratio, age, and sample distribution. Therefore, their positive or negative signs should be interpreted as data-driven average contributions rather than fixed physical laws.

To further clarify the model’s prediction mechanism at the sample level, local SHAP interpretation results for a single test sample are presented in Figure 11. The specific contribution of each input variable relative to the baseline prediction is captured by the local SHAP value: a positive value indicates that the prediction was raised by the feature for that sample, whereas a negative value indicates it was lowered. Because local SHAP is strongly sample-dependent, the local contribution direction may differ from the global average trend. For example, negative SHAP values may be assigned to cement in some samples and positive values in others. This variation is attributed mainly to the complex nonlinear interactions among features under different mix conditions, rather than to fixed positive or negative physical effects of the variables themselves.

Overall, the SHAP analysis demonstrated that the key factors governing concrete compressive strength could be accurately identified, and the complex nonlinear couplings among variables were effectively captured by the established fusion model. The resulting interpretations were found to be broadly consistent with the fundamental hydration mechanisms and mechanical behavior of concrete, and the reliability and interpretability of the prediction model for engineering applications were thereby verified.

4. Discussion

A two-level heterogeneous ensemble residual fusion model based on DE-LSSVM is proposed for predicting concrete compressive strength.

The superior performance of the model can be attributed to the following factors:

(1): The introduction of DE enhanced the hyperparameter optimization capability of the LSSVM for complex nonlinear relationships, enabling the base predictor to obtain more stable hyperparameter combinations. Since DE optimization was confined to the LSSVM base predictor alone, hyperparameter tuning was effectively decoupled from ensemble construction, thereby reducing computational cost while improving training stability and generalization performance under different dataset partition conditions.
(2): Rather than a single-layer correction, a structured multi-path residual learning mechanism is adopted. The model’s performance is largely driven by a shift of the modeling target from the output space to the residual space. In most existing studies, residual correction is implemented as a single-layer serial structure, where a primary prediction is generated by one model and the residual is corrected by another. In the proposed framework, residual learning is organized into a multi-path architecture: the same residual is processed in parallel by two heterogeneous ensemble learners, and their outputs are adaptively integrated by a meta-learner.
(3): Ablation experiments revealed that the performance improvement was driven primarily by the complementarity among residual learning mechanisms, rather than by simply increasing the number of learners. The superiority of the two-stage residual fusion structure further indicates that a well-designed fusion architecture combined with an adaptive nonlinear ensemble strategy is far more critical to generalization than merely stacking models. Moreover, the stability observed across multiple random seeds confirms the robustness and reliability of the fusion framework in predicting concrete compressive strength.
(4): The proposed model’s insensitivity to random initialization is confirmed by the low standard deviations observed across all metrics (e.g., 0.0029 for R²). This limited variance indicates that the DE-optimized LSSVM framework converges stably to a favorable solution regardless of starting conditions. Such stability not only ensures that research outcomes are reproducible but also demonstrates that the observed performance improvements are systematic, rather than artifacts of seed selection. Minor fluctuations across independent runs did not affect the overall conclusions, further confirming the robustness of the reported results.
(5): SHAP analysis reveals that the fusion model can effectively capture the nonlinear interactions between concrete mix parameters and strength development. The contribution patterns of the input variables align well with the theory of concrete hydration and the water–binder ratio, suggesting that the model not only predicts accurately but also makes physical sense. In addition, the way SHAP contributions vary across different samples further shows that the development of concrete compressive strength is driven by the coupled effects of multiple variables, rather than being controlled by any single factor alone.

From an engineering perspective, the proposed fusion framework can be utilized as a reliable tool for predicting concrete compressive strength and as a reference for practical tasks such as mix proportion optimization, construction quality control, and early-age strength evaluation. The transparency of the model’s decision-making process is further enhanced by the incorporation of SHAP interpretability, which helps satisfy the dual demands for prediction accuracy and physical interpretability in engineering applications.

5. Conclusions

Based on a dataset comprising 1030 concrete mix proportions and compressive strength samples, a heterogeneous residual ensemble (TreeBagger + LSBoost) prediction model integrating a DE optimized-LSSVM was developed, and its interpretability was further examined using the SHAP method. Through systematic model development and validation, the following principal conclusions were drawn:

(1): On the test set, under multi-seed evaluation, a mean R² of 0.9490 was achieved, along with an MAE of 3.7873 MPa and an RMSE of 2.4998 MPa. The consistently small standard deviations (e.g., 0.0029 for R²) confirmed the stability of the proposed model.
(2): Ablation experiments demonstrated that the two-stage residual design outperformed both single-corrector and three-learner configurations, confirming that model complementarity and adaptive nonlinear combination are more critical than simply adding more learners. Moreover, the robustness and reliability of the fusion framework for predicting concrete compressive strength are further demonstrated by the stability observed across multiple random seeds.
(3): SHAP analysis shows that the fusion model can accurately pinpoint the key factors influencing concrete compressive strength, capture the complex nonlinear interactions among multiple variables, and yield interpretations that are largely consistent with the hydration mechanisms of concrete materials. Furthermore, the key factors and contribution patterns identified by SHAP provide theoretical insight and practical reference for concrete mix design, water–binder ratio control, and curing management.

In summary, a reliable data-driven framework for concrete mix design is developed. SHAP-based interpretability quantifies the contributions of key factors, including curing age and cement content, transforming the model from a ‘black box’ into a transparent analytical tool and offering engineers theoretical insights into material behavior. By integrating high accuracy with explainability, this framework is expected to reduce experimental costs, improve design efficiency, and promote the adoption of interpretable machine learning in civil engineering materials.

Future research may be directed toward the following areas: model generalization can be improved through the integration of multi-source mix design data; material mechanisms may be better elucidated by constructing physically meaningful composite features such as effective water–cement ratio; model robustness can be enhanced via ensemble optimization and data distribution refinement; and synergistic effects among variables can be uncovered using SHAP interaction values or local sensitivity analysis.

Author Contributions

Conceptualization, J.S. and Y.W.; methodology, J.S. and Y.W.; software, Y.W.; validation, J.S., Y.W. and X.W.; formal analysis, Y.W.; investigation, Y.W.; data curation, Y.W.; resources, J.S.; writing—original draft preparation, Y.W.; writing—review and editing, J.S., Y.W. and X.W.; visualization, Y.W.; supervision, J.S. and X.W.; project administration, J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found in the UCI Machine Learning Repository: https://archive.ics.uci.edu/dataset/165/concrete, accessed on 10 June 2025. The datasets used during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

Author Xiongyu Wang was employed by the company Research on Bridge Engineering Construction China Railway Major Bridge Science Research Institute Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

GB 50666-2011; Ministry of Housing and Urban-Rural Development of the People’s Republic of China. China Architecture & Building Press: Beijing, China, 2011.
Yuan, J.; Wang, J.B.; Chen, X.; Huang, X.; Zhang, A.X.; Cui, A.Q. Research progress on the application of artificial intelligence in ultra-high performance concrete. J. Jilin Univ. (Eng. Technol. Ed.) 2025, 55, 771–789. [Google Scholar] [CrossRef]
Xiao, J.Z.; Deng, Q.; Xia, B. Evolution and Prospects of Low-Carbon Concrete Preparation. J. Archit. Civ. Eng. 2022, 39, 1–12. [Google Scholar] [CrossRef]
Cihan, M.T. Comparison of artificial intelligence methods for predicting compressive strength of concrete. Građevinar 2021, 73, 617–632. [Google Scholar] [CrossRef]
Ji, T.; Lin, T.W.; Lin, X.J. Prediction method of concrete compressive strength based on artificial neural network. J. Build. Mater. 2005, 8, 677–681. [Google Scholar] [CrossRef]
Wu, T.; Huang, K.; Yao, D.F. Prediction of compressive strength and analysis of influencing factors of ceramsite lightweight aggregate concrete based on BP-ANN. Bull. Chin. Ceram. Soc. 2015, 34, 2476–2481. [Google Scholar] [CrossRef]
He, X.F. Prediction of concrete compressive strength based on PSO-BP neural network. Microcomput. Appl. 2011, 30, 87–90. [Google Scholar] [CrossRef]
Cui, X.N.; Wang, Q.C.; Zhang, R.L.; Dai, J.P.; Xie, C. Prediction of compressive strength of high-performance concrete based on random forest. J. Lanzhou Jiaotong Univ. 2021, 40, 14. [Google Scholar] [CrossRef]
Liu, Y.; Cao, Y.; Wang, L.; Chen, Z.; Qin, Y. Prediction of the durability of high-performance concrete using an integrated rf-lssvm model. Constr. Build. Mater. 2022, 356, 129232. [Google Scholar] [CrossRef]
Wang, K.; Ren, J.; Yan, J.; Wu, X.; Dang, F. Research on a concrete compressive strength prediction method based on the random forest and lcssa-improved bp neural network. J. Build. Eng. 2023, 76, 107150. [Google Scholar] [CrossRef]
Tran, V.; Dang, V.Q.; Ho, L.S. Evaluating compressive strength of concrete made with recycled concrete aggregates using machine learning approach. Constr. Build. Mater. 2022, 323, 126578. [Google Scholar] [CrossRef]
Liu, G.; Sun, B. Concrete compressive strength prediction using an explainable boosting machine model. Case Stud. Constr. Mater. 2023, 18, e01845. [Google Scholar] [CrossRef]
Shaaban, M.; Amin, M.; Selim, S.; Riad, I.M. Machine learning approaches for forecasting compressive strength of high-strength concrete. Sci. Rep. 2025, 15, 25567. [Google Scholar] [CrossRef] [PubMed]
Cao, Y. Research on the prediction of concrete compressive strength based on Bayesian optimization catboost model. Concrete 2025, 7, 87–94. [Google Scholar] [CrossRef]
Das, S.S.P.N. Differential Evolution: A Survey of the State-of-the-Art. IEEE Trans. Evol. Comput. 2011, 15, 4–31. [Google Scholar] [CrossRef]
Wongsa, W.; Puphasuk, P.; Wetweerapong, J. Differential Evolution with Adaptive Mutation and Crossover Strategies for Nonlinear Regression Problems. Bull. Electr. Eng. Inform. 2024, 13, 3503–3514. [Google Scholar] [CrossRef]
Hu, Y.C.; Liang, M.; Xie, C.R.; Xie, W.W.; Weng, Y.L.; Chi, H.; Peng, H.; Luo, X.S. Strength Prediction Method of High Performance Concrete Based on Stacking Model Fusion. Bull. Chin. Ceram. Soc. 2023, 42, 3914–3926. Available online: http://gsytb.jtxb.cn/EN/Y2023/V42/I11/3914 (accessed on 5 January 2026).
Rahchamani, G.; Movahedifar, S.M.; Honarbakhsh, A. Fusion-Learning-Based Optimization: A Modified Metaheuristic Method for Lightweight High-Performance Concrete Design. Complexity 2022, 2022, 6322834. [Google Scholar] [CrossRef]
Li, T.; Hu, X.; Li, T.; Liao, J.; Mei, L.; Tian, H.; Gu, J. Enhanced Prediction and Evaluation of Hydraulic Concrete Compressive Strength Using Multiple Soft Computing and Metaheuristic Optimization Algorithms. Buildings 2024, 14, 3461. [Google Scholar] [CrossRef]
Wu, Y.; Wang, S.; Gao, C.; Sun, R.; Xia, C. Residual-Corrected Stacking Ensemble Learning for Concrete Strength Prediction and Optimization of Low-Carbon Mix Design. AIP Adv. 2025, 15, 115237. [Google Scholar] [CrossRef]
Santos Júnior, D.S.O.; de Mattos Neto, P.S.G.; de Oliveira, J.F.L.; Cavalcanti, G.D.C. A hybrid system based on ensemble learning to model residuals for time series forecasting. Inf. Sci. 2023, 649, 119614. [Google Scholar] [CrossRef]
Xu, C.; Zhong, P.; Zhu, F.; Xu, B.; Wang, Y.; Yang, L.; Wang, S.; Xu, S. A hybrid model coupling process-driven and data-driven models for improved real-time flood forecasting. J. Hydrol. 2024, 638, 131494. [Google Scholar] [CrossRef]
Abioye, S.O.; Babatunde, Y.O.; Abikoye, O.A.; Shaibu, A.N.; Bankole, B.J. Optimized Machine Learning Algorithms with SHAP Analysis for Predicting Compressive Strength in High-Performance Concrete. AI Civ. Eng. 2025, 4, 16. [Google Scholar] [CrossRef]
Ghrici, A.A.; Benzaamia, A.; Mezzoudj, F.; Medjahed, C.; Ghrici, M. SHAP-Enhanced Tree-Based Regression for Predicting the Compressive Strength of High Performance Concrete. Constr. Build. Mater. 2025, 495, 143602. [Google Scholar] [CrossRef]
Liu, C.L.; Li, S.; Cui, X.N.; Cai, L.; Zhang, J.G. A predictive analysis model for concrete compressive strength integrating XGBoost and SHAP. Water Resour. Hydropower Eng. 2025, 56, 246–258. [Google Scholar] [CrossRef]
Lin, X.; Liang, S.X.; Feng, S.Y. Prediction model of compressive strength of bagasse ash concrete based on ensemble learning. J. Zhejiang Sci.-Tech. Univ. (Nat. Sci.) 2024, 51, 507–517. [Google Scholar] [CrossRef]
Taiwo, R.; Yussif, A.M.; Adegoke, A.H.; Zayed, T. Prediction and Deployment of Compressive Strength of High-Performance Concrete Using Ensemble Learning Techniques. Constr. Build. Mater. 2024, 451, 138808. [Google Scholar] [CrossRef]
Li, S.; Ailifeila, A.; Luo, W.B.; Chen, J.J. Interpretable prediction of compressive strength of ultra-high performance concrete based on AutoML-SHAP. Bull. Chin. Ceram. Soc. 2024, 43, 3634–3644. [Google Scholar] [CrossRef]
Miao, Q.; Gao, Z.; Zhu, K.; Guo, Z.; Sun, Q.; Zhou, L. Interpretable machine learning model for compressive strength prediction of self-compacting concrete with recycled concrete aggregates and scms. J. Build. Eng. 2025, 108, 112965. [Google Scholar] [CrossRef]
Singh, D.; Singh, B. Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 2020, 97, 105524. [Google Scholar] [CrossRef]
Xue, X. Evaluation of concrete compressive strength based on an improved pso-lssvm model. Comput. Concr. 2018, 21, 505–511. [Google Scholar] [CrossRef]
Storn, R.; Price, K. Differential Evolution-A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
Wu, L. A review of the transition from Shapley values and SHAP values to RGE. Statistics 2025, 59, 1161–1183. [Google Scholar] [CrossRef]

Figure 1. Technology roadmap.

Figure 2. (a) Pearson correlation coefficient matrix; (b) pairwise feature relationship and distribution diagram.

Figure 3. DE-LSSVM heterogeneous residual ensemble and SHAP flowchart.

Figure 4. Results of different scale models. (a) Training Set: fusion prediction (ratio = 0.75); (b) training set: fusion prediction (ratio = 0.80); (c) training set: fusion prediction (ratio = 0.85); (d) training set: fusion prediction (ratio = 0.90).

Figure 5. Comparison of ablation results. (a) R² comparison; (b) RMSE comparison; (c) MAE comparison; (d) RPD comparison.

Figure 6. Comparison of 5 fusion strategies (Optimal Ratio = 0.9).

Figure 7. Performance line chart for different random seeds.

Figure 8. Performance comparison of seven predictive models for concrete compressive strength. (a) Test R² values; (b) test RMSE values; and (c) test MAE values.

Figure 9. Scatter plots of predicted versus actual compressive strength values for the seven compared models, including (a) DE-LSSVM, (b) BP, (c) RF, (d) SVR, (e) LR, (f) DT, and (g) DE-LSSVM (TB + LS).

Figure 10. SHAP-based interpretability analysis of the prediction model. (a) Average feature importance; (b) SHAP beeswarm plot.

Figure 11. Local SHAP contributions of input features for a test sample.

Table 1. Descriptive statistics of the dataset.

Variable	Type	Unit	Minimum Value	Maximum Value	Mean	Standard Deviation	Skewness
Cement content	Input	kg/m³	102.00	540.00	281.17	104.51	0.51
Slag content	Input	kg/m³	0.00	359.40	73.90	86.28	0.80
Fly ash content	Input	kg/m³	0.00	200.10	54.19	64.00	0.54
Water content	Input	kg/m³	121.75	247.00	181.57	21.36	0.07
Superplasticizer	Input	kg/m³	0.00	32.20	6.20	5.97	0.91
Coarse aggregate	Input	kg/m³	801.00	1145.00	972.92	77.75	−0.04
Fine aggregate	Input	kg/m³	594.00	992.60	773.58	80.18	−0.25
Age	Input	day	1.00	365.00	45.66	63.17	3.27
Compressive strength	Target	MPa	2.33	82.60	35.82	16.17	0.42

Note: The dataset contains 1030 samples. “Input” and “Target” denote predictor variables and the response variable, respectively.

Table 2. Parameter settings for DE and fitness.

Component	Parameter	Value
DE	Population size NP	40
	Maximum iterations $T_{\max}$	300
	Mutation strategy	DE/rand/1/bin
	Scaling factor F	Random-U(0.5,1.0) per individual
	Crossover rate CR	Random-U(0,1) per individual
Fitness	Cross-validation repeats	3
	Folds per repeat	5
	Metric	Mean RMSE

Table 3. Performance metric results for different scale models.

Scale	Training Set R²	Test-Set R²	RMSE	MAE	RPD
0.75:0.25	0.9964	0.9282	4.3482	2.8731	3.7345
0.80:0.20	0.9962	0.9362	4.1286	2.6129	3.9607
0.85:0.15	0.9956	0.9434	3.9476	2.5642	4.2236
0.90:0.10	0.9958	0.9625	3.2891	2.1580	5.2240

Table 4. Performance comparison of ablation results on the test set.

Model	Test-Set R²	RMSE	MAE	RPD
LSSVM Baseline	0.9183	4.8523	3.5569	3.5328
LSSVM and TreeBagger	0.9399	4.1626	2.8467	4.1481
LSSVM and TreeBagger with Bagging	0.9372	4.2549	2.8398	4.0535
LSSVM and Three-Level Ablation	0.9500	3.7948	2.2666	4.5002
LSSVM and TreeBagger with LSBoost	0.9512	3.7498	2.1819	4.5626

Table 5. Performance metric results for different random seed test sets.

Random Seed	99	666	2025	2028	2030	8888	Mean ± Standard Deviation
R²	0.9514	0.9442	0.9503	0.9518	0.9492	0.9471	94.90 ± 0.29%
RMSE	3.7450	3.7849	3.7859	3.4214	3.9643	4.0221	3.7873 ± 0.2108
MAE	2.6003	2.5743	2.3929	2.3983	2.7036	2.7413	2.4998 ± 0.3000

Table 6. Performance comparison of traditional models and the algorithm model in this study.

Model	R²	RMSE	MAE
DE-LSSVM	0.9181	4.8590	3.5830
BP	0.8324	6.9507	4.8498
RF	0.8705	6.1110	4.4591
SVR	0.8602	6.3474	4.4525
LR	0.5873	10.9080	8.1877
DT	0.7675	8.1877	5.3680
DE-LSSVM Residual Ensemble	0.9512	3.7498	2.1819

Table 7. Mean absolute SHAP values and mean signed SHAP values of input features.

Characteristics	Cement	Age	Slag	Water	Fly Ash	Superplasticizer	Fine Aggregate	Coarse Aggregate
Mean \|SHAP\|	7.37513	7.24585	6.26928	4.12195	3.94865	2.25767	1.86108	1.82645
Mean SHAP	−0.038	2.164	−0.557	−0.238	−0.371	0.908	−0.105	0.028

Note: Mean

|S H A P|

represents the global importance of each feature, whereas Mean SHAP represents the average signed contribution relative to the model baseline prediction.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shi, J.; Wang, Y.; Wang, X. Research on Concrete Compressive Strength Prediction Based on DE-Optimized LSSVM and Multi-Level Heterogeneous Ensemble Residual Fusion. Eng 2026, 7, 250. https://doi.org/10.3390/eng7050250

AMA Style

Shi J, Wang Y, Wang X. Research on Concrete Compressive Strength Prediction Based on DE-Optimized LSSVM and Multi-Level Heterogeneous Ensemble Residual Fusion. Eng. 2026; 7(5):250. https://doi.org/10.3390/eng7050250

Chicago/Turabian Style

Shi, Junfeng, Yifei Wang, and Xiongyu Wang. 2026. "Research on Concrete Compressive Strength Prediction Based on DE-Optimized LSSVM and Multi-Level Heterogeneous Ensemble Residual Fusion" Eng 7, no. 5: 250. https://doi.org/10.3390/eng7050250

APA Style

Shi, J., Wang, Y., & Wang, X. (2026). Research on Concrete Compressive Strength Prediction Based on DE-Optimized LSSVM and Multi-Level Heterogeneous Ensemble Residual Fusion. Eng, 7(5), 250. https://doi.org/10.3390/eng7050250

Article Menu

Research on Concrete Compressive Strength Prediction Based on DE-Optimized LSSVM and Multi-Level Heterogeneous Ensemble Residual Fusion

Abstract

1. Introduction

1.1. Research Background

1.2. Literature Review

1.2.1. Research on Predicting Concrete Compressive Strength

1.2.2. Research on the Superiority of Residual Fusion Prediction Models

1.2.3. Research on SHAP Analysis

1.3. Framework and Contributions of This Study

2. Materials and Methods

2.1. Data and Evaluation Metrics

2.2. Least Squares Support Vector Machine Principles and Parameter Optimization

2.3. Differential Evolution-Based Optimization of LSSVM Hyperparameters

2.4. Hierarchical Heterogeneous Ensemble Residual Fusion

2.5. SHapley Additive exPlanations

3. Results

3.1. Model Evaluation Metrics

3.2. Prediction Performance and Model Analysis

3.2.1. Results and Analysis of Models with Different Training and Testing Set Scales

3.2.2. Ablation Results

3.2.3. Comparison of Results from Different Random Seed Models

3.3. Comparative Analysis of Different Model Results

3.4. SHAP Model Interpretability Analysis

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI