Next Article in Journal
Optimization for Sustainability: A Comparative Analysis of Evolutionary Crossover Operators for the Traveling Salesman Problem (TSP) with a Case Study on Croatia
Previous Article in Journal
Error Estimates and Generalized Trial Constructions for Solving ODEs Using Physics-Informed Neural Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Performance Analysis of Machine Learning Models for Compressive Strength Prediction in Concrete Mix Design

1
School of Civil Engineering and Transportation, Northeast Forestry University, Harbin 150040, China
2
School of Statistics and Mathematics, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
3
Intelligent Campus Construction Center, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
4
Academy of Marxisum, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Math. Comput. Appl. 2025, 30(6), 128; https://doi.org/10.3390/mca30060128
Submission received: 23 October 2025 / Revised: 24 November 2025 / Accepted: 26 November 2025 / Published: 27 November 2025

Abstract

Recycled aggregate concrete (RAC) is a sustainable alternative to conventional concrete, reducing environmental hazards and conserving resources. Accurate compressive strength (CS) prediction is critical for its broader acceptance. This study uses machine learning (ML) models (elastic net regression, KNN, ANN, SVR, RF, XGBoost, CatBoost, symbolic regression, stacking) trained on 1030 conventional concrete mixtures from UCI to support RAC’s CS prediction. The best model achieved R2 = 0.92; performance order: CatBoost > XGBoost > RF > SVR > ANN > symbolic regression > KNN > elastic net regression. Stacking improved RMSE by 6% over CatBoost. During the testing, sensitivity analysis revealed that CS exhibits pronounced sensitivity to the cement (C) content and testing age (TA). This aligns with existing experimental research. External validation, which is often neglected by prediction model research, was performed, from which a high-quality evaluating model was used for generalizability and reliability, enhancing the heterogenicity of its usefulness. Lastly, a user-friendly graphical interface was developed that allows users to input custom parameters to obtain sustainable RAC mixtures. This study offers insights into optimizing concrete mix designs for RAC, improving its performance and sustainability. It also advances the knowledge of cementitious materials, aligning with industrial and environmental objectives.

1. Introduction

Globally, concrete is the most widely used construction material [1], and cement (C) is a key constituent of concrete. In 2019, global demand for C increased by approximately 12%, with projections indicating a near doubling by 2050 [2]. Large-scale production of C depletes natural resources and generates substantial CO2 emissions, undermining sustainability and posing significant environmental risks [3,4]. Therefore, successfully integrating alternative materials for cement that possess the same binding properties will help reduce energy consumption and carbon dioxide emissions.
In recent years and with urbanization, the volume of construction and demolition (C&D) waste is rising quickly. Aiming to deplete C&D waste’s contribution to environmental pollution, many researchers have emphasized the importance of the effective management of C&D waste, particularly waste from building demolition or debris, which represents 70–90% of the total C&D waste [5]. Utilizing recycled aggregate in concrete will not only meet the growing demand for concrete but also mitigate the looming environmental threat. Recycled aggregate comprises waste products such as fly ash (FA), slag (Sl), etc. These waste products are obtained from C&D waste and can partially substitute cement, limiting environmental destruction, waste disposal, and pollution. According to reports, the use of recycled aggregate in concrete can result in a reduction of up to 80% in greenhouse gases emissions [6].
However, the surface of recycled aggregate is coated with adhered mortar, which negatively impacts the properties of concrete, especially compressive strength (CS). CS is widely recognized as a crucial mechanical property of concrete and a key factor in ensuring its quality [7]. CS is highly sensitive to the mix proportions of RAC and many factors; therefore, the incorporation of the ratio of recycled aggregate in RAC essentially conditions the CS of concrete. A meticulously engineered RAC mix design not only achieves the target strength specifications but also demonstrates versatility across diverse construction applications.
Testing the various CS properties of RAC is time-consuming, expensive, and requires substantial materials. Recently, with advancements in artificial intelligence, machine learning (ML) algorithms have been commonly employed to predict the mechanical properties of different materials. ML techniques have significantly advanced in real-world applications due to their ability to trace non-linear and unclear connections among dataset variables. To estimate the CS of concrete, data-driven models based on measured data can effectively replace extensive testing. Over the past two decades, ML approaches, such as artificial neural networks (ANNs) [8,9,10,11,12,13,14,15], support vector machines (SVMs) [16,17,18], decision trees (DTs) [19,20], random forest (RF) [21,22,23,24,25,26,27,28,29,30], and ensemble algorithms [31,32,33,34,35], have been employed to predict the CS of concrete and other features with great precision across many studies [36,37,38,39,40]. Ayaz et al. [41] developed a generalized and accurate model for predicting the thermal stress resistance (TSR) of fly ash-based concrete using seven machine learning models. The model employs both individual and ensemble approaches. Relative sensitivity was assessed using the one-variable-at-a-time (OVAAT) method. Ben Seghier et al. [42] used four ML methods—SVR, LSSVR, ANFIS, and MLP—for CS modeling. They applied the marine predators algorithm (MPA) to optimize the ML model parameters, aiming to enhance the prediction accuracy of concrete compressive strength using waste glass. Muhammad et al. [43] conducted a comprehensive and rigorous comparative performance analysis of diverse machine learning models (multiple linear regression (MLR), ANN, SVM, k-nearest neighbors (KNN), DT for CS prediction, and the Shapley additive model explanation demonstrated the significant influencing factors on the CS of concrete. Golafshani et al. [38] analyzed the CS of RAC using various ML techniques, confirming that stacking enhances model accuracy. Sensitivity analysis identified the concrete testing age (TA) and cement content as critical factors influencing CS. Mai et al. [44] used eXtreme gradient boosting (XGBoost) to predict the CS of fiber-reinforced self-compacting concrete (FRSCC), achieving high predictive performance and stability. Sensitivity analysis revealed that cement, coarse aggregate (CA), fine aggregate, water, and sample age significantly affect CS, though their influence varies. Jesús et al. [45] collected 515 mixed designs; the data were used for training, validation, and testing to prepare models. An equation was developed to predict the 28-day CS of self-consolidating concrete ingredients, including RAC. Finally, the sensitive assessment and sensitivity were also discussed. Islam et al. [46] have demonstrated deep learning methods like LSTM networks and convolutional neural networks (CNNs) showed potential in modeling non-linear relationships within HPC mixtures with high accuracy. Symbolic regression such as Gene Expression Programming (GEP) and Multi Expression Programming (MEP) is used in the previous studies to predict the CS of fiber-reinforced self-compacting CBMs and prismatic masonry columns confined by fiber-reinforced polymer and the results show that both algorithms perform well as per the criteria described in the literature [47,48]. See Table 1 for the literature comparison.
Recent studies have reported gains by coupling learners with metaheuristic optimizers for parameter search or feature weighting, including firefly algorithm-assisted SVR and metaheuristic-based hybrid modeling on concrete datasets [49,50,51]. In this research, the five-fold cross-validation was carried out solely on the training set to optimize the hyperparameters and integrate SHAP analyses to connect predictors with compressive strength outcomes.
Even though ML models are capable of predicting the CS with high accuracy, their adoption in real applications is limited due to the need for a specialized environment or deployment frameworks. Considering these limitations, the authors used a novel explainable SHAP, which quantified the impact of parameters. This approach not only enhances model transparency but also addresses deployment barriers by elucidating how individual input features influence compressive strength predictions, thereby supporting engineers in making informed decisions without requiring complex computational setups. Furthermore, it aligns with recent trends in explainable AI, which prioritize user-friendly interpretations to accelerate real-world implementation across diverse construction scenarios. Chamika et al. utilized Shapley values and the extreme gradient boosting model to derive new analytical equations for ABD stiffness entry predictions, marking a novel application in composite material science [52]. Shashika et al. used XGB and Shapley values to estimate the shear strength reduction factor of doubly symmetric RHFBs with both edge-stiffened and unstiffened circular openings for the first time [53].
Otherwise, in the current development of CS models, researchers have focused their efforts on bagging and boosting methods; in this study, a stacking technique was utilized, employing multiple base learners, i.e., elastic net regression, KNN, ANN, support vector regression (SVR), random forest (RF), XGBoost, CatBoost, and symbolic regression, to create a meta-model. We validated the individual models and meta-models and compared the performance of each data-driven model with the testing parameters R2, RMSE, MAE, and SI to confirm the best-employed algorithms. After developing the ML models, external data gathered from other studies were used to test model generalization. Therefore, each model’s behavior concerning new data beyond the existing database was investigated. Finally, we present an open-source GUI tool with a meta-model designed for concrete mix predictions where users enter a variety of parameters, including cement (C), fly ash (FA), slag (Sl), water (W), superplasticizer (SP), coarse aggregate (CA), natural fine aggregate (NFA), and testing age (TA), and the algorithm runs based on these inputs. The development of the GUI tool offers engineers an easy way to obtain a priori prediction of the compressive strength of concretes made using RCA, even though the kind of constituents are heterogeneous and difficult to be classified, with the aim of proving a good working tool for concrete manufactures.
The remainder of the paper is divided into six sections. Section 2 briefly introduces data preparation. Section 3 covers the overall research methodology of our research. Section 4 provides the overall results and introduces the detailed performance of each ML model for CS prediction and explanation. Section 5 discusses the limitations of the research and future research directions. Finally, Section 6 concludes the paper.

Research Significance

Current knowledge limitations include over-reliance on small datasets, insufficient external validation, and lack of interpretability in complex ensemble models, which hinder practical application in heterogeneous construction scenarios.
This study advances the state of the art by integrating SHAP-based feature attribution into stacking frameworks, establishing a 6% RMSE improvement over CatBoost through rigorous permutation testing (p < 0.05), and introducing cross-study external validation to enhance model generalizability.
The findings improve existing knowledge by quantifying predictor interactions (e.g., cement–age synergies) and providing actionable insights for sustainable RAC mix design via an open-source GUI tool, bridging the gap between data-driven modeling and civil engineering practice.

2. Data Preparation

This study builds upon prior research by utilizing a total of 1030 datasets gathered from the University of California Irvine (UCI) Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+Strength, accessed on 23 October 2025) to build the model and another 952 data samples from five publications to check the models’ generalizability [54,55,56,57,58]. Therefore, each model’s behavior concerning external validation beyond the existing database was investigated. The UCI dataset comprises 1030 observations of ordinary concrete mixtures with corresponding CS measurements at different ages. It should be noted that this dataset was selected for its offering sufficient sample size and feature coverage for robust machine learning model development, and was largely used as standard concrete data. RAC open-access datasets was scarcity of large-scale, but can be transformed into parameters applicable for RAC through model parameter correction. The compressive strength of recycled aggregate concrete generally follows the Bolomey formula relationship with the water–cement ratio. The formula:
f c u , 0 = α a f c e k α b ,
where f c u , 0 represents the compressive strength of concrete cubes, f c e denotes the 28-day compressive strength of cement, k is the water–cement ratio, and α a and α b are coefficients related to the cement type and aggregate quality. These coefficients should be adjusted according to the recycled aggregate replacement rate.
Given the distinct properties of recycled aggregates compared to natural aggregates, we conducted experiments to determine the coefficients for the Bolomey formula. The raw data involve cement (C), fly ash (FA), slag (Sl), water (W), superplasticizer (SP), coarse aggregate (CA), natural fine aggregate (NFA), testing age (TA), and CS. The data preprocessing steps included the following: (1) verification of data integrity with no missing values detected; (2) feature normalization using min–max scaling to standardize input ranges (0–1); and (3) stratification of the dataset into training (80%) and testing (20%) subsets, five-fold cross-validation on training set only. The remaining 20% of the UCI dataset was utilized exclusively for the final evaluation of model performance. To prepare the raw data in the database for analysis and model development, data preprocessing by ML was performed, involving data cleaning, feature selection, data normalization, and scaling. After splitting the dataset into training and testing subsets, critical outliers were identified and removed from the training set using box plots and scatter plots.
This dataset is consistent with the publicly available concrete compressive strength dataset on Kaggle, further confirming its reliability and widespread recognition in the research community.
In total, 952 data samples were integrated for external validation, selected to ensure diversity in mixture proportions, testing protocols, and geographic origins—enhancing model generalizability across heterogeneous construction scenarios. All datasets were standardized to SI units (kg/m3 for materials, days for age, MPa for strength) through cross-referencing original publications and converting non-metric units using ISO 10080:2005 guidelines.
The ML models can capture complex, non-linear relationships between inputs and CS, resulting in more precise forecasts. Anaconda software was introduced to execute the model for elastic net regression, KNN, ANN, SVR, RF, XGBoost, CatBoost, symbolic regression, and stacking models using python coding. Figure 1 shows the mutual correlation coefficients between the predictors and compressive strength computed using Pearson correlation, all below 0.7, reducing multicollinearity risk.
The correlation analysis examines variables’ relationships, showing inverse relationships, compensatory dynamics, and strong positive relationships. (1) Inverse relationships: In this context, the relationship between water and CS (−0.29) may naturally exhibit an inverse relationship. An increase in water content decreases the CS of concrete. (2) Compensatory dynamics: One variable may counterbalance or compensate for another, i.e., cement content and FA content (−0.4). For example, an increase in cement content could be offset by a reduction in FA content. (3) Strong positive relationship: C, SP, and TA have very strong positive relationships with CS (0.5, 0.37, and 0.33, respectively). An increase in cement content and SP substantially improves the CS of concrete. The longer the TA of concrete, the higher its CS. In summary, the primary factors influencing the CS of concrete are the content of C, W, SP, and TA. The correlation heatmap intuitively displays the relationships between these variables, helping us better understand the data insight mechanisms that govern the behavior of CS. Table 2 provides descriptive statistical analysis for all parameters and statistical properties of the UCI variables.
Figure 2 presents the correlations between the compressive strength and other variables. It shows that SP, FA, and Sl, having two clusters, are identifiable in their counter plots, with one cluster centered around zero, reflecting a significant amount of data without SP FA, and Sl. Another cluster forms around 10 kg/m3 for SP, 125 kg/m3 for FA, and 130 kg/m3 for Sl, indicating a certain percent replacement of NFA by SP, FA, and Sl. For C and W, the scatter plots show relatively uniform distributions with centers approximately at 200 kg/m3 and 190 kg/m3, respectively. Concrete testing age shows a prominent cluster centered around 28 days, which can be attributed to the high frequency of data samples at this age.

3. Methodology

The model suite balances parsimony and diversity across hypothesis classes: a regularized linear baseline (elastic net), a kernel method (support vector regression), gradient-boosted trees (XGBoost, CatBoost), an ensemble tree method (random forest), a symbolic learner (symbolic regression), and a neural learner (ANN), enabling complementary inductive biases and fair comparison under a unified protocol. Data preprocessing (scaling) was fitted on each training fold and applied to the corresponding validation fold to avoid leakage.
Hyperparameters were selected via five-fold cross-validation conducted on the training set only. After setting the values of the hyperparameters, k models for each ML algorithm were developed in which (k − 1) folds were used for training the ML model in each iteration and the remaining fold was reserved for validating the model. The average performance (R2) of the k ML models in the validating folds for given hyperparameters shows the quality of the assumed hyperparameters and the ML algorithm. After determining the optimal hyperparameters for each ML technique, a single ML model was developed for the development dataset using these optimal hyperparameters, instead of using k different ML models.
Development performance is reported as mean ± SD across folds, and the final configuration was refit on the full training set and evaluated once on the held-out test set. Regarding the stacking model, different combinations of the ML models were considered to make a meta-model. Comparison of model output and the experimental data was performed to check the external generalization by the R2 values. Feature attribution and effect visualization employ complementary tools. SHAP quantifies predictor contributions; global importance is computed as the mean absolute SHAP value across samples, and local force plots explain individual predictions. All explanations are computed on the testing set after training under the identical preprocessing and data split. The methods used in this study are detailed and applied in the subsequent sections.

3.1. Elastic Net Regression

Model selection used random search with five-fold cross-validation on the training set. Preprocessing (scaling) was fit per training fold and applied to validation folds to prevent leakage. The best model maximized the mean CV and R2, with RMSE and parsimony as tie-breakers. After selection, the model was refit on the full training set and evaluated on the test set. Search budgets: elastic net/SVR/RF/XGBoost/CatBoost 100 trials each; ANN 50 trials with early stopping on validation loss.
Elastic net regression serves as a transparent linear baseline, providing interpretability and multicollinearity control via combined L1 and L2 regularization. It balances feature selection and model stability, with hyperparameters µ (regularization strength) and R (L1/L2 trade-off). Error function given in Equations (2)–(4) [59,60].
E α = 1 n i = 1 n y i y ^ i 2 + μ j = 1 p α i + 0.5 1 R j = 1 p α j 2 ,
R = λ 1 λ 1 + λ 2 ,
y ^ = α 0 + j = 1 p α j x i j ,
where the jth predictors and their corresponding regression coefficients are denoted by x i j and α j , respectively. y i and y ^ i represent the actual output and the predicted value from the regression equation, respectively. The strength of Lasso (L1) and Ridge (L2) regularization is controlled by λ 1 and λ 2 , respectively. n and p denote the number of data samples and predictors.

3.2. KNN

KNN, or K-nearest neighbors, is a data structure that stores the top K-nearest neighbors for each sample in a dataset. It is essential in fields such as manifold learning, computer vision, machine learning, and multimedia information retrieval.
If the design reference set is DATASET, which is divided into i concrete preparation sample sets, and each sample has eight feature vectors (input ratio variables x8) and the CS variable is xi, then
D A T A S E T = x 11 , x 12 , x 18 , x 1 x 21 , x 22 x 28 , x 2 x i 1 , x i 2 x i 8 , x i .
The K value represents the number of neighboring samples used to predict the CS of new samples. To initialize the K value, use the K-nearest neighbor algorithm for predicting the CS of concrete.
The prepared performance concrete mix variable feature vector is denoted as y i . The Euclidean distance between the operational feature vector y ^ i and the reference sample mix feature vector y i in the set dataset:
D n = i = 1 n y ^ i y i ,
where y ^ i and y i describe the ith predicted feature vector of CS and the experimental feature vector of CS, respectively.

3.3. ANN

A feed-forward MLP is used as follows: input layer sized to predictors, two hidden layers (128, 64) with ReLU activations, He initialization, dropout = 0.2, Adam optimizer (initial lr = 1 × 10−3 with ReduceLROnPlateau), batch size = 64, and early stopping (patience = 20) based on validation loss. ANNs are computational models inspired by biological neurons, processing data through interconnected layers. They are widely used in ML. Data enter the input layer with initialized weights and biases, and pass through hidden layers for non-linear transformation via activation functions like ReLU. During feed-forward propagation, outputs are computed and compared to actual outcomes using error metrics. Back propagation adjusts weights and biases via optimization to minimize error, allowing ANNs to model complex relationships often outperforming traditional models in prediction, classification, and pattern recognition [50,51].
y i = f i w j i x i + b j ,
where y i represents the output value of the jth artificial neuron in hidden or output layers. The activation function f processes weighted inputs where wji indicates the connection weight from the ith neuron in the previous layer to the jth neuron in the current layer. xi denotes the output value from the ith neuron in the preceding layer. Additionally, bj serves as a bias term for the jth neuron, functioning to regulate activation thresholds and modulate neural activity levels.
The weights wji must be calculated efficiently for the ANN model to converge to the optimal solution with sufficient training data. The error in weight calculation is determined by Equation (8) [60,61]:
E ( w ) = 1 n i = 1 N y ^ i y i 2 ,
where y i is the actual output in the training data and y ^ i is the predicted output by the ANN model. The error described by Equation (8) is a function of the ANN predicted output y ^ i , which in turn is dependent on weights wji of the hidden and output neurons. Greedy algorithms based on the gradient descent method were used to provide an efficient optimization to non-linear ANN models for the present study. The weights based on the gradient descent formula are given by Equation (9):
w j w j L r a t e E w w j ,
where Lrate is the learning rate and E w w j indicates the dynamic rate of change in the system error.
Forward propagation combined with gradient descent can lead to algorithms becoming self-trapped. This limitation is addressed through back propagation, where the weights of neurons in layer L are updated based on the neurons in layer L + 1. This process continues until the learning rate is satisfied or the specified number of epochs is reached.

3.4. SVR

SVR is applied to find a hyperplane or function f(x) that best fits the data y i within a specified margin of error ( ε ) utilizing kernel functions [43,60]. A detailed explanation of SVR and its mathematical formulation is below.
f x = i = 1 n α i α i * K x i , x j + b ,
where α i , α i * are Lagrange multipliers ( α i α i * 0 ), K(xi,xj) is the kernel function, and b is the bias term, computed using constraints from support vectors.

3.5. Random Forest (RF)

Random forests (RFs) constitute an ensemble approach that builds multiple decision trees using bootstrap sampling and random feature selection. The aggregated forest emerges by averaging the predictions from each individual tree. This method prevents overfitting and boosts robustness relative to a single decision tree, establishing RF as a reliable model for diverse regression applications. For a dataset D with n samples, for instance, it will construct m decision trees T1, T2,…Tm. Thus, each of the trees is trained on a bootstrap sample Di from D, and at each node, a random subset of k features is selected to find the best split. For classification, the output is determined by a majority vote across all trees [61].
y ^ = mod e ( T 1 ( x ) , T 2 ( x ) , T m ( x ) ) .
For regression, the output is the average prediction from all trees:
y ^ = 1 m i = 1 m T i ( x ) .

3.6. XGBoost

XGBoost enhances model accuracy by using an ensemble of DT regressions with boosting. The process involves multiple DTs, and the final output is a weighted sum of the individual tree outputs. The overall error is minimized through a loss function.
E = i l y i , y ^ i + η k γ T + L 1 j w j + 0.5 L 2 j w j 2 ,
where l is the loss function; y i and y ^ i are the actual and the predicted values, respectively; the complete expression incorporates multiple components ( η , γ ) that control the learning rate, tree structural complexity, while L1 and L2 regularization terms are applied to the leaf weights; and wj prevents overfitting [62].

3.7. Category Boosting (CatBoost)

CatBoost is a state-of-the-art ensemble machine learning algorithm developed by Yandex, belonging to the gradient boosting family. It is designed to enhance prediction accuracy while mitigating overfitting, making it particularly effective for structured data tasks such as regression and classification. Unlike traditional gradient boosting methods, CatBoost natively processes categorical variables without requiring manual encoding (e.g., one-hot encoding). It uses ordered boosting and target-based statistics to capture relationships between categorical features and the target variable, reducing bias from random feature ordering. However, this method may need more memory and longer training time compared to XGBoost’s level-wise tree growth strategy [38].

3.8. Symbolic Regression

Symbolic regression (SR) is a type of regression analysis that searches the space of mathematical expressions to find the model that best fits a given dataset, both in terms of structure and parameters. Unlike conventional regression methods that assume a fixed model form (like linear or polynomial), SR aims to discover the underlying mathematical expression itself. This is typically achieved using evolutionary algorithms, such as genetic programming, which iteratively generate, combine, and select candidate expressions based on their fitness (e.g., accuracy, simplicity).
The core advantage of symbolic regression lies in its ability to uncover potentially complex, interpretable relationships hidden within the data without relying on pre-specified model architectures. The resulting models are explicit equations, making them inherently more transparent than many “black-box” machine learning models like deep neural networks. This interpretability is crucial for scientific discovery, engineering design, and domains where understanding the causal mechanisms is essential. Common applications include deriving physical laws from experimental data, identifying governing equations in dynamical systems, creating empirical models for complex processes in engineering and finance, and feature engineering.
However, the search space for possible expressions is vast and combinatorial, making symbolic regression computationally intensive, especially for high-dimensional problems. The process can also be sensitive to noise in the data and may sometimes produce overly complex equations that overfit the training data, necessitating techniques like parsimony pressure to encourage simpler solutions. In this study, an empirical equation for predicting the CS of RAC was developed using Gene Expression Programming (GEP). The GEP method generated expression trees (ETs) using these input variables and five basic arithmetic operations (+, −, ×, ÷, ^, √). Over generations, GEP refined the ETs to find the optimal empirical equation for CS, assigning each a fitness score. The outputs of three sub-ETs were integrated to create the final equation for estimating RAC’S CS, presented in Equations (14) and (15), which serves as a reference and prediction model for designing and manufacturing durable RAC concrete.
C S = ( W N F A W × ( 0.963 T A ) ) 4 × S 7 × ( S 5 T A ) × 0.963 × ( C + N F A ) + T A × W × ( S 4 T A ) 2 × ( F A × ( 2 S l + N F A ) T A C A ) × ( S 5 + T A 0340 ) × 0963 × S 1 × T A F A × S 3 × S 7 + F A ( 2 C + S l + 1 ) 2 .
Common subterms:
S 1 = W N F A ,
S 2 = T A ,
S 3 = 0.637 T A ,
S 4 = T A 0.064 ,
S 5 = 0.975 ,
S 6 = T A F A ,
S 7 = 0.943 + S l .
Constrains:
N F A 0 , T A 0 , F A 0 ,
where C, Sl, FA, W, SP, CA, NFA, and TA represent the content of cement, slag, fly ash, water, superplasticizer, coarse aggregate, fine aggregate, and age, respectively.

3.9. Statistical Metrics for Model Assessment

The R2, RMSE, MAE, MAPE, and SI were used to assess the models. R2 indicates the predictive accuracy of unknown data compared to known data. It represents the fraction of variances in the dependent variable predicted by the independent variable and ranges between 0 and 1. A higher R2 means a better result. RMSE is effective in assessing prediction accuracy as it weighs larger errors more significantly than smaller errors. MAE stands for the prediction error as the average difference between the actual and predicted values. MAPE offers a measure of relative error, providing insight into the magnitude of errors relative to the actual values [63]. Low values of RMSE, MAE, and MAPE indicate better model performance by presenting the prediction values closest to the actual values. SI normalizes prediction accuracy for datasets with wide-ranging, large variable values. It categorizes predictions into four ranges: SI ≤ 0.1 (Excellent), 0.1 ≤ SI ≤ 0.2 (Good), 0.2 ≤ SI ≤ 0.3 (Fair), and SI > 0.3 (Poor), based on target variable comparison. The equations for the statistical verification conducted through the errors (R2, RMSE, MAE, MAPE, SI) are illustrated as follows:
R 2 = 1 i y ^ i y i 2 i y ^ i y ¯ i 2 ,
R M S E = 1 n i y ^ i y i 2 ,
M A E = 1 n i = 1 n y ^ i y i ,
M A P E = 100 n i y ^ i y i y ^ i ,
S I = R M S E y ¯ i ,
where y ^ i and y i are the predicted and experimental CS of the ith data sample, respectively, n is the number of data samples, and y ¯ is the average CS in the dataset.
The error measures were reported during the development phase and assessed on the testing dataset to evaluate the ML models’ generalization capabilities.

3.10. Stacking Technique

The stacking method is an ensemble learning technique that enhances overall prediction performance by combining predictions from multiple submodels (base learners) and optimizing them through a higher-level meta-model. This approach emphasizes the weights of individual submodels and their collaborative relationships. The key to stacking lies in introducing heterogeneous weak models that feed the outputs of other submodels as inputs, creating a multi-level learning process. Ultimately, the meta-model generates optimized predictions by fitting the base learners’ outputs, a process vividly described as “stacking on top of other predictions”. This technique is particularly suitable for scenarios requiring high precision and complex predictions. By leveraging the strengths of multiple models, stacking effectively reduces the bias of and variance in individual models, thereby improving the overall model’s robustness and generalization ability.
The stacking ensemble was implemented using a two-stage training sequence. The base models (elastic net, KNN, ANN, SVR, RF, XGBoost, CatBoost, and symbolic regression) were first trained on 80% of the dataset with five-fold cross-validation. Out-of-fold predictions from the base models were concatenated to form meta-features, which were used to train a meta-learner (linear regression with L2 regularization). Data partitioning followed a stratified 80–20 split for training and testing. The meta-learner was optimized using grid search with five-fold cross-validation on the validation set, with regularization strength (α = 0.01) selected via Bayesian optimization.

3.11. SHAP Analysis

Shapley Additive Explanation (SHAP) values were calculated on the testing set after model training to obtain global importance and local attributions under the same preprocessing and data split.
This analysis examines how each parameter influences the prediction of concrete’s CS. The input parameters significantly impact the prediction results. A small change in their values can result in significant differences between the observed and projected values. Sensitivity analysis helps us understand how each input contributes to a specific prediction and how they interact, making the process more comprehensible [64], which allows for the optimization of the mix design to achieve the desired CS. A SHAP analysis was carried out and Equation (21) was used to calculate the contribution of each variable to the model’s output.
Several versions of SHAP (DeepSHAP, Kernel SHAP, LinearSHAP, and TreeSHAP) exist for specific ML model categories. Tree-SHAP is used in the present study to explain the ML predictions and Equation (21) is recommended to compute the attribution of each feature ( ϕ i ).
ϕ i = K M \ i K ! N K 1 ! N ! g x K i g x K .
The term ‘K’ represents a subset of the input features. ‘M’ denotes the set of all inputs. gx(K) represents the expected value of the function on subset K.

3.12. Model Generalization

External validation is an important but often neglected part of prediction model research. A dataset not used in the model-developed process was required to evaluate the model’s predictive performance, and its data must contain the information needed to apply the model and make comparisons to the observed outcomes. In this article, to evaluate the generalizability and reliability of developed models, we obtained a suitable dataset from five publications [54,55,56,57,58], making outcome prediction, evaluating the predictive performance, assessing the model usefulness, and clearly reporting findings in Section 4.11.

4. Model Results

The development results reflect the five-fold cross-validation mean ± SD. The optimal settings in Table 3 were obtained by the above random search, five-fold, CV protocol; preprocessing and evaluation hygiene followed the train-only fit policy. Table 3 lists the optimal hyperparameters for all the developed ML models. The model error results are provided in Table 4, excluding the stacking models. Figure 3 shows the result of K-fold cross-validation in the training phase. To evaluate the alignment between the ML model predictions and actual observations, the predicted CSs were plotted against the experimental results during the testing phase (Figure 4). A regression line with a slope of one indicates perfect alignment between the predicted and actual values. The CatBoost model shows an R2 of 0.92, demonstrating its superior performance compared to other ML models. Details about the outcomes of each ML model are explained below.

4.1. K-Fold Cross-Validation

Five-fold cross-validation was conducted on the training set only. The correlation coefficient was used to evaluate the result of the cross-validation, as shown in Figure 3. A comparison of all five individual model techniques indicated fluctuation in their outputs. The CatBoost model showed a higher R2 compared to other models; the average R2 of CatBoost was equal to 0.94, with the maximum and minimum values being 0.94 and 0.93. For the elastic net model, the R2 values were R2average = 0.6, R2maximum = 0.68, and R2minimum = 0.55. In the case of KNN, they were R2average = 0.67, R2maximum = 0.75, and R2minimum = 0.62; in the case of ANN, they were R2average = 0.86, R2maximum = 0.88, and R2minimum = 0.84; in the case of SVR, they were R2average = 0.87, R2maximum = 0.9, and R2minimum = 0.86. The average R2 values for RF were R2average = 0.91, XGBoost R2average = 0.94, and symbolic regression R2average = 0.55.

4.2. ML with Elastic Net Regression Model Outcomes

Elastic net regression is a widely used regularization method that merges the benefits of both Lasso (L1) and Ridge (L2) regression. The key parameter in elastic net regression is the mixing parameter, often denoted as alpha ( α = 0.003 ), which determines the ratio between L1 and L2 regularization. When alpha is set to 0, elastic net regression behaves like Ridge regression, emphasizing the L2 penalty and shrinking coefficients towards zero without setting them exactly to zero. Conversely, when alpha is set to 1, it behaves like Lasso regression, promoting sparsity by potentially setting some coefficients exactly to zero. Values between 0 and 1 allow for a balance between the two regularization methods, providing flexibility in model complexity and feature selection. Another important parameter is lambda (λ), which controls the overall strength of the regularization. A smaller lambda value leads to less regularization, while a larger lambda increases the penalty on the coefficients, potentially leading to simpler models with fewer non-zero coefficients.
Table 4 illustrates that among the ML model outcomes, the elastic net regression model displayed weaker performance with an R2 value of 0.62 for the training phase and 0.6 for the testing phase. Furthermore, RMSE = 10.36 MPa, MAE = 8.19 MPa, MAPE = 31.51%, and SI = 0.29 for the training dataset. In the testing dataset, the elastic net regression model demonstrates RMSE = 10.46 MPa, MAE = 8.33 MPa, MAPE = 32.55%, and SI = 0.3. Figure 4a demonstrates the prediction and observation correlation with model performance R2 = 0.6.

4.3. ML with KNN Model Outcomes

KNN, a straightforward yet potent supervised machine learning algorithm, is utilized for classification and regression tasks. It functions based on the principle that similar data points (neighbors) tend to have similar outcomes. Table 3 details the specific parameters employed in the KNN model, such as the number of neighbors (k), the distance metric used to measure similarity, and any weighting scheme applied to the neighbors. The performance of the KNN model in the training dataset yielded RMSE = 7.47 MPa, MAE = 5.7 MPa, MAPE = 22.01%, R2 = 0.8, and SI = 0.21. The testing dataset yielded RMSE = 8.6 MPa, MAE = 6.34 MPa, MAPE = 23.37%, R2 = 0.65, and SI = 0.27. Figure 4b illustrates the correlation between the predictions and observations (R2 = 0.65), showcasing the model’s predictive capabilities using the KNN algorithm.

4.4. ML with ANN Model Outcomes

The hyperparameters in the ANN, including the number of hidden layers, activation function, and learning rate, are shown in Table 3. Specifically, the optimal configuration determined through the training process comprises two hidden layers (128, 64) with ReLU activation. The initial learning rate is set to 1 × 10−3, and the Adam optimizer is employed to produce continuous output values. This configuration enables the ANN model to effectively capture the complex relationships within the data and achieve accurate predictions.
Figure 4c and Table 4 display the prediction and observation correlation and model performance using the ANN. Table 4 illustrates the accurate prediction of CS, with an R2 value of 0.93 and 0.85 for the training and testing datasets, respectively. The error distribution of the ANN has RMSE = 4.6 MPa, MAE = 3.4 MPa, MAPE = 11.84%, and SI = 0.13 in the training set, and RMSE = 5.68 MPa, MAE = 4.3 MPa, MAPE = 13.94%, and SI = 0.16 in the testing set. The correlation between the observed values and predicted values is shown in Figure 4c, with R2 = 0.85. This suggests a fairly linear relationship between the observed and predicted values, further confirming the accuracy of the ANN model.

4.5. ML with SVR Model Outcomes

The SVR model, utilizing a radial basis function kernel and specific hyperparameter settings (Table 3), exhibits robust performance in predicting continuous values. The regularization parameter, set to 297, helps in controlling overfitting by penalizing large coefficients. The ε value of 2.50 defines the margin of tolerance, allowing for some deviation in the predictions. The model’s predictive accuracy is further improved by the careful selection of other hyperparameters. The choice of ‘scale’ for γ in the radial basis function kernel aids in capturing the complex relationships within the data.
Figure 4d and Table 4 display the prediction and observation correlation and model performance using SVR. It is evident that SVR algorithms exhibit a strong correlation between prediction and observed after the XGBoost and ANN models. The SVR model, as shown in Table 4, exhibits R2 = 0.95 and R2 = 0.85 during both the training and testing datasets, superior to the simulation results of Shaaban et al. (R2 = 0.84, MAE = 5.0 MPa) [65]. In addition, SVR model evaluation is conducted using its error dispersion; during the training dataset, the SVR model demonstrates RMSE = 3.73 MPa, MAE = 2.8 MPa, MAPE = 10.15%, and SI = 0.1. Meanwhile, in the testing dataset, the SVR model demonstrates RMSE = 5.94 MPa, MAE = 4.28 MPa, MAPE = 14.58%, and SI = 0.17.

4.6. ML with RF Model Outcomes

The model performance increased with the increasing number of trees and layers. Accordingly, the models with three trees and three layers are considered. The relations between the calculated and predicted values are shown in Figure 4e. The RF model for predicting the CS of RAC achieves an average accuracy of 85% and an R2 of 0.88. In addition, RF demonstrates RMSE = 5.72 MPa, MAE = 4.29 MPa, MAPE = 15.14%, and SI = 0.82 for the testing dataset. Meanwhile, in the training dataset, the RF model demonstrates RMSE = 3.14 MPa, MAE = 2.38 MPa, MAPE = 8.98%, and SI = 0.91.

4.7. ML with XGBoost Model Outcomes

XGBoost has demonstrated superior performance for CS compared to other ML modes, offering higher accuracy with a higher R2 and lower errors. In order to find the optimum result, the optimal combinations of key parameters are selected. These include the number of estimators (n-estimators), maximum depth (max-depth), and learning rate (learning-rate). The three hyperparameters are changed as follows: n-estimators [800, 900], max-depth [3, 5], and learning rate [0.1, 0.2], while others are as default. Table 4 shows that the maximum R2train = 0.99, R2test = 0.90, and at this point, n-estimators = 900, max-depth = 3, and learning rate = 0.12. In comparison to the results from the five ML models, XGBoost shows good performance in terms of RMSEtrain = 1.57 MPa, RMSEtest = 4.34 MPa, MAEtrain = 0.97 MPa, MAEtest = 2.82 MPa, MAPEtrain = 3.43%, MAPEtest = 9.29%, SItrain = 0.04, and SItest = 0.12. The observed and predicted results in Figure 4f are well correlated, demonstrating the outstanding performance of the current mode.
Figure 4f compares the observed and predicted CS values for the models. The second-best model XGBoost, with an R2 of 0.9, shows the data points closely aligned with the regression line, indicating a strong correlation and high accuracy in predicting CS.

4.8. ML with CatBoost Model Outcomes

In the present study, the CatBoost model has been evolved by interating the whole data 300 times. During the process, key hyperparameters including the learning rate, estimators, and max_depth are used to improve the performance; see Table 3 for the CatBoost-optimized hyperparameters. For CS prediction, the learning rate of 0.1, max depth of 7, and 800 estimators were tailed to target variables like the L-box ratio and compressive strength. CatBoost performed well with an R2 of 0.98 and 0.92 for the training and testing dataset, respectively, minimal error metric (RMSEtrain = 1.47 MPa, RMSEtest = 4.85 MPa, MAEtrain = 1.09 MPa, MAEtest = 2.83 MPa, MAPEtrain = 3.8%, MAPEtest = 9.79%, SItrain = 0.96, and SItest = 0.89); Figure 4g compares the observed and predicted CS values for the models. The best-performing CatBoost model, with an R2 of 0.92, was close to those predicted by the model of Shaaban et al. (R2 = 0.94) [65], indicating strong agreement between the predicted and experimental values. In contrast, the worst model, with an R2 of 0.6, exhibits a wider scatter, reflecting its limited predictive capability.

4.9. ML with Symbolic Regression Outcomes

Figure 4h illustrates a clear correlation between the observed and simulated results and the values predicted by the GEP-based equation with an R2 of 0.68. To develop the model, the model’s accuracy is listed in Table 4. The errors in the training process were RMSE = 8.45 MPa, MAE = 6.58 MPa, MAPE = 23.62%, and SI = 0.72 for the training dataset. Meanwhile, in the testing dataset, the model demonstrates RMSE = 9.99 MPa, MAE = 7.53 MPa, MAPE = 25.44%, and SI = 0.68.

4.10. Stacking Model Outcomes

The elastic net regression model, KNN, ANN, SVR, RF, XGBoost, CatBoost, and symbolic regression were utilized for constructing the stacking model. Table 5 presents the optimal combinations of base ML models along with the corresponding error measures for both the development and testing phase. Ensemble models XGBoost and CatBoost were found in all the staking models, and comparing the stacking models constructed with one or more base ML models did not reveal any substantial difference. In the testing phase, the best stacking model (stack-2), with an RMSE of 4.5 MPa, demonstrated a 6% improvement compared to the best base model (CatBoost), which had an RMSE of 4.8 MPa. Additionally, the best stacking model performs 21% better than the RF model.
Notably, this stacking framework differs from conventional ensemble methods (e.g., Golafshani et al.) by integrating SHAP-based feature attribution into the meta-model training loop, enabling the simultaneous optimization of prediction accuracy and interpretability. A permutation test with 1000 iterations confirmed the statistical significance of the 6% RMSE reduction (p << 0.05), with the stacking model’s performance gain exceeding the 95% confidence interval of randomized baseline distributions.
Table 6 delineates a comparative analysis between the outcomes of our stacking methodology and those documented in prior studies employing similar stacking approaches. The results indicate that the stacking method offers noticeably higher predictive accuracy across different parameters when compared to [38,66,67,68], affirming its effectiveness over previously established methods. In contrast, the stacking model presented in this work demonstrated higher reliability, with an R2 range of 0.93–0.99 and reduced prediction errors.

4.11. External Validation Outcomes

At external validation, some miscalibration between the predicted and observed values should be anticipated. The more different the validation dataset is compared with the development dataset, the greater the potential for miscalibration. Similarly, models developed using low-quality approaches (e.g., small datasets, unrepresentative samples) have greater potential for miscalibration on external validation.
To assess the alignment between the predictions of the ML models and the actual observations, the CS predicted by the development ensemble model (CatBoost and stack-2) was plotted against the experimental results. Through literature research, the study simulated an independent dataset that was not normalized, boosting model generalizability. Rigorous statistical analysis affirmed the stability and consistency of this methodology. By meeting every requirement, it is clear that the stack-2 model performs well by obtaining very ideal values of each parameter. Figure 5 presents the graphical regression line fitted to the data points. Upon conservation, the stacking model displays a tighter cluster of data points close to the ideal regression line, which signifies its superior performance. The findings of this study demonstrate that the proposed stacking model exhibits greater efficacy and applicability in forecasting the CS than the models that have been previously documented.

4.12. Sensitivity Analysis Outcomes

SHAP-based feature attribution was computed on the testing set to ensure consistency with performance reporting. The results revealed the direction and magnitude of each predictor’s contribution to the compressive strength. This sensitivity analysis aimed to evaluate how individual inputs affect the output variable in predicting concrete’s CS. Figure 6 shows the Shapley value analyses of the CatBoost models used to simulate the CS relevance of each input parameter. The vertical axis shows the inputs and their effects on the output parameters in decreasing order. As the color changes from red to blue, the input parameter decreases within the data range. Among the various input features, the cement and testing age are the two most critical factors affecting the simulation of CS, surpassing other factors such as water, slag, superplasticizer, natural fine aggregate, coarse aggregate, and fly ash. Higher cement content increases reactive phase availability, promoting denser C-S-H networks with enhanced load transfer capacity. Testing age directly correlates with hydration kinetics—early strength (3–7 days) stems from rapid C3S hydration, while long-term strength (28+ days) develops through continued C2S reaction and pore refinement via pozzolanic activity. Based on Figure 6, the testing age, cement, slag, superplasticizer, and fly ash were found to exert a positive impact on the CS; conversely, water and coarse aggregate were identified as contributions with a negative influence on the CS. Fly ash, as a primary binder with the least impactful parameter, suggests that variations in fly ash content may not significantly alter the performance of CS, at least within the range considered in this study. In contrast, cement and testing age exhibit substantial influence, indicating that their precise control and optimization could lead to significant improvements in CS properties. The analysis also highlights the importance of other parameters, such as slag, superplasticizer, coarse aggregate, and natural fine aggregate, which are less influential than cement and testing age, but still contribute to the overall variability in CS. Superplasticizer improves the workability of concrete without raising the water content. The minimal sensitivity suggests superplasticizer’s primary role in facilitating placement rather than strength enhancement. Natural fine aggregate boosts particle packing efficiency and enhances mixture cohesion. Its medium sensitivity reflects supplementary reinforcement of the cement matrix for strength development. Coarse aggregate serves chiefly as structural filler. The elevated sensitivity highlights compressive strength’s stronger reliance on binding agents than aggregate materials. Sensitivity insights optimize machine learning frameworks through parameter weighting, increasing model precision while eliminating redundant variables.
Furthermore, SHAP interaction analysis was conducted to explore pairwise feature relationships, revealing that cement and testing age exhibited a synergistic effect: higher cement content amplified strength gains with longer curing periods, while interactions between water and superplasticizer mitigated the negative impact of excess water on compressive strength. Sensitivity analysis showed it effectively captured the positive impact of fly ash and superplasticizer, as well as the negative correlation between on CS prediction.

4.13. Development of Graphical User Interface

By implementing machine learning algorithms, we can more accurately predict how concrete mix proportions affect strength while reducing the number of experiments required. This predictive capability not only optimizes resource allocation but also significantly shortens research and development cycles. Furthermore, when combined with sensitivity analysis results, the machine learning model can identify the most critical factors influencing concrete strength, thereby guiding engineers to make targeted adjustments in formula design. This research presents an extremely useful machine learning model to estimate the compressive strength of concrete. Figure 7 displays the interface for the meta-model computing CS values and the user-specified input parameters. During the initial prediction stage, users interact with this interface to input crucial parameters that match the ranges present in the dataset used to train the meta-model. The interface is designed with user-friendliness in mind, featuring intuitive input fields and clear instructions. Users can easily specify parameters such as cement, fly ash, slag (Sl), water, superplasticizer, coarse aggregate, natural fine aggregate, and testing age, as well as the type and proportion of binding agents, as well as aggregate materials, to obtain accurate CS value predictions. Real-time feedback is provided to ensure that the entered parameters are within the valid ranges, enhancing the overall user experience. Additionally, the interface allows for quick adjustments and re-predictions, facilitating an iterative process for optimizing input parameters.

5. Discussion

Comparison with the literature: The testing set R2 and error magnitudes are comparable to the values reported for compressive strength prediction using elastic net regression, KNN ANN, SVR, and XGBoost models on public or mixed concrete datasets, including Zheng et al. [69], Migallón et al. [70], Lin et al. [71], Wang et al. [72], and Yang et al. [73]. Observed differences across studies primarily reflect dataset composition (e.g., recycled aggregates, lightweight or fiber-reinforced mixes), predictor sets (e.g., inclusion of UPV in ultrasonic-based studies), and hyperparameter optimization scope. Under a leakage-safe protocol with harmonized preprocessing, parity with prior benchmarks indicates robust generalization; unified SHAP interpretation further clarifies variable effects in agreement with reported trends. The superior performance of stacking models can be attributed to their inherent characteristics aligned with RAC’s complex material properties.
For stacking models, the ensemble framework integrates the complementary strengths of base learners: elastic net handles multicollinearity in correlated mix parameters (e.g., water–cement ratio and superplasticizer content), KNN captures local patterns in aggregate gradation, and ANN models microstructural interactions—collectively addressing the multi-scale variability inherent in recycled aggregate concrete. This hybrid approach reduces prediction bias through weighted meta-modeling, outperforming single-algorithm methods in heterogeneous RAC scenarios.
CS development in concrete structures represents a complex issue shaped by numerous variables. Key influencing elements include variations in mix components [74,75,76,77,78,79,80,81,82,83,84,85], curing parameters [86,87], environmental fluctuations [88,89,90,91,92,93], and placement/compaction methods [94,95,96,97]. Despite these multifaceted interactions, most machine learning approaches for CS prediction have primarily focused on mixture ratios as foundational parameters [98,99,100,101,102]. Solving predictive issues with eight specific parameters does not ensure generalized accuracy. Application of a global dataset that is not only limited to the field of concrete mixing proportions but also including the mechanical and micro-structure properties of concrete testing samples could improve generalized accuracy. A substantially larger majority of fundamental research should be conducted on concrete investigations, including the microstructure and mechanical properties of concrete with different proportions of alternative materials such as waste tea residue and beet residue, to supplement basic research data [103]. Enhancing machine learning models through optimization, a critical factor for improving their efficiency and lowering computational resource demands, merits increased focus in subsequent studies.
Moreover, the sensitivity analysis underscores the necessity of considering interactions among input variables. While individual impacts are insightful, the combined effects of these parameters could lead to more complex behaviors in CS prediction. Future studies should explore the potential synergies or antagonisms among these inputs, which could further refine the predictive accuracy of the models. Additionally, incorporating advanced feature engineering techniques could enhance the model’s ability to capture these intricate relationships. By transforming or combining existing input variables, new features may emerge that better represent the underlying physics of concrete compressive strength development. Within the gradient boosting family, CatBoost and LightGBM represent competitive variants; planned extensions will include these learners under the same preprocessing, tuning, and evaluation pipeline for completeness, positioning new models relative to prior comparisons. Reported advances made by this research are methodological: standardized pipeline with harmonized preprocessing and tuning, unified SHAP interpretation across learners on the same test partition, and cross-validated development summaries. The scope targets transparent, like-for-like benchmarking, rather than proposing new learning algorithms. A lightweight GUI/notebook is planned under the identical preprocessing and evaluation pipeline, accepting mix variables and age as inputs, returning predicted compressive strength with mixture designs for an environmentally sustainable scene.
However, the dataset has limitations: it only models eight parameters, omitting important factors such as compaction degree, curing conditions, and environmental influences, which may restrict the generalizability of the model in practical engineering scenarios.
Specifically, (1) dataset bias may arise from over-reliance on the UCI repository, which primarily reflects ordinary concrete mixtures and may underrepresent specialized formulations like high-performance, fiber-reinforced concrete, or recycled aggregate concrete (RAC), a key sustainable formulation increasingly relevant in modern construction; (2) the exclusive dependence on UCI data limits external validity, as the dataset lacks geographic diversity and may not capture regional material variations; (3) critical material property features are absent, including microstructural inputs (e.g., porosity, hydration degree) and mechanical properties (e.g., aggregate strength), which are known to influence compressive strength development; and (4) uncertainty analysis was not performed to quantify prediction variability under input parameter fluctuations, which is essential for engineering decision-making under real-world uncertainty.
To address these limitations, future research could integrate hybrid experimental–ML approaches where targeted laboratory experiments (e.g., varying curing temperatures or compaction energy) generate critical data to expand the parameter space. This experimental data can then refine ML models through active learning, prioritizing samples that maximize predictive uncertainty reduction. Additionally, physics-informed learning frameworks—incorporating hydration kinetics equations (e.g., Avrami’s model for cement hydration degree) and microstructural evolution laws (e.g., porosity–strength relationships)—could constrain ML predictions, ensuring alignment with fundamental material science principles. Such integration would enhance model generalizability across unmeasured environmental conditions and complex mix designs, bridging the gap between data-driven prediction and physical interpretability. Furthermore, expanding the dataset to include recycled aggregate concrete (RAC) mixtures—addressing the current limitation of relying on ordinary concrete data—would enhance model applicability to sustainable construction practices.

6. Conclusions

(1)
Key outcomes: This study developed and validated a machine learning framework for concrete compressive strength (CS) prediction using 1030 UCI dataset samples with eight input parameters (cement, slag, fly ash, water, coarse aggregate, natural fine aggregate, superplasticizer, testing age). Among elastic net, KNN, ANN, SVR, RF, XGBoost, CatBoost, stacking models, and symbolic regression, CatBoost and stacking approaches achieved superior performance via harmonized preprocessing, five-fold cross-validation, and unified SHAP interpretation. External validation with the independent literature data confirmed robust generalizability, identifying cement and testing age as the most critical predictors through sensitivity analysis. A GUI was developed. It enables engineers and researchers to access the CS rapidly, presenting an extremely user-friendly interface to facilitate the use of a machine learning model. It is adaptable to new assumptions, and various scenarios to be obtained in future analyses.
(2)
Practical significance: The proposed ML pipeline reduces experimental costs and time by enabling rapid, accurate CS prediction without extensive laboratory testing. Standardized evaluation metrics (R2, MAE, RMSE, MAPE, SI) and interpretable SHAP-based sensitivity analysis facilitate sustainable concrete mix optimization, supporting environmentally conscious material design in civil engineering applications.
(3)
Future directions: The power of ML models is their data-driven ability, instead of the explicit understanding of underlying physical mechanisms. This means constant and rigorous laboratory validation are needed. CS is determined by many other parameters, including factors like concrete composition, water–cement ratios, moisture levels, substitution rates, diverse environmental factors, etc. [104]. Moving forward, researchers could explore sophisticated data through experimental work, field tests, and other numerical analyses using various methods to better handle CS prediction. The more we understand about the concrete composition versus CS relationship, the better we can understand the nature of concrete and how to optimize the concrete mixture [105,106]. Additionally, future investigations of integrated modeling approaches combining multiple algorithms should be conducted to enhance CS predictive accuracy. Furthermore, creating comprehensive, high-quality datasets that cover not only compressive strength but also other key properties of concrete—including recycled aggregate concrete (RAC) mixtures, which are critical for sustainable construction practices—is essential for improving the accuracy and generalization of predictive tools.
In summary, the proposed ML pipeline reduces experimental costs and time by enabling rapid, accurate CS prediction without extensive laboratory testing. Standardized evaluation metrics (R2, MAE, RMSE, MAPE, SI) and interpretable SHAP-based sensitivity analysis facilitate sustainable concrete mix optimization, supporting environmentally conscious material design in civil engineering applications. Machine learning (ML) models can assist decision-makers in making professional decisions and optimizing their design choices. Furthermore, creating comprehensive, high-quality datasets that cover not only compressive strength, but also other key properties of concrete, is essential for improving the accuracy and generalization of predictive tools.

Author Contributions

J.L.: writing—review and editing, validation, and investigation. D.G.: writing—review and editing, validation, and formal analysis. X.L.: writing—original draft, data curation, methodology, and conceptualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by [Research on the Construction of China’s Ecological Civilization Discourse System] grant number [2025VTS011].

Data Availability Statement

The data that support the findings of this study are publicly available from the UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+Strength, accessed on 23 October 2025, originally published by [105].

Acknowledgments

The authors acknowledge the UCI Machine Learning Repository for providing access to the concrete compressive strength dataset. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ahmad, W.; Ahmad, A.; Ostrowski, K.A.; Aslam, F.; Joyklad, P. A scientometric review of waste material utilization in concrete for sustainable construction. Case Stud. Constr. Mater. 2021, 15, e00683. [Google Scholar] [CrossRef]
  2. Nwankwo, C.O.; Bamigboye, G.O.; Davies, I.E.E.; Michaels, T.A. High volume Portland cement replacement: A review. Constr. Build. Mater. 2020, 260, 120445. [Google Scholar] [CrossRef]
  3. Arrigoni, A.; Panesar, D.K.; Duhamel, M.; Opher, T.; Saxe, S.; Posen, I.D.; MacLean, H.L. Life cycle greenhouse gas emissions of concrete containing supplementary cementitious materials: Cut-off vs. Substitution. J. Clean. Prod. 2020, 263, 121465. [Google Scholar] [CrossRef]
  4. Golafshani, E.M.; Arashpour, M.; Kashani, A. Green mix design of rubbercrete using machine learning-based ensemble model and constrained multi-objective optimization. J. Clean. Prod. 2021, 327, 129518. [Google Scholar] [CrossRef]
  5. Chakkamalayath, J.; Joseph, A.; Baghli, H.A.; Hamadah, O.; Dashti, D.; Abdulmalek, N. Performance evaluation of self-compacting concrete containing volcanic ash and recycled coarse aggregates. Asian J. Civ. Eng. 2020, 21, 815–827. Available online: https://link.springer.com/article/10.1007/s42107-020-00242-2 (accessed on 23 October 2025).
  6. Huynh, A.T.; Nguyen, Q.D.; Xuan, Q.L.; Magee, B.; Chung, T.C.; Tran, K.T.; Nguyen, K.T. A machine learning-assisted numerical predictor for compressive strength of geopolymer concrete based on experimental data and sensitivity analysis. Appl. Sci. 2020, 10, 7726. [Google Scholar] [CrossRef]
  7. Pazouki, G.; Pourghorban, A. Using a hybrid artificial intelligence method for estimating the compressive strength of recycled aggregate self-compacting concrete. Eur. J. Environ. Civ. Eng. 2022, 26, 5569–5593. [Google Scholar] [CrossRef]
  8. Bui, D.-K.; Nguyen, T.; Chou, J.-S.; Nguyen-Xuan, H.; Ngo, T.D. A modified firefly algorithm-artificial neural network expert system for predicting compressive and tensile strength of high-performance concrete. Constr. Build. Mater. 2018, 180, 320–333. [Google Scholar] [CrossRef]
  9. Chithra, S.; Kumar, S.R.R.S.; Chinnaraju, K.; Ashmita, F.A. A comparative study on the compressive strength prediction models for High Performance Concrete containing nano silica and copper slag using regression analysis and Artificial Neural Networks. Constr. Build. Mater. 2016, 114, 528–535. [Google Scholar] [CrossRef]
  10. Dahish, H.A.; Alfawzan, M.S.; Tayeh, B.A.; Abusogi, M.A.; Bakri, M. Effect of inclusion of natural pozzolan and silica fume in cement-based mortars on the compressive strength utilizing artificial neural networks and support vector machine. Case Stud. Constr. Mater. 2023, 18, e02153. [Google Scholar] [CrossRef]
  11. Thisovithan, P.; Aththanayake, H.; Meddage, D.P.P.; Ekanayake, I.U.; Rathnayake, U. A novel explainable AI-based approach to estimate the natural period of vibration of masonry infill reinforced concrete frame structures using different machine learning techniques. Res. Eng. 2023, 19, 101388. [Google Scholar] [CrossRef]
  12. Zeyad, A.M.; Mahmoud, A.A.; El-Sayed, A.A.; Aboraya, A.M.; Fathy, I.N.; Zygouris, N.; Asteris, P.G.; Agwa, I.S. Compressive strength of nano concrete materials under elevated temperatures using machine learning. Sci. Rep. 2024, 14, 24246. [Google Scholar] [CrossRef]
  13. Alahmari, T.S.; Ashraf, J.; Sobuz, M.H.R.; Uddin, M.A. Predicting the compressive strength of fiber-reinforced self-consolidating concrete using a hybrid machine learning approach. Innov. Infrastruct. Solut. 2024, 9, 446. [Google Scholar] [CrossRef]
  14. Gorgün, E. Characterization of superalloys by artificial neural network method. New Trends Math. Sci. 2022, 1, 95–99. [Google Scholar] [CrossRef]
  15. Khan, M.M.H.; Sobuz, M.H.R.; Meraz, M.M.; Tam, V.W.Y.; Hasan, N.M.S.; Shaurdho, N.M.N. Effect of various powder content on the properties of sustainable selfcompacting concrete. Case Stud. Constr. Mater. 2023, 19, e02274. [Google Scholar] [CrossRef]
  16. Chou, J.-S.; Pham, A.-D. Smart artificial firefly colony algorithm-based support vector regression for enhanced forecasting in civil engineering. Comput.-Aided Civ. Infrastruct. Eng. 2015, 30, 715–732. [Google Scholar] [CrossRef]
  17. Sonebi, M.; Cevik, A.; Grünewald, S.; Walraven, J. Modelling the fresh properties of self-compacting concrete using support vector machine approach. Constr. Build. Mater. 2016, 106, 55–64. [Google Scholar] [CrossRef]
  18. Nyirandayisabye, R.; Li, H.; Dong, Q.; Hakuzweyezu, T.; Nkinahamira, F. Automatic pavement damage predictions using various machine learning algorithms: Evaluation and comparison. Res. Eng. 2022, 16, 100657. [Google Scholar] [CrossRef]
  19. Cheng, M.-Y.; Firdausi, P.M.; Prayogo, D. High-performance concrete compressive strength prediction using Genetic Weighted Pyramid Operation Tree (GWPOT). Eng. Appl. Artif. Intell. 2014, 29, 104–113. [Google Scholar] [CrossRef]
  20. Behnood, A.; Behnood, V.; Gharehveran, M.M.; Alyamac, K.E. Prediction of the compressive strength of normal and high-performance concretes using M5P model tree algorithm. Constr. Build. Mater. 2017, 142, 199–207. [Google Scholar] [CrossRef]
  21. MAlkharisi, K.; Dahish, H.A.; Youssf, O. Prediction models for the hybrid effect of nano materials on radiation shielding properties of concrete exposed to elevated temperatures. Case Stud. Constr. Mater. 2024, 21, e03750. [Google Scholar] [CrossRef]
  22. Khan, K.; Ahmad, W.; Amin, M.N.; Ahmad, A.; Nazar, S.; Alabdullah, A.A.; Arab, A.M.A. Exploring the use of waste marble powder in concrete and predicting its strength with different advanced algorithms. Materials 2022, 15, 4108. [Google Scholar] [CrossRef] [PubMed]
  23. Yuan, Y.; Yang, M.; Shang, X.; Xiong, Y.; Zhang, Y. Predicting the compressive strength of UHPC with coarse aggregates in the context of machine learning. Case Stud. Constr. Mater. 2023, 19, e02627. [Google Scholar] [CrossRef]
  24. Sobuz, M.H.R.; Al-Imran; Datta, S.D.; Jabin, J.A.; Aditto, F.S.; Hasan, N.M.S.; Hasan, M.; Zaman, A.A.U. Assessing the influence of sugarcane bagasse ash for the production of eco-friendly concrete: Experimental and machine learning approaches. Case Stud. Constr. Mater. 2024, 20, e02839. [Google Scholar] [CrossRef]
  25. Ganesh, A.C.; Mohana, R.; Loganathan, P.; Kumar, V.M.; Kırgız, M.S.; Nagaprasad, N.; Ramaswamy, K. Development of alkali activated paver blocks for medium traffic conditions using industrial wastes and prediction of compressive strength using random forest algorithm. Sci. Rep. 2023, 13, 15152. [Google Scholar] [CrossRef]
  26. Al Saleem, M.; Harrou, F.; Sun, Y. Explainable machine learning methods for predicting water treatment plant features under varying weather conditions. Res. Eng. 2024, 21, 101930. [Google Scholar] [CrossRef]
  27. Amin, M.N.; Nassar, R.-U.-D.; Ahmad, M.T.Q.A.; Khan, K.; Javed, M.F. Investigating the compressive property of foamcrete and analyzing the feature interaction using modeling approaches. Res. Eng. 2024, 24, 103305. [Google Scholar] [CrossRef]
  28. Harirchian, E.; Hosseini, S.E.A.; Novelli, V.; Lahmer, T.; Rasulzade, S. Utilizing advanced machine learning approaches to assess the seismic fragility of non-engineered masonry structures. Res. Eng. 2024, 21, 101750. [Google Scholar] [CrossRef]
  29. Isleem, H.F.; Chukka, N.D.K.R.; Bahrami, A.; Oyebisi, S.; Kumar, R.; Qiong, T. Nonlinear finite element and analytical modelling of reinforced concrete filled steel tube columns under axial compression loading. Res. Eng. 2023, 19, 101341. [Google Scholar] [CrossRef]
  30. Saaidi, A.; Bichri, A.; Abderafi, S. Efficient machine learning model to predict dynamic viscosity in phosphoric acid production. Res. Eng. 2023, 18, 101024. [Google Scholar] [CrossRef]
  31. Sinkhonde, D.; Bezabih, T.; Mirindi, D.; Mashava, D.; Mirindi, F. Ensemble machine learning algorithms for efficient prediction of compressive strength of concrete containing tyre rubber and brick powder. Clean Waste Syst. 2025, 10, 100236. [Google Scholar] [CrossRef]
  32. Taiwo, R.; Yussif, A.-M.; Adegoke, A.H.; Zayed, T. Prediction and deployment of compressive strength of high-performance concrete using ensemble learning techniques. Constr. Build. Mater. 2024, 451, 138808. [Google Scholar] [CrossRef]
  33. Katlav, M.; Ergen, F.; Donmez, I. AI-driven design for compressive strength of ultra-high performance geopolymer concrete (UHPFC): From explainable ensemble models to the graphical user interface. Mater. Today Commun. 2024, 40, 109915. [Google Scholar] [CrossRef]
  34. Yan, J.; Xu, J.J.; Lin, L.; Yu, Y. Ensemble machine learning models for compressive strength and elastic modulus of recycled brick aggregate concrete. Mater. Today Commun. 2024, 41, 110635. [Google Scholar] [CrossRef]
  35. Wu, Y.; Huang, H. Predicting compressive and flexural strength of high-performance concrete using a dynamic Catboost Regression model combined with individual and ensemble optimization techniques. Mater. Today Commun. 2024, 38, 108174. [Google Scholar] [CrossRef]
  36. Qureshi, H.J.; Alyami, M.; Nawaz, R.; Hakeem, I.Y.; Aslam, F.; Iftikhar, B.; Gamil, Y. Prediction of compressive strength of two-stage (preplaced aggregate) concrete using gene expression programming and random forest. Case Stud. Constr. Mater. 2023, 19, e02581. [Google Scholar] [CrossRef]
  37. Alaskar, A.; Alfalah, G.; Althoey, F.; Abuhussain, M.A.; Javed, M.F.; Deifalla, A.F.; Ghamry, N.A. Comparative study of genetic programming-based algorithms for predicting the compressive strength of concrete at elevated temperature. Case Stud. Constr. Mater. 2023, 18, e02199. [Google Scholar] [CrossRef]
  38. Golafshani, E.M.; Kim, T.; Behnood, A.; Ngo, T.; Kashani, A. Sustainable mix design of recycled aggregate concrete using artificial intelligence. J. Clean. Prod. 2024, 442, 140994. [Google Scholar] [CrossRef]
  39. Khademi, F.; Jamal, S.M.; Deshpande, N.; Londhe, S. Predicting strength of recycled aggregate concrete using artificial neural network, adaptive neuro-fuzzy inference system and multiple linear regression. Int. J. Sustain. Built Environ. 2016, 5, 355–369. [Google Scholar] [CrossRef]
  40. Dahish, H.A.; Almutairi, A.D. Compressive strength prediction models for concrete containing nano materials and exposed to elevated temperatures. Result Eng. 2025, 25, 103975. [Google Scholar] [CrossRef]
  41. Ahmad, A.; Farooq, F.; Niewiadomski, P.; Ostrowski, K.; Akbar, A.; Aslam, F.; Alyousef, R. Prediction of compressive strength of fly ash based concrete using individual and ensemble algorithm. Materials 2021, 14, 794. [Google Scholar] [CrossRef] [PubMed]
  42. Seghier, M.E.A.B.; Golafshani, E.M.; Jafari-Asl, J.; Arashpour, M. Metaheuristicbased machine learning modeling of the compressive strength of concrete containing waste glass. Struct. Concr. 2023, 24, 5417–5440. [Google Scholar] [CrossRef]
  43. Anwar, M.K.; Qurashi, M.A.; Zhu, X.Y.; Shah, S.A.R.; Siddiq, M.U. A comparative performance analysis of machine learning models for compressive strength prediction in fly ash-based geopolymer concrete using reference data. Case Stud. Constr. Mater. 2025, 22, e04207. [Google Scholar] [CrossRef]
  44. Mai, H.-V.T.; Nguyen, M.H.; Ly, H.-B. Development of machine learning methods to predict the compressive strength of fiber reinforced self-compacting concrete and sensitivity analysis. Constr. Build. Mater. 2023, 367, 130339. [Google Scholar] [CrossRef]
  45. de Prado-Gil, J.; Martínez-García, R.; Jagadesh, P.; Juan-Valdes, A.; Gόnzalez-Alonso, M.-I.; Palecia, G. To determine the compressive strength of self-compacting recycled aggregate concrete using artificial neural network (ANN). Ani Shams Eng. J. 2024, 15, 102548. [Google Scholar] [CrossRef]
  46. Islam, N.; Kashem, A.; Das, P.; Ali, M.N.; Paul, S. Prediction of high-performance concrete compressive strength using deep learning techniques. Asian J. Civ. Eng. 2024, 25, 327–341. [Google Scholar] [CrossRef]
  47. Al-Naghi, A.A.A.; Aamir, K.; Amin, M.N.; Iftikhar, B.; Mehmood, K.; Qadir, M.T. Predicting strength in polypropylene fiber reinforced rebberized concrete using symbolic regression AI techniques. Case Stud. Constr. Mat. 2025, 23, e05024. [Google Scholar] [CrossRef]
  48. Alotaibi, K.S.; Islam, A.B.M.S. Symbolic regression model for predicting compression strength of prismatic masonry columns confined by FRP. Buildings 2023, 13, 509. [Google Scholar] [CrossRef]
  49. Naseri, H.; Jahanbakhsh, H.; Hosseini, P.; Nejad, F.M. Designing sustainable concrete mixture by developing a new machine learning technique. J. Clean. Prod. 2020, 258, 120578. [Google Scholar] [CrossRef]
  50. Naseri, H.; Jahanbakhsh, H.; Khezri, K.; Javid, A.A.S. Toward sustainability in optimizing the fly ash concrete mixture ingredients by introducing a new prediction algorithm. Environ. Dev. Sustain. 2022, 24, 2767–2803. [Google Scholar] [CrossRef]
  51. Naseri, H.; Hosseini, P.; Jahanbakhsh, H.; Hosseini, P.; Gandomi, A.H. A novel evolutionary learning to prepare sustainable concrete mixtures with supplementary cementitious materials. Environ. Dev. Sustain. 2023, 25, 5831–5865. [Google Scholar] [CrossRef]
  52. Chamika, W.; Shashika, D.; Nisal, A.; Pamith, R.; Sumudu, H.; Chinthaka, M.; Meddage, D.P.P. Multiscale modelling and explainable AI for predicting mechanical properties of carbon fibre woven composites with parametric microsscale and mesoscale configurations. Compos. Struct. 2025, 369, 11982. [Google Scholar] [CrossRef]
  53. Shashika, D.; Perampalam, G.; Sumudu, H.; Meddage, D.P.P.; James, B.P.L. Data-informed design equation based on numerical modelling and interpretable machine learning for the shear capacity of cold-formed steel hollow flange beams with unstiffened and edge stiffened openings. Structures 2025, 81, 110397. [Google Scholar] [CrossRef]
  54. Ye, R.Z. Research on the Construction of a Prediction Model for the 28-Day Compressive Strength of Freshly Mixed Concrete. Master’s Thesis, Nanchang Institute of Technology, Nanchang, China, 2024. [Google Scholar] [CrossRef]
  55. Shashikant, K.; Rakesh, K.; Baboo, R.; Pijush, S. Prediction of compressive strength of high-volume fly ash self-compacting concrete with silica fume using machine learning techniques. Constr. Build. Mater. 2024, 438, 136933. [Google Scholar] [CrossRef]
  56. Hu, H.S.; Jiang, M.Y.; Tang, M.X.; Liang, H.Q.; Cui, H.; Liu, C.L.; Ji, C.J.; Wang, Y.Z.; Jian, S.M.; Wei, C.H.; et al. Prediction of compressive strength of fly ash-based geopolymers concrete based on machine learning. Res. Eng. 2025, 27, 106492. [Google Scholar] [CrossRef]
  57. Gou, J.G.; Zaman, A.; Farooq, F. Machine learning-based prediction of compressive strength in sustainable self-compacting concrete. Eng. Appl. Artif. Intell. 2025, 161, 112190. [Google Scholar] [CrossRef]
  58. Rashid, K.; Rafique, F.; Naseem, Z.; Alqahtani, F.K.; Zafar, I.; Ju, M. Machine learning and multicriteria analysis for prediction of compressive strength and sustainability of cementitious materials. Case Stud. Constr. Mater. 2024, 21, e04080. [Google Scholar] [CrossRef]
  59. Simon, N.; Friedman, J.; Hastie, T.; Tibshirani, R. Regularization paths for cox’s proportional hazards model via coordinate descent. JSS J. Stat. Soft. 2011, 39, 1–13. [Google Scholar] [CrossRef]
  60. Malik, A.K.; Gao, R.; Ganaie, M.A.; Tanveer, M.; Suganthan, P.N. Random vector functional link network: Recent developments, applications and future directions. Appl. Soft. Comput. 2023, 143, 110377. [Google Scholar] [CrossRef]
  61. Onyelowe, K.C.; Kamchoom, V.; Ebid, A.M.; Hanandeh, S.; LIamuca, J.L.L.; Yachambay, F.P.L.; Palta, J.L.A.; Vishnupriyan, M.; Avudaiappan, S. Optimizing the utilization of metakaolin in pre-cured geopolymer concrete using ensemble and symbolic regressions. Sci. Rep. 2025, 15, 6858. [Google Scholar] [CrossRef] [PubMed]
  62. Őzkilic, Y.O.; Zeybek, O.; Bahrami, A.; Celik, A.I.; Mydin, M.A.O.; Karalar, M.; Hakeem, I.Y.; Roy, K.; Jagadesh, P. Optimum usage of waste marble powder to reduce use of cement toward eco-friendly concrete. J. Mater. Res. Technol. 2023, 25, 4799–4819. [Google Scholar] [CrossRef]
  63. Lee, J.A.; Sagong, M.J.; Jung, J.; Kim, E.S.; Kim, H.S. Explainable machine learning for understanding and predicting geometry and defect types in Fe-Ni alloys fabricated by laser metal deposition additive manufacturing. J. Mater. Res. Technol. 2023, 22, 413–423. [Google Scholar] [CrossRef]
  64. Ekanayake, I.U.; Meddage, D.P.P.; Rathnayake, U. A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud. Constr. Mater. 2022, 16, e01059. [Google Scholar] [CrossRef]
  65. Shaaban, M.; Amin, M.; Selim, S.; Riad, I.M. Machine learning approaches for forecasting compressive strength of high-strength concrete. SCI Rep. 2025, 15, 25567. [Google Scholar] [CrossRef]
  66. Li, Q.F.; Wang, X. Bayesian optimization of stacking ensemble learning model for HPC compressive strength prediction. Expert. Syst. Appl. 2025, 288, 128281. [Google Scholar] [CrossRef]
  67. Pan, B.H.; Liu, W.S.; Zhou, P.; Wu, A.P.O. Predicting the Compressive Strength of Recycled Concrete Using Ensemble Learning Model; Digital Object Identifier; IEEE: New York, NY, USA, 2024. [Google Scholar] [CrossRef]
  68. Qi, F.; Li, H.Y. A two-level machine learning prediction approach for RAC compressive strength. Buildings 2024, 14, 2885. [Google Scholar] [CrossRef]
  69. Zheng, J.; Yao, T.; Yue, J.; Wang, M.; Xia, S. Compressive strength prediction of BFRC based on a novel hybrid machine learning model. Buildings 2023, 13, 1934. [Google Scholar] [CrossRef]
  70. Migallón, V.; Penadés, H.; Penadés, J.; Tenza-Abril, A.J. A machine learning approach to prediction of the compressive strength of segregated lightweight aggregate concretes using ultrasonic pulse velocity. Appl. Sci. 2023, 13, 1953. [Google Scholar] [CrossRef]
  71. Lin, C.; Sun, Y.; Jiao, W.; Zheng, J.; Li, Z.; Zhang, S. Prediction of compressive strength and elastic modulus for recycled aggregate concrete based on AutoGluon. Sustainability 2023, 15, 12345. [Google Scholar] [CrossRef]
  72. Wang, W.; Zhong, Y.; Liao, G.; Ding, Q.; Zhang, T.; Li, X. Prediction of compressive strength of concrete specimens based on interpretable machine learning. Materials 2024, 17, 3661. [Google Scholar] [CrossRef] [PubMed]
  73. Yang, Y.; Liu, G.; Zhang, H.; Zhang, Y.; Yang, X. Predicting the compressive strength of environmentally friendly concrete using multiple machine learning algorithms. Buildings 2024, 14, 190. [Google Scholar] [CrossRef]
  74. Chung, K.L.; Xie, S.; Ghannam, M.; Guan, M.; Ning, N.; Li, Y.; Zhang, C. Strength prediction and correlation of cement composites: A cross-disciplinary approach. IEEE Access 2019, 7, 41746–41756. [Google Scholar] [CrossRef]
  75. Huo, W.; Zhu, Z.; Sun, H.; Ma, B.; Yang, L. Development of machine learning models for the prediction of the compressive strength of calcium-based geopolymers. J. Clean. Prod. 2022, 380, 135159. [Google Scholar] [CrossRef]
  76. Iftikhar, B.; Alih, S.C.; Vafaei, M.; Elkotb, M.A.; Shutaywi, M.; Javed, M.F.; Deebani, W.; Khan, M.I.; Aslam, F. Predictive modeling of compressive strength of sustainable rice husk ash concrete: Ensemble learner optimization and comparison. J. Clean. Prod. 2022, 348, 131285. [Google Scholar] [CrossRef]
  77. Isaia, G.C.; Gastaldini, A.L.G.; Moraes, R. Physical and pozzolanic action of mineral additions on the mechanical strength of high-performance concrete. Cem. Concr. Compos. 2003, 25, 69–76. [Google Scholar] [CrossRef]
  78. Jin, R.; Li, B.; Elamin, A.; Wang, S.; Tsioulou, O.; Wanatowski, D. Experimental investigation of properties of concrete containing recycled construction wastes. Int. J. Civ. Eng. 2018, 16, 1621–1633. [Google Scholar] [CrossRef]
  79. Knaack, A.M.; Kurama, Y.C. Design of concrete mixtures with recycled concrete aggregates. Acids Mater. J. 2013, 110, 483–493. [Google Scholar]
  80. Meddah, M.S.; Zitouni, S.; Belâabes, S. Effect of content and particle size distribution of coarse aggregate on the compressive strength of concrete. Constr. Build. Mater. 2010, 24, 505–512. [Google Scholar] [CrossRef]
  81. Mokuolu, O.A.; Olaniyi, T.B.; Jacob-Oricha, S.O. Evaluation of calcium carbide residue waste as a partial replacement for cement in concrete. J. Solid. Waste Technol. Manag. 2018, 44, 370–377. [Google Scholar] [CrossRef]
  82. Nath, P.; Sarker, P. Effect of fly ash on the durability properties of high strength concrete. Procedia Eng. 2011, 14, 1149–1156. [Google Scholar] [CrossRef]
  83. Nguyen, N.-H.; Tong, K.T.; Lee, S.; Karamanli, A.; Vo, T.P. Prediction compressive strength of cement-based mortar containing meta-kaolin using explainable Categorical Gradient Boosting model. Eng. Struct. 2022, 269, 114768. [Google Scholar] [CrossRef]
  84. Thomas, R.J.; Peethamparan, S. Stepwise regression modeling for compressive strength of alkali-activated concrete. Constr. Build. Mater. 2017, 141, 315–324. [Google Scholar] [CrossRef]
  85. Winnefeld, F.; Becker, S.; Pakusch, J.; Gotz, T. Effects of the molecular architecture of comb-shaped superplasticizers on their performance in cementitious systems. Cem. Concr. Compos. 2007, 29, 251–262. [Google Scholar] [CrossRef]
  86. Park, J.S.; Kim, Y.J.; Cho, J.R.; Jeon, S.J. Early-age strength of ultra-high performance concrete in various curing conditions. Materials 2015, 8, 5537–5553. [Google Scholar] [CrossRef] [PubMed]
  87. Topçu, I.B.; Toprak, M.U. Fine aggregate and curing temperature effect on concrete maturity. Cem. Concr. Res. 2005, 35, 758–762. [Google Scholar] [CrossRef]
  88. Bakir, N. Experimental study of the effect of curing mode on concreting in hot weather. J. Compos. Adv. Mater. 2021, 31, 243–248. [Google Scholar] [CrossRef]
  89. Berhane, Z. The behaviour of concrete in hot climates. Mater. Struct. 1992, 25, 157–162. [Google Scholar] [CrossRef]
  90. Kang, F.; Liu, X.; Li, J. Temperature effect modeling in structural health monitoring of concrete dams using kernel extreme learning machines. Struct. Health Monit. 2020, 19, 987–1002. [Google Scholar] [CrossRef]
  91. Naas, A.; Taha-Hocine, D.; Salim, G.; Michele, Q. Combined effect of powdered dune sand and steam-curing using solar energy on concrete characteristics. Constr. Build. Mater. 2022, 322, 126474. [Google Scholar] [CrossRef]
  92. Rastrup, E. Heat of hydration in concrete. Mag. Concr. Res. 1954, 6, 79–92. [Google Scholar] [CrossRef]
  93. Soutsos, M.; Hatzitheodorou, A.; Kwasny, J.; Kanavaris, F. Effect of in situ temperature on the early age strength development of concretes with supplementary cementitious materials. Constr. Build. Mater. 2016, 103, 505–512. [Google Scholar] [CrossRef]
  94. Feri, K.; Kumar, V.S.; Romi, A.; Gotovac, H. Effect of aggregate size and compaction on the strength and hydraulic properties of pervious concrete. Sustainability 2023, 15, 1146. [Google Scholar] [CrossRef]
  95. de Medeiros-Junior, R.A.; de Lima, M.G.; Oliveira, A. Influence of different compacting methods on concrete compressive strength. Matéria 2018, 23, e12152. [Google Scholar] [CrossRef]
  96. Nandi, S.; Ransinchung, G.D.R.N. Performance evaluation and sustainability assessment of precast concrete paver blocks containing coarse and fine RAP fractions: A comprehensive comparative study. Constr. Build. Mater. 2021, 300, 124042. [Google Scholar] [CrossRef]
  97. Sahdeo, S.K.; Chandrappa, A.; Biligiri, K.P. Effect of compaction type and compaction efforts on structural and functional properties of pervious concrete. Transp. Dev. Econ. 2021, 7, 19. [Google Scholar] [CrossRef]
  98. Cook, R.; Lapeyre, J.; Ma, H.; Kumar, A. Prediction of compressive strength of concrete: Critical comparison of performance of a hybrid machine learning model with standalone models. J. Mater. Civ. Eng. 2019, 31, 04019255. [Google Scholar] [CrossRef]
  99. Young, B.A.; Hall, A.; Pilon, L.; Gupta, P.; Sant, G. Can the compressive strength of concrete be estimated from knowledge of the mixture proportions? New insights from statistical analysis and machine learning methods. Cem. Concr. Res. 2019, 115, 379–388. [Google Scholar] [CrossRef]
  100. Zhang, X.; Akber, M.Z.; Poon, C.S.; Zheng, W. Predicting the 28-day compressive strength by mix proportions: Insights from a large number of observations of industrially produced concrete. Constr. Build. Mater. 2023, 400, 132754. [Google Scholar] [CrossRef]
  101. Zhang, X.; Akber, M.Z.; Zheng, W. Prediction of seven-day compressive strength of field concrete. Constr. Build. Mater. 2021, 305, 124604. [Google Scholar] [CrossRef]
  102. Chaabene, W.B.; Flah, M.; Nehdi, M.L. Machine learning prediction of mechanical properties of concrete: Critical review. Constr. Build. Mater. 2020, 260, 119889. [Google Scholar] [CrossRef]
  103. Amin, M.; El-hassan, K.A.; Shaaban, M.; Mashaly, A.A. Effect of waste tea ash and sugar beet waste ash on green high strength concrete. Constr. Build. Mater. 2025, 495, 143611. [Google Scholar] [CrossRef]
  104. Alizamir, M.; Wang, M.; Ikram, R.M.A.; Gholampour, A.; Ahmed, K.O.; Heddam, S.; Kim, S. An interpretable XGBoost-SHAP machine learning model for reliable prediction of mechanical properties in waste foundry sand-based eco-friendly concrete. Result Eng. 2025, 25, 104307. [Google Scholar] [CrossRef]
  105. Yeh, I.-C. Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 1998, 28, 1797–1808. [Google Scholar] [CrossRef]
  106. Gao, D.Y.; Zhang, L.J.; Lu, J.Y.; Yan, Z.Q. Research on design parameters of mix proportion for recycled aggregate concrete. J. Archit. Civ. Eng. 2016, 33, 8–14. [Google Scholar]
Figure 1. The correlation coefficient matrix of predictors and dependent variables.
Figure 1. The correlation coefficient matrix of predictors and dependent variables.
Mca 30 00128 g001
Figure 2. Scatter plots of C (a), SL (b), FA (c), W (d), SP (e), CA (f), NFA (g), and TA (h).
Figure 2. Scatter plots of C (a), SL (b), FA (c), W (d), SP (e), CA (f), NFA (g), and TA (h).
Mca 30 00128 g002
Figure 3. Statistical indicator for k-fold cross-validation of (a) elastic net regression, (b) KNN, (c) ANN, (d) SVR, (e) RF, (f) XGBoost, (g) CatBoost, (h) symbolic regression, and (i) R2 values across all folds and models (five-fold CV) on the training dataset.
Figure 3. Statistical indicator for k-fold cross-validation of (a) elastic net regression, (b) KNN, (c) ANN, (d) SVR, (e) RF, (f) XGBoost, (g) CatBoost, (h) symbolic regression, and (i) R2 values across all folds and models (five-fold CV) on the training dataset.
Mca 30 00128 g003
Figure 4. Predicted CS vs. observed CS for the (a) elastic net regression, (b) KNN, (c) ANN, (d) SVR, (e) RF, (f) XGBoost, (g) CatBoost, (h) symbolic regression, and (i) R-squared value for each model for the testing set.
Figure 4. Predicted CS vs. observed CS for the (a) elastic net regression, (b) KNN, (c) ANN, (d) SVR, (e) RF, (f) XGBoost, (g) CatBoost, (h) symbolic regression, and (i) R-squared value for each model for the testing set.
Mca 30 00128 g004aMca 30 00128 g004b
Figure 5. Prediction plots for the CatBoost and stacking-2 models.
Figure 5. Prediction plots for the CatBoost and stacking-2 models.
Mca 30 00128 g005
Figure 6. SHAP analysis using the inputs of the database.
Figure 6. SHAP analysis using the inputs of the database.
Mca 30 00128 g006
Figure 7. Graphical user interface (GUI) for the stacking model.
Figure 7. Graphical user interface (GUI) for the stacking model.
Mca 30 00128 g007
Table 1. Comparative of the recent literature.
Table 1. Comparative of the recent literature.
Authors ML/DL Methods Input Parameters Data Source R2 Research Object Key Findings
Chithra et al. [9]ANN, MRACement, aggregate ratios, curing time, temperature, density, moisture contentLaboratory experiments0.9975 for ANN, 0.6374–0.6717 for MRACompressive strength of concreteANN outperforms MRA in accuracy for complex
Dahish et al. [10]MNLR, SVM, ANNMaterial proportion, chemical composition, relative importance rankingsExperiments and the literature0.8898 for MNLR1 Fc28 0.9015 for MNLR2 Fc180 Compressive strength prediction of cement mortarsApplies support vector machine (SVM) to predict the compressive strength of cement mortar containing NP and SF, expanding the application scenarios of machine learning in the field of building materials
Sonebi et al. [17]SVMCement, limestone powder, water, sand, coarse aggregate, superplasticizer (kg/m3), and testing timeExperimentsPolynomial kernel SVM yielded lower R2 valuesPrediction of fresh properties of self-compacting concrete (SCC)Provided an alternative method to simulate tailor-made SCC mixes, reducing the need for extensive trial batches in practice
Nyirandayisabye et al. [18]LR, SVR, RF, KNN, GBR, LGBMOT, AEBO, OAVP, EPT, EPBM, EPSM, MSR, OBG, OAGMichigan Department of Transportation AND Michigan State UniversityGBR achieved the highest R2 of 99%, followed by LGBM (98%), stacking regressor and DTR (97%), RFR (95%), LR (64%), and poor performance by KNN and SVR (≈1% or negative R2)Predict multiple types of asphalt pavement damageAddressed the lack of comparative studies on ML algorithms for detecting combined pavement damage types
Cheng et al. [19]ANN, MRA, MNLR, SVM, etc.Concrete and asphalt pavementLiterature and experimental data, institutional datasetsANN (0.9975), GBR (99%), LGBM (98%), SVM-RBFCompressive strength, asphalt pavement damage predictionProposed Genetic Weighted Pyramid Operation Tree (GWPOT) outperformed traditional Operation Tree (OT)
Behnood et al. [20]M5PConcrete, constituents, ratios, age1912 data collected from the published literature0.911Compressive strength of NC and HPCLog transformation for non-linearity
Alkharisi et al. [21]LR, M5P, RF, XGBoostNA dosage, CNTs dosage, temperature, heating duration117 experimental data points from cubic concrete specimens exposed to elevated temperatures. Data were sourced from the previously published literature0.5962 for LR, 0.7895 for M5P, 0.9732 for RF, 0.9898 for XGBoostPredict the gamma-ray radiation shielding properties (LAC) of nano-modified concrete (NMC) incorporating NA and CNTs, subjected to elevated temperaturesFirst study to model the combined influence of NA and CNTs on LAC under high temperatures
Golafshani et al. [38]Various ML techniques (stacking ensemble)Ingredient quantities; aggregate properties; cement grade (CG); testing age (TA).3519 RAC mixtures were compiled from 120 peer-reviewed scientific studiesBoosting models R2 > 0.99; stacking model R2 = 0.9587CS of recycled aggregate concrete (RAC)Stacking improved model accuracy; sensitivity analysis highlighted concrete testing age (TA) and cement content as critical factors
Muhammad et al. [43]MLR, ANN, SVM, KNN, DTFA, sand, CA, SS, NaOH, water, alkaline activator solution, molarity, water–binder ratio (w/c), silica–alumina ratio (Si/Al), curing age, and curing temperature563 samples collected from 55 literature studies spanning over 20 yearsANNs > boosting DT > bootstrap DT > SVMs > DT > KNNs > MLR, 0.92 for ANNs (highest performance)Concrete CSShapley additive model explanation identified significant influence factors on CS
Mai et al. [44]DT, LGBM, eXtreme gradient boosting (XGBoost)Cement, coarse aggregate, fine aggregate, water, supplementary materials, fibers, admixtures387 data samples collected from 11 international publicationsXGBoost R2 = 0.992;
DTR2 = 0.986; LGBM R2 = 0.931
CS of fiber-reinforced self-compacting concrete (FRSCC)Achieved high predictive performance and stability; sensitivity analysis revealed cement, CA, fine aggregate, water, and sample age as significant factors
Jesús et al. [45]ANNCement, admixtures, water, fine aggregate, coarse aggregate, fine aggregate, superplasticizer, percentage of recycled aggregate515 mix designs are collected from the existing literatureFamily I: R2 = 0.9299; Family II: R2 = 0.824; Family III: R2 = 0.8775; Family IV: R2 = 0.799128-day CS of self-consolidating concrete (including RAC)Developed a prediction equation
Table 2. Characteristics of the input and output elements.
Table 2. Characteristics of the input and output elements.
ParameterUnitTypeMinimumMeanMaximumStdSkewnessKurtosis
Ckg/m3Input102281.17540104.510.51−0.52
Slkg/m3Input073.9359.486.280.8−0.51
FAkg/m3Input054.19200.1640.54−1.33
Wkg/m3Input121.8181.5724721.360.070.122
SPkg/m3Input06.232.25.970.911.411
CAkg/m3Input801972.92114577.75−0.04−0.6
NFAkg/m3Input594773.58992.680.18−0.25−0.1
TADay (1–365)Input145.6636563.173.2712.17
CSMPaOutput2.3335.8282.616.710.42−0.31
Table 3. The optimal hyperparameters of the developed ML models.
Table 3. The optimal hyperparameters of the developed ML models.
ModelsOptimal Hyperparameters
Elastic net regressionα = 0.003, R = 0.1
KNNPower parameter = “Euclidean distance”, K = 6
ANNHidden layer size = (128, 64), activation = ‘ReLU’, optimization algorithm = ‘adam’, initial learning rate = 1 × 10−3, dropout = 0.2, batch size = 64, patience = 20
SVRKernel = radial basis function, γ = ‘scale’, regularization parameter = 297, ε = 2.50
RFMaximum depth = 40, minimum samples leaf = 1, minimum samples split = 5, number of estimators = 500
XGBoostSubsample ratio = 0.8585, γ = 0.9757, learning rate range = 0.12,
maximum tree depth = 3, number of estimators = 900, L1
regularization term = 0.3749, L2 regularization term = 0.8558
CatBoostBagging temperature = 5.8, depth = 7, iterations = 300, L2 regulation term = 3, learning rate = 0.1, random strength = 1.0, subsample ratio = 0.8
Symbolic regressionPopulation size = 500, generations = 20, stopping criteria = 0.01, crossover probability = 0.7, subtree mutation probability = 0.1, hoist mutation probability = 0.05, point mutation probability = 0.1, max samples = 0.9
Table 4. The error measures of each ML model for CS prediction.
Table 4. The error measures of each ML model for CS prediction.
ErrorsPhasesElastic Net RegressionKNNANNSVRRFXGBoostCatboostSymbolic Regression
RMSE (MPa)Training10.367.474.63.733.141.571.478.45
MAE (MPa)8.195.73.42.82.380.971.096.58
MAPE (%)31.5122.0111.8410.158.983.433.8023.62
R20.620.80.930.950.970.990.980.74
SI0.290.210.130.10.910.040.960.72
RMSE (MPa)Testing10.468.65.685.945.724.344.859.99
MAE (MPa)8.336.344.34.284.292.822.837.53
MAPE (%)32.5523.3713.9414.5815.149.299.7925.44
R20.60.650.850.850.880.900.920.68
SI0.30.270.160.170.820.120.890.68
Table 5. The error measures of the best combinations of the base ML models for the stacking model development.
Table 5. The error measures of the best combinations of the base ML models for the stacking model development.
Stacking ModelsIndividual ML Models (Used) Training Set Testing Set
ElasticNetKNNANNSVRRFXGBoostCatBoostSymbolic RegressionRMSE
(MPa)
MAE
(MPa)
MAPE
(%)
R2SIRMSE
(MPa)
MAE
(MPa)
MAPE
(%)
R2SI
Stacking-21.42250.86853.21230.99280.04024.51152.91618.88940.92580.1226
Stacking-31.42710.87523.21680.99270.04034.52172.91728.89570.92550.1229
Stacking-41.42710.87523.21680.99270.04034.52172.91728.89570.92550.1229
Stacking-51.4490.89653.31610.99250.04094.52062.92158.91350.92550.1229
Stacking-61.4490.89653.31610.99250.04094.52062.92158.91350.92550.1229
Stacking-71.44620.8913.3090.99250.04084.51332.92118.92410.92570.1227
Stacking-81.44480.88943.29590.99250.04084.51462.92078.92230.92570.1227
Note: □ no base model combination; √ base model combination.
Table 6. Assessment of ML outcomes from this study against the existing literature.
Table 6. Assessment of ML outcomes from this study against the existing literature.
ReferenceMaterial UsedPredicted PropertiesML AlgorithmReported R2 Value
Current StudyUCI dataset and other 952 data samples collected from various sourcesCSXGBoost, CatBoost0.99
Golafshani et al. [38]3519 data samples collected from various sourcesCSXGBoost, LightBoost, CatBoost0.9587
Li and Wang [66]400 data samples obtained from the literatureCSRF, CatBoost0.97
Pan et al. [67]UCI dataset and 63 samples of 2 different recycled aggregate concreteCSXGBoost, DT, ET0.985
Qi and Li [68]1100 experimental databaseCSXGBoost, LightBoost0.964
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, J.; Guan, D.; Liu, X. Comparative Performance Analysis of Machine Learning Models for Compressive Strength Prediction in Concrete Mix Design. Math. Comput. Appl. 2025, 30, 128. https://doi.org/10.3390/mca30060128

AMA Style

Liu J, Guan D, Liu X. Comparative Performance Analysis of Machine Learning Models for Compressive Strength Prediction in Concrete Mix Design. Mathematical and Computational Applications. 2025; 30(6):128. https://doi.org/10.3390/mca30060128

Chicago/Turabian Style

Liu, Junyu, Dayou Guan, and Xi Liu. 2025. "Comparative Performance Analysis of Machine Learning Models for Compressive Strength Prediction in Concrete Mix Design" Mathematical and Computational Applications 30, no. 6: 128. https://doi.org/10.3390/mca30060128

APA Style

Liu, J., Guan, D., & Liu, X. (2025). Comparative Performance Analysis of Machine Learning Models for Compressive Strength Prediction in Concrete Mix Design. Mathematical and Computational Applications, 30(6), 128. https://doi.org/10.3390/mca30060128

Article Metrics

Back to TopTop