1. Introduction
Concrete is the most widely used man-made material in civil engineering, with a significant impact on both the economy and the environment [
1]. The construction sector contributes substantially to economic development (approximately 9% of the European Union’s GDP), generates millions of direct and indirect jobs (around 18 million in the EU), and fulfills the infrastructure and building needs of the population [
2,
3]. However, it is also one of the main consumers of natural resources, accounting for nearly 50% of total raw material use and 36% of global final energy consumption [
4,
5], making it a major source of greenhouse gas emissions [
4,
6].
In response, the scientific and technical community has focused on developing strategies aimed at reducing the consumption of non-renewable resources and emissions associated with concrete production, to promote a more sustainable and environmentally responsible industry [
7,
8]. Among the most promising alternatives are the recycling of material and the incorporation of mineral additives that partially replace traditional Portland cement, a material with high environmental impact due to emissions generated during its production [
9,
10,
11]. In this context, eco-friendly concrete emerges as an innovative solution, where a fraction of cement is replaced with recycled glass powder derived from glass waste that, due to its physical or chemical properties, cannot be reintroduced into conventional recycling processes and would otherwise end up in landfills. This approach valorizes a difficult-to-manage byproduct, reduces the need for disposal, and actively contributes to circular economy principles and the mitigation of the sector’s environmental impact. In civil engineering, structures are designed to meet specific requirements based on their intended function, expected service life, and exposure conditions [
12]. Within the design process, compressive strength is one of the most relevant properties, as it determines the quality and mechanical performance of concrete [
13]. The type of concrete is selected according to its strength class (non-structural, ordinary structural, high strength, or very high-strength) based on project requirements and exposure conditions [
14]. This classification is based on criteria established in international standards such as Eurocode 2 [
15], ACI 318 [
16], ACI 239 for UHPC [
17], and the Spanish Structural Code [
18], which specify minimum required strength levels and applicable conditions for each category. The experimental determination of concrete compressive strength is technically complex due to the nonlinear relationship between the composition of the constituent material and resulting mechanical behavior [
19,
20,
21]. Traditionally, this involves casting multiple specimens cured under controlled conditions over periods ranging from 1 to 365 days [
22], followed by destructive laboratory testing to determine compressive strength, which entails a significant cost in terms of time, materials, and labor. In this context, artificial intelligence (AI) and machine learning (ML) models offer a non-destructive and efficient approach for predicting and classifying compressive strength, particularly in eco-friendly concretes with partial cement replacement using non-recyclable glass powder. This represents a significant step toward digitalization and sustainability in the construction sector.
Over the past three decades, artificial neural networks (ANNs) have been successfully applied across a wide range of scientific and engineering disciplines, demonstrating remarkable capabilities for modeling complex phenomena and predicting non-linear behaviors [
23,
24,
25]. Although their adoption in civil engineering has been relatively limited, several studies have shown their potential for estimating the compressive strength of concrete with accuracy comparable to or exceeding that of traditional empirical methods [
26,
27,
28,
29]. The quality of the predictions depends heavily on the network design, training parameters, and selection of input variables, enabling continuous optimization of the models and their adaptation to the specific requirements of each application. Beyond ANNs, a wide variety of AI techniques have been applied to concrete behavior studies, including Adaptive Neuro-Fuzzy Inference Systems (ANFIS), Genetic Programming (GP), Support Vector Machines (SVM), Classification and Regression Trees (CART), and Biogeography-Based Programming (BBP), among others [
30]. These tools broaden the methodological spectrum for predictive modeling and mix optimization, reducing reliance on exhaustive experimental testing. Recent studies have proposed increasingly sophisticated approaches combining ML models with metaheuristic optimization algorithms. For example, Yasen et al. [
31] developed an Extreme Learning Machine (ELM) model to predict the strength of cellular concrete, achieving greater accuracy than Multivariate Adaptive Regression Splines (MARS), M5 Tree, and SVR, demonstrating its usefulness for improving quality control and reducing physical testing. Similarly, Asteris et al. [
32] introduced a hybrid learning model (Hybrid Ensembling of Surrogate Machine Learning Models, HENSM) that integrates ANNs, MARS-L/C (linear (L) and cubic (C) variants of the Multivariate Adaptive Regression Splines), Gaussian Process Regression (GPR), and Minimax Probability Machine Regression (MPMR) within a neural network, achieving more accurate and stable predictions with lower risk of overfitting and potential implications for the design of more sustainable concretes. Other studies, such as that by Kandiri et al. [
33], combined ANNs with optimization algorithms (Salp Swarm Algorithm (SSA), Genetic Algorithm (GA), Grasshopper Optimization Algorithm (GOA)) to predict the strength of concrete with recycled aggregates, highlighting the robustness of hybrid approaches. Likewise, Bui et al. [
34] applied an ANN optimized with the Whale Optimization Algorithm (WOA), outperforming other evolutionary methods (Dragonfly Algorithm (DA) and Ant Colony Optimization (ACO)). Omran et al. [
35] compared nine data mining models for concretes with mineral admixed and found that GPR offered the highest predictive accuracy, especially when combined with ensemble methods. Alshihri et al. [
36] also demonstrated the effectiveness of Cascade Correlation Neural Networks (CCNN) in predicting the strength of lightweight concrete at different curing ages, significantly reducing testing costs and time. Beyond numerical prediction, several studies have employed AI models to classify concretes according to their compressive strength. Alghamdi [
37] used Decision Tree methods to accurately categorize high-strength mix designs. Zhao [
38], applying the Naïve Bayes model, reliably identified strength categories in ultra-high-performance concretes, evidencing its usefulness for quality control. Models such as Random Forest (RF) [
39,
40] have proven robust against noisy data, while Decision Trees, SVM, and k-NN have shown variable results depending on the mix design and data distribution [
41,
42].
In addition to these advances, recent research has increasingly adopted more powerful machine learning frameworks to model the behavior of sustainable concretes. For example, Khan et al. [
43] employed Extreme Gradient Boosting (XGBoost) to predict the mechanical performance of waste-incorporated concretes, demonstrating high accuracy and robust feature sensitivity analysis. Zhang et al. [
44] introduced a hybrid RF–GWO (Grey Wolf Optimizer)–XGBoost model for geopolymer blends, highlighting the potential of ensemble techniques in sustainability-oriented applications. Other developments include the work of Demirtürk [
45], who optimized XGBoost and Light Gradient Boosting Machine (LightGBM) for high-performance concretes, and Fei et al. [
46], who applied ensemble ML models, including XGBoost, Random Forest, LightGBM, and Adaptive Boosting (AdaBoost), to predict the compressive strength of recycled powder mortar.
Most previous studies have focused on predicting the compressive strength of conventional or recycled concrete. In contrast, very few studies have developed machine learning classification methods for eco-friendly concretes produced with glass powder as a partial cement substitute. Therefore, in this study, we focus on analyzing different ML algorithms to classify these concretes based on their compressive strength. The experimental dataset used in this study was specifically generated with systematic variations in glass powder content, allowing for a detailed analysis of its influence on strength classes. The goal of this research is to develop and identify the most reliable and generalizable methods for quality control and mix optimization, thereby reducing the need for destructive testing and the associated material and time costs.
3. Results
Table 4 shows the results obtained by laboratory-measured compressive strength of eco-friendly cement samples with the classifications produced by the ML models. This representation highlights the models’ capability to categorize the cements according to their compressive performance across different curing times.
A comprehensive evaluation of the obtained results will be conducted using robust performance metrics designed to rigorously quantify the predictive accuracy and reliability of the proposed artificial intelligence models for the classification of eco-friendly concrete mixtures based on their compressive strength.
Model Performance Evaluation
To assess the predictive capability and robustness of the proposed ML models, several performance metrics were employed. All metrics were derived from the confusion matrix presented in
Table 5, which summarizes the classification outcomes for the validation dataset. These metrics quantify the models’ ability to accurately classify the compressive strength classes of concrete specimens based on their mix composition and curing time. Standard statistical indicators, including accuracy, precision, recall, and F1-score, were calculated to provide a comprehensive understanding of both the overall and class-specific performance of each algorithm.
Table 6 presents the performance metrics derived from the confusion matrix, used to evaluate the ML models applied in this study. Accuracy measures the proportion of correct predictions over the total number of instances, while precision and recall provide a more detailed, class-specific analysis: precision indicates how reliable the positive predictions are, and recall reflects the model’s ability to identify all actual instances of a given class. The F1-score, as the harmonic mean of precision and recall, provides a balanced measure of overall performance. Finally, specificity complements recall by assessing the model’s ability to avoid false positives. Together, these metrics allow for a comprehensive comparison of the models, considering both their global performance and their capacity to correctly distinguish between different concrete types.
Table 7 complements the information provided in the previous paragraph by presenting the actual values of these metrics for each ML model applied to the concrete dataset.
A detailed examination of the model performance metrics highlights significant differences in predictive behavior across the evaluated algorithms. Naïve Bayes consistently achieved high values across all metrics for all strength categories (precision and recall > 0.85; F1-score ~0.95; overall accuracy 0.95), demonstrating both excellent positive case identification and strong resistance to false positives. This balance indicates a robust capacity to generalize across heterogeneous data, making it the most reliable model for comprehensive classification.
Random Forest exhibited generally strong performance, with moderate to high precision and recall (accuracy 0.76), yet a noticeable decline in recall for the minority classes (Non-structural and Very high-strength) suggests some sensitivity to class imbalance. Its specificity remained high, indicating correct identification of negative cases even when positive instances were misclassified.
Decision Tree achieved perfect metrics (1.0) across all categories, reflecting overfitting rather than true predictive capability. While this ensures flawless performance on the training set, the lack of generalization limits its utility for unseen data.
SVM and k-NN displayed highly uneven metrics. SVM attained perfect precision and recall only in extreme classes (Non-structural and Very high-strength), with poor recall in intermediate categories (e.g., 0.10 for Ordinary structural), indicating failure to capture subtle inter-class boundaries. Similarly, k-NN performed acceptably for low-strength concrete but almost completely failed in High-strength classes (precision 0.01) and very high-strength (recall 0.10), reflecting strong sensitivity to local data distribution and feature scaling.
In conclusion, the comparison of machine learning models highlights their respective strengths and limitations in predicting concrete strength classes. Naïve Bayes consistently achieved high accuracy across all classes, demonstrating robust generalization. Random Forest performed well for “Ordinary structural” and “High strength” classes but showed lower recall for minority classes. SVM excelled in “Non-structural” and “Very high-strength” classes but was less effective for intermediate classes, while k-NN performed well for “Ordinary structural” but struggled with higher strength classes due to sensitivity to local patterns. Decision Tree achieved perfect classification, likely reflecting overfitting. Overall, these results illustrate the analytical focus of each model. SVM emphasizes margin-based separation, Naïve Bayes leverages probabilistic feature relationships, Random Forest and Decision Tree use hierarchical splits, and k-NN relies on local similarity, providing clear guidance for selecting the most appropriate method based on the target concrete class and dataset characteristics.
Figure 3 illustrates a radar chart comparing the global performance of the evaluated machine learning models across five metrics: Precision, Recall, Specificity, F1-Score, and Accuracy.
4. Discussion
The comparative evaluation of five machine learning algorithms—Naïve Bayes, Random Forest, Decision Tree, Support Vector Machine (SVM), and k-Nearest Neighbors (k-NN), for classifying the compressive strength of eco-friendly concretes incorporating recycled glass powder has revealed substantial differences in predictive behavior and practical implications for sustainable material design. These findings not only highlight the computational performance of each model but also provide valuable insight into their suitability for guiding the development and structural use of environmentally responsible cementitious composites.
The Naïve Bayes model exhibited the highest overall performance (accuracy ≈ 0.95), maintaining balanced precision and recall across all strength categories. Its strong predictive capability for high and ultra-high strength concretes aligns with prior studies by Zhao et al. [
38], which attribute this robustness to the algorithm’s probabilistic structure and its ability to effectively handle datasets with moderate collinearity and clearly separated quantitative features. In the context of glass powder modified concretes, this performance suggests that the material’s mechanical behavior remains sufficiently distinct across strength classes to be probabilistically separable, supporting the potential use of Naïve Bayes as an efficient, low-complexity model for early-stage material screening.
The Random Forest algorithm achieved comparably strong results (accuracy ≈ 0.76), especially in the ordinary structural and high-strength categories. Its ensemble-based architecture, which aggregates multiple decision trees through bagging, allows it to capture nonlinear interactions between material components and curing parameters. These results corroborate findings by Omran et al. [
35] and Asteris et al. [
32], who reported similar advantages of tree-based models for predicting concrete mechanical properties. In this study, Random Forest demonstrated high robustness to noise and variability arising from the inclusion of glass powder, confirming its capability for accurate classification and potential real-time monitoring of eco-concretes during production and curing. However, the slight underperformance in the very high-strength category suggests the need for further class balancing or feature engineering when dealing with minority strength ranges.
Conversely, the Decision Tree model achieved perfect performance (accuracy = 1.00), a result indicative of overfitting rather than true generalization. Although this behavior superficially suggests exceptional precision, it compromises reliability when predicting unseen data. Similar overfitting tendencies have been reported by Asteris et al. [
32], who noted that decision tree–based models tend to overfit when the number of predictor variables is high or when class distinctions are subtle. Nonetheless, the interpretability and transparent decision rules of this model make it a valuable analytical tool for exploring parameter–response relationships, particularly for visualizing the influence of glass powder content, water-to-binder ratio, or curing age on strength development. In practice, Decision Trees may serve as an explanatory component within hybrid or ensemble frameworks, enhancing interpretability without sacrificing predictive accuracy.
The SVM model, in contrast, exhibited inconsistent performance (accuracy ≈ 0.42), performing well only in the extreme categories (non-structural and ultra-high strength). This pattern mirrors findings from previous studies [
41], where the margin-based nature of support vector machines is highly effective for well-separated classes but limited in cases of overlapping or complex nonlinear relationships unless kernel parameters are carefully optimized. Given the compositional heterogeneity of eco-friendly concretes, where particle morphology, pozzolanic activity, and curing kinetics interact non-linearly, the limited adaptability of the SVM underscores the necessity of kernel tuning or hybridization for improved predictive stability in sustainable material systems.
Similarly, the k-Nearest Neighbors (k-NN) model demonstrated the lowest overall accuracy (≈0.49) and weak generalization in the high and very high-strength categories. Although it performed reasonably well for ordinary concrete, its dependence on local data density and sensitivity to feature scaling reduced its robustness in heterogeneous datasets. These limitations, reported also by Kandiri et al. [
33], highlight that k-NN is more suited for qualitative or semi-quantitative classification tasks, such as preliminary mixture grouping or process monitoring, rather than for rigorous strength classification.
In summary, the comparative performance trends are consistent with the evidence obtained from the feature importance analysis. Naïve Bayes and Random Forest outperformed SVM, k-NN, and Decision Tree. Naïve Bayes benefited from well-differentiated feature distributions, and Random Forest leveraged its ensemble architecture to capture nonlinear interactions. The feature importance analysis confirmed that cement content, glass powder ratio, and curing age are the dominant variables influencing strength classification, while aggregate ratios and water content play a secondary role. These findings not only explain the superior accuracy of Naïve Bayes and Random Forest but also highlight the practical value of feature-based information for optimizing the design of eco-friendly concrete mixes.
From a broader perspective, these results emphasize that probabilistic and ensemble-learning methods, such as Naïve Bayes and Random Forest, provide the most balanced combination of accuracy, generalization, and interpretability for classifying eco-friendly concretes. Their strong predictive performance confirms that the incorporation of recycled glass powder as a partial cement replacement does not introduce excessive data noise or instability, thereby reinforcing its potential as a viable and sustainable substitute for structural applications. Furthermore, the consistent increase in predicted strength classification with curing age across all models reflects the expected hydration and pozzolanic activity trends of blended cements, validating both the experimental observations and the physical relevance of the models.
In contrast, while the Decision Tree model excels in interpretability, its tendency to overfit limits its predictive generalization, making it more suitable for exploratory or explanatory analyses. Similarly, SVM and k-NN demonstrate effectiveness primarily in extreme strength categories, where class boundaries are more distinct. Collectively, these outcomes are consistent with trends reported in recent literature [
31,
32,
33,
34,
35,
36], which indicate that probabilistic and ensemble-based approaches consistently outperform deterministic and proximity-based models in the classification of materials characterized by high compositional variability.
The discrepancies observed among the algorithms can be partially explained by the intrinsic differences between glass powder concrete and conventional mixes. The incorporation of glass powder as a partial cement replacement results in a more eco-friendly concrete, which exhibits a distinctive characteristic compared to conventional concretes: it prolongs the setting reactions due to the extension of its pozzolanic reactions [
54]. This modification affects the hydration kinetics and microstructure, producing denser matrices, but also greater variability in early-age strength due to the pozzolanic reaction. This heterogeneity can influence the predictive behavior models, particularly in algorithms sensitive to data distribution, such as SVM and k-NN. In contrast, ensemble and tree-based methods better capture nonlinear dependencies and compositional interactions, which explain their greater accuracy and recall in most strength classes. However, despite these potential sources of variability, the experimental data did not show significant instability in the compressive strength measurements. This can be attributed to the small particle size of the recycled glass powder, which promotes uniform dispersion; the pozzolanic reaction, which provides additional C–S–H gel without generating abrupt strength fluctuations; and the low substitution levels (1–2% by volume), which preserve the homogeneity of the mixture and prevent clustering or segregation. Together, these factors support the reliability and consistency of the dataset used for machine learning classification.
5. Conclusions
The results of this study demonstrate that ML models constitute effective and sustainable tools for classifying the compressive strength of eco-friendly concretes in which a fraction of the cement is replaced with glass powder. This data-driven approach contributes to more efficient resource management, supports the principles of the circular economy, and reduces the need for extensive destructive testing. Based on model performance, the key findings can be summarized as follows:
The Naïve Bayes classifier achieved the highest overall accuracy (≈0.95) demonstrating a balance between accuracy and comprehensiveness across all strength categories. Random Forest also performed reliably (accuracy ≈ 0.76), effectively capturing the nonlinear interactions between mixture components and curing parameters. Its consistent performance confirms that partial cement replacement with glass powder does not introduce significant instability into the data, reinforcing its potential as a viable and sustainable substitute for structural concrete applications.
The Decision Tree model achieved perfect accuracy (1.00), indicating possible overfitting; Nevertheless, its interpretability makes it particularly valuable for explanatory analysis and understanding the relative influence of mixture parameters such as glass powder content, water-to-binder ratio, and curing age on concrete strength.
The SVM and k-NN methods, with accuracies of 0.42 and 0.50, respectively, demonstrated competitive performance in extreme strength categories but lower overall accuracy in heterogeneous datasets, limiting their applicability for generalized prediction. However, their simplicity and responsiveness to local patterns make them suitable for preliminary mixture screening, process monitoring, or rapid on-site quality control.
Overall, the findings highlight that probabilistic and ensemble-based methods outperform deterministic and proximity-based algorithms when classifying materials with high compositional variability. The choice of model should therefore be guided by the specific objectives of the study, balancing interpretability, accuracy, and generalization depending on the intended application.
In summary, AI-based models, such as Naïve Bayes and Random Forest, can optimized mix design, enable real-time monitoring during production, and contribute to non-destructive testing by providing early strength class predictions. This scalable framework improves decision-making, efficiency, and reliability in sustainable construction and can accelerate the adoption of low-carbon materials and the efficient use of resources, supporting the broader goals of sustainable construction and the circular economy in the cement and concrete industry.
A potential future line of research could explore several avenues to further strengthen and expand the findings of this study. For example, studying other types of additives to cement to produce different kinds of eco-friendly concretes (such as ground granulated blast furnace slag, fly ash, eggshell, etc.) could enhance the model’s applicability. Conducting external validation using an independent dataset, processed through the same preprocessing steps but without retraining the model, could also help assess performance across diverse contexts. Moreover, examining and applying data balancing techniques for categories with fewer samples may contribute to improved generalizability and robustness of the model.