Concrete Material Variability and Machine Learning Model Performance: A Comprehensive Review

Bahmani, Hadi; Mostafaei, Hasan; Santos, Paulo; Ferrández, Daniel

doi:10.3390/buildings16030556

Open AccessReview

Concrete Material Variability and Machine Learning Model Performance: A Comprehensive Review

¹

Department of Civil Engineering, Shahrekord University, Shahrekord 88186-34141, Iran

²

School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, NSW 2007, Australia

³

University of Coimbra, ISISE, ARISE, Department of Civil Engineering, 3030-788 Coimbra, Portugal

⁴

Departamento de Tecnología de la Edificación, Universidad Politécnica de Madrid, Avda Juan de Herrera, 6, 28040 Madrid, Spain

^*

Author to whom correspondence should be addressed.

Buildings 2026, 16(3), 556; https://doi.org/10.3390/buildings16030556

Submission received: 26 December 2025 / Revised: 19 January 2026 / Accepted: 27 January 2026 / Published: 29 January 2026

(This article belongs to the Collection Advances in Enhancing Properties of Concrete, Mortar, Gypsum, and Plaster Materials)

Download

Browse Figures

Versions Notes

Abstract

Machine learning (ML) has become an increasingly important tool in concrete engineering which has significantly altered the method of prediction and optimization of concrete properties, enabling more efficient, accurate, and sustainable processes. However, the inherent variability of concrete is a significant challenge to the generalization and performance of ML models. This study is a review that explores the effect of the variability of concrete material on the reliability and accuracy of predictions by ML. To explain the influence of these sources of variability on mechanical and durability related behaviors, the paper groups the sources of variability into four major groups, namely composition, microstructure, curing conditions, and environmental factors. A broad range of machine learning paradigms—including supervised learning, unsupervised learning, reinforcement learning (RL), and hybrid physics-informed approaches—is examined with respect to their robustness against data heterogeneity and distributional shifts. The weaknesses and advantages of the two types of algorithms are highlighted with regard to forecasting fresh and hardened concrete properties and the optimization of the mix design. Based on this synthesis, the review identifies key unresolved challenges, including the lack of standardized multi-source datasets, limited transferability of models across experimental settings, and insufficient reporting of preprocessing and normalization practices.

Keywords:

machine learning (ML); concrete; mechanical properties; performance prediction; deep learning

1. Introduction

The desire to incorporate machine learning (ML) into the civil engineering industry has radically transformed traditional activities in structural analysis, material design, and construction management [1]. Through the exploitation of artificial intelligence, ML methodologies have become essential in an array of applications, such as predictive modeling of material behavior, autonomous fault detection regimes, and real-time support of decisions during construction processes. These data-oriented models can provide an all-time greater understanding and efficiencies where engineers can streamline designs, foresee failures before they occur, and increase safety and sustainability in infrastructure development [2,3,4,5,6,7].

However, the effectiveness and trustworthiness of the ML models are inherently dependent on the quality, quantity, and representativeness of their training data. This reliance takes specific salience in concrete engineering due to the inherent variability of the material [8,9,10,11,12,13,14,15]. Concrete is a chemically reactive and inhomogeneous material, the properties of which are regulated by changes in composition, microstructure, curing methods, and exposure to environmental conditions. This heterogeneity contributes to nonlinear behavior, and thus generalization of the ML model can be disrupted by such heterogeneity leading to some phenomena like overfitting to small datasets or biased predictions in application to new formulations [16,17,18,19]. As a result, although ML has been shown to predict strength, durability, or fresh-concrete behavior, its practical application is still limited by the issue of variability [20,21,22,23,24].

Concrete passes through several steps during production, positioning, and, finally, maintenance in the long term. Here, academics are placing greater efforts on implementing ML algorithms at these steps with the view of forecasting concrete behavior and identifying possible failures to mitigate the cost of experiments and promote the creation of more sustainable concrete materials. Figure 1 illustrates the integration points of ML at different steps, including input and maintenance, depending on the characteristics added at each of them. The variability of concrete is divided into all aspects introduced in the input and process phases of this figure and covered in the current study.

The aim of this review is to summarize the current studies regarding the relationship between concrete material variability and machine learning model performance. It aims to

Investigate effects of different concrete formulations, characterized by differences in compositional components, micro-structural features, curing regimes, and environmental conditions, on model accuracy.
Review the range of data-driven methods used in the analysis of concrete behavior, which include supervised learning, unsupervised learning, reinforcement learning (RL), and hybrid architectures.
Focus on the emerging approaches, including deep learning-based defect detection and reinforcement learning, to optimize the construction process.
Determine existing challenges, such as heterogeneity, lack, and bias of data and suggest future paths to create predictive models that are reliable and transferable to one another.

Although machine learning is promising to be used in a reality involving concrete, some disadvantages hinder its general use in the context of concrete. Uncertainties due to variability in mix designs, environmental factors, and real-world heterogeneity can result into overfitting, weaker generalization, and biased results [25]. Moreover, the lack of data and uneven testing procedures are also a major challenge. To eliminate these limitations, it is important to use interdisciplinary collaboration to improve methods of data acquisition, create uniform testing protocols, improve feature engineering, and develop complex transfer-learning systems.

An emerging literature bases its comparison directly upon various ML algorithms applied to concrete datasets to determine the strengths, weaknesses, and best practices. Such comparative research studies add value to our knowledge, as they reveal the most effective methods of modeling used to predict certain concrete properties in different conditions. The trend shown in Figure 2 depicts the trend of the annual publication of the application of ML in concrete technology between the year 2015 and 2026 (December 2025). This growth reflects increasing scholarly interest and expanding industrial adoption of machine learning approaches in concrete technology and construction.

Despite the growing number of review articles addressing machine learning applications in concrete engineering, most existing studies primarily focus on algorithmic development, comparative model performance, or specific prediction tasks such as compressive strength or durability indicators. While these reviews [5,26,27,28,29] provide valuable insights into the capabilities of different ML techniques, they generally treat experimental data as homogeneous and implicitly assume consistency across datasets. As a result, the influence of concrete material variability, arising from differences in composition, microstructure, curing conditions, and environmental exposure, on the reliability, robustness, and generalizability of ML models remains insufficiently explored. In practice, however, such variability is intrinsic to concrete and represents a major source of uncertainty in data-driven modeling. This gap highlights the need for a dedicated review that explicitly examines how material variability propagates through machine learning workflows and affects prediction outcomes. The present study addresses this need by providing a structured and comprehensive synthesis of existing research, linking sources of concrete variability to ML model performance and identifying methodological challenges and best practices for developing more reliable and transferable predictive models.

This study aims to give a detailed perception of the effect of concrete variability on the performance of machine learning models and why the combination of material science expertise with artificial intelligence studies is critical. Through this, it will enhance the development of more resilient, flexible, and dependable predictive models—enabling the development of smarter, safer, and more sustainable infrastructure.

2. Sources of Variability in Concrete

The nature of concrete is variable, as it is the result of a very large number of factors that come together, affecting the microstructure of concrete, its mechanical behavior, and durability under load over time [30,31,32,33,34,35,36,37,38,39,40]. All these forms of variability represent important factors to consider during the creation of machine learning (ML) models, as they present the obstacles of heterogeneity of data, the relevance of features, and model resilience. Innovative technologies like 3D concrete printing are faced with variability sources which are a major challenge, and this is mainly because of the increased sensitivity of process parameters and the intricate rheological behavior of the materials they deal with [41,42,43]. As a result, the application of ML as a predictive, control, and process optimization tool has become inevitable.

There are major sources of variability to concrete materials, and they can be classified as follows.

2.1. Composition

The basic structure of concrete will include cementitious materials, aggregates, water, and chemical admixtures. There can be significant differences in the behavior of the whole material contained in these constituents:

Types of Cement: The hydration kinetics, strength development, and permeability of different cement formulations, including Portland cement and blended cements with additional cementitious materials (e.g., fly ash or silica fume), depend on the cement type used [44,45,46,47]. As an example, high-volume fly-ash mixes generally have a lower strength gain and better durability [48].
Aggregates: Aggregates can have an impact on workability, packing density, and crack propagation depending on their size, shape, texture and mineralogical composition. The use of fine and coarse aggregates may lead to the occurrence of varying levels of porosity and mechanical properties.
W/C Ratio: This ratio is one of the critical determinants of mechanical and durability quality of concrete. Lower water-to-cement ratio tends to pass a stronger and less permeable structural integrity, and higher ratios tend to pass better workability but have high porosities and less resistance. The difference in the W/C ratio of mix designs is another contributor of variations in performance measure, especially compressive strength and chemical attack resistance [49,50].
Admixtures and Additives: Workability, setting times, and durability are altered by chemical admixtures like superplasticizers, air-entraining agents, and accelerators, adding further variability that has to be modeled [51,52].

2.2. Microstructure

The internal architecture of hardened concrete develops through complex hydration reactions which generate distinct microstructural features:

Porosity and Pore Size Distribution: Variations in pore structure influence permeability, durability, and elastic properties. Higher porosity generally correlates with reduced strength and increased susceptibility to ingress of deleterious agents [53].
Interfacial Transition Zones (ITZ): The weak boundary zone between aggregates and cement paste impacts crack initiation and propagation. Microstructural differences in ITZ thickness and quality can lead to inconsistent mechanical performance [54].
Hydration Products: The spatial distribution and morphology of hydration crystals affect stiffness and toughness, with microstructural heterogeneity resulting from factors like curing temperature and mix proportions [55,56].

2.3. Curing Conditions

Proper curing is vital for achieving desired concrete properties, but environmental factors and practice variations create inconsistencies:

Temperature and Humidity: High temperatures can hasten the rate of hydration but can also cause uneven drying and prevent the achievement of the correct setting. Poor humidity can cause the surface to shrink, and it can also develop microcracks [57,58].
Duration: The lack of curing time will hinder the development of strength, but excessive curing can have adverse influence on other properties. Uncertainty is caused by variations in curing procedures on the final state of the material [59,60].
Curing Methods: Moist curing, curing covers, or accelerated curing have an impact on hydration and microstructure and, therefore, on long-term performance [61,62].

2.4. Environmental Exposure

Concrete structures are exposed to external conditions that influence their deterioration pathways:

Freeze–Thaw Cycles: Repeated freezing and thawing cause internal damage, especially in porous concrete, affecting strength and increasing permeability.
Thermal Shock Cycles: Sudden and extreme temperature fluctuations, such as those caused by fire exposure or rapid environmental changes, can induce thermal gradients within the concrete mass. These gradients generate internal stresses that promote cracking, spalling, and degradation of mechanical properties.
Chemical Attack: Chloride ingress, sulfate contamination, and carbonation attack concrete’s protective barriers, altering microstructure and accelerating corrosion [63,64].

Variability in concrete composition, including cement type, water-to-binder ratio, and SCM content, frequently leads to dataset shift and confounding, as models trained on narrow mix families fail to generalize to unseen formulations. Microstructural variability, often sparsely measured or indirectly inferred, introduces missing metadata and label noise, limiting the ability of data-driven models to capture causal relationships. Curing-related variability primarily induces temporal dataset shift, which degrades the performance of static supervised models when time-dependent effects are not explicitly encoded. Finally, environmental exposure variability (e.g., carbonation, chloride ingress, freeze–thaw) leads to long-term extrapolation errors, as degradation mechanisms are often underrepresented in training datasets.

This variability in concrete, arising from differences in composition, microstructure, curing regimes, and environmental exposure, is directly reflected in machine learning datasets and contributes to increased prediction uncertainty and modeling challenges. Variability tends to generate nonlinear, heterogeneous, and sometimes conflicting data patterns, making it difficult to select features and reducing the training sample consistency. As an example, the alterations in the water to cement ratio or aggregate gradation can cause a shift in the statistical distribution of the important variables, making these models that are trained on one dataset not very generalizable to another. Similarly, microstructural characteristics, including porosity or ITZ properties, can be poorly represented or inconsistently characterized across research, thus resulting in bias or scant of available datasets. Temporal and contextual dependencies, which are often hard to represent in most machine learning models, also exist in the curing conditions, as well as in environmental impact. Taken together, they lead to overfitting, low transferability, and unreliable predictions, highlighting the need to bring the material variability and predictable machine learning results closer to each other with the help of solid data curation, feature engineering, and physics-informed modeling.

The variability of concrete as detailed above has several related factors that are connected and linked to composition, microstructure, conditions under curing, and exposure to the environment. Not only do these sources of heterogeneity impact the mechanical and durability performance of concrete, but they also represent significant challenges in the creation and generalization of machine learning models.

Not all sources of concrete variability affect machine learning algorithms equally, and their impact largely depends on each model’s capacity to manage noise, nonlinearity, and distributional shifts. Variability in mix composition (e.g., cement type, water-to-binder ratio, SCM content) is particularly detrimental to linear regression and simple decision tree models, as it introduces nonlinear interactions and multicollinearity that these methods cannot adequately capture. Microstructural variability, including porosity distribution and interfacial transition zone characteristics, poses challenges for models that rely solely on macroscopic input variables, leading to underrepresentation and biased predictions in shallow learning frameworks. Curing-related variability, which is inherently time-dependent, is especially problematic for static supervised models, whereas recurrent architectures such as LSTM networks are better suited to capture temporal effects. Environmental exposure variability, including freeze–thaw cycles, carbonation, and chemical attack, often results in highly nonlinear and long-term degradation patterns that degrade the performance of single-learner models but can be more effectively handled by ensemble methods and physics-informed ML approaches.

Existing studies consistently demonstrate that machine learning models can accurately predict key concrete properties such as compressive strength and durability indicators when trained on well-curated datasets. However, several issues remain unresolved, including the limited transferability of models across different experimental conditions, the lack of standardized multi-source datasets, and the sensitivity of model performance to concrete material variability and environmental exposure. In this work, existing studies are compared based on predictive accuracy, robustness to concrete material variability, generalizability across datasets and experimental conditions, and application context (e.g., mix design optimization, hardened property prediction, and fresh concrete processability).

To explain such relations, Table 1 provides an overview of the main sources of variability, their effects on material properties, and the exact difficulties that they present to machine learning-based prediction and optimization and its mitigation strategies.

2.5. Linking ML Input Features to Fundamental Concrete Mechanisms

Although machine learning research usually focuses on predictive accuracy, the choice of input characteristics implicitly indicates underlying physical and chemical processes that were previously determined by experimental research using concrete. Mix composition variables, including water-to-binder, cement chemistry, the content of supplementary cementitious materials, and aggregate nature, are always found in both conventional experiments and feature-importance results on the ML as the main sources of strength and durability. Likewise, the hydration kinetics and microstructural development are represented in curing-related inputs and have been demonstrated by experiment to control refinement of pore and load-bearing capacity. Microstructural descriptors (measured explicitly, e.g., porosity, pore-size distribution, or determined indirectly by mix proportions and curing history) are mechanisms which are affected by microstructure, associated with crack initiation and transport properties, which have a significant impact on mechanical performance. Variables of environmental exposure like temperature, dampness, and chemical aggressiveness are indicative of a degradation process that can be carbonated, chloride ingress, and freeze–thaw damage, among others, which have been well reported in the literature of experimental work. ML models that are trained on these features are effective at learning nonlinear mappings between these governing mechanisms and measured performance measures, but physical interpretation of what is learned is indirect. Notably, when comparing the findings of the ML feature-importance rankings and the findings of the traditional experimental studies, the two findings are highly consistent in the factors that are mostly dominant, especially the composition- and curing-driven behavior. Nonetheless, inconsistencies emerge where two datasets do not contain descriptors, which are critical, or multiple processes interact, which highlights the necessity of physically informed feature selection and hybrid model methods. The combination of knowledge in experimental concrete science with the analysis based on the ML is thus critical in creating accurate and interpretable and transferable models.

3. ML Algorithms for Concrete-Related Predictions

Tools used in civil engineering ML techniques have become invaluable in predicting concrete properties and optimizing mix designs, as well as the overall structural assessment process. The complexity, heterogeneity, and non-linearity of concrete make the use of flexible ML methods that can accommodate different types of data, uncertainties, and deliver accurate predictions despite small amounts of data or noise necessary. This part explores the main ML paradigms used in the field of concrete technology and its main paradigms, paradigm methodologies, strengths, weaknesses, and recent developments.

However, even though there is increasing interest in the use of machine-learned and AI tools, their application in concrete engineering is not exempt of significant restraint. ML models are data-important and tend to be less reliably extrapolated to other mix designs outside the range of mix designs, or curing regimes or exposure conditions used in the training data. In real-life practice, this weakness is further compounded by heterogeneity in materials and inconsistent standards in testing and incomplete metadata, which may be used to provide biased predictions and give false performance indicators. In addition, predictive accuracy does not always imply physical understanding, and purely data-based models can hide material processes underlying it unless they are supplemented by area knowledge or physics-constrained models.

Machine learning models reviewed in the literature are mostly trained with experimental datasets which are representative of mean mechanical properties, including average compressive strength observed on laboratory specimens. ML models do not usually predict characteristic or design strengths necessary to design the structure, including statistical confidence levels and safety factors. Rather, they are obtained as the result of post processing of mean responses, which have been predicted by ML, tapped with established statistical procedures and design code provisions. Few studies directly deal with the issue of uncertainty quantification or probabilistic modeling, which suggests that the combination of ML predictions with structural design reliability frameworks is a research problem.

3.1. Data Processing, Harmonization, and Normalization of Multi-Source Experimental Datasets

In machine learning applications to concrete engineering, training datasets are frequently compiled from multiple experimental studies conducted by different researchers under varying testing standards, curing regimes, specimen geometries, and environmental conditions. This inherent heterogeneity introduces biases and scale inconsistencies that can significantly affect model convergence, prediction accuracy, and generalizability if not appropriately treated. Consequently, data preprocessing and normalization are critical steps in ensuring the reliability of ML-based predictions [80,81,82].

Most studies address this challenge through a combination of data cleaning, normalization, and feature scaling procedures prior to model training. Common practices include min–max normalization, z-score standardization, and logarithmic transformations to reduce skewness in variables such as compressive strength, permeability indices, or durability indicators. Feature scaling is particularly important for algorithms sensitive to input magnitude, such as neural networks, support vector machines, and gradient-based optimization models, as it prevents dominance of variables with larger numerical ranges and improves training stability.

When aggregating datasets from multiple sources, data harmonization strategies are often adopted to ensure consistency across studies. These strategies include unifying measurement units, filtering data based on comparable curing ages or exposure conditions, and excluding incomplete records with missing metadata. In some cases, categorical encoding of experimental conditions (e.g., curing method or exposure environment) is used to explicitly preserve contextual information within the dataset. Outlier detection methods—such as interquartile range filtering or unsupervised anomaly detection—are also employed to remove non-representative data points that may arise from experimental errors or atypical testing protocols.

Despite these preprocessing efforts, variability stemming from non-standardized experimental procedures and incomplete reporting remains a key limitation in current ML-based concrete studies. This highlights the need for transparent documentation of data sources, preprocessing workflows, and normalization techniques, as well as the development of standardized, open-access databases with unified testing protocols. Such practices are essential for improving reproducibility, enhancing cross-study model transferability, and enabling more robust machine-learning frameworks capable of generalizing across diverse concrete formulations and experimental conditions.

3.2. Supervised Learning

Supervised learning is still the most widely used ML method in concrete studies due to its simple framework, i.e., the association of input features to familiar outputs using the labeled datasets. These are models which are trained on past data where input variables include mix proportions, curing conditions, environmental, and material properties correlated with target results which include compressive strength, permeability, or failure modes [19,83,84,85,86,87,88,89,90,91,92,93,94,95].

Common algorithms include

Neural Network (NN): Neural networks are some computational models based on the structure and functioning of the human brain, made of layers of interconnected nodes (neurons) processing and transmitting information. NNs with more than one hidden layer (Deep Neural Networks, DNNs) have proven to be particularly effective in learning complex, nonlinear relationships over large datasets and are therefore highly effective in making long-term life predictions and in modeling time-dependent degradation processes. Recurrent Neural Networks (RNNs): This is a class of NN that processes sequential data by storing internal memory of past inputs, and it is amenable to time-dependent behavioral modeling. The CNNs are a particular form of DNNs that are capable of handling grid-like data, like images, and have been successfully used to detect cracks on images and classify surface damage [94,95,96]. Table 2 lists representative studies that have been carried out to compare ANN methods of predicting concrete properties.

Table 2. Representative studies comparing ANN methods for concrete property prediction.

Study	Concrete Application	ML Methods Compared	Key Outcome (Best Performance)
Duan et al. [97]	Compressive strength	ANN	ANN model, trained on 146 datasets from 16 studies, effectively predicted compressive strength using 14 input parameters. Results demonstrated ANN’s strong generalization capability across diverse mix designs and recycled aggregate types.
Naderpour et al. [98]	Compressive strength	ANN	ANN model, trained on 139 datasets from 14 studies, successfully predicted compressive strength of RAC using six input parameters. The model demonstrated strong performance across a wide range of recycled aggregate types, confirming its utility for sustainable construction planning.
Loureiro et al. [6]	Compressive strength	ANN	ANN achieved highest accuracy (R² ≈ 0.89) with lowest error, closely followed by Gradient Boosting. RF and SVR performed slightly lower. ANN’s advantage was balanced by GB’s greater interpretability
Al Yamani et al. [99]	Compressive strength	ANN	NN consistently outperformed other models, with RMSE values indicating highly accurate predictions. The model showed strong agreement between predicted and measured compressive strength, with correlation values above 0.8 after 28 days.
Khademi et al. [100]	Compressive strength	ANN, ANFIS, MLR	ANN and ANFIS outperformed MLR in predictive accuracy. MLR was more suitable for preliminary mix design, while ANN and ANFIS are better suited for mix optimization and high-accuracy applications. Including non-dimensional parameters significantly improved model accuracy.
Hammoudi et al. [101]	Compressive strength	ANN, Response Surface Methodology (RSM)	Both ANN and RSM effectively modeled compressive strength. ANN showed higher prediction accuracy across all ages (7, 28, 56 days). Strength decreased with higher RCA replacement. Cement content and slump were significant predictors. ANN outperformed RSM in statistical accuracy (R², RMSE, RPD).
Chen et al. [102]	Multiple properties of concrete	Back-Propagation Neural Network (BPNN)	BPNN accurately modeled both material-to-property and property-to-property relationships. Average relative error remained under 7% for both models. A notable trade-off was observed between strength and permeability, supporting its use in predictive evaluations and test cost reduction.

Regression: Regression is a statistical process that is employed to model and predict continuous values by approximating a relationship between input variables and a target variable [103,104]. It is extensively used in the analysis and prediction of material behavior, including prediction of concrete strength in terms of mix proportions and curing conditions [23].
Decision Trees (DTs): Decision trees divide the data into branches depending on the value of the feature to make a decision or prediction. They are also intuitive and simple to interpret and can be used in classification and regression tasks including the detection of the types of defects or structural health evaluation [105,106]. They are, however, subject to overfitting unless they are checked.
Regression Trees: Regression trees are decision tree models applied to the prediction of the continuous numerical values by dividing data according to the feature thresholds. They are preferred due to their interpretability and capability of modeling complex and nonlinear relationships between mixed components and target properties. As an example, the regression trees can estimate the concrete compressive strength depending on the proportion of cement, water–cement ratio, and size of aggregate [83,84,85,86,87,88,89].
Support Vector Machines (SVMs): Support vector machines identify the optimum boundary (hyper-plane) to divide the various classes in the data. They are very useful in categorization works like that of detecting defective and non-defective concrete or the classification of structural integrity. SVMs can handle high-dimensional data and are resistant to overfitting, when they are tuned well [19,90,91,92,93].
Ensemble methods: Ensemble methods are those approaches that consolidate the forecasts of numerous base models, with the goal of improving accuracy and stability. Extreme Gradient Boosting (XGBoost) and Random Forest (RF) are two commonly used ensemble algorithms [107,108]. XGBoost constructs a sequence of decision trees, with each tree correcting the errors committed by the preceding trees using gradient boosting which has proven to be efficient and highly predictive, thus making it especially useful in the prediction of nonlinear relationships in concrete property prediction and mix design optimization [107]. Random Forests, on the other hand, are the aggregation of the several decision trees that are trained on random samples of data and features, and this assists in lowering overfitting and capturing different patterns. Random Forests have been widely used in classification and regression, as well as in defect detection applications [109,110] and durability prediction [111,112] in engineering.
Long short-term memory (LSTM): LSTM is a particular model of recurrent neural network that can learn longer-range dependencies and temporal patterns in sequential data. LSTM addresses the issue of vanishing gradient which is inherent to traditional RNNs by storing memory cells and gating mechanisms that allow the model to learn complex time-series data. Structural health monitoring and predictive maintenance [113,114] are among the many applications of LSTM networks in which the importance of time dynamics is essential.

The key issues regarding learning under supervision in the area of concrete engineering include limited generalization ability, tendency to overfit on limited domains of focus, and inability to capture the overall variability of concrete formulations. Whereas supervised models are often highly accurate when limited to the range of the training data, they often have a narrow range of generalizability to new and unobserved concrete mixes, especially when available datasets are sparse or are not representative of real-world variability. This overfitting gives rise to the overfitting limit, whose result is that the model learned spurious patterns or noise of the training data instead of learning something about the fundamental behavior of materials. To enhance the strength of supervised learning in this domain, researchers should focus on obtaining large and diversified data sets and perform careful feature engineering in their effort to extract salient material descriptors; they should also use regularization and ensemble modeling to reduce model variance [115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135]. The diagram of a supervised neural network algorithm in relation to real research is presented in Figure 3.

Nafees et al. [19] compared four commonly used supervised machine learning algorithms, decision trees (DTs), multilayer perceptron neural networks (MLPNNs), support vector machines (SVMs), and Random Forests (RFs) in predicting compressive and split tensile strength of plastic concrete. They have found that single DT models had satisfactory performance (R² = 0.78); nevertheless, the performance of the models was significantly enhanced with the implementation of the ensemble variants (bagging and boosting) and, thus, the error ranges were reduced, and the models became more robust. MLPNN models were found to possess a high ability to model nonlinear relationships, and the optimized architecture had three hidden layers of 9-3-2 neurons with the best predictive accuracy. SVMs had solid results of regression but were more sensitive to the choice of the kernel and the distribution of the training data. RF was always the best method to predict, with the largest predictive ability (R² = 0.93 in compressive strength and R² = 0.86 in tensile strength) and the smallest mean absolute errors. The paper highlighted that the superior generalization and robustness of the ensemble tree-based approaches including RF compared to single-learner approaches was noted, but neural networks are still powerful but demand more data.

3.3. Unsupervised and Clustering Methods

Data analysis methods unsupervised learning methods do not assume labels influencing the analysis of data and, therefore, are useful in exploratory analysis, pattern recognition, and anomaly detection in real-world data. They have the ability to bring out latent structures which can be missed in approaches which are supervised.

Key methods include

Clustering Algorithms: These are algorithms that cluster together similar concrete samples using common properties and indicate natural clusters of performance, such as concrete classes or performance clusters. K-means divides data into a set number of clusters by minimizing the distance between the points and the center of the cluster. In hierarchical clustering, a tree-based hierarchy of nested clusters is formed by merging or splitting clusters based on similarity. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) identifies clusters using the density of data points and is able to identify clusters of any shape as well as isolate noise data [136]. Gaussian Mixture Models (GMM) classify the data by assuming that it is a characteristic of a mixture of Gaussian distributions, where a point can be part of a cluster with different probabilities. The GMM can model overlapping and irregularities of clusters, unlike K-means [137,138]. The clustering techniques can be used to optimize mix designs by finding mixes with similar mechanical behavior formation [115,116,117,118,119,120,121,122].
Dimensionality Reduction Techniques: Techniques are methods used to reduce the high dimension data to allow the analysis and visualization of the data. Principal Component Analysis (PCA) performs reduction in dimensionality that converts the data into a group of orthogonal principal components that represent the highest quantification of the variance to isolate the key variables that affect concrete performance, including the admixture quantities of individual minerals or moisture content. t-Distributed Stochastic Neighbor Embedding (t-SNE) is a nonlinear algorithm which maintains local relations and achieves the visualization of complex patterns in the reduced-dimensional space. Uniform Manifold Approximation and Projection (UMAP) is similarly a dimensionality reduction algorithm that preserves local and global data structure and usually is faster and more scalable. Those methodologies make visualization easier and aid in selection of features [123,124,125,126,127,128].
Anomaly Detection Models: These are models that detect outliers that strongly deviate in normal patterns, including unusual curing conditions, unusual mix ratios, or unusual microsequences. The models help to identify such irregularities in the early stages of development of possible defects, material inconsistencies, or the non-compliance with quality, resulting in better quality management, safety, and reduction in the risk of structural failures in concrete use [129,130,131,132,133,134].

Models that are unsupervised are especially beneficial in cases where labeled data are either limited or expensive to acquire. They help to identify the hidden relationships, enhance the knowledge of concrete behavior, and guide the process of data-driven decision making. Figure 4 shows that the experiment results of concrete mix, after dimension reduction by PCA to the two major components (PCA1 vs. PCA2), are afterward clustered to discover hidden trends among the different mixes.

3.4. Reinforcement Learning Algorithm

Although the use of reinforcement learning (RL) in concrete technology is comparatively recent, it provides opportunities for adaptive and sequential decision-making in the process of mix design optimization, curing management, and automated construction activities [139,140,141,142,143,144,145,146,147]. Unlike supervised learning, RL involves an agent being trained in a sequence of decisions by interacting with an environment and receiving feedback as rewards or punishment.

Core features include

Algorithmic Approaches: RL is based on a number of fundamental algorithmic approaches. The value-based approaches like Q-learning and Deep Q-Networks (DQN) are aimed at determining the value of expected reward of actions to make decisions. Policy-gradient approaches, like Proximal Policy Optimization (PPO), in contrast, modify agent behavior to achieve long-term rewards. Such approaches are generally used in the context of a Markov Decision Process (MDP) that views the environment as a sequence of states, actions, and rewards [148,149]. The actor–critic models have the advantages of both policy-based and value-based approaches, using a single network to find actions (actor) and another to analyze them (critic).
Real-World Implementation Challenges:

Sample Efficiency: Most RL algorithms need a large size of interactions with the environment in which they can learn effective policies, which might be expensive and time-prohibitive to achieve in a real-world scenario.

Sim-to-Real Transfer: There are usually challenges when trying to transfer policies learned in simulation to the physical systems because of modeling errors leading to performance differences.

Reliability and Safety: It is important to have safe exploration and decision-making in the operational environments, especially where it comes to large-scale construction and safety-critical processes.

Limitations:

Reward Function Design: The design of relevant reward functions is often complicated; inappropriately designed rewards may result in sub-optimization or unintended behavior, e.g., over-optimization of one measure at the cost of the performance of the project in general.

Convergence Problems: Stability and convergence of RL algorithms are also difficult to achieve, particularly in systems that have high dimensional state space and action space or are stochastic. An example can be the continuous learning in complex construction settings which can lead to oscillations or slow improvement of the policies.

Nevertheless, the ability of continuous learning and adaptation of RL makes it highly appropriate in real-time process control and optimization of concrete production and construction. Nevertheless, existing applications are mostly limited to the stage of research or simulation, and the scaling of the solutions to the full-construction projects, taking into consideration their complexity, safety, and logistics, is a challenge that is still being pursued. Figure 5 demonstrates the work of the RL algorithm in a 3D concrete printing system.

3.5. Hybrid and Physics-Informed Models

Due to the increasing complexity of construction data and demand for efficient predictive tools, a great variety of machine learning methods have been used in concrete and construction engineering. The commonly used traditional approaches to supervised learning include linear regression, decision trees, support vectors machines, Random Forests, and gradient boosting, which are used to forecast concrete properties including compressive strength, durability, and workability. These models are appreciated due to predictive accuracy and simplicity of implementation but often have to be tuned and validated with care to prevent overfitting.

The methods of unsupervised learning, such as k-means clustering and principal component analysis, have been applied to discovering the hidden trends in material composition and classifying types of concrete, as well as to lower the dimensionality of high-dimensional data. Though less popular, RL is finding use as a formidable business optimization tool, e.g., of resource distribution, sequencing, and dynamic control in automated construction systems.

In addition to completely data-driven methods, hybrid models are becoming increasingly popular due to their capacity to combine machine learning with physical concepts and expertise in the domain. Dynamically, physics-informed neural networks are neural networks that incorporate governing equations, e.g., creep, shrinkage, heat transfer equations, within the model architecture or loss function. This makes predictions consistent with known physical laws, thus enhancing generalization and interpretability, especially in data-scarce situations [8,73,150,151,152].

Hybrid methods improve the quality and strength of the machine-learning results in a practical setting by incorporating the engineering information into the modeling process. The models are especially useful when making predictions outside of the training data because they are limited by physical realism. Figure 6 shows an example of a hybrid model between physics-informed models and neural networks, and their interaction.

Overall, the machine learning field in construction is between the classical supervised and unsupervised models to the highly developed RL and hybrid physics-informed frameworks. Both methods have their unique benefits based on particular issues of concrete variability and complexity of building construction. Such diversity highlights how the modeling strategies need to be chosen in order to apply them effectively and encourages further comparable studies in order to compare their efficiency in various applications. In the next section, the literature has been synthesized to make such comparative evaluations in order to have a clearer picture of best practices in machine learning to concrete engineering.

The most important differences in the robustness of machine-learning paradigms can mostly be explained by their ability to deal with data heterogeneity, noise, nonlinear interactions, and model uncertainty, which are typical of concrete material datasets. The ensemble style techniques, like Random Forests and gradient-boosting models, are more likely to be robust, as they are a collection of weak learners, which decreases the variance and limits effects of outliers or biased training set. Deep learning models also lead to increased robustness by modeling complex, nonlinear, and high-dimensional relationships between mix composition, curing conditions, and performance indicators unless the datasets are sufficiently diverse. Hybrid and physics-informed machine learning models provide an additional level of robustness by embedding governing physical principles, such as hydration, transport, or damage mechanisms, into the learning process, thereby constraining predictions to physically admissible regimes and improving generalization beyond the training domain. In contrast, conventional data-driven models that rely solely on statistical learning without explicit physical constraints, particularly single-learner approaches, tend to be more sensitive to experimental variability and distributional shifts, which may reduce their reliability when applied to previously unseen concrete formulations or exposure conditions.

Across the reviewed studies, ensemble-based methods such as Random Forest and gradient-boosting models consistently achieve strong performance on heterogeneous concrete datasets, particularly when data are compiled from multiple mix designs or experimental sources. ANN performs competitively when large and well-curated datasets are available but show greater sensitivity to noise and inconsistent preprocessing. Physics-informed and hybrid models offer improved robustness under data-scarce or extrapolative conditions, especially for time-dependent behaviors influenced by curing and environmental exposure.

4. Comparative Evaluation of Established ML Methods in Concrete Research

Supervised learning is the most commonly used ML method in the concrete technology domain since it allows predicting the most reliable qualities of this material, including strength and durability, based on labeled data [5,153,154]. Unsupervised, on the contrary, is used in hidden patterns discovery, mix designs clustering, and outlier. Controlling the 3D printing process of concrete and optimization of curing conditions are some of the new uses of reinforcement learning that necessitate the use of complex simulations and large amounts of data. Hybrid and physics-informed models, a combination of machine learning and physical laws, have a higher accuracy and promise much in the modeling of complex behavior like creep and cracking but are more difficult to develop.

In this article, the authors continue by delving into the potential and working processes of different ML algorithms in three areas of concrete technology that are important. The paper in the second section discusses how these potentials have been exploited within the existing literature to seek a better insight into the most effective ML strategies in concrete engineering.

4.1. Mix Design Modeling and Optimization

The modeling and optimization of concrete mix design is regarded as one of the most critical bottlenecks in the design of the current-day engineering materials since the mechanical properties, durability, fresh workability, environmental sustainability, and even the overall final cost of concrete all directly rely on the composition of the input materials [11,155]. Mix design, due to the broad range of the raw materials (cement, aggregates, water, supplementary cementitious materials fly ash, ground granulated blast-furnace slag (GGBS), nanoparticles, and chemical admixtures) and the multiphase nature of the created system, is a very difficult task. In this case, conventional empirical approaches or the analysis framework are heavily constrained in terms of their ability to be generalized and accurate. The more recent progress in concrete mixture proportioning has moved towards increasingly emphasizing ML methods, owing to their capability to integrate nonlinear and multi-variable interactions in ways that cannot be well thought through by statistical and experimental techniques. Unlike prescriptive or trial-based design processes, ML models are trained on experimental data and find latent relationships among mixture components and the performance results of compressive strength, slump, durability, or cost [156].

One of the key benefits of the mix design modeling using ML is that it can be applied to multi-objective optimization combined with metaheuristic algorithms. Compressive strength, workability, and cost can be modeled together and allow designers to trade off at a large design space. The predictive accuracy and generalizability of ML models are greater than that of other prediction models (statistical regression or response surface), especially in terms of predictions of nonlinear relationships like the effects of SCMs or curing conditions on strength development [157,158,159].

Altogether, the paradigm shift in the ML-driven mix design modeling significantly lowers the use of the heavy trial batching, enhances the prediction accuracy, and promotes sustainability-related goals. As the availability of experimental data increases, the combination of ANNs, decision tree aggregates, and SVMs into computational optimization models presents an avenue of potential success in realizing robust and efficient design of high-performance and eco-friendly mixes of concrete. A summary of the recent literature on applying machine learning and optimization techniques in the design of concrete mixtures is shown in Table 3.

4.2. Hardened Concrete Properties Prediction

One of the main areas where ML has proven to be of great potential is in predicting the properties of hardened concrete. Being a heterogeneous, multiphase, time-dependent material, hardened concrete holds very complex behavior due to a wide range of mechanical and environmental stressors. Other nonlinear and multifaceted functions of many factors including mix composition, microstructure and curing conditions, and environmental exposure are properties like compressive, tensile, and flexural strength; creep and shrinkage; and durability indicators, such as chloride ion permeability, sulfate attack resistance, carbonation depth, freeze–thaw resistance, and performance at extreme occurrences like fire, explosion, seismic loading, or fatigue. The classical modeling techniques which rely on analytical relationships or semi-empirical equations are not usually adaptable and generalizable to new or changing conditions. In this sense, ML provides an effective tool by being able to capture a nonlinear mapping and captivate complex dependencies among variables to make accurate and credible predictions [179,180].

Supervised learning models that are trained on large experimental datasets can successfully predict parameters like compressive strength with various ages, chloride diffusion coefficients, the depth of carbonation, and the loss of weight following freeze–thaw cycles. Deep learning models, such as CNN and LSTM networks, can also be used to model the intricate interaction of thermal, temporal, and mechanical variables to predict the responding of the material in the face of fire or explosion conditions [181] in the context of concrete performance under extreme conditions.

The unsupervised learning algorithms come in handy especially when examining the durability of experiment data and detecting behavioral clusters or detecting some anomalous mix designs. As an example, concrete mixes with similar degradation behavior under sulfate attack or freeze: thaw cycles can be clustered to identify similar mixes by concrete performance even without the performance results marked. Similarly, dimensionality reduction models such as t-SNE or UMAP can be used to visually represent complex relations among the microstructural variables and durability performance to improve interpretation [179,182].

In more complicated situations, particularly those where the behavior or loading history is time-dependent, RL also offers an opportunity in adaptive modeling and decision-making in dynamical settings. In particular, to study concrete in the presence of cyclic (fatigue) or variable loads (e.g., earthquakes), RL may be used to solve the problem, as a dynamic environment and (sequential) stochastic inputs can be predicted. An example is an RL agent which can learn using multiple loading paths to approximate residual load and failure likelihood. Likewise, when simulating the coupled thermo-mechanical behavior of concrete to exposure to fire, the RL algorithm can be used to effectively tune thermal, transport, and mechanical characteristics [183,184].

Physics-informed and hybrid methods combine governing physical equations with ML structures, e.g., heat or ionic diffusion equations and mechanical damage or plastics, and improve generalizability of models to previously unexplored physical regimes with minimal reliance on large experimental datasets. As an example, predictive models of carbonation or chloride ingress as a combination of numerical solutions to Fick equations and neural networks can be used to produce predictive models that are both accurate and physically sound. Physics-informed models may be useful in applications where the response of materials to highly complex behavioral processes, like long-term creep or thermal cracking, need to be described with greater fidelity than is possible using purely data-driven statistical models of the same process [185].

4.3. Fresh Concrete Behavior and Processability

Simulation and forecasting the characteristics of fresh concrete, especially in new areas like 3D concrete printing, can be viewed as one of the most difficult but the most promising parts of ML application in the science of construction materials. Properties of fresh concrete, including flowability, plastic viscosity, yield stress, pumpability, buildability, setting time, and print-specific properties including interlayer bonding strength and anisotropy, are very sensitive to a wide range of interacting variables. They are the ratio of water to cementitious material, the kind and the number of admixtures (particularly superplasticizers and accelerators), ambient temperature, shear rate, elapsed time since mixing, and other processing and environmental parameters. The time-dependent, non-Newtonian, and multiphase character of fresh concrete and its sensitivity to the field itself makes classical modeling (i.e., analytical or semi-empirical equations) less than adequate in characterizing the behavior of fresh concrete in the real world. Here, ML with its capability to learn nonlinear mappings of such complexity and model high dimensional interactions has become an indispensable tool [186].

With rheological input features as well as environmental conditions, supervised learning algorithms, such as ANN models, are able to predict the values of viscosity, flowability, or setting time very well. These algorithms have been used in 3D concrete printing to model the behavior of the post-extrusion layer such as the stability of the layer, the possibility of collapse, and the ability to bond with the previous layer so that engineers can predict fresh concrete performance during the printing process without depending on time-consuming and expensive physical tests [187].

Applications of unsupervised learning have been made in clustering mixes with similar rheological behavior, classifying the variations brought about by admixtures and identifying mix designs with high risk of print instability [41]. Also, the dimensionality reduction techniques like PCA or t-SNE can demonstrate the latent dependence between mix composition, concrete age, and printability (e.g., anisotropy) characteristics. More advanced uses, like real-time control of the 3D printing process, will be crucial with RL. In this case, an RL agent would react continuously to the printing environment, learning to dynamically optimize parameters like pump rate, nozzle speed, admixture dosage, or print temperature towards some goal, like better buildability, more stable layers, or cohesion between layers. Through the development of the printing environment as a MDP, RL allows the fresh concrete to be dynamically and adaptively controlled, which is simply impossible under the traditional modeling frameworks. DQN and PPO algorithms are studied to realize real-time monitoring and automatic control of rheological properties during printing, which opens new opportunities in automated construction of concrete based on ML-driven approaches [188,189].

The use of physics-informed as well as hybrid ML models can be very promising in predicting the behavior of fresh concrete in a case when there is limited or unreliable experimental data. In prediction of viscosity and yield stress in non-Newtonian flowing concrete as an example, a physical constraint on a neural network should be applied based on the Navier–Stokes equations or Bingham-plastic models to make consistent physical predictions. Physics-informed models may directly incorporate the time-dependent history of concrete properties as a result of thixotropy in 3D printing use, which leads to a more stable model under conditions of varying behavior and makes the predictions more resistant to change.

5. General Trends

Overall, these studies suggest that more sophisticated or hybrid modeling approaches—such as deep neural networks and tree-based ensemble methods—often achieve higher predictive accuracy than simpler statistical or single-learner machine-learning models when estimating concrete properties [14]. Specifically, when the target of interest is well studied—such as compressive strength—artificial neural networks, convolutional neural networks, and gradient-boosting regressors are expected to perform better over more basic regression or parametric models. However, the performance variations are sometimes relatively small, and less complex algorithms (e.g., decision trees or support vectors machines) can achieve competitive performance at significantly higher levels of interpretability [14]. A second interesting observation is that the accuracy of models highly depends on data quality and breadth: models that are trained on large, heterogeneous data sets generalize better, and even advanced models fail when faced with thin or limited scopes of data. It is remarkable that machine learning methods have been effectively used in a broad range of concrete parameters besides compressive strength. As an example, Koya et al. showed that different ML models can predict six different mechanical properties with high precision [9], which explains that data-driven approaches can reflect the numerous aspects of concrete behavior.

Equally, machine learning has demonstrated durability prediction, with about 83% variance in chloride penetration prediction on an ensemble model and in mix-design optimization [5,190,191]. Such comparative analyses offer advice to practitioners in choosing the right ML tools, such as electing to use deep learning or ensemble methods in cases where the highest accuracy is needed and adequate data exist but using simpler models in cases where interpretability or constrained data are issues of concern. After summarizing the results of the current research, one can admit that every ML method has its unique benefits based on the context of a problem. The lessons learnt in these comparisons will help in making a perfect selection or a mix of models in a particular task. At the same time, as traditional methods are being perfected, scientists are also advancing the edge with new ML technologies that utilize new algorithms and computational capabilities. The following paragraphs will talk about these new state-of-the-art innovations and how they continue to solve the issues of the complicated nature of concrete data. The general trends are as follows:

Ensemble and boosting models consistently outperform single-learner models on heterogeneous concrete datasets.
Model performance is strongly constrained by dataset diversity rather than algorithmic complexity alone.
Curing- and environment-driven variability remains a dominant source of prediction uncertainty.
Physics-informed approaches improve robustness under extrapolative conditions.

5.1. Key Insights and Cross-Study Lessons

In the literature, there are several common findings related to the dependence between the variability of concrete materials and the performance of machine learning models. First, there is always a consistent finding that the accuracy of models is greatly influenced by the diversity and representativeness of the training datasets. Models that are trained on specific mix designs or laboratory conditions are more likely to have low generalizability in application to the concretes of different mix compositions, curing regimes, or exposure conditions. This weakness is the most significant in the case of single-learner and purely data-driven models.

Second, ensemble methods and deep learning models tend to be less sensitive to changes in mix composition and experimental conditions due to their capacity to learn nonlinear correlations, and they are less sensitive to noise. Their performance advantage, however, is reduced in cases where datasets are small/limited, undocumented, or inconsistently pre-processed, and thus, it can be said that the level of sophistication of an algorithm is not able to overcome poor data quality.

Third, the inconsistency of such factors as curing conditions and environmental exposure is one of the unresolved problems in the research. Long-term degradation mechanisms, time-dependent effects, and multi-physics interactions are usually underrepresented in existing datasets, causing uncertainty in predictions even with the most advanced models. Physics-informed and hybrid machine learning systems can appear as promising interventions in this scenario, as they do not allow predictions beyond the limits of inductive domain knowledge and enhance extrapolation outside of the training range.

In general, cross-study results suggest that to enhance the reliability of ML in concrete engineering, it is necessary to not only choose the right algorithms, but also resort to standardized data preprocessing, transparently report on the experimental conditions, and select the models based on the problem. These lessons can be useful to the researcher and practitioner who wants to use machine learning tools on a heterogeneous concrete system.

Contrasting ML Paradigms Under Different Variability Regimes

The literature review shows that the strength of machine learning models is highly dependent on the prevailing source of concrete material variability. ANNs tend to be most effective when the change in composition variability, including water-to-binder ratio, type of cement, and additional cementitious materials, is sensitized and adequate datasets are provided which are large and well-curated. But ANNs are less robust in the case where they are trained on multi-source heterogeneous data because they are sensitive to noise and unreliable preprocessing. Because the variable is more resistant to heterogeneous and noisy data, especially when the variability is due to aggregate characteristics or mixed experimental settings, due to the reduction in the variance through the ensemble, the extrapolation is less powerful outside the field of training; however, the extrapolation is restricted. Gradient-boosting models, in general, and XGBoost in particular, tend to be most accurate in predicting in complex controlled variability regimes, particularly in cases where nonlinear interactions between composition and curing parameters dominate but tend to be more vulnerable to overfitting in high-imbalanced or sparse cases. Conversely, physics-informed and hybrid ML models are more robust in curing-related and environmental variability, such as time-dependent hydration conditions, thermal exposure, and long-term degradation processes, because physical constraints make them less sensitive to insufficient data and increase generalizability. In general, it has been indicated that ANN and XGBoost are most efficient in composition-dominated variability with sufficient data, RF in heterogeneous multi-source data, and physics-informed models in the most reliable situation when variability is controlled by time-dependent variation or environmental factors.

Common failure modes reported across studies include reduced accuracy when models are applied outside the training domain, particularly during extrapolation across different mix families, curing regimes, and SCM types.

6. Conclusions

The review has suggested the complex connection between the variability of concrete materials and the performance, reliability, and generalization of machine learning (ML) models deployed across concrete technology. The natural heterogeneity of concrete due to the differences between the composition, microstructure, the conditions of the curing process, and the exposures of concrete leads to the complex, nonlinear interaction directly affecting the accuracy and strength of the data-driven predictive models. This variability must be identified and clearly dealt with to be able to successfully and responsibly use ML in concrete engineering.

The review has summarized and classified a wide range of ML methods, such as supervised learning, unsupervised learning, reinforcement learning, and hybrid physics-informed methods, and assessed their relevance to three major fields: (i) design and optimization of concrete mixes/concrete properties, (ii) prediction of concrete properties, and (iii) adaptive process control in new technologies, such as 3D concrete printing. Although supervised learning schemes, in particular artificial neural networks, ensemble tree-based models, and gradient-boosting algorithms, are currently still dominant because of their high predictive power, the analysis notes that the performance of such models heavily depends on the quality of data, its diversity, and preprocessing schemes.

One of the primary findings of the review is that data heterogeneity and the absence of standardization are the primary limitations of the current research using ML in concrete. Training data is often assembled by various experimental origins that use diverse testing regimes and partially complete metadata and thus may result in discriminatory predictions and reduced model transferability. It is then important to deal with these issues with the use of data preprocessing, normalization, and harmonization processes. In addition, unsupervised learning and anomaly detection techniques are useful tools that can be used to analyze complex datasets, extract underlying patterns, and enhance data consistency before the training of a model.

The review also indicates that hybrid and physics-informed ML models are areas with good potential to alleviate the rather weak aspects of solely data-driven models, especially in data-sparse or extrapolation-based situations. Such models can be made more interpretable by incorporating physical constraints and mechanistic knowledge into learning architectures, leading to more generalization outside of training domains and physically consistent predictions. Reinforcement learning is still in its infancy in terms of its adoption in concrete engineering but has a great promise of being used in real-time optimization and adaptive control in automated construction and high-tech manufacturing processes.

Despite the significant advancement in the given field, several issues persist. They are the lack of high-quality, standardized datasets; the lack of reporting of the experimental conditions; the risk of overfitting with the small data coverage; and the inability to benchmark between studies on ML models. In the mission to progress the field, future studies ought to focus on the creation of open-access, high-quality databases, embrace transparent data preprocessing pipelines, and enhance cross-disciplinary interaction between materials scientists, structural engineers, and data scientists. Furthermore, comparative benchmarking, quantification of uncertainty, and model interpretability should be given more attention to aid in real-life application.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AI	Artificial Intelligence
NN	Artificial Neural Network
BNN	Bayesian Neural Network
CNN	Convolutional Neural Network
DBSCAN	Density-Based Spatial Clustering of Applications with Noise
DNN	Deep Neural Network
DT	Decision Trees
GMM	Gaussian Mixture Model
XGBoost	Extreme Gradient Boosting
GAN	Generative Adversarial Network
GGBS	Ground Granulated Blast-Furnace Slag
ITZ	Interfacial Transition Zones
KNN	K-Nearest Neighbors
LSTM	Long Short-Term Memory
MDP	Markov Decision Process
ML	Machine Learning
MLPNN	Multilayer Perceptron Neural Network
NN	Neural Network
PCA	Principal Component Analysis
PPO	Proximal Policy Optimization
RF	Random Forests
RL	Reinforcement Learning
RNN	Recurrent Neural Network
SVM	Support Vector Machine
t-SNE	t-distributed Stochastic Neighbor Embedding
UMAP	Uniform Manifold Approximation and Projection

References

Mostafaei, H.; Mostofinejad, D.; Ghamami, M.; Wu, C. Fully automated operational modal identification of regular and irregular buildings with ensemble learning. Structures 2023, 58, 105439. [Google Scholar] [CrossRef]
Mostafaei, H.; Ghamami, M. State of the Art in Automated Operational Modal Identification: Algorithms, Applications, and Future Perspectives. Machines 2025, 13, 39. [Google Scholar] [CrossRef]
Lagaros, N.D.; Plevris, V. Artificial intelligence (AI) applied in civil engineering. Appl. Sci. 2022, 12, 7595. [Google Scholar] [CrossRef]
Harle, S.M. Advancements and challenges in the application of artificial intelligence in civil engineering: A comprehensive review. Asian J. Civ. Eng. 2024, 25, 1061–1078. [Google Scholar] [CrossRef]
Gamil, Y. Machine learning in concrete technology: A review of current researches, trends, and applications. Front. Built Environ. 2023, 9, 1145591. [Google Scholar] [CrossRef]
Loureiro, A.A.B.; Stefani, R. Comparing the performance of machine learning models for predicting the compressive strength of concrete. Discov. Civ. Eng. 2024, 1, 19. [Google Scholar] [CrossRef]
Oviedo, A.I.; Londoño, J.M.; Vargas, J.F.; Zuluaga, C.; Gómez, A. Modeling and Optimization of Concrete Mixtures Using Machine Learning Estimators and Genetic Algorithms. Modelling 2024, 5, 642–658. [Google Scholar] [CrossRef]
Li, Z.; Yoon, J.; Zhang, R.; Rajabipour, F.; Srubar, W.V., III; Dabo, I.; Radlińska, A. Machine learning in concrete science: Applications, challenges, and best practices. npj Comput. Mater. 2022, 8, 127. [Google Scholar] [CrossRef]
Koya, B.P.; Aneja, S.; Gupta, R.; Valeo, C. Comparative analysis of different machine learning algorithms to predict mechanical properties of concrete. Mech. Adv. Mater. Struct. 2022, 29, 4032–4043. [Google Scholar] [CrossRef]
Kumar, A.; Arora, H.C.; Kapoor, N.R.; Mohammed, M.A.; Kumar, K.; Majumdar, A.; Thinnukool, O. Compressive strength prediction of lightweight concrete: Machine learning models. Sustainability 2022, 14, 2404. [Google Scholar] [CrossRef]
Ziolkowski, P.; Niedostatkiewicz, M. Machine learning techniques in concrete mix design. Materials 2019, 12, 1256. [Google Scholar] [CrossRef]
DeRousseau, M.A.; Laftchiev, E.; Kasprzyk, J.R.; Rajagopalan, B.; Srubar, W.V., III. A comparison of machine learning methods for predicting the compressive strength of field-placed concrete. Constr. Build. Mater. 2019, 228, 116661. [Google Scholar] [CrossRef]
Ziolkowski, P.; Niedostatkiewicz, M.; Kang, S.-B. Model-based adaptive machine learning approach in concrete mix design. Materials 2021, 14, 1661. [Google Scholar] [CrossRef] [PubMed]
Pakzad, S.S.; Roshan, N.; Ghalehnovi, M. Comparison of various machine learning algorithms used for compressive strength prediction of steel fiber-reinforced concrete. Sci. Rep. 2023, 13, 3646. [Google Scholar] [CrossRef]
Mostafaei, H. Modal Identification Techniques for Concrete Dams: A Comprehensive Review and Application. Sci 2024, 6, 40. [Google Scholar] [CrossRef]
Nafees, A.; Amin, M.N.; Khan, K.; Nazir, K.; Ali, M.; Javed, M.F.; Aslam, F.; Musarat, M.A.; Vatin, N.I. Modeling of mechanical properties of silica fume-based green concrete using machine learning techniques. Polymers 2021, 14, 30. [Google Scholar] [CrossRef]
Nazar, S.; Yang, J.; Ahmad, W.; Javed, M.F.; Alabduljabbar, H.; Deifalla, A.F. Development of the new prediction models for the compressive strength of nanomodified concrete using novel machine learning techniques. Buildings 2022, 12, 2160. [Google Scholar] [CrossRef]
Shang, M.; Li, H.; Ahmad, A.; Ahmad, W.; Ostrowski, K.A.; Aslam, F.; Joyklad, P.; Majka, T.M. Predicting the mechanical properties of RCA-based concrete using supervised machine learning algorithms. Materials 2022, 15, 647. [Google Scholar] [CrossRef] [PubMed]
Nafees, A.; Khan, S.; Javed, M.F.; Alrowais, R.; Mohamed, A.M.; Mohamed, A.; Vatin, N.I. Forecasting the mechanical properties of plastic concrete employing experimental data using machine learning algorithms: DT, MLPNN, SVM, and RF. Polymers 2022, 14, 1583. [Google Scholar] [CrossRef]
Pham, A.-D.; Ngo, N.-T.; Nguyen, Q.-T.; Truong, N.-S. Hybrid machine learning for predicting strength of sustainable concrete. Soft Comput. 2020, 24, 14965–14980. [Google Scholar] [CrossRef]
Yaseen, Z.M. Machine learning models development for shear strength prediction of reinforced concrete beam: A comparative study. Sci. Rep. 2023, 13, 1723. [Google Scholar] [CrossRef] [PubMed]
Wan, Z.; Xu, Y.; Šavija, B. On the use of machine learning models for prediction of compressive strength of concrete: Influence of dimensionality reduction on the model performance. Materials 2021, 14, 713. [Google Scholar] [CrossRef] [PubMed]
Ahmad, A.; Ostrowski, K.A.; Maślak, M.; Farooq, F.; Mehmood, I.; Nafees, A. Comparative study of supervised machine learning algorithms for predicting the compressive strength of concrete at high temperature. Materials 2021, 14, 4222. [Google Scholar] [CrossRef]
Hosseinzadeh, M.; Mousavi, S.S.; Hosseinzadeh, A.; Dehestani, M. An efficient machine learning approach for predicting concrete chloride resistance using a comprehensive dataset. Sci. Rep. 2023, 13, 15024. [Google Scholar] [CrossRef] [PubMed]
Mostafaei, H.; Ashoori Barmchi, M.; Bahmani, H. Seismic Resilience and Sustainability: A Comparative Analysis of Steel and Reinforced Structures. Buildings 2025, 15, 1613. [Google Scholar] [CrossRef]
Wani, S.R.; Suthar, M. Utilizing machine learning approaches within concrete technology offers an intelligent perspective towards sustainability in the construction industry: A comprehensive review. Multiscale Multidiscip. Model. Exp. Des. 2025, 8, 1. [Google Scholar] [CrossRef]
Prasittisopin, L. Machine learning (ML) and deep learning (DL) in sustainable concrete construction: Review, trend and gap analyses. J. Asian Archit. Build. Eng. 2025, 1–29. [Google Scholar] [CrossRef]
Mobasheri, F.; Hosseinpoor, M.; Yahia, A.; Pourkamali-Anaraki, F. Machine Learning as an Innovative Engineering Tool for Controlling Concrete Performance: A Comprehensive Review. Arch. Comput. Methods Eng. 2025, 32, 4723–4767. [Google Scholar] [CrossRef]
Khan, K.; Ahmad, W.; Amin, M.N.; Ahmad, A. A systematic review of the research development on the application of machine learning for concrete. Materials 2022, 15, 4512. [Google Scholar] [CrossRef]
Neville, A.M. Properties of Concrete; Pearson Education India: Chennai, India, 1963. [Google Scholar]
Mehta, P.K.; Monteiro, P. Concrete: Microstructure, Properties, and Materials; McGraw-Hill: New York, NY, USA, 2006. [Google Scholar]
Mostafaei, H.; Kelishadi, M.; Bahmani, H.; Wu, C.; Ghiassi, B. Development of sustainable HPC using rubber powder and waste wire: Carbon footprint analysis, mechanical and microstructural properties. Eur. J. Environ. Civ. Eng. 2025, 29, 399–420. [Google Scholar] [CrossRef]
Mostafaei, H.; Bahmani, H. Sustainable High-Performance Concrete Using Zeolite Powder: Mechanical and Carbon Footprint Analyses. Buildings 2024, 14, 3660. [Google Scholar] [CrossRef]
Bahmani, H.; Mostafaei, H.; Santos, P.; Fallah Chamasemani, N. Enhancing the mechanical properties of Ultra-High-Performance Concrete (UHPC) through silica sand replacement with steel slag. Buildings 2024, 14, 3520. [Google Scholar] [CrossRef]
Ivanov, V.; Chu, J.; Stabnikov, V. Basics of construction microbial biotechnology. In Biotechnologies and Biomimetics for Civil Engineering; Springer: Berlin/Heidelberg, Germany, 2014; pp. 21–56. [Google Scholar]
Aïtcin, P.-C. High Performance Concrete; CRC Press: Boca Raton, FL, USA, 1998. [Google Scholar]
Shapland, A.; Stefani, E. Archaeology Behind the Battle Lines; Routledge: Abingdon, UK, 2017. [Google Scholar]
Nodehi, M.; Ozbakkaloglu, T.; Gholampour, A.; Mohammed, T.; Shi, X. The effect of curing regimes on physico-mechanical, microstructural and durability properties of alkali-activated materials: A review. Constr. Build. Mater. 2022, 321, 126335. [Google Scholar] [CrossRef]
Bahmani, H.; Mostafaei, H.; Mohamad Momeni, R.; Khoshoei, S.M. Utilization of Waste Marble Sludge in Self-Compacting Concrete: A Study on Partial Replacement of Cement and Fine Aggregates. Sustainability 2025, 17, 8523. [Google Scholar] [CrossRef]
Bahmani, H.; Mostafaei, H.; Rostampour, M.A. Utilization of Stone Quarry Sludge in the Development of Environmentally Friendly High-Strength Concrete. J. Compos. Sci. 2025, 9, 648. [Google Scholar] [CrossRef]
Boddepalli, U.; Panda, B.; Ranjani Gandhi, I.S. Rheology and printability of Portland cement based materials: A review. J. Sustain. Cem.-Based Mater. 2023, 12, 789–807. [Google Scholar] [CrossRef]
Arunothayan, A.R.; Nematollahi, B.; Khayat, K.H.; Ramesh, A.; Sanjayan, J.G. Rheological characterization of ultra-high performance concrete for 3D printing. Cem. Concr. Compos. 2023, 136, 104854. [Google Scholar] [CrossRef]
Gao, H.; Jin, L.; Chen, Y.; Chen, Q.; Liu, X.; Yu, Q. Rheological behavior of 3D printed concrete: Influential factors and printability prediction scheme. J. Build. Eng. 2024, 91, 109626. [Google Scholar] [CrossRef]
Bahmani, H.; Mostafaei, H.; Wu, C. Innovative Uses of Iron Ore Tailings in Sustainable Concrete: An In-Depth Review of Achievements, Future Potential, and Strategic Directions. Arab. J. Sci. Eng. 2025, 1–20. [Google Scholar] [CrossRef]
Bahmani, H.; Mostafaei, H. Eco-Friendly Self-Compacting Concrete Incorporating Waste Marble Sludge as Fine and Coarse Aggregate Substitute. Buildings 2025, 15, 3218. [Google Scholar] [CrossRef]
Bahmani, H.; Mostafaei, H.; Mostofinejad, D. Review of energy dissipation mechanisms in concrete: Role of advanced materials, mix design, and curing conditions. Sustainability 2025, 17, 6723. [Google Scholar] [CrossRef]
Rostampour, M.A.; Mostofinejad, D.; Bahmani, H.; Mostafaei, H. Crack Assessment Using Acoustic Emission in Cement-Free High-Performance Concrete Under Mechanical Stress. J. Compos. Sci. 2025, 9, 380. [Google Scholar] [CrossRef]
Bahmani, H.; Mostofinejad, D. Comparative analysis of environmental, social, and mechanical aspects of high-performance concrete with calcium oxide-activated slag reinforced with basalt, and recycled PET fibers. Case Stud. Constr. Mater. 2024, 20, e02895. [Google Scholar] [CrossRef]
Kim, Y.-Y.; Lee, K.-M.; Bang, J.-W.; Kwon, S.-J. Effect of W/C ratio on durability and porosity in cement mortar with constant cement amount. Adv. Mater. Sci. Eng. 2014, 2014, 273460. [Google Scholar] [CrossRef]
Piasta, W.; Zarzycki, B. The effect of cement paste volume and w/c ratio on shrinkage strain, water absorption and compressive strength of high performance concrete. Constr. Build. Mater. 2017, 140, 395–402. [Google Scholar] [CrossRef]
Bahmani, H.; Mostafaei, H. Impact of Fibers on the Mechanical and Environmental Properties of High-Performance Concrete Incorporating Zeolite. J. Compos. Sci. 2025, 9, 222. [Google Scholar] [CrossRef]
Mostafaei, H.; Bahmani, H.; Mostofinejad, D. Damping Behavior of Fiber-Reinforced Concrete: A Comprehensive Review of Mechanisms, Materials, and Dynamic Effects. J. Compos. Sci. 2025, 9, 254. [Google Scholar] [CrossRef]
Bahmani, H.; Mostofinejad, D. A novel high-performance concrete based on calcium oxide-activated materials reinforced with different fibers. Dev. Built Environ. 2023, 15, 100201. [Google Scholar] [CrossRef]
Samarakoon, M.H.; Ranjith, P.G.; Duan, W.H.; De Silva, V.R.S. Properties of one-part fly ash/slag-based binders activated by thermally-treated waste glass/NaOH blends: A comparative study. Cem. Concr. Compos. 2020, 112, 103679. [Google Scholar] [CrossRef]
Sasui, S.; Kim, G.; Nam, J.; van Riessen, A.; Eu, H.; Chansomsak, S.; Alam, S.F.; Cho, C.H. Incorporation of waste glass as an activator in Class-C Fly Ash/GGBS based Alkali activated material. Materials 2020, 13, 3906. [Google Scholar] [CrossRef]
Gao, X.; Yu, Q.L.; Lazaro, A.; Brouwers, H.J.H. Investigation on a green olivine nano-silica source based activator in alkali activated slag-fly ash blends: Reaction kinetics, gel structure and carbon footprint. Cem. Concr. Res. 2017, 100, 129–139. [Google Scholar] [CrossRef]
Ogundiran, M.B.; Nugteren, H.W.; Witkamp, G.-J. Geopolymerisation of fly ashes with waste aluminium anodising etching solutions. J. Environ. Manag. 2016, 181, 118–123. [Google Scholar] [CrossRef]
Rodríguez, E.D.; Bernal, S.A.; Provis, J.L.; Paya, J.; Monzo, J.M.; Borrachero, M.V. Effect of nanosilica-based activators on the performance of an alkali-activated fly ash binder. Cem. Concr. Compos. 2013, 35, 1–11. [Google Scholar] [CrossRef]
El-Naggar, M.R.; El-Dessouky, M.I. Re-use of waste glass in improving properties of metakaolin-based geopolymers: Mechanical and microstructure examinations. Constr. Build. Mater. 2017, 132, 543–555. [Google Scholar] [CrossRef]
Puertas, F.; Torres-Carrasco, M. Use of glass waste as an activator in the preparation of alkali-activated slag. Mechanical strength and paste characterisation. Cem. Concr. Res. 2014, 57, 95–104. [Google Scholar] [CrossRef]
Adesanya, E.; Ohenoja, K.; Di Maria, A.; Kinnunen, P.; Illikainen, M. Alternative alkali-activator from steel-making waste for one-part alkali-activated slag. J. Clean. Prod. 2020, 274, 123020. [Google Scholar] [CrossRef]
Font, A.; Soriano, L.; de Moraes Pinheiro, S.M.; Tashima, M.M.; Monzó, J.; Borrachero, M.V.; Payá, J. Design and properties of 100% waste-based ternary alkali-activated mortars: Blast furnace slag, olive-stone biomass ash and rice husk ash. J. Clean. Prod. 2020, 243, 118568. [Google Scholar] [CrossRef]
Yaseri, S.; Verki, V.M.; Mahdikhani, M. Utilization of high volume cement kiln dust and rice husk ash in the production of sustainable geopolymer. J. Clean. Prod. 2019, 230, 592–602. [Google Scholar] [CrossRef]
Prusty, S.R.; Panigrahi, R.; Jena, S. Characterisation and life-cycle assessment of alkali-activated concrete using industrial wastes. Int. J. Environ. Sci. Technol. 2024, 21, 2923–2938. [Google Scholar] [CrossRef]
Saha, A.; Aditto, F.S.; Kundu, L.; Sobuz, M.H.R.; Sunny, M.M.H. Analysis of waste glass as a partial substitute for coarse aggregate in self-compacting concrete: An experimental and machine learning study. J. Build. Eng. 2024, 98, 111112. [Google Scholar] [CrossRef]
Sathvik, S.; Oyebisi, S.; Kumar, R.; Shakor, P.; Adejonwo, O.; Tantri, A.; Suma, V. Analyzing the influence of manufactured sand and fly ash on concrete strength through experimental and machine learning methods. Sci. Rep. 2025, 15, 4978. [Google Scholar] [CrossRef]
Bentegri, H.; Rabehi, M.; Kherfane, S.; Nahool, T.A.; Rabehi, A.; Guermoui, M.; Alhussan, A.A.; Khafaga, D.S.; Eid, M.M.; El-Kenawy, E.-S.M. Assessment of compressive strength of eco-concrete reinforced using machine learning tools. Sci. Rep. 2025, 15, 5017. [Google Scholar] [CrossRef]
Bangaru, S.S.; Wang, C.; Hassan, M.; Jeon, H.W.; Ayiluri, T. Estimation of the degree of hydration of concrete through automated machine learning based microstructure analysis—A study on effect of image magnification. Adv. Eng. Inform. 2019, 42, 100975. [Google Scholar] [CrossRef]
Laqsum, S.A.; Zhu, H.; Haruna, S.I.; Ibrahim, Y.E.; Al-shawafi, A. Mechanical and Impact Strength Properties of Polymer-Modified Concrete Supported with Machine Learning Method: Microstructure Analysis (SEM) Coupled with EDS. J. Compos. Sci. 2025, 9, 101. [Google Scholar] [CrossRef]
Li, Y.; Ma, Y.; Tan, K.H.; Qian, H.; Liu, T. Microstructure-informed deep learning model for accurate prediction of multiple concrete properties. J. Build. Eng. 2024, 98, 111339. [Google Scholar] [CrossRef]
Ghosh, A.; Ransinchung, G.D. Application of machine learning algorithm to assess the efficacy of varying industrial wastes and curing methods on strength development of geopolymer concrete. Constr. Build. Mater. 2022, 341, 127828. [Google Scholar] [CrossRef]
Sun, B.; Huang, Y.; Liu, G.; Wang, W. Prediction of compressive strength of concrete under various curing conditions: A comparison of machine learning models and empirical mathematical models. Innov. Infrastruct. Solut. 2024, 9, 262. [Google Scholar] [CrossRef]
Ahmad, S.A.; Ahmed, H.U.; Rafiq, S.K.; Ahmad, D.A. Machine learning approach for predicting compressive strength in foam concrete under varying mix designs and curing periods. Smart Constr. Sustain. Cities 2023, 1, 16. [Google Scholar] [CrossRef]
Zhang, M.; Gu, Z.; Zhao, Y.; Fu, Y.; Kong, X. Compressive strength prediction of cement base under sulfate attack by machine learning approach. Case Stud. Constr. Mater. 2024, 21, e03652. [Google Scholar] [CrossRef]
Li, Y.; Shi, J.; Shen, J.; Jin, K.; Fan, M.; Liu, X. Prediction of the Sulfate Attack Resistance of Concrete Based on Machine-Learning Algorithms. J. Comput. Civ. Eng. 2024, 38, 04024043. [Google Scholar] [CrossRef]
Taffese, W.Z.; Sistonen, E.; Puttonen, J. CaPrM: Carbonation prediction model for reinforced concrete using machine learning methods. Constr. Build. Mater. 2015, 100, 70–82. [Google Scholar] [CrossRef]
Ehsani, M.; Ostovari, M.; Mansouri, S.; Naseri, H.; Jahanbakhsh, H.; Nejad, F.M. Machine learning for predicting concrete carbonation depth: A comparative analysis and a novel feature selection. Constr. Build. Mater. 2024, 417, 135331. [Google Scholar] [CrossRef]
Tran, V.Q.; Mai, H.V.T.; To, Q.T.; Nguyen, M.H. Machine learning approach in investigating carbonation depth of concrete containing Fly ash. Struct. Concr. 2023, 24, 2145–2169. [Google Scholar] [CrossRef]
Li, Y.; Jin, K.; Lin, H.; Shen, J.; Shi, J.; Fan, M. Analysis and prediction of freeze-thaw resistance of concrete based on machine learning. Mater. Today Commun. 2024, 39, 108946. [Google Scholar] [CrossRef]
Rathnayaka, M.; Karunasingha, D.; Gunasekara, C.; Wijesundara, K.; Law, D.W.; Lokuge, W. Systematic Analysis of the Impact of Data Preprocessing Techniques on Machine-Learning Model Performance: A Case Study of a Compressive Strength Prediction Model for Geopolymer Concrete. J. Comput. Civ. Eng. 2025, 39, 04025051. [Google Scholar] [CrossRef]
Hasan, M.R.; Shuvo, A.K.; Pranto, E.B.; Hasan, M.; Miah, M.M. Data-driven prediction of concrete strength by machine learning: Hybrid-fiber-reinforced recycled aggregate concrete. World J. Eng. 2025; in press. [Google Scholar]
Barbhuiya, S.; Sharif, M.S. Integrating Machine Learning with Concrete Science: Bridging Traditional Testing and Advanced Predictive Modelling. In 2024 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT); IEEE: New York, NY, USA, 2024; pp. 88–93. [Google Scholar]
Young, B.A.; Hall, A.; Pilon, L.; Gupta, P.; Sant, G. Can the compressive strength of concrete be estimated from knowledge of the mixture proportions?: New insights from statistical analysis and machine learning methods. Cem. Concr. Res. 2019, 115, 379–388. [Google Scholar] [CrossRef]
Kabiru, O.A.; Owolabi, T.O.; Ssennoga, T.; Olatunji, S.O. Performance comparison of SVM and ANN in predicting compressive strength of concrete. IOSR J. Comput. Eng. 2014, 16, 88–94. [Google Scholar] [CrossRef]
Chou, J.-S.; Tsai, C.-F.; Pham, A.-D.; Lu, Y.-H. Machine learning in concrete strength simulations: Multi-nation data analytics. Constr. Build. Mater. 2014, 73, 771–780. [Google Scholar] [CrossRef]
Duan, J.; Asteris, P.G.; Nguyen, H.; Bui, X.-N.; Moayedi, H. A novel artificial intelligence technique to predict compressive strength of recycled aggregate concrete using ICA-XGBoost model. Eng. Comput. 2021, 37, 3329–3346. [Google Scholar] [CrossRef]
Gupta, S.M. Support vector machines based modelling of concrete strength. Int. J. Intel. Technol. 2007, 3, 12–18. [Google Scholar]
Chou, J.-S.; Pham, A.-D. Enhanced artificial intelligence for ensemble approach to predicting high performance concrete compressive strength. Constr. Build. Mater. 2013, 49, 554–563. [Google Scholar] [CrossRef]
Deepa, C.; SathiyaKumari, K.; Sudha, V.P. Prediction of the compressive strength of high performance concrete mix using tree based modeling. Int. J. Comput. Appl. 2010, 6, 18–24. [Google Scholar] [CrossRef]
Erdal, H.I. Two-level and hybrid ensembles of decision trees for high performance concrete compressive strength prediction. Eng. Appl. Artif. Intell. 2013, 26, 1689–1697. [Google Scholar] [CrossRef]
Nafees, A.; Javed, M.F.; Khan, S.; Nazir, K.; Farooq, F.; Aslam, F.; Musarat, M.A.; Vatin, N.I. Predictive modeling of mechanical properties of silica fume-based green concrete using artificial intelligence approaches: MLPNN, ANFIS, and GEP. Materials 2021, 14, 7531. [Google Scholar] [CrossRef] [PubMed]
Khan, M.A.; Aslam, F.; Javed, M.F.; Alabduljabbar, H.; Deifalla, A.F. New prediction models for the compressive strength and dry-thermal conductivity of bio-composites using novel machine learning algorithms. J. Clean. Prod. 2022, 350, 131364. [Google Scholar] [CrossRef]
Salem, N.M.; Deifalla, A. Evaluation of the strength of slab-column connections with FRPs using machine learning algorithms. Polymers 2022, 14, 1517. [Google Scholar] [CrossRef] [PubMed]
Ebid, A.; Deifalla, A. Using artificial intelligence techniques to predict punching shear capacity of lightweight concrete slabs. Materials 2022, 15, 2732. [Google Scholar] [CrossRef]
Dong, W.; Huang, Y.; Lehane, B.; Ma, G. XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring. Autom. Constr. 2020, 114, 103155. [Google Scholar] [CrossRef]
Kaloop, M.R.; Kumar, D.; Samui, P.; Hu, J.W.; Kim, D. Compressive strength prediction of high-performance concrete using gradient tree boosting machine. Constr. Build. Mater. 2020, 264, 120198. [Google Scholar] [CrossRef]
Duan, Z.-H.; Kou, S.-C.; Poon, C.-S. Prediction of compressive strength of recycled aggregate concrete using artificial neural networks. Constr. Build. Mater. 2013, 40, 1200–1206. [Google Scholar] [CrossRef]
Naderpour, H.; Rafiean, A.H.; Fakharian, P. Compressive strength prediction of environmentally friendly concrete using artificial neural networks. J. Build. Eng. 2018, 16, 213–219. [Google Scholar] [CrossRef]
Al Yamani, W.H.; Ghunimat, D.M.; Bisharah, M.M. Modeling and predicting the sensitivity of high-performance concrete compressive strength using machine learning methods. Asian J. Civ. Eng. 2023, 24, 1943–1955. [Google Scholar] [CrossRef]
Khademi, F.; Jamal, S.M.; Deshpande, N.; Londhe, S. Predicting strength of recycled aggregate concrete using artificial neural network, adaptive neuro-fuzzy inference system and multiple linear regression. Int. J. Sustain. Built Environ. 2016, 5, 355–369. [Google Scholar] [CrossRef]
Hammoudi, A.; Moussaceb, K.; Belebchouche, C.; Dahmoune, F. Comparison of artificial neural network (ANN) and response surface methodology (RSM) prediction in compressive strength of recycled concrete aggregates. Constr. Build. Mater. 2019, 209, 425–436. [Google Scholar] [CrossRef]
Chen, S.; Zhao, Y.; Bie, Y. The prediction analysis of properties of recycled aggregate permeable concrete based on back-propagation neural network. J. Clean. Prod. 2020, 276, 124187. [Google Scholar] [CrossRef]
Atici, U. Prediction of the strength of mineral admixture concrete using multivariable regression analysis and an artificial neural network. Expert Syst. Appl. 2011, 38, 9609–9618. [Google Scholar] [CrossRef]
Khademi, F.; Akbari, M.; Jamal, S.M.; Nikoo, M. Multiple linear regression, artificial neural network, and fuzzy logic prediction of 28 days compressive strength of concrete. Front. Struct. Civ. Eng. 2017, 11, 90–99. [Google Scholar] [CrossRef]
Karbassi, A.; Mohebi, B.; Rezaee, S.; Lestuzzi, P. Damage prediction for regular reinforced concrete buildings using the decision tree algorithm. Comput. Struct. 2014, 130, 46–56. [Google Scholar] [CrossRef]
Salkhordeh, M.; Mirtaheri, M.; Soroushian, S. A decision-tree-based algorithm for identifying the extent of structural damage in braced-frame buildings. Struct. Control Health Monit. 2021, 28, e2825. [Google Scholar] [CrossRef]
Gogineni, A.; Panday, I.K.; Kumar, P.; Paswan, R.K. Predicting compressive strength of concrete with fly ash and admixture using XGBoost: A comparative study of machine learning algorithms. Asian J. Civ. Eng. 2024, 25, 685–698. [Google Scholar] [CrossRef]
Dabiri, H.; Farhangi, V.; Moradi, M.J.; Zadehmohamad, M.; Karakouzian, M. Applications of decision tree and random forest as tree-based machine learning techniques for analyzing the ultimate strain of spliced and non-spliced reinforcement bars. Appl. Sci. 2022, 12, 4851. [Google Scholar] [CrossRef]
Patel, S.; Jokhakar, V.N. A random forest based machine learning approach for mild steel defect diagnosis. In 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC); IEEE: New York, NY, USA, 2016; pp. 1–8. [Google Scholar]
Chun, P.-J.; Ujike, I.; Mishima, K.; Kusumoto, M.; Okazaki, S. Random forest-based evaluation technique for internal damage in reinforced concrete featuring multiple nondestructive testing results. Constr. Build. Mater. 2020, 253, 119238. [Google Scholar] [CrossRef]
Iqbal, M.; Zhang, D.; Jalal, F.E. Durability evaluation of GFRP rebars in harsh alkaline environment using optimized tree-based random forest model. J. Ocean Eng. Sci. 2022, 7, 596–606. [Google Scholar] [CrossRef]
Liu, Y.; Cao, Y.; Wang, L.; Chen, Z.-S.; Qin, Y. Prediction of the durability of high-performance concrete using an integrated RF-LSSVM model. Constr. Build. Mater. 2022, 356, 129232. [Google Scholar] [CrossRef]
Panfeng, B.; Songlin, Z.; Hongyu, C.; Caiwei, L.; Pengtao, W.; Lichang, Q. Structural monitoring data repair based on a long short-term memory neural network. Sci. Rep. 2024, 14, 9974. [Google Scholar] [CrossRef] [PubMed]
Pagano, D. A predictive maintenance model using long short-term memory neural networks and Bayesian inference. Decis. Anal. J. 2023, 6, 100174. [Google Scholar] [CrossRef]
Dorafshan, S.; Thomas, R.J.; Maguire, M. Comparison of deep convolutional neural networks and edge detectors for image-based crack detection in concrete. Constr. Build. Mater. 2018, 186, 1031–1045. [Google Scholar] [CrossRef]
Debroy, S.; Sil, A. An apposite transfer-learned DCNN model for prediction of structural surface cracks under optimal threshold for class-imbalanced data. J. Build. Pathol. Rehabil. 2022, 7, 83. [Google Scholar] [CrossRef]
Ali, L.; Alnajjar, F.; Jassmi, H.A.; Gocho, M.; Khan, W.; Serhani, M.A. Performance evaluation of deep CNN-based crack detection and localization techniques for concrete structures. Sensors 2021, 21, 1688. [Google Scholar] [CrossRef]
Silva, W.R.L.d.; Lucena, D.S.d. Concrete cracks detection based on deep learning image classification. Proceedings 2018, 2, 489. [Google Scholar] [CrossRef]
Zaidi, S.S.A.; Ansari, M.S.; Aslam, A.; Kanwal, N.; Asghar, M.; Lee, B. A survey of modern deep learning based object detection models. Digit. Signal Process. 2022, 126, 103514. [Google Scholar] [CrossRef]
Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2009; pp. 248–255. [Google Scholar]
Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
Golding, V.P.; Gharineiat, Z.; Munawar, H.S.; Ullah, F. Crack detection in concrete structures using deep learning. Sustainability 2022, 14, 8117. [Google Scholar] [CrossRef]
Yu, Y.; Samali, B.; Rashidi, M.; Mohammadi, M.; Nguyen, T.N.; Zhang, G. Vision-based concrete crack detection using a hybrid framework considering noise effect. J. Build. Eng. 2022, 61, 105246. [Google Scholar] [CrossRef]
Su, C.; Wang, W. Concrete cracks detection using convolutional neuralnetwork based on transfer learning. Math. Probl. Eng. 2020, 2020, 7240129. [Google Scholar] [CrossRef]
Yang, Q.; Shi, W.; Chen, J.; Lin, W. Deep convolution neural network-based transfer learning method for civil infrastructure crack detection. Autom. Constr. 2020, 116, 103199. [Google Scholar] [CrossRef]
Islam, M.M.; Hossain, M.B.; Akhtar, M.N.; Moni, M.A.; Hasan, K.F. CNN based on transfer learning models using data augmentation and transformation for detection of concrete crack. Algorithms 2022, 15, 287. [Google Scholar] [CrossRef]
Ali, R.; Chuah, J.H.; Talip, M.S.A.; Mokhtar, N.; Shoaib, M.A. Structural crack detection using deep convolutional neural networks. Autom. Constr. 2022, 133, 103989. [Google Scholar] [CrossRef]
Li, S.; Zhao, X. Image-based concrete crack detection using convolutional neural network and exhaustive search technique. Adv. Civ. Eng. 2019, 2019, 6520620. [Google Scholar] [CrossRef]
Cohn, R.; Holm, E. Unsupervised machine learning via transfer learning and k-means clustering to classify materials image data. Integr. Mater. Manuf. Innov. 2021, 10, 231–244. [Google Scholar] [CrossRef]
Gairola, S.; Shah, R.; Narayanan, P.J. Unsupervised image style embeddings for retrieval and recognition tasks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 1–5 March 2020; pp. 3281–3289. [Google Scholar]
Ji, X.; Henriques, J.F.; Vedaldi, A. Invariant information clustering for unsupervised image classification and segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9865–9874. [Google Scholar]
Tuia, D.; Camps-Valls, G. Semisupervised remote sensing image classification with cluster kernels. IEEE Geosci. Remote Sens. Lett. 2009, 6, 224–228. [Google Scholar] [CrossRef]
Clancy, T.C.; Khawar, A.; Newman, T.R. Robust signal classification using unsupervised learning. IEEE Trans. Wirel. Commun. 2011, 10, 1289–1299. [Google Scholar] [CrossRef]
Noh, Y.; Koo, D.; Kang, Y.-M.; Park, D.; Lee, D. Automatic crack detection on concrete images using segmentation via fuzzy C-means clustering. In 2017 International Conference on Applied System Innovation (ICASI); IEEE: New York, NY, USA, 2017; pp. 877–880. [Google Scholar]
Schmarje, L.; Santarossa, M.; Schröder, S.-M.; Koch, R. A survey on semi-, self-and unsupervised learning for image classification. IEEE Access 2021, 9, 82146–82168. [Google Scholar] [CrossRef]
Deng, D. DBSCAN clustering algorithm based on density. In 2020 7th International Forum on Electrical Engineering and Automation (IFEEA); IEEE: New York, NY, USA, 2020; pp. 949–953. [Google Scholar]
Wang, J.; Jiang, J. Unsupervised deep clustering via adaptive GMM modeling and optimization. Neurocomputing 2021, 433, 199–211. [Google Scholar] [CrossRef]
Vidya Sagar, R. Verification of the applicability of the Gaussian mixture modelling for damage identification in reinforced concrete structures using acoustic emission testing. J. Civ. Struct. Health Monit. 2018, 8, 395–415. [Google Scholar] [CrossRef]
Ahn, K.U.; Park, C.S. Application of deep Q-networks for model-free optimal control balancing between different HVAC systems. Sci. Technol. Built Environ. 2020, 26, 61–74. [Google Scholar] [CrossRef]
Adam, B.; Smith, I.F. Reinforcement learning for structural control. J. Comput. Civ. Eng. 2008, 22, 133–139. [Google Scholar] [CrossRef]
An, Y.; Xia, T.; You, R.; Lai, D.; Liu, J.; Chen, C. A reinforcement learning approach for control of window behavior to reduce indoor PM2. 5 concentrations in naturally ventilated buildings. Build. Environ. 2021, 200, 107978. [Google Scholar] [CrossRef]
Andriotis, C.P.; Papakonstantinou, K.G. Managing engineering systems with large state and action spaces through deep reinforcement learning. Reliab. Eng. Syst. Saf. 2019, 191, 106483. [Google Scholar] [CrossRef]
Andriotis, C.P.; Papakonstantinou, K.G. Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints. Reliab. Eng. Syst. Saf. 2021, 212, 107551. [Google Scholar] [CrossRef]
Apolinarska, A.A.; Pacher, M.; Li, H.; Cote, N.; Pastrana, R.; Gramazio, F.; Kohler, M. Robotic assembly of timber joints using reinforcement learning. Autom. Constr. 2021, 125, 103569. [Google Scholar] [CrossRef]
Arora, S.; Doshi, P. A survey of inverse reinforcement learning: Challenges, methods and progress. Artif. Intell. 2021, 297, 103500. [Google Scholar] [CrossRef]
Azuatalam, D.; Lee, W.-L.; De Nijs, F.; Liebman, A. Reinforcement learning for whole-building HVAC control and demand response. Energy AI 2020, 2, 100020. [Google Scholar] [CrossRef]
Petro, Y.; Ojiako, U.; Williams, T.; Marshall, A. Organizational ambidexterity: A critical review and development of a project-focused definition. J. Manag. Eng. 2019, 35, 03119001. [Google Scholar] [CrossRef]
Whitehead, S.D.; Lin, L.-J. Reinforcement learning of non-Markov decision processes. Artif. Intell. 1995, 73, 271–306. [Google Scholar] [CrossRef]
Zhu, Q.; Leibowicz, B.D. A Markov decision process approach for cost-benefit analysis of infrastructure resilience upgrades. Risk Anal. 2022, 42, 1585–1602. [Google Scholar] [CrossRef]
Tipu, R.K.; Shah, O.A.; Vats, S.; Purohit, S. Enhancing Concrete Properties Through the Integration of Recycled Coarse Aggregate: A Machine Learning Approach for Sustainable Construction. In 2024 4th International Conference on Innovative Practices in Technology and Management (ICIPTM); IEEE: New York, NY, USA, 2024; pp. 1–5. [Google Scholar]
Mater, Y.; Kamel, M.; Karam, A.; Bakhoum, E. ANN-Python prediction model for the compressive strength of green concrete. Constr. Innov. 2023, 23, 340–359. [Google Scholar] [CrossRef]
Wang, X.Q.; Chen, P.; Chow, C.L.; Lau, D. Artificial-intelligence-led revolution of construction materials: From molecules to Industry 4.0. Matter 2023, 6, 1831–1859. [Google Scholar] [CrossRef]
Chaabene, W.B.; Flah, M.; Nehdi, M.L. Machine learning prediction of mechanical properties of concrete: Critical review. Constr. Build. Mater. 2020, 260, 119889. [Google Scholar] [CrossRef]
Dinesh, A.; Prasad, B.R. Predictive models in machine learning for strength and life cycle assessment of concrete structures. Autom. Constr. 2024, 162, 105412. [Google Scholar] [CrossRef]
DeRousseau, M.; Kasprzyk, J.; Srubar, W.V., III. Computational design optimization of concrete mixtures: A review. Cem. Concr. Res. 2018, 109, 42–53. [Google Scholar] [CrossRef]
Chen, F.; Xu, W.; Wen, Q.; Zhang, G.; Xu, L.; Fan, D.; Yu, R. Advancing concrete mix proportion through hybrid intelligence: A multi-objective optimization approach. Materials 2023, 16, 6448. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Rahman, M.A.; Zhang, T.; Lu, Y. PINN-CHK: Physics-informed neural network for high-fidelity prediction of early-age cement hydration kinetics. Neural Comput. Appl. 2024, 36, 13665–13687. [Google Scholar] [CrossRef]
Varghese, S.; Anand, R.; Paliwal, G. Physics-Informed Neural Network for Concrete Manufacturing Process Optimization. arXiv 2024, arXiv:2408.14502. [Google Scholar] [CrossRef]
Golafshani, E.M.; Behnood, A. Estimating the optimal mix design of silica fume concrete using biogeography-based programming. Cem. Concr. Compos. 2019, 96, 95–105. [Google Scholar] [CrossRef]
Zhang, J.; Huang, Y.; Ma, G.; Nener, B. Mixture optimization for environmental, economical and mechanical objectives in silica fume concrete: A novel frame-work based on machine learning and a new meta-heuristic algorithm. Resour. Conserv. Recycl. 2021, 167, 105395. [Google Scholar] [CrossRef]
Naseri, H.; Jahanbakhsh, H.; Hosseini, P.; Nejad, F.M. Designing sustainable concrete mixture by developing a new machine learning technique. J. Clean. Prod. 2020, 258, 120578. [Google Scholar] [CrossRef]
Golafshani, E.M.; Arashpour, M.; Kashani, A. Green mix design of rubbercrete using machine learning-based ensemble model and constrained multi-objective optimization. J. Clean. Prod. 2021, 327, 129518. [Google Scholar] [CrossRef]
Dabbaghi, F.; Tanhadoust, A.; Nehdi, M.L.; Nasrollahpour, S.; Dehestani, M.; Yousefpour, H. Life cycle assessment multi-objective optimization and deep belief network model for sustainable lightweight aggregate concrete. J. Clean. Prod. 2021, 318, 128554. [Google Scholar] [CrossRef]
Motlagh, S.A.T.; Naghizadehrokni, M. An extended multi-model regression approach for compressive strength prediction and optimization of a concrete mixture. Constr. Build. Mater. 2022, 327, 126828. [Google Scholar] [CrossRef]
Shamsabadi, E.A.; Salehpour, M.; Zandifaez, P.; Dias-da-Costa, D. Data-driven multicollinearity-aware multi-objective optimisation of green concrete mixes. J. Clean. Prod. 2023, 390, 136103. [Google Scholar] [CrossRef]
Li, Y.; Shen, J.; Lin, H.; Li, Y. Optimization design for alkali-activated slag-fly ash geopolymer concrete based on artificial intelligence considering compressive strength, cost, and carbon emission. J. Build. Eng. 2023, 75, 106929. [Google Scholar] [CrossRef]
Huang, Y.; Huo, Z.; Ma, G.; Zhang, L.; Wang, F.; Zhang, J. Multi-objective optimization of fly ash-slag based geopolymer considering strength, cost and CO₂ emission: A new framework based on tree-based ensemble models and NSGA-II. J. Build. Eng. 2023, 68, 106070. [Google Scholar] [CrossRef]
Wang, S.; Xia, P.; Wang, Z.; Meng, T.; Gong, F. Intelligent mix design of recycled brick aggregate concrete based on swarm intelligence. J. Build. Eng. 2023, 71, 106508. [Google Scholar] [CrossRef]
Sun, C.; Wang, K.; Liu, Q.; Wang, P.; Pan, F. Machine-learning-based comprehensive properties prediction and mixture design optimization of ultra-high-performance concrete. Sustainability 2023, 15, 15338. [Google Scholar] [CrossRef]
Dong, W.; Huang, Y.; Cui, A.; Ma, G. Mix design optimization for fly ash-based geopolymer with mechanical, environmental, and economic objectives using soft computing technology. J. Build. Eng. 2023, 72, 106577. [Google Scholar] [CrossRef]
Chen, H.; Cao, Y.; Liu, Y.; Qin, Y.; Xia, L. Enhancing the durability of concrete in severely cold regions: Mix proportion optimization based on machine learning. Constr. Build. Mater. 2023, 371, 130644. [Google Scholar] [CrossRef]
Liu, K.; Zheng, J.; Dong, S.; Xie, W.; Zhang, X. Mixture optimization of mechanical, economical, and environmental objectives for sustainable recycled aggregate concrete based on machine learning and metaheuristic algorithms. J. Build. Eng. 2023, 63, 105570. [Google Scholar] [CrossRef]
Hafez, H.; Teirelbar, A.; Tošić, N.; Ikumi, T.; de la Fuente, A. Data-driven optimization tool for the functional, economic, and environmental properties of blended cement concrete using supplementary cementitious materials. J. Build. Eng. 2023, 67, 106022. [Google Scholar] [CrossRef]
Zheng, W.; Shui, Z.; Xu, Z.; Gao, X.; Zhang, S. Multi-objective optimization of concrete mix design based on machine learning. J. Build. Eng. 2023, 76, 107396. [Google Scholar] [CrossRef]
Golafshani, E.M.; Kim, T.; Behnood, A.; Ngo, T.; Kashani, A. Sustainable mix design of recycled aggregate concrete using artificial intelligence. J. Clean. Prod. 2024, 442, 140994. [Google Scholar] [CrossRef]
Shahrokhishahraki, M.; Malekpour, M.; Mirvalad, S.; Faraone, G. Machine learning predictions for optimal cement content in sustainable concrete constructions. J. Build. Eng. 2024, 82, 108160. [Google Scholar] [CrossRef]
Yuan, Z.; Zheng, W.; Qiao, H. Machine learning based optimization for mix design of manufactured sand concrete. Constr. Build. Mater. 2025, 467, 140256. [Google Scholar] [CrossRef]
Taffese, W.Z.; Hilloulin, B.; Zaccardi, Y.V.; Marani, A.; Nehdi, M.L.; Hanif, M.U.; Kamath, M.; Nunes, S.; von Greve-Dierfeld, S.; Kanellopoulos, A. Machine learning in concrete durability: Challenges and pathways identified by RILEM TC 315-DCS towards enhanced predictive models. Mater. Struct. 2025, 58, 145. [Google Scholar] [CrossRef]
Luo, D.; Wang, K.; Wang, D.; Sharma, A.; Li, W.; Choi, I.H. Artificial intelligence in the design, optimization, and performance prediction of concrete materials: A comprehensive review. npj Mater. Sustain. 2025, 3, 14. [Google Scholar] [CrossRef]
Hilloulin, B.; Umunnakwe, R. Machine learning-aided prediction of shrinkage in modern concrete: Focus on mix proportions and SCMs. J. Build. Eng. 2024, 98, 111410. [Google Scholar] [CrossRef]
Li, W.; Li, H.; Liu, C.; Min, K. Concrete Creep Prediction Based on Improved Machine Learning and Game Theory: Modeling and Analysis Methods. Buildings 2024, 14, 3627. [Google Scholar] [CrossRef]
Lunardi, L.R.; Cornélio, P.G.; Prado, L.P.; Nogueira, C.G.; Felix, E.F. Hybrid Machine Learning Model for Predicting the Fatigue Life of Plain Concrete Under Cyclic Compression. Buildings 2025, 15, 1618. [Google Scholar] [CrossRef]
Zhang, M.; Kang, R. Machine learning methods for predicting the durability of concrete materials: A review. Adv. Cem. Res. 2025, 37, 502–517. [Google Scholar] [CrossRef]
Shishegaran, A.; Varaee, H.; Rabczuk, T.; Shishegaran, G. High correlated variables creator machine: Prediction of the compressive strength of concrete. Comput. Struct. 2021, 247, 106479. [Google Scholar] [CrossRef]
Schossler, R.T.; Ullah, S.; Alajlan, Z.; Yu, X. Improving Decision-Making in 3D Concrete Printing Through Shap-Guided Machine Learning: Predictive Models and Feature Importance for Yield Stress and Viscosity. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4536032 (accessed on 26 January 2025).
Meyer, M.; Langer, A.; Mehltretter, M.; Beyer, D.; Coenen, M.; Schack, T.; Haist, M.; Heipke, C. Image-based Deep Learning for the time-dependent prediction of fresh concrete properties. arXiv 2024, arXiv:2402.06611. [Google Scholar] [CrossRef]
Xiao, S.; Li, J.; Wang, Z.; Chen, Y.; Tofighi, S. Advancing additive manufacturing through machine learning techniques: A state-of-the-art review. Future Internet 2024, 16, 419. [Google Scholar] [CrossRef]
Mattera, G.; Caggiano, A.; Nele, L. Optimal data-driven control of manufacturing processes using reinforcement learning: An application to wire arc additive manufacturing. J. Intell. Manuf. 2025, 36, 1291–1310. [Google Scholar] [CrossRef]
Morcous, G.; Lounis, Z. Prediction of onset of corrosion in concrete bridge decks using neural networks and case-based reasoning. Comput.-Aided Civ. Infrastruct. Eng. 2005, 20, 108–117. [Google Scholar] [CrossRef]
Rosso, M.M.; Asso, R.; Aloisio, A.; Di Benedetto, M.; Cucuzza, R.; Greco, R. Corrosion effects on the capacity and ductility of concrete half-joint bridges. Constr. Build. Mater. 2022, 360, 129555. [Google Scholar] [CrossRef]

Figure 1. Role of ML across stages of concrete production and long-term maintenance based on relevant features.

Figure 2. Annual publication trend on ML applications in concrete technology over the period from 2015 to 2026 (December 2025).

Figure 3. Approach of a supervised NN model to estimate the compressive strength of concrete based on input data.

Figure 4. Using unsupervised ML algorithms for clusters extraction.

Figure 5. Approach of an RL algorithm in a 3D printing concrete system.

Figure 6. A physics-informed neural network model in the field of concrete.

Table 1. Sources of variability in concrete and their implications for ML model performance.

Source of Variability	Examples	Impact on Concrete Properties	Implications for ML Models	Suggested Strategies
Composition	Cement type, water-to-cement ratio, SCMs, admixtures [65,66,67]	Controls strength, permeability, workability	Shifts in statistical distributions reduce generalization; models may overfit specific mix ranges	Expand dataset diversity; feature normalization; transfer learning
Microstructure	Porosity, ITZ quality, hydration products [68,69,70]	Governs stiffness, toughness, durability	Microstructural data often sparse → underrepresented features	Physics-informed ML; imaging-based ML; hybrid models
Curing Conditions	Temperature, humidity, curing duration, curing method [71,72,73]	Affects strength development, shrinkage, cracking	Time-dependent behaviors often ignored → inaccurate predictions	Use time-series models (RNN, LSTM); explicit encoding of curing regimes
Environmental Exposure	Freeze–thaw, carbonation, chloride/sulfate attack [24,74,75,76,77,78,79]	Long-term durability and degradation	Highly nonlinear degradation patterns difficult to capture	Ensemble learning; anomaly detection; coupling with mechanistic models

Table 3. Summary of machine learning and optimization studies for concrete mixture design optimization, including material types, dataset characteristics, input variables, output targets, ML models, optimization methods, and design objectives.

Ref	Concrete	Inputs	ML Model(s)	Optimization	Objectives
[160]	Silica fume concrete	Cement, SF, W/B, CA, FA, SP, age	Biogeography-based programming (BBP)	Constrained biogeography-based optimization (CBBO)	Uniaxial compressive strength (UCS) ≥ req, Cost ↓
[161]	Silica fume concrete (SFC)	Cement, SF, W/B, FA, CA, SP, age, Maximum size of coarse aggregate	NN	multi-objective beetle antennae search (MOBAS)	UCS ↑, Cost ↓, CO₂ ↓
[162]	Sustainable concrete	Cement, SCM, W/B, CA, FA, SP, Age	ANN, SVM, regression	Genetic algorithm (GA), water cycle algorithm (WCA), soccer league comp.	UCS ↑, Cost ↓, CO₂ ↓, Energy ↓
[163]	Rubbercrete	Cement (MC), W/B, CA, FA, SP, SF, waste coarse rubber (WCR), waste fine rubber (WFR), Age	M5P + MGEP (ensemble)	Grey wolf optimization (GWO)	UCS ↑, Cost ↓, CO₂ ↓, WR use ↑
[164]	Lightweight aggregate concrete	Fine LECA, Coarse LECA, SP, W/B, cement, SF, Powder stone	DBN	GA + LCA	Strength ↑, Cost ↓, Env. footprint ↓
[165]	Conventional concretes	Cement, agg., W/B, SP	ANN, RF, DT, Polynomial Regression	NSGA-II	UCS ↑, Cost–CS balance
[166]	SCM concretes (5 SCMs)	Cement, FA, slag, SF, WMP, WGP, agg., SP, W/B	Multiple linear regression (MLR), K nearest neighbors (KNN), SVM, Gaussian process (GP), RF, ANN, GBM, XGBM (best)	Multicollinearity-aware MOO (MA-MOO)	UCS ≈ target, Cost ↓, Env. ↓
[167]	Fly ash–slag geopolymer	Slag, FA, NaOH, Na₂SiO₃, SP, CA, FA, W/B	Gaussian Process Regression (GPR), RF, GB, BPNN	PSO	UCS ↑, Cost ↓, CO₂ ↓
[168]	Fly ash–slag geopolymer	FA, slag, Na-silicate, curing	RF, extremely randomized tree (ERT), GBR, XGBR	Non-Dominated Sorting Genetic Algorithm 2 (NSGA-II)	UCS ↑, Cost ↓, CO₂ ↓
[169]	Recycled brick aggregate (RBA) concrete	Cement, W/B, RBA, crushed tile ratio (CT), crushed brick ratio (CB), and natural aggregate (NA) ratio.	NN, SVM, RF, Extreme learning machine (ELM), Generalized regression neural network (GRNN), XGB, GWO-BP (best)	MOO (swarm)	UCS ↑, Cost ↓, CO₂ ↓
[170]	UHPC	SF, FA, slag, FA., CA, W/B, steel fiber, W/B, SP	XGBoost (best), RF, GBR, LR, NN, DT.	AHP	UCS ↑, Flexural ↑, Workability ↑, Shrinkage ↓, Cost ↓, CO₂ ↓
[171]	Fly ash-based geopolymer	Fly ash chem. composition, mix proportions, curing conditions	NN	NSGA-II (MODO)	UCS ↑, Cost ↓, CO₂ ↓
[172]	Cold-region durability	Cement, FA, CA, SP, W/B	RF	NSGA-II	Durability ↑, Cost ↓
[173]	Recycled aggregate concrete (RAC)	Cement, sand, W/B, CA, FA, Strength grade of cement, RCA, curing, admixtures	NN, GPR, RF, Classification and regression tree (CART), gradient boosting decision trees (GBDT), XGB (best)	CMOPSO	UCS ↑, Cost ↓, CO₂ ↓, Energy ↓
[174]	Blended-cement concrete (Opt-bcc)	OPC + 5 SCMs + func. reqs	Pre-bcc ML	GA (Opt-bcc)	Strength, Workability, Cost ↓, CO₂ ↓
[175]	Conventional concrete (industrial DB)	Cement, FA, slag, sand, CA, admixtures, W/B	Gradient Boosting (best)	NSGA-III, C-TAEA	UCS ↑, Binder efficiency ↑, Cost ↓
[176]	Recycled aggregate concrete	Cement, FA, Slag, SF, RA, RWA, SP, TA	Elastic Net regression, KNN, NN, SVM, DT, RF, XGBoost, Light Gradient Boosting (LGBoost), Category Boosting (CatBoost), and Stacking methods	MOWCA, Monte Carlo + SHAP	UCS ↑, Cost ↓, CO₂ ↓
[177]	Conventional concrete	28- and 90-day UCS, slump, size, CA, W	Elastic Net (best), ANN, RF, DT	Regression-based cement prediction	Cement ↓ (~10%), CO₂ ↓ (~10%), UCS maintained
[178]	Manufactured sand concrete (MSC)	Cement, FA, M-sand, CA, SP, W/B	NN, RF, SVR, XGBoost (best)	NSGA-II	UCS ↑, Durability ↑, Cost ↓

↓ means decrease; ↑ means increase.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bahmani, H.; Mostafaei, H.; Santos, P.; Ferrández, D. Concrete Material Variability and Machine Learning Model Performance: A Comprehensive Review. Buildings 2026, 16, 556. https://doi.org/10.3390/buildings16030556

AMA Style

Bahmani H, Mostafaei H, Santos P, Ferrández D. Concrete Material Variability and Machine Learning Model Performance: A Comprehensive Review. Buildings. 2026; 16(3):556. https://doi.org/10.3390/buildings16030556

Chicago/Turabian Style

Bahmani, Hadi, Hasan Mostafaei, Paulo Santos, and Daniel Ferrández. 2026. "Concrete Material Variability and Machine Learning Model Performance: A Comprehensive Review" Buildings 16, no. 3: 556. https://doi.org/10.3390/buildings16030556

APA Style

Bahmani, H., Mostafaei, H., Santos, P., & Ferrández, D. (2026). Concrete Material Variability and Machine Learning Model Performance: A Comprehensive Review. Buildings, 16(3), 556. https://doi.org/10.3390/buildings16030556

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Concrete Material Variability and Machine Learning Model Performance: A Comprehensive Review

Abstract

1. Introduction

2. Sources of Variability in Concrete

2.1. Composition

2.2. Microstructure

2.3. Curing Conditions

2.4. Environmental Exposure

2.5. Linking ML Input Features to Fundamental Concrete Mechanisms

3. ML Algorithms for Concrete-Related Predictions

3.1. Data Processing, Harmonization, and Normalization of Multi-Source Experimental Datasets

3.2. Supervised Learning

3.3. Unsupervised and Clustering Methods

3.4. Reinforcement Learning Algorithm

3.5. Hybrid and Physics-Informed Models

4. Comparative Evaluation of Established ML Methods in Concrete Research

4.1. Mix Design Modeling and Optimization

4.2. Hardened Concrete Properties Prediction

4.3. Fresh Concrete Behavior and Processability

5. General Trends

5.1. Key Insights and Cross-Study Lessons

Contrasting ML Paradigms Under Different Variability Regimes

6. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI