1. Introduction
The hydration process in mass concrete is a complex phenomenon influenced by numerous interrelated factors. Among these, the heat generated during cement hydration plays a critical role in the early age behaviour of concrete. Excessive temperature rise can lead to thermal gradients and internal restraints, which in turn contribute to the development of thermal cracking, a significant durability concern in large-scale concrete structures. Early theoretical investigations into the heat of hydration date back to the 1930s, notably during the construction of the first dams. Since then, decades of research have focused on mitigating thermal cracking by optimising cement types, concrete mix designs, curing methods, and the use of chemical and mineral admixtures [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14].
Contemporary research continues to highlight the importance of modelling heat and mass transfer in concrete, with an increasing number of studies over the past three years confirming the persistent relevance of this subject [
15,
16,
17,
18]. Foundational models for heat and moisture transport in porous materials were first proposed by Luikov, Harmathy, and De Vries [
15,
19]. Today, a wide range of mathematical and numerical models are available, varying in terms of physical assumptions and computational complexity. Advanced models for concrete specifically incorporate its porous, multi-phase character and the interplay of physicochemical phenomena [
12,
20,
21]. However, many engineering applications, especially those involving early age massive concrete structural members, still rely on simplified formulations that neglect thermodiffusion cross-effects [
22,
23,
24,
25]. While more comprehensive formulations derived from the laws of irreversible thermodynamics do exist [
16,
20], their practical application is limited by complexity and computational demands. Consequently, simplified 1D or 2D analyses are often used, omitting the spatial variability of temperature and moisture fields, especially in irregularly shaped elements such as bridge abutments, nuclear containment walls, or lock chambers.
Heat and moisture transport in concrete generally follow Fourier’s law and Fick’s second law, respectively. Yet, the hydration phase presents unique modelling challenges. Initially a heterogeneous mixture of solids and liquids, concrete undergoes significant structural transformation as hydration progresses, complicating the modelling of its thermal response. Two main modelling strategies address this: the multi-phase and phenomenological approaches [
12].
The multi-phase approach seeks to represent the material’s heterogeneous nature through separate constitutive equations for the solid, liquid, and gaseous phases. These are then averaged to describe the behaviour of the concrete as a whole. Although this method can offer detailed insights, it involves numerous input parameters and is computationally intensive. Research by Gawin et al. [
20] and others [
26,
27] has advanced multi-phase modelling significantly. Nevertheless, due to computational limitations, even these sophisticated models often reduce the analysis to one-dimensional cases, limiting the evaluation of spatial heat and humidity distributions in massive elements.
In contrast, the phenomenological approach treats concrete as a continuous medium and provides a macroscopic description of heat and mass transport phenomena. This strategy is more popular in engineering applications and is supported by numerous studies. A detailed review of these models and their capabilities is presented in [
15]. While this method is less computationally demanding, it still requires a deep understanding of the evolving thermal properties of concrete and access to advanced modelling tools to deliver reliable predictions.
As mentioned before, the challenges posed by early age thermal effects in concrete became evident during the 19th and 20th centuries, particularly with the construction of massive concrete dams. Landmark projects such as the Hoover Dam (1936) and the Grand Coulee Dam (1942) in the United States exposed the critical issue of excessive heat generated by cement hydration [
1,
2]. Today, similar problems persist in modern infrastructure, particularly in large-scale structural elements such as foundation blocks, bridge abutment walls, reactor containments, water tanks, and retaining walls [
17,
28,
29]. These elements, characterised by thick cross-sections, are collectively referred to as mass concrete due to their susceptibility to significant temperature rises during hydration.
The heat generated within these elements can reach up to 100 °C, giving rise to two major concerns. First, high internal temperatures must be limited, typically to 65–70 °C, to avoid delayed ettringite formation (DEF), which poses long-term risks to concrete durability. Second, non-uniform temperature distribution creates thermal gradients, as the core of the structure heats up and cools more slowly than the exterior. This mismatch in expansion and contraction causes internal restraint and tensile stresses, resulting in thermal cracking. In the case of external restraints, which are typical in building structures, tensile stresses due to this restraint arise during the cooling phase, as the maximum temperature reached in the element gradually decreases to ambient temperature. The formation of thermal cracks not only weakens the mechanical performance of the structure but also compromises watertightness, which is especially critical in hydraulic and containment structures. To mitigate this risk, construction guidelines often impose maximum allowable differences in temperature, both between the core and surface (commonly 15–20 °C) and between the peak and stabilised temperatures, to ensure concrete strains remain within acceptable limits [
2,
3]. Therefore, accurately predicting temperature rise in concrete during this critical early period is essential for achieving durable, crack-resistant structures. Numerous research efforts and real-world case studies have emphasised the importance of this issue and the limitations of existing modelling tools.
There is a large number of interdependent factors that influence temperature development during both the hydration process and the transfer of heat to the environment. These factors are complex, often nonlinear, and highly sensitive to variations in both material characteristics and external conditions. For clarity, they can be grouped into three main categories (
Figure 1).
The first group encompasses the properties of the concrete mix, including the type and quantity of cement, the use of mineral and chemical admixtures, the water-to-cement ratio, and particularly the type of aggregate. These factors primarily affect the amount of heat generated during hydration, but they also influence heat transfer characteristics—particularly through the aggregate, which plays a key role in determining the thermal conductivity of the concrete.
The second group includes casting and curing conditions, such as the initial temperature of the mix at placement, ambient temperature during construction, the casting sequence, duration and segmentation, the thermal properties of the formwork, and the type of curing or protection applied to exposed surfaces. Environmental factors like solar radiation and wind speed also fall into this category, as they significantly affect heat exchange between the concrete and its surroundings.
The third group involves geometric and boundary conditions, including the dimensions and thickness of the structural element, the nature of contact with the subgrade or adjacent materials, and the available surface area for heat dissipation. These aspects strongly influence how heat is retained or lost over time. All of these factors interact dynamically throughout the hydration and hardening process. Moreover, the kinetics of cement hydration, which themselves depend on the evolving temperature field within the concrete, add a layer of complexity. As a result, the accurate prediction of thermal behaviour in mass concrete requires not only detailed thermal modelling but also calibration of model parameters using experimental data and validation through real-time temperature monitoring.
Given the complexity of interacting factors influencing heat transfer in mass concrete, as well as the limitations of current analytical and numerical methods—many of which are either oversimplified or computationally prohibitive—there is a growing demand for novel, efficient, and reliable predictive approaches. Traditional simulations often require the use of advanced computational software, substantial processing power, and detailed description and calibration of material parameters, which significantly limit their practicality in real-time engineering applications [
15,
22,
30]. In this context, the use of machine learning (ML) offers a promising alternative [
31,
32,
33,
34]. By learning directly from data, ML models can capture complex nonlinear relationships among multiple variables governing temperature rise, temperature distribution, and thermal gradients. This is particularly valuable given that temperature development in mass concrete is inherently nonlinear, nonstationary, and affected by numerous interacting factors. While the problem can, in principle, be solved using the three-dimensional Fourier heat conduction equation with appropriate boundary conditions, such solutions require advanced numerical modelling and are computationally demanding. On the other hand, simplified physical estimations are feasible only under idealised adiabatic conditions, which neglect environmental heat exchange and therefore fail to represent real structures. These limitations highlight the need for alternative approaches that can provide sufficiently accurate predictions while accounting for diverse influencing technological and materials factors. ML methods may address this gap by offering a systematic framework for predicting early age thermal behaviour in mass concrete structures. Once an appropriate model is trained and validated on an expanded database, it could serve as a practical tool for engineers to estimate temperature development across a broad range of input parameters. Moreover, beyond prediction, ML can also help quantify the relative importance of technological and material factors, thereby supporting better-informed design decisions and more effective thermal control strategies.
It should be mentioned that in recent years, AI has introduced new perspectives and transformative approaches in the study of cementitious materials. Innovative numerical procedures and modelling techniques have been proposed, enabling the analysis of temperature and stress fields in mass concrete based on existing finite element simulations combined with the development of data-driven methods [
35]. For example, during the construction of a large-span cable-stayed bridge in China, Support Vector Regression (SVR) was successfully used to predict the heat of hydration in pile caps based on monitoring data. The model achieved high accuracy in forecasting temperature 2–3 days in advance and outperformed traditional BP neural networks [
36]. Similarly, deep learning (DL) techniques—such as Artificial Neural Networks (ANN), Recurrent Neural Networks (RNN), and Bidirectional Deep RNNs (BD-RNN)—have shown promise in predicting temperature rise in concrete with complex mix compositions. Enhanced with segmentation methods, optimisation algorithms, and cross-validation, these models achieved reliable results even when trained on relatively small datasets [
37]. In addition to thermal prediction, machine learning models are also used to address complex engineering problems, offering improved accuracy and efficiency [
34,
36,
37,
38,
39,
40,
41,
42,
43,
44].
Particularly useful and worthy of attention may be regression machine learning models, designed to predict continuous numerical values based on input data. Regression machine learning models are a class of algorithms designed to predict continuous numerical values based on input data. Unlike classification models, which assign data points to predefined categories, regression models learn the underlying relationship between input features and a continuous target variable. In the context of building materials engineering, such models can be especially valuable for forecasting key parameters in early age concrete behaviour. For example, predicting the maximum temperature inside mass concrete elements, estimating the rate of heat release from cement hydration, or modelling the development of temperature gradients during curing. Commonly used regression algorithms include linear regression, which assumes a straightforward linear relationship between inputs and outputs; decision trees and ensemble methods like Random Forest, which can capture complex, nonlinear patterns; Support Vector Regression (SVR), which is well-suited for high-dimensional data; and neural networks, which are capable of modelling intricate nonlinear dependencies due to their layered architecture [
33,
36]. Additionally, ensemble-based approaches such as gradient boosting combine multiple weak learners to create a more accurate and robust predictive model. Once trained on relevant datasets, these models can deliver rapid and precise predictions, reducing reliance on traditional, time-consuming physical simulations. Their ability to account for multiple interacting variables and nonlinear relationships makes them a powerful tool in modelling the thermal and durability performance of mass concrete structures.
Despite promising results, ML applications in the context of heat transfer prediction in mass concrete are still not widespread. Therefore, this paper proposes a machine learning-based framework for predicting temperature rise and thermal gradients induced by cement hydration and heat transfer to the environment. It should be mentioned that the proposed ML models are designed to predict the maximum temperature and temperature gradients in mass concrete. The thermal cracking risk, the most relevant aspect from a practical perspective, is not directly predicted; however, it can be evaluated based on the predicted peak temperature, the subsequent cooling to ambient conditions, and the magnitude of the thermal gradient. In practice, given the limiting tensile strain capacity of concrete, it is generally assumed that the temperature gradient should not exceed approximately 15–20 °C. Thermal gradients also provide the basis for stress calculations. Therefore, temperature prediction consistently serves as the foundation for assessing the potential risk of thermal cracking.
Three regression models, linear regression, decision tree, and XGBoost were trained and evaluated on simulated datasets that included concrete mix parameters and environmental conditions. The method is shown through a case study of a massive reinforced concrete wall, a structural element commonly used in bridge abutments, nuclear enclosures, lock walls, tanks, and retaining structures. These components are particularly vulnerable to early age cracking due to hydration heat accumulation, which may reduce their serviceability and durability.
The remainder of this paper is structured as follows. In the Data and Methods section, we describe the input data, the generation of simulation datasets, and the applied machine learning algorithms. This section also outlines the data preprocessing steps and training strategy. The Results section presents the model’s predictive performance and compares it with traditional simulation approaches. Finally, conclusions are drawn regarding the effectiveness, limitations, and potential of ML-based modelling for early age thermal behaviour in mass concrete temperatures.
3. Results
3.1. Model Performance Using Leave-One-Out Cross-Validation
To evaluate the accuracy of the predictive models, Leave-One-Out Cross-Validation (LOOCV) was used, as it provides reliable and stable results, particularly when working with a limited dataset. The evaluation outcomes are summarised in
Table 5, presented separately for each target variable: Tmax and ΔT core_surface.
Analysis of the results indicates that the XGBoost model delivered the best overall predictive performance, achieving the highest coefficient of determination (R2) values: 0.997 for Tmax and 0.998 for ΔT core_surface. Additionally, it recorded the lowest values of MAE, MSE, and MAPE, confirming its high level of accuracy and robustness. The linear regression model also performed well, with R2 values of 0.983 for Tmax and 0.975 for ΔT core_surface, although its absolute errors were slightly higher compared to XGBoost.
The decision tree model, while still showing relatively high R2 values (above 0.92 for both target variables), exhibited substantially larger prediction errors. This suggests a lower generalisation capability in this particular application, possibly due to overfitting or limited complexity in capturing the underlying relationships within the data.
3.2. Evaluation Using 5-Fold Cross-Validation
To further evaluate the generalizability of the models, five-fold cross-validation was conducted. For each model, the mean coefficient of determination (R2) and its standard deviation were calculated, enabling an assessment of the models’ predictive stability with respect to different training and test data splits.
As shown in
Table 6, the XGBoost model achieved the highest mean R
2 value of 0.9665, along with a moderate standard deviation of 0.0358, indicating both high predictive accuracy and good generalisation capability. The linear regression model yielded a slightly lower mean R
2 of 0.9661, but with the lowest standard deviation (0.0105), suggesting excellent consistency and stability across cross-validation iterations.
In contrast, the decision tree model recorded the lowest mean R2 value (0.8985) and the highest standard deviation (0.0569), reflecting greater variability in results and reduced prediction stability compared to the other models.
3.3. Correlation Between Input and Target Variables
To gain deeper insight into the influence of individual input features on the target variables, a linear correlation analysis was conducted, with the results presented in
Figure 6. The correlation matrix displays the Pearson correlation coefficients for each feature-target variable pair, specifically for Tmax and ΔT core_surface.
The analysis revealed that wall thickness had the strongest correlation with ΔT core_surface, with a Pearson coefficient of 0.91, indicating a very strong positive relationship. This variable also showed a moderate correlation with Tmax, at 0.44. For Tmax, environmental conditions, specifically the initial concrete temperature and ambient temperature, also demonstrated a strong positive correlation, each with a coefficient of 0.78, highlighting their significant influence on predicting the maximum internal temperature of concrete.
In contrast, slag content exhibited a negative correlation with both Tmax (−0.41) and ΔT core_surface (−0.32), confirming that higher slag content tends to reduce the thermal effects associated with cement hydration. Other variables, such as concrete class and total binder content, showed low correlation coefficients (approximately 0.11–0.13), suggesting a limited impact on the target variables within the scope of the analysed dataset. It should be clarified that the concrete class is directly related to the amount of cement binder. Based on practical experience, experimental research, and CIRIA C766 guidelines, it is known that higher cement content leads to increased self-heating, with approximately a 10 °C rise under adiabatic conditions per 100 kg of cement. However, this effect is less pronounced in our dataset due to the relatively small variation in concrete classes and binder content, which, however, reflects actual engineering applications. At the same time, larger variations were considered in wall thickness and external conditions, which had a greater influence on the results. Therefore, for the analysed dataset, the feature importance results showing minimal contribution of concrete class and binder content, particularly for the temperature difference between the core and the surface of the element (ΔT core_surface), appear reasonable.
3.4. Feature Importance Analysis
Next, the impact of individual input variables on the prediction quality was assessed using the Permutation Importance analysis method. The results are presented separately for each of the three regression models used in
Figure 7,
Figure 8,
Figure 9,
Figure 10,
Figure 11 and
Figure 12.
The permutation importance analysis conducted across all three models, Linear Regression, Decision Tree, and XGBoost, for the two target variables (Tmax and ΔT core_surface) reveal consistent patterns in feature relevance. Wall thickness consistently stands out as the most influential predictor for ΔT core_surface in all models, showing particularly high importance values in both the Decision Tree and XGBoost models. For Tmax, feature importance varies depending on the model. Linear Regression highlights wall thickness, slag content, initial and ambient temperature as key predictors.
In contrast, the Decision Tree model identifies ambient temperature as the most dominant factor, while XGBoost assigns the greatest importance to environmental conditions overall. Across all models and target variables, certain features such as total binder, concrete class, and environmental conditions consistently show small importance, suggesting limited predictive value in the studied context.
Overall, these findings highlight the critical role of thermal and geometric parameters, particularly wall thickness, in determining both maximum temperature and thermal gradients within concrete structures. However, it is important to emphasise that although the results are generally consistent with existing knowledge about the influence of technological and material factors on concrete hardening temperatures, they are based on the specific dataset used in this analysis.
3.5. Prediction vs. Actual and Residuals Analysis
To evaluate prediction accuracy and model calibration, scatter plots comparing predicted versus actual values, along with residual histograms derived from Leave-One-Out cross-validation (LOOCV) results, were analysed. This approach enables the identification of potential systematic errors and provides insight into the distribution of prediction errors relative to the observed data. For the linear regression model (
Figure 13,
Figure 14,
Figure 15 and
Figure 16), a high level of agreement was observed between the actual and predicted values, as illustrated by the scatter plots for both target variables. Predictions for ΔT core_surface show very good calibration relative to the ideal line (y = x), while for Tmax, a slight dispersion is visible at higher temperature values. The histograms of residuals indicate that, for ΔT core_surface, the error distribution is relatively symmetric and centred around zero, confirming the absence of significant bias. In contrast, for Tmax, a predominance of positive residuals is evident, suggesting a tendency of the model to overestimate the maximum temperature.
In the case of the Decision Tree model (
Figure 17,
Figure 18,
Figure 19 and
Figure 20), prediction accuracy was noticeably lower compared to the other models. The scatter plots reveal greater deviations from the ideal line (y = x), confirming the presence of significant prediction errors, particularly at higher actual values.
The residual histogram for Tmax displays a bimodal distribution, indicating instability in the model’s predictions—frequently alternating between overestimation and underestimation. This bimodal shape is a consequence of the discrete nature of decision trees, which generate predictions in distinct value ranges, leading to clustered residuals. For ΔT core_surface, the residuals show a more uniform distribution; however, the deviation from a normal distribution suggests that the model struggles to fully capture the underlying data patterns.
The XGBoost model (
Figure 21,
Figure 22,
Figure 23 and
Figure 24) demonstrated the highest predictive accuracy among all the algorithms evaluated. The scatter plots for both target variables show a near-perfect alignment with the y = x line, confirming excellent calibration of the model. The residual histograms exhibit a symmetrical, narrow distribution centred around zero for both variables, indicating high prediction stability and precision. In the case of ΔT core_surface, the errors are particularly minimal and evenly distributed, further confirming the model’s strong ability to capture the underlying relationships in the data.
3.6. Residual Diagnostics
The residual analysis was complemented with statistical tests to confirm the visual findings. Normality of the residual distribution was assessed using the Shapiro–Wilk and D’Agostino–Pearson tests, while potential systematic error (bias) was evaluated with a one-sample t-
test (testing whether the residual mean equals zero). The results are summarised in
Table 7.
For the LinearRegression model, the residuals did not significantly deviate from normality, and their mean was close to zero, indicating no detectable bias. For the DecisionTree model, the tests revealed a deviation from normality, although the residual mean was not significantly different from zero. The bimodal shape of the residual histogram reflects the discrete nature of tree-based predictions. In contrast, for the XGBoost model, the residuals followed a distribution consistent with normality and were centred around zero, confirming that the ensemble method effectively smooths the irregularities observed in a single tree.
4. Conclusions
This study developed and compared three machine learning regression models to predict the thermal response of massive concrete, focusing on two key parameters: the maximum temperature (Tmax) and the temperature difference between the concrete core and surface (ΔT core_surface). The models were trained on the data, incorporating both cross-section dimensions of the analysed wall, concrete mix properties and environmental conditions.
The results demonstrated that the XGBoost model achieved the highest predictive accuracy, with coefficient of determination (R2) values of 0.997 for Tmax and 0.998 for ΔT core_surface, outperforming both the linear regression and Decision Tree models. Residual analysis and scatter plots confirmed the high stability and excellent calibration of the XGBoost predictions. The linear regression model also delivered satisfactory performance, particularly for Tmax prediction, with a low mean absolute percentage error (MAPE) of 3.01%. However, for ΔT core_surface, the error increased significantly (MAPE = 8.32%), suggesting the model’s limited capacity to capture nonlinear relationships within the data.
The Decision Tree model showed the lowest predictive performance among the three. Its residual distributions revealed signs of instability and greater error dispersion, while its R2 values and error metrics (MAE, MAPE) were notably worse than those of the other models.
These findings confirm that properly configured machine learning algorithms, especially gradient boosting methods like XGBoost, can serve as powerful tools for analysing thermal behaviour and heat transfer in massive concrete structures. Hence, the ML models may provide a systematic framework for predicting early age thermal behaviour in mass concrete structures. Once an appropriate ML model is selected and tested, and the database is expanded, it could form the basis of a practical tool for users to predict temperature development based on a broad dataset of input parameters. This study primarily explores the potential of such an approach, as no comprehensive attempts of this kind currently exist. This methodology may also allow engineers to assess the relative importance of technological and material factors influencing temperature development in the specific concrete structure, supporting better-informed design decisions and thermal control strategies.