A Comparative Study on Modeling Methods for Deformation Prediction of Concrete Dams

Deng, Xingsheng; Zhu, Xu; Tang, Zhongan

doi:10.3390/modelling6040154

Open AccessArticle

A Comparative Study on Modeling Methods for Deformation Prediction of Concrete Dams

by

Xingsheng Deng

^1,*

,

Xu Zhu

¹ and

Zhongan Tang

²

¹

The School of Aeronautical Engineering, Changsha University of Science & Technology, Changsha 410114, China

²

Hunan Geospatial Information Engineering and Technology Research Center, The Third Surveying and Mapping Institute of Hunan Province, Changsha 410119, China

^*

Author to whom correspondence should be addressed.

Modelling 2025, 6(4), 154; https://doi.org/10.3390/modelling6040154

Submission received: 28 September 2025 / Revised: 19 November 2025 / Accepted: 26 November 2025 / Published: 28 November 2025

(This article belongs to the Section Modelling in Engineering Structures)

Download

Browse Figures

Versions Notes

Abstract

A series of machine learning models have been proposed in the past decades, but it remains undetermined which is optimal for specific applications. Establishing mathematical prediction models for dam deformation and structural health monitoring based on environmental factors is crucial to dam safety assessment. This paper takes Zhexi Dam, a concrete gravity-type dam in China, as an example to conduct a comparative study on the performance of deformation prediction models. The physical factors that cause dam deformation include the air temperature, reservoir water temperature, reservoir water level, and dam aging. The correlations between environmental factors and dam deformation are evaluated by maximum information coefficient (MIC) and Pearson, Kendall, and Spearman correlation coefficients. The monitoring data reveal that the deformation has a high correlation with environmental factors. A number of the most representative monitoring points from hundreds of monitoring points are selected for modeling. For comparison, seven modeling methods, i.e., multiple linear regression (MLR), gradient boosting decision tree (GBDT), random forest (RF), support vector machine (SVM), and long short-term memory network (LSTM), weighted average model (WAM) of the above five algorithms, and Transformer-based neural network, are introduced to establish dam deformation prediction models. The experimental results indicate that both the weighted average model and the Transformer-based neural network achieve consistently high accuracy, showing strong agreement with the monitoring data generally. However, in scenarios involving small sample sizes, the SVM model demonstrates relatively superior predictive performance compared to the other models.

Keywords:

dam deformation prediction; statistical modeling; Transformer-based network; support vector machine; weighted average model

Graphical Abstract

1. Introduction

Dams play a crucial role in the socioeconomic development of a country by providing water storage for human consumption, the irrigation of vast agricultural fields, flood control, water resource management, and hydroelectricity generation. As the number of dams increases, their working and operating conditions become increasingly complex over time. Concrete dams not only have to bear various dynamic and static loads but also suffer the effects of various sudden disasters and naturally harsh environments [1]. The performance of these structures under operational and environmental loads may decrease over time because of age-related deterioration, floods, and other factors. As dams age and deteriorate, their structural health conditions become more serious. Consequently, dams harbor numerous latent hazards, leading to the proliferation of diseased and hazardous structures. If concrete dams are not well managed and maintained, failure may lead to economic and life losses. The likelihood of dam failure has increased, posing an imminent threat to the lives and property of downstream communities. The potential collapse of a dam can result in catastrophic damage. In the past century, dam break events of varying magnitudes have occurred in almost all countries. Since the 1960s, a series of dam failures have occurred worldwide. Examples of disasters that have occurred around the world include the Canyon Lake dam, which claimed 236 people [2], and the Austin dam failure, which claimed 78 people [3]. In 1975, owing to the impact of a typhoon, Banqiao and Shimantan dams located in Henan Province in central China collapsed one after the other, causing millions of people to lose their homes and resulting in a documented death toll exceeding 26,000 [4,5]. These disasters and their consequences can be minimized if appropriate dam safety monitoring and surveillance strategies are implemented.

Over the past two decades, machine learning has undergone a revolutionary development from theoretical exploration to large-scale application. At the beginning of the 21st century, traditional algorithms such as support vector machines (SVMs), random forests (RFs), and gradient boosting decision trees (GBDTs) dominated the field of machine learning. However, these methods exhibited significant limitations, particularly in their reliance on manual feature extraction. In 2006, Geoffrey Hinton introduced the concept of deep neural networks, marking the dawn of a new era in deep learning [6]. In 2012, AlexNet’s outstanding performance in the ImageNet competition validated the exceptional feature learning capabilities of deep neural networks [7], further driving the synergistic development of GPU computing power, Internet data, and machine learning algorithms. The proposal of the Transformer architecture in 2017 revolutionized the paradigm of natural language processing [8]. Building upon this foundation, series of models such as BERT [9] and GPT [10] have propelled pretrained large models into the mainstream. Application scenarios have expanded beyond computer vision and speech recognition to encompass areas such as medical diagnosis, autonomous driving, financial risk control, and time series forecasting. Over the past few decades, machine learning has developed rapidly, giving rise to numerous models. Although various machine learning models originated from different backgrounds, they can all be targeted at the same application field. How do different machine learning models perform in the same practical application? This is a question that scholars have been exploring for a long time. This paper conducts a comparative study using the prediction of horizontal and vertical displacement deformations in gravity-type dam as examples. Our findings indicate that the prediction model based on the Transformer architecture demonstrates certain advantages and could serve as a new deformation prediction model for concrete dams.

Dams are constantly subjected to environmental, hydraulic, and geomechanical factors that cause horizontal and vertical displacements of their structures. When these displacements exceed critical limits, disastrous consequences may occur. To mitigate such risks or detect anomalies that could indicate potential failure, early warning systems must be implemented. It is necessary to monitor the structural response of dams under various operation conditions [11,12,13]. Monitoring the deformation of large dams is of vital importance for avoiding catastrophic loss of infrastructure and life [14]. Displacement is the most intuitive indicator of the structural condition of concrete dams among all measured parameters. The deformation of a dam is typically influenced by multiple causal factors, such as changes in upstream reservoir water levels, variations in air or dam body temperature, and time-dependent effects. To compare and improve the predictive and interpretive capabilities of the models, various modeling techniques should be utilized to develop mathematical monitoring models on the basis of the above causal factors. The traditional multiple linear regression (MLR) method considers the dam response as a linear explicit function of causal factors. However, causal mechanisms are usually nonlinear and dynamic in practice [15]. Machine learning algorithms are capable of characterizing the relationship between mechanical parameters and structural response due to their strong nonlinear mapping ability [16]. Artificial intelligence algorithm-integrated machine learning models, such as SVMs, RFs, long short-term memory networks (LSTMs), artificial neural networks (ANNs), extreme learning machines (ELMs), and regression trees (RTs), have been used in structural health monitoring [17,18,19,20,21,22]. With optimal kernel functions, the prediction performances of machine learning models are generally better than those of the MLR model. However, the disadvantage of machine learning models is that they are usually considered black box models; their application in the field of dam safety monitoring focuses mainly on prediction, and their causal interpretation ability is usually ignored [23].

The primary objective of this paper is to compare the performance of dam deformation prediction models, thereby providing a reference for model selection in dam deformation modeling. The main contributions of this paper are as follows: (1) the correlations between the causal environment factors and dam deformation time serials are revealed. The maximum information coefficient (MIC) [24] and Pearson, Kendall, and Spearman correlation coefficients [25] are introduced to evaluate the correlations between each environmental factor and dam deformation. (2) On the basis of above correlations, seven machine learning modeling techniques, MLR, SVM, RF, GBDT, LSTM, the weighted average of the above five models (WAM), and Transformer-based network, are employed to establish dam deformation time series prediction models. The field monitoring data validated the feasibility and reliability of the prediction models for dam deformation monitoring. (3). The method and experimental results have certain guidance and reference value for the construction of dam deformation prediction models. The Transformer-based model is suggested as a new dam deformation prediction model. The rest of this paper is organized as follows: Section 2 examines causal factors and their correlation with dam deformation, and Section 3 introduces seven methods for modeling dam deformation. Section 4 contains a series of experiments and results. Concluding remarks are presented in Section 5.

2. Environmental Factors and Correlations

2.1. Environmental Factors

On the basis of machine learning algorithms, this paper aims to establish an optimal accurate model for concrete dam displacement monitoring, which is different from the HST [26], HEST, or HETT [15] models. The general framework of the proposed model for dam displacement is

\hat{y} = {\hat{y}}_{T} + {\hat{y}}_{h} + {\hat{y}}_{p}

(1)

where

\hat{y}

is the displacement of a dam,

{\hat{y}}_{T}

is the temperature component,

{\hat{y}}_{h}

is the hydraulic component caused by water pressure, and

{\hat{y}}_{p}

is the time component. The temperature component is

{\hat{y}}_{T} = b_{0} + \sum_{i = 1}^{7} b_{i} T_{i}

(2)

where

b_{0}

is the constant term and where

b_{i}

is the regression coefficient.

T_{i}

is the temperature factor. Since the deformation sequence of a dam exhibits hysteresis characteristics under the influence of air temperature, six atmospheric temperature factors and one average reservoir water temperature (T_water in Table 1) factor need to be considered. The nonequidistant piecewise average values of the previous air temperature at the dam site are utilized as factors for temperature deformation. On the basis of the displacement observation date, six atmospheric temperature factors are derived from the mean previous temperatures prior to the current observation date, which are 1~10 days, 11~20 days, 21~35 days, 36~50 days, 51~70 days, and 71~90 days, corresponding to the environmental factors T_1–10, T_11–20, T_21–35, T_36–50, T_51–70, and T_71–90 in Table 1, respectively. The length of the previous air temperature period used is determined according to the dam type and dam body thickness. The hydraulic component is

{\hat{y}}_{h} = c_{0} + \sum_{j = 1}^{4} c_{j} H^{j}

(3)

where

c_{0}

is a constant term and where

c_{j}

is the regression coefficient.

H_{j}

represents the power of the water depth

H

of the reservoir at the dam site, corresponding to environmental factors H¹–H⁴ in Table 1. The time component is

{\hat{y}}_{p} = d_{0} + \sum_{k = 1}^{8} d_{k} f_{k} (t_{1})

(4)

where

d_{0}

is a constant term and where

d_{k}

is the regression coefficient.

f_{k} (t_{1})

are the eight time-effect factors, i.e.,

\ln (t_{1} + 1)

,

1 - \exp (- t_{1})

,

t_{1} / (t_{1} + 1)

,

t_{1}

,

{t_{1}}^{2}

,

{t_{1}}^{0.5}

,

{t_{1}}^{- 0.5}

,

1 / (1 + \exp (- t_{1}))

, corresponding to environment factors θ1–θ8 in Table 1, respectively.

t_{1}

is one percent of the days from the initial date to the observation date.

2.2. Correlations

There is a complicated nonlinear functional relationship between dam deformation and environmental factors. To analyze the correlation between each causal environmental factor and dam deformation, the Pearson, Kendall, and Spearman correlation coefficients [25] and the MIC [24] are introduced to calculate the correlation index between them. The correlation coefficient indices listed in Table 1 are calculated from all five years of observation datasets from Zhexi Dam, Hunan Province, China.

Table 1 shows that the selected environmental factors are strongly correlated with the horizontal displacement, vertical displacement, and crack width of the dam. The factor T_1–10 has the highest correlation with horizontal displacement, whereas T_21–35 has the strongest correlation with vertical displacement. The crack widths are attributed primarily to time-dependent factors, with a weak correlation with other contributing factors.

3. Models and Methods

Machine learning aims to find a universally suitable mapping of inputs and outputs in a given sample space so that, when parameters outside the sample space are input into the model, accurate predictions can also be obtained.

3.1. Transformer-Based Model

Transformer is a new sequence transformation model architecture proposed by the Google team [27]. Transformer is entirely constructed based on the attention mechanism, discarding the traditional recurrent neural network (RNN) and convolutional neural network (CNN), and achieving a highly parallelized processing mechanism of the encoder–decoder structure. The Transformer consists of a multi-head self-attention mechanism and a location-aware feedforward network. Each layer includes residual connections and layer normalization to enhance training stability. Transformer has achieved significant performance improvements in machine translation tasks, demonstrating excellent generalization capabilities on both large and limited data. Transformer has become the foundation of large model architectures, driving significant progress in the fields of natural language processing and computer vision. The Transformer model is useful for performing time series prediction with dam deformation data due to its ability to capture temporal dependencies and long-term dependencies in the data. The network structure of a Transformer-based model for dam deformation time series prediction is shown in Figure 1.

The ‘positionEmbeddingLayer’ allows for the encoding of the position information of each element within the input sequence. By incorporating position embedding, the model can learn to differentiate between different time steps and capture time dependencies within the data. The ‘selfAttentionLayer’ allows for the model to weigh the importance of different elements within the input sequence. This allows the model to capture the dependencies between all elements in the input sequence and learn the relationships between them. Self-attention mechanisms are effective at capturing long-term dependencies in the time series data, as they can establish connections between distant time steps, understanding patterns that may have a delayed impact on future outcomes. This is especially true in the monitoring of dam deformation, because the impact of the changes in reservoir water level and temperature on dam deformation usually lags behind for a period of time, and this time can even reach as long as three months. The ‘indexing1dLayer’ allows for the extraction of data from the specified index of the input data. This allows the network to perform regression on the output of the ‘selfAttentionLayer’.

The Transformer model is configured with a single-channel input and processes sequences of up to 128 tokens in length. The multi-head self-attention mechanism consists of 4 parallel attention heads, each with key and query dimensionalities of 128, enabling simultaneous focus on different segments of the input sequence. For training purposes, the model is set to run for a maximum of 3000 epochs using a mini-batch size of 32 and is optimized using the stochastic gradient descent with momentum (SGDM) algorithm with a learning rate of 0.01. To improve generalization, the training data is randomly shuffled at the beginning of each epoch, and gradient clipping is applied with a threshold of 1 to ensure stable and efficient convergence. Training is configured to execute in an ‘auto’ mode, automatically utilizing a GPU when available for hardware acceleration, or falling back to CPU otherwise.

3.2. Multiple Linear Regression

Linear regression models assume that the relationships between the data are linear. Multiple linear regression (MLR) modeling does not rely on prior knowledge, and the regression results are only the approximate fit of the true relationship between the variables [28]. The MLR is as follows:

y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + \dots + β_{p} x_{p} + ξ

(5)

where

ξ

is the model residual,

E (ξ) = 0

and

D (ξ) = σ^{2}

;

y

represents the horizontal and vertical displacement or crack width of the dam body;

x_{i}

represents the environmental factor; and

β

represents the model coefficient. The matrix form of Equation (5) is

Y = X β + ξ

(6)

where

Y = (\begin{matrix} y_{1} \\ y_{2} \\ \dots \\ y_{n} \end{matrix})

,

X = (\begin{matrix} 1 & x_{11} & x_{12} & \dots & x_{1 p} \\ 1 & x_{21} & x_{22} & \dots & x_{2 p} \\ \dots & \dots & \dots & \dots & \dots \\ 1 & x_{n 1} & x_{n 2} & \dots & x_{n p} \end{matrix})

, and

β = (\begin{matrix} β_{0} \\ β_{1} \\ \dots \\ β_{p} \end{matrix})

. Then, the least squares estimate of

β

is

\hat{β} = {(X^{T} X)}^{- 1} X^{T} Y

(7)

The multiple linear regression prediction model is:

Y_{o} = X_{o} \hat{β}

(8)

If an observation of the environment factor matrix

X_{o}

is given, a prediction dam deformation

Y_{o}

will be obtained.

3.3. Support Vector Regression

Support vector regression is a common prediction technique that maps low-dimensional features to the space of high-dimensional features through a kernel function. The SVM follows the principle of empirical risk minimization, and its structure and algorithm are relatively concise [29], which is suitable for small sample cases of predictive modeling problems. For large datasets, the SVM algorithm is very expensive in terms of time, CPU and RAM consumption because of the process of calculating weights. Assuming that the dam deformation is

f (x)

, which is a function of the environment factor

x

, the support vector regression model of the dam deformation is as follows:

f (x) = \sum_{i = 1}^{s} w_{i} K (x, x_{i}) + b

(9)

where

x

is the environment vector,

s

is the number of support vectors,

w_{i}

is the coefficient term,

b

is the constant term to be solved, and

K (x, x_{i})

is the kernel function. The Gaussian radial basis kernel function is applied as follows:

K (x, x_{i}) = \exp (- \frac{{‖x - x_{i}‖}^{2}}{2 σ^{2}})

(10)

Mathematical models ((9) and (10)) can solve unknown parameters in the theoretical system of the least squares criterion, which is applied by many conventional statistical models.

3.4. Random Forest

Random forest is a statistical learning theory that uses the bootstrap resampling method from sample sets and then combines the tree predictors via majority voting so that each tree is grown via a new bootstrap training set [30]. RF is widely applied in many fields because of its good tolerance of noisy data and due to the law of large numbers it does not overfit. RF regression is an ensemble learning algorithm that demonstrates high simulation robustness due to its ability to integrate many regression trees. It combines tree predictors, with each tree depending on the values of a random vector sampled independently. The generalization error for the RF converges toward a limit when the number of regression trees increases. The RF prediction model can be expressed as

d (x) = f (x_{i}, M t r y, N t r e e) \begin{matrix} i = 1, 2, \dots, k \end{matrix}

(11)

where

d (x)

is the dam displacement;

f

is the special function relationship of the RF regression algorithm;

x_{i}

is the

i - th

environment factor;

k

is the number of total factors relative to dam displacement;

M t r y

is the number of random feature factors; and

N tree

is the number of regression trees in the RF. The construction of the RF regression tree mainly adopts a named CART decision tree algorithm [31,32]. Let

f (x)

be the decision tree of the CART regression model:

f (x) = \sum_{i = 1}^{M} C_{i} I \begin{matrix} (x \in R) \end{matrix}

(12)

where

M

is the total number of leaf nodes;

C_{i}

is the output value of the

i - th

leaf node;

R

is the sample set of the leaf node; and

x

is a sample in the dataset. When

x \in R

, the value of

I

is 1; otherwise,

I

is 0.

3.5. Gradient Boosting Decision Tree

GBDT is an iterative algorithm that uses decision trees to arrive at a final solution by aggregating the conclusions of multiple trees. The GBDT algorithm employs the principle of random sampling for sample selection and uses a majority-compliant minority voting mechanism during decision making. The core idea of GBDT is that each tree learns the residual of the sum of all previous trees’ predictions along the negative gradient direction. By adding the residual to the forecast value, it is possible to obtain a precise value gradually. GBDT consists of an upper network, which predicts the target value directly, and a lower network, which predicts the residual between the output value of the upper network and the target value. In this way, the residual of the upper network can be reduced in the gradient direction to improve the fitting accuracy of the model to real observations [33].

The GBDT model is

F_{m} (x) = F_{m - 1} (x) + ν \cdot γ_{m} h_{m} (x), 0 < ν \leq 1

(13)

where

m

is the number of iterations,

ν

is the learning rate, and

h_{m} (x)

is a base learner tree fitted to pseudoresiduals, which is trained by using the training datasets

{\{(x_{i}, r_{i m})\}}_{i = 1}^{n}

.

r_{i m}

is the pseudoresidual:

r_{i m} = - {[\frac{\partial L (y_{i}, F (x_{i}))}{\partial F (x_{i})}]}_{F (x) = F_{m - 1} (x)}

(14)

Computing the multiplier

γ_{m}

by solving the following one-dimensional optimization problem,

γ_{m} = \underset{γ}{\arg \min} \sum_{i = 1}^{n} L (y_{i}, F_{m - 1} (x_{i}) + ν \cdot γ h_{m} (x_{i}))

(15)

where

L (y, F (x)) = \frac{1}{2} {(y - F (x))}^{2}

is a differentiable loss function.

3.6. Long Short-Term Memory Network

The LSTM architecture has been proposed to solve the well-known vanishing gradient problem [34] with recurrent neural networks. Currently, the most widely used configuration for LSTM networks is that proposed by Graves and Schmidhuber [35], generally known as vanilla LSTM. It consists of a set of recurrently connected memory blocks, each containing one or more self-connected memory cells and three multiplicative units: input, output, and forget gates. A vanilla LSTM network comprises a set of input and output units, as well as a number of hidden units ranging from 64 to 512. In all the cases, training was conducted with the mean squared error as the loss function and the Adam optimizer. The memory cells remember values over arbitrary time intervals, and the three gates regulate the flow of information through the cell. While several other variations of the LSTM architecture have been proposed over the years, the results of a large-scale study conducted by Gref et al. [36] showed that none of the eight investigated LSTM variations significantly improved the performance of the vanilla LSTM [37]. The output

O_{n + 1}

of the LSTM regression network at epoch

n + 1

is

O_{n + 1} = W_{O} [\begin{matrix} h_{n} & x_{n + 1} \end{matrix}] + b_{O}

(16)

where

W_{O}

is the weight,

b_{O}

is the threshold,

x_{n + 1}

is the input at epoch

n + 1

, and the hidden unit

h_{n}

is

h_{n} = O_{n} ⊙ \tanh (c_{n})

(17)

where

O_{n}

is the output of the LSTM regression network at epoch

n

and where

c_{n}

is the cell status at epoch

n

:

c_{n} = f_{n} ⊙ c_{n - 1} + i_{n} ⊙ \tanh (W_{c} [\begin{matrix} h_{n - 1} & x_{n} \end{matrix}] + b_{c})

(18)

where

f_{n}

is the output of the forget gate at epoch

n

,

c_{n - 1}

is the cell status at epoch

n - 1

,

i_{n}

is the output of the input gate at epoch

n

,

W_{c}

is the weight, and

b_{c}

is the threshold.

f_{n}

and

i_{n}

are as follows:

f_{n} = s i g m o i d (W_{f} [\begin{matrix} h_{n - 1} & x_{n} \end{matrix}] + b_{f})

(19)

i_{n} = s i g m o i d (W_{i} [\begin{matrix} h_{n - 1} & x_{n} \end{matrix}] + b_{i})

(20)

where

W_{f}

and

W_{i}

are weight matrices and where

b_{f}

and

b_{i}

are thresholds.

3.7. Weighted Average Model (WAM)

The weighted average model solution is calculated by

W A M = \frac{1}{\sum_{i = 1}^{n} w_{i}} \sum_{i = 1}^{n} w_{i} \cdot p_{i}

(21)

where

n

is the number of models,

p_{i}

is the prediction value of each model, and

w_{i}

is the weight of the prediction value of each model.

w_{i} = δ_{i}^{- 2}

(22)

where

δ_{i}

is the root mean squared error (RMSE) of the prediction from each model.

3.8. Prediction Accuracy Index

All the modeling algorithms are programmed in Python 3.11 64-bit and run on a desktop personal computer with an Intel i7 2.90 GHz CPU and 32 GB of RAM. The prediction accuracy of each method is evaluated by the correlation coefficient and RMSE

δ

.

δ = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(f (x_{i}) - y_{i})}^{2}}

(23)

where

f (x_{i})

are the model predictions and where

y_{i}

are the observations of dam deformation.

4. Experiments and Results

4.1. Zhexi Dam

Zhexi Dam is located in the median reaches of the Zishui River, Anhua County, Hunan Province, China. The hub of the Zhexi hydropower station is composed of a concrete overflow dam in the riverbed, an inclined ship lift on the left bank, and a power plant on the right bank. The project started in July 1958 and began generating electricity in January 1962. The total reservoir capacity is 3.57 billion m³. The length of the dam crest is 330 m, the elevation of the dam crest is 174.25 m, and the maximum dam height is 104.25 m. The dam consists of a single pier head dam in the overflow section and a wide joint gravity dam in the nonoverflow section, and the total width of the overflow dam is 146 m. The nonoverflow section on the left bank consists of five gravity dam monoliths, each 15 m wide. On the right bank, the nonoverflow section comprises seven dam monoliths, all with a width of 15 m. The total concrete volume of the dam amounts to 658,100 m³. Zhexi Dam is shown in Figure 2. The distributions of horizontal displacement monitoring points and vertical displacement monitoring points are shown in Figure 3.

4.2. Environmental Factors Data

The lowest temperature was recorded between January and February, while the average temperature remained above 20 °C from May to September. The highest temperatures were observed during July and August, with extreme maximum temperatures reaching approximately 40 °C. The water level of a reservoir is closely associated with precipitation patterns. In the Zhexi Basin, annual rainfall is primarily concentrated between April and August, with May and June being the peak plum rain seasons. May and June experience the highest levels of rainfall, averaging approximately 1560 mm annually, with a maximum daily rainfall of 235 mm. The air temperature and upstream lake water level from the year 2016 to 2021 are shown in Figure 4.

4.3. Data Preprocessing

Before conducting dam deformation modeling, it is necessary to address the issue of environmental factors in the dataset that vary greatly in magnitude and have disparate positive and negative values, which may impact model parameters. Thus, the environmental factor data are all normalized to [0, 1]. The normalization formula is:

x_{01} = \frac{x - x_{\min}}{x_{\max} - x_{\min}}

(24)

where

x_{01}

is the normalized variable and where

x_{\max}

and

x_{\min}

are the maximum and minimum values of the variable

x

to be normalized, respectively. The process of data normalization not only enhances the comparability of various environmental factors but also expedites the modeling procedure and increases the prediction precision. After data normalization processing, the steps for deformation modeling of dams via machine learning are as follows:

(1): Forming the training datasets: The observational data covers a five-year period, with monthly observations collected over 60 cycles, offering a consistent and comprehensive view over time. The normalized datasets are alternately divided into two parts: training datasets and prediction datasets. Half of the total datasets were selected as training datasets, and the other half were selected as prediction datasets for accuracy validation. The datasets include input variables and output variables. The dam deformations are used as the output variables, and the corresponding environment factors are used as model input variables.
(2): Training the learning machine: Initialize the parameters of the learning machine, perform the training iteration process, wait for the termination criterion to be met, and then obtain the regression model parameter.
(3): Dam deformation prediction: The environment factor data are input into the trained machine learning regression model, the corresponding dam deformation is calculated, and the prediction accuracy and correlation coefficients are calculated.

The hyperparameters of machine learning directly affect the prediction model accuracy. In our experiments, the optimal value is determined by increasing or decreasing one of the hyperparameters and comparing the prediction accuracy and efficiency. For the Transformer-based network, the maximum position is set to 128, the number of heads is 4, the number of key channels is 128, the learning rate is set to 0.01, and the maximum number of iteration epochs is 3000. For RF regression, the number of regression trees is set to 400, the maximum feature is set to 19, the maximum depth is set to 3, the minimum number of samples is split into 2, and the minimum number of samples is set to 1. For the SVM, the RBF is chosen as the kernel function, the gamma in the kernel function is set to 0.2, the regularization parameter C is set to 10, and the epsilon-tube and tolerance of the termination criterion are set to 0.1. For GBDT, the number of trees is set to 200, the learning rate is 0.1, the maximum number of features is zero, and the maximum depth is 4.

4.4. Horizontal Displacement

The maximum horizontal displacement upstream reaches 7.86 mm, whereas the maximum downstream horizontal displacement reaches 17.49 mm, resulting in an annual variation of up to 19.06 mm. Notably, the general overflow dam section and elevator shaft dam section exhibit significant amplitude variations, whereas the left bank gravity dam section and right bank inlet dam section demonstrate relatively smaller amplitude variations. Importantly, these observations align closely with both the height and structural characteristics of the respective dams.

The law of horizontal displacement variation of the dam crest is as follows: when the reservoir water level rises, the dam moves downstream, and when it decreases, the dam springs back upstream. Similarly, with a lag of approximately one month, when the temperature increases, the dam shifts upstream, and when it decreases, it shifts downstream. The amplitude of the water level displacement ranges from 5.96 to 9.52 mm, accounting for 32.9% to 79.5% of the total horizontal displacement, whereas the temperature displacement component ranges from 1.33 to 9.27 mm, accounting for 14.3% to 49.8% of the total horizontal displacement. The hydraulic displacement components in sections such as buttress dams and inlet dams are greater than the temperature displacement components, whereas the elevator shafts and right guide walls have larger temperature displacement components than the hydraulic pressure components. The larger variations in reservoir water levels corresponded with larger displacements in dam water levels.

The horizontal displacement of the dam crest exhibits downstream aging displacement, with magnitudes ranging from 0.57 to 10.57 mm, accounting for 5.8% to 38.7% of the total horizontal displacement. Notably, the largest aging displacement component is observed in section #5 buttress dam at a magnitude of 10.57 mm and accounts for 38.7% of the total horizontal displacement, with an average speed of 0.32 mm/year; however, this rate has significantly slowed in recent years and is less than 0.22 mm/year after 1995. Most measuring points exhibit convergent displacement trends.

Notably, an increase in the lateral displacement of the dam body often coincides with an increase in water leakage from cracks within the dam structure, thereby indicating some degree of damage to its structural integrity. The results of the seven prediction models for horizontal displacement prediction are listed in Table 2 and Table 3.

The results presented in Table 2 and Table 3 indicate that the SVM model achieves superior predictive accuracy compared to other models. Additionally, the MLR model exhibits the largest prediction RMSE. We also observe that the prediction RMSE of the LSTM model displays slight instability, with fluctuations evident at certain points. The prediction results of the WAM confirm its superior performance, characterized by the minimum prediction RMSE and the maximum correlation coefficient. Furthermore, the Transformer-based network demonstrates relatively better performance compared to the other five machine learning methods. The horizontal displacements and their errors at points A06 and A16 predicted by the seven models are shown in Figure 5 and Figure 6.

From Figure 5 and Figure 6, the horizontal displacements have significant annual periodic characteristics, but the annual displacement amplitude is slightly different, ranging from 1 mm to 8 mm. Most of the prediction errors of horizontal displacement monitoring are between −2.0 mm and 2.0 mm, and the positive and negative values of the errors are relatively symmetrical.

4.5. Vertical Displacement

The vertical displacement of the dam crest varies within a specific range. On the upstream side, the maximum elevation value reaches 10.00 mm, whereas the maximum subsidence value is 4.00 mm. The annual variation amplitude has a maximum of 10.78 mm and an average of 8.96 mm. Conversely, on the downstream side, the maximum elevation value is recorded at 9.72 mm, with a corresponding maximum subsidence value of 4.70 mm. The annual variation shows a peak of 11.06 mm and an average of 8.57 mm in magnitude. The annual variation is relatively large in the middle, small on both sides, large on the downstream side, and small on the upstream side.

The vertical displacement of the dam roof is influenced primarily by temperature, indicating that, as the temperature increases, the dam roof increases, and as the temperature decreases, it decreases. However, there was a lag of approximately 1~2 months in response to temperature changes. The temperature-induced displacement ranges from 3.41 to 10.93 mm, accounting for 54% to 96% of the total variation. Notably, this effect is more pronounced on the downstream side than on the upstream side. The aging component contributes significantly to most measuring points along the vertical displacement of the dam crest, except those on the upstream side of the right guide wall. At all the measuring points along the upstream side of pier #4, subsidence trends are observed, with a maximum value of 4.30 mm (accounting for 32% of the total variation) and an average subsidence rate of 0.12 mm/year; however, these values are lower on the downstream side than on the upstream side. The maximum subsidence rate occurs in the initial years following the completion of dam cavity reinforcement, indicating an increase in the vertical load subsequent to backfilling with concrete and consequently leading to an increase in the subsidence rate. The results of the seven prediction models for vertical displacement prediction are listed in Table 4 and Table 5.

The results in Table 4 and Table 5 are similar to those in Table 2 and Table 3, which shows that the SVM model has relatively better predictive accuracy than the other models do. The prediction results of the WAM and the Transformer-based network indicate superior performance. The vertical displacements and their errors at points 8PierD and 1PierD predicted by the six models are shown in Figure 7 and Figure 8.

From Figure 7 and Figure 8, the vertical displacements have significant annual periodic characteristics. The annual vertical displacement amplitude is between −10 and 4 mm. Most of the prediction errors of the vertical displacement are between −2.0 and 2.0 mm, and the positive and negative values of the errors are relatively symmetrical.

5. Discussion and Analysis

This section presents a comprehensive discussion of the experimental results, focusing on the interpretability of the models, the technical insights derived from the comparative study, the importance of various environmental factors, and the engineering implications of the developed prediction models.

5.1. Interpretability of Models for Engineering Applications

In engineering applications, particularly in the monitoring of critical infrastructure such as dam safety, the interpretability of a predictive model is as crucial as its accuracy. Not only precise predictions but also clear and understandable reasonings behind those predictions are required to support informed decision making.

Among the seven models evaluated, the multiple linear regression model provides the highest level of interpretability. Its parameters offer direct and quantifiable insights into the contribution of each environmental factor to dam deformation. For example, the regression coefficients represent the expected change in displacement per unit change in variables such as temperature or water level, which aligns well with established physical principles. However, its assumption of linearity restricts its capacity to capture the complex, nonlinear interactions inherent in dam behavior.

In contrast, the support vector machine, random forest, and gradient boosting decision tree models are typically regarded as ‘grey-box’ models. Although their internal mechanisms are more complex than those of MLR, techniques such as feature importance analysis can be employed to extract meaningful insights. Their superior predictive performance, particularly SVM’s robustness with small sample sizes, enhances their value, provided that their predictions can be partially interpreted.

The long short-term memory and Transformer-based models represent the ‘black-box’ end of the modeling spectrum. Their strength lies in the ability to automatically learn complex temporal dependencies without requiring explicit feature engineering. The Transformer, through its self-attention mechanism, can theoretically assign importance weights to different time steps, providing a potential avenue for interpretation by highlighting which historical periods most influence a given prediction. However, translating these learned attention weights into physically meaningful and actionable insights for dam engineers remains a significant challenge.

The weighted average model inherits the interpretability characteristics of its constituent models. While it enhances predictive performance, its ensemble nature further complicates direct causal interpretation. In dam safety monitoring, a hybrid approach is often advisable: a highly accurate model such as the Transformer or WAM can serve as the primary early-warning system, while simpler, interpretable models like MLR, or models enhanced with post hoc explanation techniques, can be employed for diagnostic analysis when anomalies are detected.

5.2. Technical Discussion and Critical Interpretation of Results

The experimental results reveal several key technical insights. The consistently strong performance of the SVM model under small-sample conditions, as noted in the Abstract and confirmed in Table 2, Table 3, Table 4 and Table 5, can be attributed to its principle of structural risk minimization, which effectively mitigates overfitting—a common challenge when data is limited. This characteristic makes SVM a particularly reliable choice during the initial phases of monitoring or for dams with sparse sensor data.

The superior performance of the WAM demonstrates the effectiveness of ensemble learning. By integrating the diverse strengths of five different algorithms, the WAM mitigates the individual weaknesses and variances of each model, resulting in more robust and stable predictions. This indicates that no single algorithm is universally optimal; instead, a carefully weighted combination can achieve near-optimal performance.

The Transformer-based model’s high accuracy, comparable to that of the WAM, validates its applicability to dam deformation time series prediction. Its primary advantage lies in the self-attention mechanism, which efficiently captures long-range dependencies. This capability is critically important for modeling dam deformation, where effects such as reservoir filling or seasonal temperature changes may exhibit delays spanning several months, as discussed in Section 3.1. Unlike LSTM, which processes data sequentially and may struggle with very long-term dependencies, the Transformer processes the entire sequence simultaneously, making it potentially more effective at capturing these delayed physical phenomena. The occasional instability of the LSTM model, as evidenced by fluctuating RMSE values at monitoring points such as A07 and A17, may stem from its sensitivity to hyperparameter tuning or its inherent challenges in preserving gradient information over very long sequences.

5.3. Interpretation of Machine-Learned Relationships and Feature Importance

To move beyond black-box predictions and understand the ‘why’ behind model outputs, feature importance analysis is essential. Based on the high correlation coefficients presented in Table 1, reasonable inferences can be made regarding feature dominance.

Horizontal Displacement: The factors T_1–10 (short-term air temperature) and T_11–20 exhibit the highest correlations with MIC values of 0.9000 and 0.7402, respectively. This indicates that recent temperature fluctuations are a dominant driver of horizontal movements, likely due to the daily and short-term thermal expansion and contraction of the dam concrete and its foundation.

Vertical Displacement: The factors T_21–35 and T_11–20 (medium-term air temperature) exhibit the strongest correlations (MIC = 0.9541). This indicates that vertical displacement is primarily influenced by cumulative thermal effects over weeks, likely due to deeper heat penetration into the dam structure.

Crack Width: The time-effect factors (θ1 to θ8) exhibit an almost perfect correlation (MIC ≈ 1.0) with crack width. This is a significant finding, as it clearly indicates that crack development and widening are primarily driven by irreversible, time-dependent processes such as concrete creep, shrinkage, and material aging. The hydraulic factors (H¹–H⁴) show a weak but consistent negative correlation, suggesting that higher water levels may exert a closing pressure on certain cracks.

5.4. Engineering Interpretation

The prediction results demonstrate a strong alignment with the established physical behavior of concrete gravity dams.

Seasonal Thermal Expansion: The models successfully capture the annual periodic characteristics of both horizontal and vertical displacements (Figure 5, Figure 6, Figure 7 and Figure 8). Correlation analysis confirms that temperature is a primary driving factor. The predicted upstream movement of the dam crest in response to rising temperatures (and vice versa) is consistent with the physical mechanism of thermal expansion.

Reservoir Load Response: The observed behavior in which the dam moves downstream as reservoir water levels rise (and upstream during drawdown) is accurately captured by the models. This reflects the direct mechanical response of the dam structure to variations in hydrostatic pressure from the reservoir.

Aging and Time-Dependent Effects: The significant role of time-effect factors, particularly in relation to crack width and aging displacement, accurately reflects the long-term material behavior of concrete. The predicted convergent displacement trends at most monitoring points provide a positive indication of the dam’s structural stability.

These models can be directly integrated into a modern dam health monitoring system. They can serve as baseline models, anomaly detection tools, and forecasting tools. Baseline models can provide expected deformation values under given environmental conditions. The WAM or Transformer model would be excellent candidates for this due to their high accuracy. Anomaly detection tools are essential for initiating subsequent inspections, as significant and persistent deviations between actual measurements and model predictions may indicate potential structural issues. As forecasting tools, by inputting forecasted weather and planned reservoir levels, the models can predict future deformations, aiding in operational planning and risk management.

The choice of model can be tiered, the SVM for data-scarce scenarios, the Transformer for high-precision, long-term dependency modeling, and the MLR for situations requiring maximum interpretability for routine diagnostics.

6. Conclusions

This paper conducted a comparative study among several dam deformation prediction models. Accurate prediction of dam deformation is crucial to maintain dam operation and protect human life. On the basis of the correlation between physical factors and dam deformation, it is possible to construct a prediction model with higher accuracy, which provides a reference for dam safety monitoring and analysis. According to nineteen physical factors, seven models are utilized for dam deformation prediction. The stability and prediction accuracy are validated by a series of experiments. The following conclusions can be drawn:

(1): The MIC, Pearson, Kendall, and Spearman correlation indices reveal that dam deformation is closely related to physical factors such as air temperature, reservoir water temperature, reservoir water level, and dam aging. It is possible to establish an accurate prediction model for dam deformation. It is revealed that Zhexi Dam is operating under safe conditions and that its periodic deformation corresponds to environmental factors within an allowable range. All the models’ prediction errors are less than 2.0 mm.
(2): Different models exhibited varying performance in the same practical application. Among the seven models evaluated, the MLR model yielded a relatively high prediction RMSE. In scenarios with limited samples, the SVM model demonstrated superior prediction accuracy. Our results indicate that the weighted average ensemble model achieves higher prediction accuracy than any individual constituent model, albeit at the cost of requiring multiple models and substantial computational resources. The Transformer model, first introduced in 2017, was incorporated into the analysis. The prediction model based on a standalone Transformer architecture demonstrates considerable accuracy and distinct advantages, showing promise as a novel deformation prediction approach for concrete dams.

Author Contributions

Conceptualization, X.D. and X.Z.; methodology, X.D.; software, X.D.; validation, Z.T., X.D., and X.Z.; formal analysis, X.D.; investigation, X.Z.; resources, Z.T.; data curation, X.D.; writing—original draft preparation, X.D.; writing—review and editing, X.D. and X.Z.; visualization, X.Z.; supervision, Z.T.; project administration, Z.T.; funding acquisition, X.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Open Topic of Hunan Geospatial Information Engineering and Technology Research Center (Grant No. HNGIET2025001) and the Natural Science Foundation of Hunan Province, China (No. 2024JJ8335).

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflicts of interest. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

The following abbreviations are used in this manuscript:

MIC	maximum information coefficient
MLR	multiple linear regression
GBDT	gradient boosting decision tree
RF	random forest
SVM	support vector machine
LSTM	long short-term memory
WAM	weighted average model
RMSE	root mean squared error

References

Gu, H.; Yang, M.; Gu, C.S.; Huang, X.F. A factor mining model with optimized random forest for concrete dam deformation monitoring. Water Sci. Eng. 2021, 14, 330–336. [Google Scholar] [CrossRef]
Graham, W.J. A Procedure for Estimating Loss of Life Caused by Dam Failure—DSO-99-06; Bureau of Reclamation: Washington, DC, USA, 1999. [Google Scholar] [CrossRef]
Rich, T.P. Lessons in social responsibility from the austin dam failure. Int. J. Eng. Educ. 2006, 22, 1287–1296. [Google Scholar] [CrossRef]
Yang, L.; Liu, M.; Smith, J.A.; Tian, F. Typhoon Nina the August 1975 flood over central China. J. Hydrometeorol. 2017, 18, 451–472. [Google Scholar] [CrossRef]
Vazquez-Ontiveros, J.R.; Martinez-Felix, C.A.; Vazquez-Becerra, G.E.; Gaxiola-Camacho, J.R.; Melgarejo-Morales, A.; Padilla-Velazco, J. Monitoring of local deformations and reservoir water level for a gravity type dam based on GPS observations. Adv. Space Res. 2022, 69, 319–330. [Google Scholar] [CrossRef]
Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
Kovács, L.; Csépányi-Fürjes, L.; Tewabe, W. Transformer Models in Natural Language Processing. In The 17th International Conference Interdisciplinarity in Engineering; Inter-ENG 2023; Lecture Notes in Networks and Systems; Moldovan, L., Gligor, A., Eds.; Springer: Cham, Switzerland, 2024; Volume 929. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; Volume 1, pp. 4171–4186. [Google Scholar]
Radford, A.; Narasimhan, K. Improving Language Understanding by Generative Pre-Training; Computer Science, Linguistics; MIT Press: Cambridge, MA, USA, 2018; p. 12. [Google Scholar]
Bukenya, P.; Moyo, P.; Beushausen, H.; Oosthuizen, C. Health monitoring of concrete dams: A literature review. J. Civ. Struct. Health Monit. 2014, 4, 235–244. [Google Scholar] [CrossRef]
Konakoglu, B.; Cakir, L.; Yilmaz, V. Monitoring the deformation of a concrete dam: A case study on the Deriner Dam, Artvin, Turkey. Geomat. Natl. Hazards Risk. 2020, 11, 160–177. [Google Scholar] [CrossRef]
Xi, R.; Zhou, X.; Jiang, W.; Chen, Q. Simultaneous estimation of dam displacements and reservoir level variation from GPS measurements. Measurement 2018, 122, 247–256. [Google Scholar] [CrossRef]
Ruiz-Armenteros, A.M.; Lazecky, M.; Hlaváčová, I.; Bakoň, M.; Delgado, J.M.; Sousa, J.J.; Lamas-Fernández, F.; Marchamalo, M.; Caro-Cuenca, M.; Papco, J.; et al. Deformation monitoring of dam infrastructures via spaceborne MT-InSAR. The case of La Viñuela (Málaga, southern Spain). Procedia Comput. Sci. 2018, 138, 346–353. [Google Scholar] [CrossRef]
Wang, S.; Sui, X.; Liu, Y.; Gu, H.; Xu, B.; Xia, Q. Prediction and interpretation of the deformation behaviour of high arch dams based on a measured temperature field. J. Civ. Struct. Health Monit. 2023, 13, 661–675. [Google Scholar] [CrossRef]
Kang, F.; Wu, Y.R.; Ma, J.T.; Li, J.J. Structural identification of super high arch dams using Gaussian process regression with improved salp swarm algorithm. Eng. Struct. 2023, 286, 116150. [Google Scholar] [CrossRef]
Belmokre, A.; Mihoubi, M.K.; Santillán, D. Analysis of dam behavior by statistical models: Application of the random forest approach. KSCE J. Civ Eng. 2019, 23, 4800–4811. [Google Scholar] [CrossRef]
Mata, J. Interpretation of concrete dam behaviour with artificial neural network and multiple linear regression models. Eng. Struct. 2011, 33, 903–910. [Google Scholar] [CrossRef]
Ranković, V.; Grujović, N.; Divac, D.; Milivojević, N. Development of support vector regression identification model for prediction of dam structural behaviour. Struct. Saf. 2014, 48, 33–39. [Google Scholar] [CrossRef]
Salazar, F.; Toledo, M.Á.; Oñate, E.; Suárez, B. Interpretation of dam deformation and leakage with boosted regression trees. Eng. Struct. 2016, 119, 230–251. [Google Scholar] [CrossRef]
Liu, W.J.; Pan, J.W.; Ren, Y.S.; Wu, Z.G.; Wang, J.T. Coupling prediction model for long-term displacements of arch dams based on long short-term memory network. Struct. Control Health Monit. 2020, 27, e2548. [Google Scholar] [CrossRef]
Huang, B.; Kang, F.; Li, J.; Wang, F. Displacement prediction model for high arch dams using long short-term memory based encoder-decoder with dual-stage attention considering measured dam temperature. Eng. Struct. 2023, 280, 115686. [Google Scholar] [CrossRef]
Hu, J.; Ma, F.H. Comparison of hierarchical clustering-based deformation prediction models for high arch dams during the initial operation period. J. Civil Struct. Health Monit. 2021, 11, 897–914. [Google Scholar] [CrossRef]
Reshef, D.N.; Reshef, Y.A.; Finucane, H.K.; Grossman, S.R.; McVean, G.; Turnbaugh, P.J.; Lander, E.S.; Mitzenmacher, M.; Sabeti, P.C. Detecting Novel Associations in Large Data Sets. Science 2011, 334, 1518–1524. [Google Scholar] [CrossRef]
Fan, R.; Meng, D.Z.; Xu, D.X. Survey of the research process for statistical correlation analysis. Math. Model. Its Appl. 2014, 3, 1–12. (In Chinese) [Google Scholar] [CrossRef]
Wang, S.W.; Xu, Y.L.; Gu, C.S.; Bao, T.F.; Xia, Q.; Hu, K. Hysteretic effect considered monitoring model for interpreting abnormal deformation behavior of arch dams: A case study. Struct. Control Health Monit. 2019, 26, e2417. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Li, Y.; Bao, T.; Shu, X.; Chen, Z.; Gao, Z.; Zhang, K. A Hybrid Model Integrating Principal Component Analysis, Fuzzy C-Means, and Gaussian Process Regression for Dam Deformation Prediction. Arab. J. Sci. Eng. 2021, 46, 4293–4306. [Google Scholar] [CrossRef]
Deng, X.S.; Tang, G.; Wang, Q.Y.; Luo, L.X.; Long, S.C. A Method for Forest Vegetation Height Modeling Based on Aerial Digital Orthophoto Map and Digital Surface Model. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4404307. [Google Scholar] [CrossRef]
Fang, K.N.; Wu, J.B.; Zhu, J.P.; Xie, B.C. A review of technologies on random forests. Stat. Inf. Forum. 2011, 26, 32–38. [Google Scholar] [CrossRef]
Zhou, Z.H. Machine Learning; Tsinghua University Press: Beijing, China, 2016; pp. 171–183. [Google Scholar]
Dong, S.S.; Huang, Z.X. A brief theoretical overview of random forests. J. Integr. Technol. 2013, 2, 1–7. Available online: https://jcjs.siat.ac.cn/en/article/id/201301001 (accessed on 15 August 2025). (In Chinese).
Wu, W.M.; Wang, J.X.; Huang, Y.S.; Zhao, H.Y.; Wang, X.T. A novel way to determine transient heat flux based on GBDT machine learning algorithm. Int. J. Heat Mass Transf. 2021, 179, 121746. [Google Scholar] [CrossRef]
Hochreiter, S.; Bengio, Y.; Frasconi, P.; Schmidhuber, J. Gradient flow in recurrent nets: The difficulty of learning long-term dependencies. In A Field Guide to Dynamical Recurrent Neural Networks; IEEE Press: Piscataway, NJ, USA, 2001; pp. 237–243. [Google Scholar]
Graves, A.; Schmidhuber, J. Framewise, phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 2005, 18, 602–610. [Google Scholar] [CrossRef]
Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A Search Space Odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 2222–2232. [Google Scholar] [CrossRef]
Pagano, D. A predictive maintenance model using long short-term memory neural networks and Bayesian inference. Decis. Anal. J. 2023, 6, 100174. [Google Scholar] [CrossRef]

Figure 1. The network structure of Transformer-based model.

Figure 2. Zhexi Dam and some representative monitoring points.

Figure 3. Distribution of horizontal and vertical displacement monitoring points.

Figure 4. The air temperature and upstream water level (From 2016 to 2021).

Figure 5. The predicted horizontal displacement of A06 (a,c); The horizontal displacement prediction error of A06 (b,d).

Figure 6. The predicted horizontal displacement of A16 (a,c); The horizontal displacement prediction error of A16 (b,d).

Figure 7. The predicted vertical displacement of 8PierD (a,c); The vertical displacement prediction error of 8PierD (b,d).

Figure 8. The predicted vertical displacement of 1PierD (a,c); The vertical displacement prediction error of 1PierD (b,d).

Table 1. Correlation indices between environmental factors and dam displacement.

Environment Factors	Horizontal Displacement				Vertical Displacement				Crack Width
Environment Factors	MIC	Pearson	Kendall	Spearman	MIC	Pearson	Kendall	Spearman	MIC	Pearson	Kendall	Spearman
T_1–10	0.9000	−0.8698	−0.7127	−0.8874	0.8816	−0.9395	−0.7920	−0.9371	0.8167	0.0067	−0.0002	0.0100
T_11–20	0.7402	−0.7888	−0.5972	−0.8113	0.9541	−0.9523	−0.8305	−0.9581	0.7837	−0.0579	−0.0376	−0.0419
T_21–35	0.7324	−0.7312	−0.5384	−0.7521	0.9541	−0.9561	−0.8271	−0.9592	0.8196	−0.1396	−0.0917	−0.1093
T_36–50	0.5327	−0.5876	−0.4111	−0.6089	1.0000	−0.8940	−0.6988	−0.8940	0.8186	−0.2209	−0.1369	−0.1718
T_51–70	0.3369	−0.3640	−0.2469	−0.3949	0.6944	−0.7763	−0.5677	−0.7813	0.7567	−0.2862	−0.1758	−0.2258
T_71–90	0.2741	−0.0977	−0.0714	−0.1189	0.5124	−0.5537	−0.3739	−0.5562	0.7140	−0.3355	−0.2033	−0.2663
T_water	0.4868	−0.6453	−0.4636	−0.6698	0.8851	−0.9186	−0.7761	−0.9344	0.8094	−0.3442	−0.2096	−0.3002
H¹	0.2414	0.1066	0.0572	0.0961	0.3001	−0.3253	−0.1919	−0.2877	0.7407	−0.3552	−0.2056	−0.2990
H²	0.2414	0.1080	0.0572	0.0961	0.3001	−0.3206	−0.1919	−0.2877	0.7407	−0.3527	−0.2056	−0.2990
H³	0.2414	0.1091	0.0572	0.0961	0.3001	−0.3154	−0.1919	−0.2877	0.7407	−0.3499	−0.2056	−0.2990
H⁴	0.2414	0.1100	0.0572	0.0961	0.3001	−0.3099	−0.1919	−0.2877	0.7407	−0.3467	−0.2056	−0.2990
θ1	0.3638	−0.1265	−0.0708	−0.0960	0.3679	−0.1109	−0.0515	−0.0793	0.9999	−0.9907	−0.9499	−0.9936
θ2	0.3638	−0.1681	−0.0708	−0.0960	0.3679	−0.2209	−0.0515	−0.0793	0.9999	−0.7351	−0.9499	−0.9936
θ3	0.3638	−0.1591	−0.0708	−0.0960	0.3679	−0.1716	−0.0515	−0.0793	0.9999	−0.9854	−0.9499	−0.9936
θ4	0.3638	−0.1004	−0.0708	−0.0960	0.3679	−0.0833	−0.0515	−0.0793	0.9999	−0.9856	−0.9499	−0.9936
θ5	0.3638	−0.0734	−0.0708	−0.0960	0.3679	−0.0813	−0.0515	−0.0793	0.9999	−0.9698	−0.9499	−0.9936
θ6	0.3638	−0.1195	−0.0708	−0.0960	0.3679	−0.1012	−0.0515	−0.0793	0.9999	−0.9896	−0.9499	−0.9936
θ7	0.3638	0.1933	0.0708	0.0960	0.3679	0.1960	0.0515	0.0793	0.9999	0.9886	0.9499	0.9936
θ8	0.3638	−0.1339	−0.0708	−0.0960	0.3679	−0.1890	−0.0515	−0.0793	0.9999	−0.7352	−0.9499	−0.9936

Table 2. Horizontal displacement prediction RMSEs of the seven models (mm).

Point Name	MLR	RF	GBDT	SVM	LSTM	WAM	Transformer
A04	1.364	1.060	1.200	0.969	0.994	0.960	0.845
A05	1.320	1.052	1.249	1.183	1.071	0.976	1.039
A06	1.518	0.824	0.770	0.815	0.961	0.695	0.871
A07	1.148	1.184	1.135	1.115	1.765	1.034	1.252
A08	1.228	1.180	1.221	1.097	1.486	1.053	1.213
A09	0.981	1.110	1.106	0.864	1.434	0.855	1.134
A10	2.156	1.968	1.762	1.721	2.128	1.537	1.536
A11	1.999	1.738	1.664	1.392	1.698	1.293	1.205
A12	1.628	1.551	1.571	1.149	2.175	1.085	1.089
A13	1.317	1.532	1.890	1.065	1.913	0.997	1.095
A14	1.351	1.127	1.351	0.942	1.654	0.848	1.134
A15	0.936	1.159	1.594	0.907	1.310	0.873	1.001
A16	1.486	1.111	1.171	0.910	1.091	0.899	0.788
A17	1.440	1.370	1.286	1.169	2.121	1.183	1.000
A18	1.075	1.191	1.269	1.052	1.291	0.992	0.991
A19	1.102	0.724	0.855	0.775	0.770	0.614	0.960
A20	1.534	0.609	0.647	0.672	0.754	0.568	0.801

Table 3. Horizontal displacement prediction correlation coefficients of the seven models.

Point Name	MLR	RF	GBDT	SVM	LSTM	WAM	Transformer
A04	0.574	0.574	0.467	0.654	0.612	0.663	0.765
A05	0.650	0.693	0.520	0.598	0.626	0.730	0.604
A06	0.547	0.796	0.760	0.728	0.566	0.839	0.767
A07	0.835	0.825	0.835	0.838	0.512	0.867	0.829
A08	0.786	0.775	0.772	0.808	0.598	0.826	0.771
A09	0.893	0.859	0.865	0.912	0.713	0.918	0.845
A10	0.830	0.804	0.852	0.857	0.762	0.890	0.887
A11	0.827	0.807	0.828	0.878	0.812	0.897	0.911
A12	0.885	0.825	0.844	0.908	0.578	0.916	0.924
A13	0.916	0.834	0.803	0.919	0.703	0.931	0.917
A14	0.896	0.913	0.855	0.934	0.753	0.949	0.899
A15	0.913	0.858	0.772	0.911	0.791	0.918	0.890
A16	0.852	0.835	0.820	0.902	0.834	0.905	0.923
A17	0.790	0.791	0.824	0.853	0.281	0.858	0.881
A18	0.866	0.772	0.758	0.822	0.706	0.847	0.859
A19	0.714	0.697	0.670	0.764	0.578	0.800	0.397
A20	0.253	0.682	0.663	0.563	0.359	0.722	0.230

Table 4. Vertical displacement prediction RMSE of the seven models (mm).

Point Name	MLR	RF	GBDT	SVM	LSTM	WAM	Transformer
LG3D	0.762	0.642	0.676	0.559	0.681	0.528	0.642
LG2D	0.969	0.767	1.023	0.623	0.749	0.553	0.519
LG1D	0.748	0.707	0.791	0.667	0.707	0.609	0.608
LgwD	1.018	0.803	0.944	0.691	0.808	0.639	0.593
8PierD	0.960	0.827	0.846	0.795	0.902	0.665	0.645
7PierD	0.974	0.900	0.948	0.801	0.968	0.711	0.727
6PierD	0.934	0.878	0.878	0.801	0.996	0.724	0.656
5PierD	1.066	0.942	0.922	0.830	1.000	0.732	0.731
4PierD	1.018	0.919	0.987	0.799	1.027	0.749	0.664
3PierD	0.999	0.882	0.910	0.795	0.955	0.732	0.691
2PierD	0.975	0.955	0.978	0.831	0.996	0.770	0.675
1PierD	1.006	0.886	0.879	0.807	0.970	0.717	0.652
RgwD	0.744	0.874	0.974	0.701	0.811	0.637	0.695
ElesD	0.738	0.918	0.915	0.727	0.773	0.669	0.695
6WtinD	0.859	0.943	0.936	0.815	0.887	0.755	0.807
5WtinD	0.828	0.996	1.020	0.870	0.861	0.758	0.717
4WtinD	0.827	0.938	0.898	0.799	0.836	0.705	0.638

Table 5. Vertical displacement prediction correlation coefficients of the six models.

Point Name	MLR	RF	GBDT	SVM	LSTM	WAM	Transformer
LG3D	0.877	0.933	0.915	0.946	0.904	0.959	0.929
LG2D	0.850	0.925	0.833	0.945	0.910	0.966	0.968
LG1D	0.894	0.925	0.893	0.925	0.909	0.946	0.930
LgwD	0.913	0.960	0.927	0.964	0.945	0.974	0.975
8PierD	0.950	0.975	0.968	0.975	0.957	0.983	0.980
7PierD	0.949	0.972	0.958	0.973	0.950	0.981	0.975
6PierD	0.954	0.973	0.967	0.974	0.949	0.979	0.980
5PierD	0.940	0.968	0.964	0.972	0.948	0.979	0.975
4PierD	0.947	0.968	0.962	0.975	0.946	0.978	0.979
3PierD	0.947	0.971	0.962	0.974	0.952	0.978	0.977
2PierD	0.953	0.967	0.960	0.971	0.951	0.976	0.977
1PierD	0.947	0.969	0.966	0.968	0.950	0.977	0.984
RgwD	0.959	0.949	0.930	0.966	0.951	0.972	0.964
ElesD	0.941	0.916	0.920	0.945	0.936	0.955	0.945
6WtinD	0.939	0.928	0.927	0.946	0.935	0.954	0.950
5WtinD	0.937	0.913	0.905	0.932	0.933	0.950	0.951
4WtinD	0.935	0.916	0.922	0.939	0.933	0.953	0.960

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Deng, X.; Zhu, X.; Tang, Z. A Comparative Study on Modeling Methods for Deformation Prediction of Concrete Dams. Modelling 2025, 6, 154. https://doi.org/10.3390/modelling6040154

AMA Style

Deng X, Zhu X, Tang Z. A Comparative Study on Modeling Methods for Deformation Prediction of Concrete Dams. Modelling. 2025; 6(4):154. https://doi.org/10.3390/modelling6040154

Chicago/Turabian Style

Deng, Xingsheng, Xu Zhu, and Zhongan Tang. 2025. "A Comparative Study on Modeling Methods for Deformation Prediction of Concrete Dams" Modelling 6, no. 4: 154. https://doi.org/10.3390/modelling6040154

APA Style

Deng, X., Zhu, X., & Tang, Z. (2025). A Comparative Study on Modeling Methods for Deformation Prediction of Concrete Dams. Modelling, 6(4), 154. https://doi.org/10.3390/modelling6040154

Article Menu

A Comparative Study on Modeling Methods for Deformation Prediction of Concrete Dams

Abstract

1. Introduction

2. Environmental Factors and Correlations

2.1. Environmental Factors

2.2. Correlations

3. Models and Methods

3.1. Transformer-Based Model

3.2. Multiple Linear Regression

3.3. Support Vector Regression

3.4. Random Forest

3.5. Gradient Boosting Decision Tree

3.6. Long Short-Term Memory Network

3.7. Weighted Average Model (WAM)

3.8. Prediction Accuracy Index

4. Experiments and Results

4.1. Zhexi Dam

4.2. Environmental Factors Data

4.3. Data Preprocessing

4.4. Horizontal Displacement

4.5. Vertical Displacement

5. Discussion and Analysis

5.1. Interpretability of Models for Engineering Applications

5.2. Technical Discussion and Critical Interpretation of Results

5.3. Interpretation of Machine-Learned Relationships and Feature Importance

5.4. Engineering Interpretation

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI