Prediction of Dam Deformation Using SSA-LSTM Model Based on Empirical Mode Decomposition Method and Wavelet Threshold Noise Reduction

: The deformation monitoring information of concrete dams contains some high-frequency components, and the high-frequency components are strongly nonlinear, which reduces the accuracy of dam deformation prediction. In order to solve such problems, this paper proposes a concrete dam deformation monitoring model based on empirical mode decomposition (EMD) combined with wavelet threshold noise reduction and sparrow search algorithm (SSA) optimization of long short-term memory network (LSTM). The model uses EMD combined with wavelet threshold to decompose and denoise the measured deformation data. On this basis, the LSTM model based on SSA optimization is used to mine the nonlinear function relationship between the reconstructed monitoring data and various inﬂuencing factors. The engineering example is analyzed and compared with the prediction results of LSTM model and PSO-SVM model. The results show that the mean absolute error (MAE) and root mean square error (RMSE) of the model are 0.05345 and 0.06358, with the complex correlation coefﬁcient R 2 of 0.9533 being closer to 1 and a better ﬁt than the other two models. This can effectively mine the relationship in the measured deformation data, and reduce the inﬂuence of high-frequency components on the dam prediction accuracy.


Introduction
China is a large country of water conservancy construction, with the number and scale of existing dams among the world's leading.To ensure the safe operation of dams is of great significance to maintain the safety of life and property of the public and regional stability, and a dam failure would cause huge losses.For example, the '9-8' dam collapse in Xianfen County, Shanxi in 2008 caused 277 deaths, 4 missing persons, 33 injuries, and direct economic losses of RMB 96.19 million, while in 2018, the Nakuru County dam in Kenya collapsed, killing 48 people.Deformation is an important monitoring quantity that reflects the comprehensive safety state of a dam.The construction of a high-precision deformation monitoring model can represent the evolution of the structural properties of a dam, quantitatively interpret the role of the main influencing factors and predict the operation of the dam, and evaluate the dam properties accordingly [1][2][3].
The dam deformation monitoring model can be divided into statistical model, deterministic model, and hybrid models depending on the modeling approach [4,5].As statistical models are easy to implement, they are widely used, with the development of computer technology such as gray theory [5,6], neural network model [7][8][9], support vector machine model [10][11][12], random forest theory [13,14], and many other methods which are widely used in dam deformation monitoring model [15,16], which greatly improves the calculation speed and prediction accuracy of the monitoring model.However, these models have certain shortcomings.The random forest model has uncertainty in the empirical selection of parameters and runs relatively slowly, which can lead to situations such as poor classification in small data processing [17], while neural network models are prone to overfitting and local optima when running.SVM models are suitable for problems such as small samples and nonlinearities, but the prediction performance of SVM models is strongly influenced by the selection of kernel parameters [18].In order to make up for the shortcomings of these models, relevant personnel have proposed deep learning methods (such as CNN [19], RNN [20], DBN [21]) and applied them to dam deformation monitoring.By studying the intrinsic laws and representation levels of monitoring data, the model can improve the prediction accuracy of complex nonlinear problems to a certain extent.Among them, the recurrent neural network is one of the more commonly used deep learning methods.The recurrent neural network (RNN) is designed based on the recursive nature of sequence data and is a feedback type of neural network.However, due to the disappearance of the gradient of RNN, the time series cannot exist for too long, so it can only have short-term memory and cannot support long-term memory, so the relevant personnel proposed the long short-term memory network (LSTM) [22][23][24].
Long short-term memory networks (LSTMs) are derivatives that solve the problem of RNN gradient decay.The LSTM network combines short-term and long-term memory with gating, which solves the gradient decay problem to a certain extent.Ou Bin et al. [25] proposed a concrete dam deformation prediction model based on LSTM and verified that the model has good prediction accuracy and iteration rate in practical engineering.Wang et al. [26] proposed a prediction model combining ALO-LSTM and characteristic attention mechanism based on the existing earth dam seepage pressure prediction model, and quantitatively analyzed the degree of influence of each influence factor in the seepage pressure effect quantity.Dasan Yang et al. [27] proposed an attention mechanism-based LSTM concrete dam deformation prediction method with Adma optimization algorithm to improve the learning accuracy and speed of LSTM, and verified the feasibility of the model in practical engineering.Liu et al. [28] proposed a coupled long-term displacement prediction model for arch dams based on long and short memory networks.Principal component analysis (PCA) and moving average (MA) methods were used to reduce the dimensionality of the input variables, which are combined with LSTM to realize two coupled prediction models, LSTM-PCA and LSTM-MA, respectively.Affected by the measurement accuracy of the instrument and the inherent noise associated with the information acquisition module of the monitoring system, it is inevitable that there are certain random errors in the monitoring data that cannot be explained by environmental factors.In order to maximally eliminate the interference of unfavorable factors such as inherent noise and random errors, wavelet analysis [29,30], singular value decomposition [31], variational mode decomposition (VMD) [32,33], empirical mode decomposition (EMD) [34,35], etc., have been applied to the noise reduction of dam deformation monitoring data.
In view of the adverse effect of noise components in the measured deformation data on the modeling accuracy, this paper proposes a signal denoising method that combines empirical mode decomposition (EMD) and wavelet threshold method to decompose and reconstruct the data to make the unstable dam.The monitoring data are stabilized to reduce the influence of noise in the measured deformation.In addition, in order to avoid local optima when the model algorithm performs deformation prediction, the Sparrow Search Algorithm (SSA) is used to optimize long-and short-term memory networks (LSTM), perform parameter optimization, and a concrete dam deformation prediction model based on the Sparrow Search Algorithm which optimized LSTM is constructed.When applied to a certain engineering practice, the prediction accuracy and calculation speed of the model were analyzed and compared.

Selection of Statistical Models for Dam Deformation Prediction
Dam deformation is the displacement vector sum of the plastic and elastic deformation of the concrete dam and bedrock under load.The displacement vector generated in the The factor selection of the water pressure component δ H , for the reason that the load of water pressure p c is nonlinear changes.However, the p c is with a curvilinear relationship of H, from which it can be deduced that there is a relationship between δ H and H, which is linear with H, H 2 , H 3 , H 4 .Moreover, the δ 2H and δ 2H have the same relationship: where a i is structure coefficient; H is water depth value before the dam.
According to analysis of the dam deformation monitoring data, temperature is one of the main factors affecting the arch dam.After years of normal operation of a concrete dam, the hydration heat of the poured concrete has been completely dispersed, and the temperature inside the dam has reached a quasi-stable temperature field.At this time, the dam temperature is only influenced by the boundary temperature.Assume that water temperature and air temperature are harmonic motion; deformation and concrete temperature are linear relationship.Therefore, multi-period harmonics are chosen as the factor: where m 3 is 1 or 2, i is the annual cycle, t is the cumulative monitoring days.Time-dependent component δ θ , which in general variation law of mathematical expression is a functional relationship when arch dam is of normal operation.For concrete arch dams, it can be considered that the main factor of influencing the time-displacement is the viscous flow of the dam concrete and the dam base rock.The linear combination of θ and ln θ can better describe the time-displacement caused by the rheological properties of the arch dam materials: In summary, the time-varying forecast model for arch dam deformation can be expressed as:

Empirical Mode Decomposition
Empirical Modal Decomposition Method is a noise reduction method for nonlinear and non-stationary data [36].This method does not require manual setting of parameters.The original data can be quickly decomposed into high-to-low modal components, namely, Intrinsic Mode Functions (IMF) and Residual (Res) [37].The specific decomposition and reconstruction steps are as follows: For a given data sequence x(t), select the extreme value points in the data, and use the cubic spline difference function to form the upper and lower envelopes for the maximum point and the minimum point.Calculate the mean value of the upper and lower envelopes m 1 (t), subtract m 1 (t) from x(t), and get a new time series denoted as h 1 (t): Water 2022, 14, 3380 However, it is difficult to solve the upper and lower package routes in reality, so it is necessary to use the spline difference function for fitting, and new extreme points will be generated during the fitting process.So it needs to go through the screening cycle until a certain stopping criterion is reached to the end.That is, take h 1 (t) as a new x(t) and loop the above operation k times to get h 1k (t) which is: where h 1(k−1) (t) is the screening result of k − 1 times; at this time h 1k is the first IMF component, denoted as c 1 (t): Subtract c 1 (t) from given data x(t) to get residual r 1 (t): Continue to repeat the above steps until the residual is less than the preset error, or the residual is monotonic.at this point, the EMD decomposition ends which is: In order to make the frequency and amplitude of the components obtained from the decomposition have a certain practical significance, it can faithfully reflect the volatility characteristics of the original sequence.In the process of decomposition screening, the standard deviation sum of adjacent screening results is 0.3.That is, the SD equation is:

Wavelet Threshold Noise Reduction
The basic principle of wavelet threshold denoising is that after the signal is transformed by a wavelet, the corresponding wavelet coefficients will be generated; select the appropriate threshold value, keep the wavelet coefficients larger than the threshold value, and remove the wavelet coefficients smaller than the threshold value [38].The basic steps are as follows: (1) Select a wavelet with the number of layers N for decomposition.
(2) Thresholding the decomposition coefficients of each layer.(3) Wavelet reconstruction according to the wavelet coefficients after deactivation.The wavelet threshold affects the selection of key parameters for noise reduction.Selecting appropriate parameters will achieve a good noise reduction effect.The main influencing factors are as follows: Decomposition layer: When wavelet decomposition is performed on the original data, the higher the number of decomposition layers, the better the noise reduction effect, but the signal is more likely to be distorted.The best result is achieved when the number of layers is chosen to be 3.
Basic wavelet function: For real signals, the basic wavelet selection usually considers factors such as support length, symmetry, trailing moments, smoothness, and similarity.For one-dimensional signals such as audio signals, dB wavelets, and symbol wavelets are usually selected.
Threshold: Threshold selection is more important for wavelet threshold noise reduction.Common threshold selection methods include unbiased risk estimation threshold, minimax threshold, fixed threshold, and heuristic threshold.This paper uses a fixed threshold: At present, the commonly used threshold functions mainly include hard threshold function and soft threshold function, and the two threshold functions have their own Water 2022, 14, 3380 5 of 16 advantages and disadvantages.Although the hard threshold function can well preserve the edge characteristics of the signal, it will cause a certain degree of distortion in the process of signal processing.The soft threshold function to remove noise after the signal is much smoother.Therefore, the soft threshold function is used in this paper.
Soft threshold function: where w thr is the wavelet coefficient after wavelet transformation; λ is the threshold value; ŵthr c is the wavelet coefficient denoised by the threshold value.
Considering that the EMD method and the wavelet threshold method have their own advantages and disadvantages in data preprocessing, and the two methods can complement each other, for the high-frequency components decomposed by EMD, the wavelet threshold can be used for noise reduction, which effectively reduces the distortion of the signal.Therefore, a noise reduction method based on EMD combined with wavelet threshold is constructed based on the above principles.

Sparrow Search Algorithm
Jiankai xue [39] proposed a new intelligent optimization algorithm which is the Sparrow Search Algorithm (SSA) in 2020.This algorithm is mainly inspired by the foraging behavior of sparrows, and has the characteristics of strong merit-seeking ability and fast convergence.The Sparrow Search Algorithm avoids the algorithm falling into local optimum by constructing the corresponding fitness function, which makes the identity of individual sparrows, and the position change dynamically.In the simulation experiment, we need to use virtual sparrows for food hunting, and the population X consisting of n sparrows can be expressed in the following form: where d is the dimension of the sparrow population, n is the number of sparrows, x is the sparrow individual.
The fitness values of all sparrows can be expressed in the following form: where F X is the adaptability matrix, and f is the adaptability value.Sparrows with better adaptability values (discoverers) will prefer to guide foraging direction and range when foraging in groups, due to the greater foraging range of discoverers.
During each iteration of the Sparrow Search Algorithm, the location update of the discoverer is described as follows: Water 2022, 14, 3380 where iter is the current iteration factor; iter max is the constant with the highest number of iterations; x t i,j x t i,j is the value of the j dimension of the i sparrow at the t iteration, is the alarm value; ST ∈ [0.5, 1] is the safety threshold value; Q is the random numbers that obey the normal distribution, and L is a 1 × d order matrix.
When R 2 < ST, and there is no danger around at this point, the discoverer can continue to search for food.
When R 2 ≥ ST, and there is danger around at this time, the discoverer needs to take cover to ensure safety quickly.
However, for sparrow populations, once the discoverer searches for a quality food source, the joiner will recognize it and fly near it to grab food during foraging.At the same time, some joiners will always watch the discoverer and are ready to fight for food.Thus, the location update plan result from the joiner is as follows: where x p is the producer optimal location; X worst is the worst position; A is the 1 × d order matrix, and any element is randomly assigned a value of 1 or −1.When i > n 2 , it indicates that the i joiner failed to grab the food and needs to forage again for it.
In conclusion, assuming 10-20% of sparrows are aware of the danger, the initial position random generation rule is: where λ is the step control function, obeying a normal distribution with mean 0 and variance 1; f i is the current sparrow adaptation values; f g is the global optimal adaptation value; x best is the global best position; k is the direction of movement; ε is the minimum constant.When f i > f g , sparrows are more dangerous at the fringes of the population.When f i = f g , aware of the danger, the sparrow moves closer to other sparrows for safety in order to avoid it.

LSTM Neural Networks
LSTM is a special form of the traditional recurrent neural network (RNN), which was proposed by Sepp Hochreiter in 1977 for the gradient vanishing problem of RNN.Compared with RNN, which has only one state h in the hidden layer, LSTM adds a unit state c to the original structure of RNN.In order to make long-term preservation of shortterm input, LSTM proposes to store and update the cell state in the form of "gate".The concepts of input gate, forget gate, and output gate are proposed, and the long-term retention of information is finally realized through the control of the three gates as shown in Figures 1 and 2.
The forget gate controls the proportion of the unit state from the previous moment saved to the current moment.The formula sigmoid can be obtained by the activation function: where W f is the weight matrix of the forgotten gate; b f is the offset items; δ is the sigmoid stimulus function; x i is the current input, and h t−1 is the hidden output in the previous moment.
proposed by Sepp Hochreiter in 1977 for the gradient vanishing problem of RNN.Compared with RNN, which has only one state h in the hidden layer, LSTM adds a unit state c to the original structure of RNN.In order to make long-term preservation of short-term input, LSTM proposes to store and update the cell state in the form of "gate".The concepts of input gate, forget gate, and output gate are proposed, and the long-term retention of information is finally realized through the control of the three gates as shown in Figures 1 and 2.  The forget gate controls the proportion of the unit state from the previous moment saved to the current moment.The formula sigmoid can be obtained by the activation function: where is the weight matrix of the forgotten gate; is the offset items; is the sigmoid stimulus function; is the current input, and ℎ is the hidden output in the previous moment.
The input gate controls the proportion of the network state saved to the cell state at the current moment.The following formula is: where is the weight matrix of the sigmoid layer; is the bias term for the input gate sigmoid layer; is the weight matrix of the input gate tanh layer, and is the bias term for output gate tanh layer.proposed by Sepp Hochreiter in 1977 for the gradient vanishing problem of RNN.Compared with RNN, which has only one state h in the hidden layer, LSTM adds a unit state c to the original structure of RNN.In order to make long-term preservation of short-term input, LSTM proposes to store and update the cell state in the form of "gate".The concepts of input gate, forget gate, and output gate are proposed, and the long-term retention of information is finally realized through the control of the three gates as shown in Figures 1 and 2.  The forget gate controls the proportion of the unit state from the previous moment saved to the current moment.The formula sigmoid can be obtained by the activation function: where is the weight matrix of the forgotten gate; is the offset items; is the sigmoid stimulus function; is the current input, and ℎ is the hidden output in the previous moment.
The input gate controls the proportion of the network state saved to the cell state at the current moment.The following formula is: where is the weight matrix of the sigmoid layer; is the bias term for the input gate sigmoid layer; is the weight matrix of the input gate tanh layer, and is the bias term for output gate tanh layer.The input gate controls the proportion of the network state saved to the cell state at the current moment.The following formula is: where W i is the weight matrix of the sigmoid layer; b i is the bias term for the input gate sigmoid layer; W c is the weight matrix of the input gate tanh layer, and b c is the bias term for output gate tanh layer.The expression after cell state update is: The output gate can extract valid information from the current cell state to be used in a new hidden layer.The mathematical expression of the output gate is: where W o is the weight matrix of the output gate, and b o is the bias term for the output gate.The final output of LSTM is:

Construction of the Model
Affected by the measurement accuracy of the instrument and the inherent noise associated with the information acquisition module of the monitoring system, it is inevitable that there are certain random errors in the monitoring data that cannot be explained by environmental factors, which will lead to prediction accuracy when building a dam deformation monitoring model.Therefore, it is necessary to perform prediction preprocessing on the data.In this paper, the combined noise reduction approach based on EMD combined with Affected by the measurement accuracy of the instrument and the inherent noise as sociated with the information acquisition module of the monitoring system, it is inevitable that there are certain random errors in the monitoring data that cannot be explained by environmental factors, which will lead to prediction accuracy when building a dam de formation monitoring model.Therefore, it is necessary to perform prediction prepro cessing on the data.In this paper, the combined noise reduction approach based on EMD combined with wavelet threshold is used to decompose and reconstruct the original data and combined with the Sparrow Search Algorithm (SSA) to optimize the LSTM model, a concrete dam deformation monitoring model based on empirical mode decomposition (EMD) combined with wavelet threshold noise reduction, and combined with the Spar row Search Algorithm (SSA) to optimize the long short-term memory network (LSTM was constructed.The specific process is shown in Figure 3 below: Step 1: The monitored raw data are decomposed by EMD, and the IMF components ob tained from the decomposition are distributed from high to low.Perform wavelet thresh old noise reduction on high-frequency IMF components, reconstruct the high-frequency IMF components after noise reduction and low-frequency IMF components, and obtain the data after noise reduction; Step 1: The monitored raw data are decomposed by EMD, and the IMF components obtained from the decomposition are distributed from high to low.Perform wavelet threshold noise reduction on high-frequency IMF components, reconstruct the high-frequency IMF components after noise reduction and low-frequency IMF components, and obtain the data after noise reduction; Step 2: Initialize and normalize the denoised data.Determine the following parameters: length of LSTM time window, number of hidden layer cells, sparrow population size and the number of iterations.Subsequently, initial safety threshold and sparrow position; Step 3: Use the predicted value of the LSTM algorithm and the root mean square of the sample data to determine the fitness value of each sparrow; Step 4: Update the sparrow position, get a new fitness value, and search for the optimal position of the population and the global optimal value; Step 5: Perform iterations, determine whether the maximum number of iterations is reached, and obtain the optimal individual solution.Stop the iteration if the maximum value is reached and determine the optimal parameters of the LSTM.If not, repeat the loop step; Step 6: Substitute the obtained LSTM parameters into the training grid to make predictions.

Factsheet
A hydropower station is located in southwestern Yunnan Province, China.The dam is a concrete double-bend arch dam with a crest elevation of 1245 m.The maximum dam height is 292 m, the dam crest length is 992.74 m, the arch crown beam top width is 13 m, and the arch crown beam bottom width is 69.49m.Considering the low frequency of manual observation, measurement points were selected from the automatic measurement points for comparative analysis.Among them, C4-A22-PL-05 measurement points are highly reliable, with fewer missing or jumping data and fewer instrument failures, which can provide longer and more accurate deformation monitoring data.A point is selected at the dam body as an observation point, and the upstream and downstream water levels and temperatures corresponding to this measurement point are shown in Figures 4-6 below.
ual observation, measurement points were selected from the autom points for comparative analysis.Among them, C4-A22-PL-05 measu highly reliable, with fewer missing or jumping data and fewer instrum can provide longer and more accurate deformation monitoring data.A at the dam body as an observation point, and the upstream and downs and temperatures corresponding to this measurement point are shown low.

Data Noise Reduction Based on EMD Combined with Wavelet Threshold
Since the wavelet threshold noise reduction in the data processing, altho sults are smoother, but in the extreme points will still be more blurred, and the EMD method will remove the high-frequency components, the use of low-freq ponents for reconstruction in a certain degree will make the signal distortion this paper proposes a noise reduction method based on EMD combined w threshold.First, the original monitoring data of the dam is decomposed by EM frequency IMF components obtained from the decomposition are denoised by threshold, and the low-frequency components obtained by the EMD decomp reconstructed.This method effectively avoids the signal distortion caused to s by the traditional EMD method.In this paper, different methods are used to data.The analysis results and the IMF components are shown in Figure 7 belo

Data Noise Reduction Based on EMD Combined with Wavelet Threshold
Since the wavelet threshold noise reduction in the data processing, although the results are smoother, but in the extreme points will still be more blurred, and the traditional EMD method will remove the high-frequency components, the use of low-frequency components for reconstruction in a certain degree will make the signal distortion.Therefore, this paper proposes a noise reduction method based on EMD combined with wavelet threshold.First, the original monitoring data of the dam is decomposed by EMD, the high-frequency IMF components obtained from the decomposition are denoised by the wavelet threshold, and the low-frequency components obtained by the EMD decomposition are reconstructed.This method effectively avoids the signal distortion caused to some extent by the traditional EMD method.In this paper, different methods are used to denoise the data.The analysis results and the IMF components are shown in Figure 7  It can be seen from Figures 7 and 8 that after the dam deformation signal based on EMD combined with the wavelet threshold noise reduction method It can be seen from Figures 7 and 8 that after the dam deformation signal is processed based on EMD combined with the wavelet threshold noise reduction method, it is decomposed into five components, which are arranged in order from top to bottom by frequency.Using EMD combined with wavelet threshold noise reduction, the burrs near the extreme points are significantly reduced, the data at the peaks and valleys are smoother, and the amplitude of the signals at the peaks and valleys is well preserved.Compared with the wavelet threshold noise reduction effect, the denoising effect is very ideal, and can well maintain the basic shape of the original data of the dam deformation.It can be seen from Figures 7 and 8 that after the dam deformation signal is processe based on EMD combined with the wavelet threshold noise reduction method, it is decom posed into five components, which are arranged in order from top to bottom by frequency Using EMD combined with wavelet threshold noise reduction, the burrs near the extrem points are significantly reduced, the data at the peaks and valleys are smoother, and th amplitude of the signals at the peaks and valleys is well preserved.Compared with th wavelet threshold noise reduction effect, the denoising effect is very ideal, and can we maintain the basic shape of the original data of the dam deformation.

Model Analysis
There are many factors affecting the deformation of arch dams, such as the time factor, the water pressure factor, and the temperature, therefore, the choice of model parameters is more important.For the model input 10 vectors, where the timing factor is a combination of linear θ and ln θ, taken (θ − θ 0 ), (ln θ − ln θ 0 ).The temperature factor is taken sin 2πt 365 − sin 2πt 0 365 , cos 2πt 365 − cos 2πt 0 365 , sin 4πt 365 − sin 4πt 0 365 , cos 4πt 365 − cos 4πt 0 365 .The dam is a concrete hyperbolic arch dam, therefore, the water pressure factor is taken as (H − H 0 ), (H − H 0 ) 2 , (H − H 0 ) 3 , (H − H 0 ) 4 .After selecting the impact factor and the effect dose, in order to eliminate the influence of the dimension and the magnitude difference between the impact factor as too large, the impact factor and the effect dose are normalized.The nodes of the input layer and output layer of the LSTM model in this paper are 1 and 9, respectively, the number of units in the two LSTM hidden layers are 100 and 50, respectively, and the maximum number of iterations is set to 50, respectively.Select the training set and test set, use the training set data to make the model fully learn the deformation law of the dam, and use the trained prediction model to predict the deformation of the prediction set data.
In this paper, the monitoring data of measurement point C4-A22-PL-05 from 2 February 2014 to 16 June 2015 were selected for parameter searching optimization of SSA-LSTM.The SSA algorithm optimizes the parameters of the LSTM network, which are the number of hidden neurons and the learning rate, respectively, and takes the root mean square error of the monitored real data and the predicted data as the fitness function.At the same time, the sparrow population was set to 10, the number of iterations was 50, the search dimension was 4, the number of neurons m was set to (1, 100), and the learning rate was (0.0001, 0.01).After SSA algorithm optimization, the number of hidden neurons and the learning rate of the two layers were (69, 25, 0.00745).The dam deformation data from the C4-A22-PL-05 measuring point from 17 June 2015 to 13 April 2016 are used as the training set, and the dam monitoring data from 13 April 2016 to November 2016 are used as the test set to carry out testing.In order to verify the validity of the model, different models LSTM, and PSO-SVM are used to construct the corresponding prediction models.The prediction results of the three models are shown in Table 1.The adaptation curve is shown in Figure 9.
The prediction curves and model residuals are shown in Figures 10 and 11 below.The correlation results in Table 1 show that the complex correlation coeffic SSA-LSTM model has a value of 0.  The correlation results in Table 1 show that the complex correlation coefficient R 2 : SSA-LSTM model has a value of 0.9533, LSTM model has a value of 0.9036, and PSO-SVM model has a value of 0.885 at this measurement point.It can be seen that for the selected measurement point C4-A22-PL-05, SSA-LSTM > LSTM > PSO-SVM, where the SSA-LSTM model had the highest actual fit.
Root mean square error (RMSE): the value of SSA-LSTM model at this measuring point is 0.06358, the value of LSTM model is 0.0913, and the value of PSO-SVM model is 0.1293.It can seen that for the selected C4-A22-PL-05 measuring point, SSA-LSTM < LSTM < PSO-SVM, and the root mean square error of SSA-LSTM model is the smallest.Among them, the SSA-LSTM model is reduced by 2.7% compared with the LSTM model and 6.5% compared with the PSO-SVM model.
Mean absolute error (MAE): the value of SSA-LSTM model at this measuring point is 0.05345, the value of LSTM model is 0.07611, and the value of PSO-SVM model is 0.09564.It can be seen that for the selected C4-A22-PL-05 measuring point, SSA-LSTM < LSTM < PSO-SVM, and the average absolute error of SSA-LSTM model is the smallest.Among them, the SSA-LSTM model is reduced by 1.9% compared with the LSTM model and 4.2% compared with the PSO-SVM model.
As can be seen from Figure 10, the prediction model constructed by the three algorithms of SSA-LSTM, LSTM, and PSO-SVM is generally consistent with the actual displacement change process.In comparison, the prediction results of SSA-LSTM algorithm are closer to the measured deformation than those of LSTM and PSO-SVM prediction models.It can be seen from Figure 11 that the residual of the SSA-LSYM model has no obvious change rule and its range of variation is significantly smaller.Other models are increasing with time residuals.It can be seen from Figures 10 and 11 that the SSA-LSTM model fits better than the other two models, and its residual variation range fluctuates less, indicating that the model can more accurately represent the complex nonlinear function relationship between the impact factor and the dam deformation.

1.
This paper proposes a noise reduction method based on EMD combined with wavelet threshold, using the EDM method to decompose the original monitoring data of the dam, and applying wavelet threshold noise reduction to the decomposed high-frequency IMF components.The high-frequency IMF components after noise reduction are obtained, and the low-frequency IMF components obtained by decomposition are combined for reconstruction.A prediction model is constructed from the denoised data, which improves the prediction accuracy of the SAA-LSTM model.

2.
This paper uses the Sparrow Search Algorithm to optimize the long short-term memory (LSTM), and uses the good stability, convergence speed, scalability, and robustness of the Sparrow Search Algorithm to perform grid training and parameter optimization of the LSTM.The global optimal location and fitness values are updated, and the optimized LSTM model is optimized in terms of the number of hidden layer nodes and learning rate using grid search to effectively mine the complex functional relationship between the dam deformation and its influence factors.

3.
The two deformation prediction models of LSVM and PSO-SVM are compared by using the example verification analysis.Compared with the other three models, the multiple correlation coefficient R 2 of the SSA-LSTM model is 0.9533, which is closer to 1 and has better fitting accuracy.The mean absolute error and root mean square error are 0.05345 and 0.06358, which are smaller than the other two models.It can be seen that the prediction accuracy and convergence speed of the SSA-LSTM model have been significantly improved, which provides a new method for high-precision prediction of dam deformation and is more suitable for practical engineering.
used to decompose and reconstruct the original data, and combined with the Sparrow Search Algorithm (SSA) to optimize the LSTM model, a concrete dam deformation monitoring model based on empirical mode decomposition (EMD) combined with wavelet threshold noise reduction, and combined with the Sparrow Search Algorithm (SSA) to optimize the long short-term memory network (LSTM) was constructed.The specific process is shown in Figure3below:

Figure 3 .
Figure 3. Concrete arch dam prediction process based on EMD combined with wavelet threshold noise reduction coupled with SSA-LSTM.

Figure 3 .
Figure 3. Concrete arch dam prediction process based on EMD combined with wavelet threshold noise reduction coupled with SSA-LSTM.

Figure 8 .Figure 8 .
Figure 8. IMF components after noise reduction based on EMD combined with wavelet threshold (a) IMF1 components after noise reduction based on EMD combined with wavelet thresholds; (b IMF2 components after noise reduction based on EMD combined with wavelet thresholds; (c) IMF Figure 8. IMF components after noise reduction based on EMD combined with wavelet thresholds.(a) IMF1 components after noise reduction based on EMD combined with wavelet thresholds; (b) IMF2 components after noise reduction based on EMD combined with wavelet thresholds; (c) IMF3 components after noise reduction based on EMD combined with wavelet thresholds; (d) IMF4 components after noise reduction based on EMD combined with wavelet thresholds; (e) IMF5 components after noise reduction based on EMD combined with wavelet thresholds.
9533, LSTM model has a value of 0.9036, and PS model has a value of 0.885 at this measurement point.It can be seen that for the s measurement point C4-A22-PL-05, SSA-LSTM > LSTM > PSO-SVM, where the SSA model had the highest actual fit.Root mean square error (RMSE): the value of SSA-LSTM model at this me point is 0.06358, the value of LSTM model is 0.0913, and the value of PSO-SVM m 0.1293.It can be seen that for the selected C4-A22-PL-05 measuring point, SSA-L LSTM < PSO-SVM, and the root mean square error of SSA-LSTM model is the s
body can be decomposed into water pressure component δ H , temperature component δ T , and time-dependent component δ θ δ dam

Table 1 .
Prediction results of each prediction model.
Water 2022, 14, x FOR PEER REVIEW