1. Introduction
Dams are the most important infrastructure in water conservancy and hydropower projects and play an active role in flood control, irrigation, shipping, and power generation. However, while dams bring great benefits, they also have a series of safety problems, and a dam failure can have serious social and economic consequences downstream, causing massive personal and property losses [
1,
2]. To reduce the damage caused by dam failure, dam safety monitoring has been carried out in various countries. Uplift pressure is one of the key tasks for monitoring the seepage of concrete dams and plays an important role in reflecting the stability and durability of the dam [
3,
4]. Therefore, it is possible to improve the accuracy of the dam hazard occurrence forecast by combining historical uplift pressure-monitoring data with intelligent algorithms to establish a practical and effective concrete dam safety monitoring model.
The dam safety monitoring model is a mathematical model established to reflect the law of change in the amount of the effect of dam monitoring. Many studies have been conducted on dam safety monitoring models; however, most of them focus on displacement monitoring, whereas there are fewer theoretical research results devoted to uplift pressure-monitoring models. In addition, the current monitoring models used in practical engineering have problems, such as the poor adaptability of the monitoring model and prediction accuracy, that are insufficient for meeting intelligent target requirements. The traditional statistical model is employed as an example. Although the calculation is simple, it is difficult to reflect the nonlinear relationship between the effect size and complex factors. This results in poor extrapolation accuracy and low forecast accuracy [
5]. In recent years, the study of dam safety monitoring models has been enriched by the development of artificial intelligence theory and the wide application of various intelligent algorithms in data analysis and mining [
6].
Support vector machines (SVMs) give excellent performance in solving high-dimensional nonlinear problems [
7], so they have been introduced into dam safety monitoring research by scholars. Rankovic V et al. used SVM for deformation prediction in the safety monitoring of concrete dams, and the application results of engineering examples show that SVM prediction accuracy is high [
8]. SVMs are affected by parameters, so intelligent optimization algorithms, such as particle swarm optimization (PSO) and artificial bee colony optimization (ABC), are used to find the best one. Huaizhi Su et al. developed a PSO-SVM model for dam deformation prediction, and the results showed that the parameter optimization by PSO can improve the model accuracy and shorten the iteration time [
9]. Junrong Zhang et al. established an ABC-SVM model to predict landslide displacements by optimizing the SVM model with the ABC algorithm, and the results showed that the SVM model has excellent prediction performance in short-term prediction; however, in a long-term prediction, the prediction accuracy of the SVM model decreases with the growth of prediction time [
10].
Neural networks have been introduced into dam safety monitoring research due to their powerful nonlinear characterization of multiple features. Hai-Feng Liu et al. applied the backpropagation (BP) neural network to the dam safety monitoring model, and the prediction results showed a high prediction accuracy and stable prediction performance of the model [
11]. Neural networks have proven to be excellent at handling large-scale data sampling problems; however, gradient vanishing and gradient explosion occur as the size of data samples increases. In addition, due to its own structure, the traditional neural network algorithm cannot learn from data with time series characteristics, and the model that was built is not sufficiently adaptable or accurate. The Recurrent Neural Network (RNN) is able to process time series data but is prone to gradient vanishing and gradient explosion problems. The long- and short-term memory network (LSTM) not only takes full consideration of the time series correlation information in the data, but it also avoids the problems of RNN gradient vanishing to a certain extent. Therefore, the model is used in the field of concrete dam deformation safety monitoring [
12] and tailings dam deformation safety monitoring [
13], and the engineering application results all showed that the LSTM model has a higher prediction accuracy and is closer to the actual measured data. However, the LSTM model requires too many parameters for training and overfitting occurs when the amount of data is insufficient [
14]. Meanwhile, the gated recurrent neural network (GRU) model ultimately improves this shortcoming by integrating the forget gate and input gate of the LSTM model into the update gate, thereby reducing the number of parameters [
15]. The GRU model has also shown a better performance than the LSTM model in engineering applications [
16].
In summary, there is a wealth of theoretical research results for large intelligent algorithm monitoring models, but the vast majority of dam safety monitoring-related research is mainly displacement prediction models, with less research devoted to uplift pressure monitoring, so there is an urgent need to supplement the research content of the prediction model of uplift pressure. In addition, the adaptability and accuracy of the intelligent algorithm models studied at this stage still have shortcomings, such as model overfitting and underfitting, which keep them far from the application of real intelligent scenarios. In this study, a CNN-GRU dynamic prediction model for uplift pressure was developed to model uplift pressure-monitoring data with large-scale samples and time-series features. Considering the inherent generalization limitations of a single model, this study combines the CNN’s feature extractability in deep learning and the GRU’s characteristics of long-term memory structure to automatically extract hidden features and long-term temporal dependencies among historical dam monitoring data, which enhances the stability of the model performance. In addition, using the GRU model instead of the LSTM model avoids the phenomenon of overfitting due to insufficient data volume, which affects model prediction accuracy [
17]. Finally, the performance of the CNN-GRU uplift pressure model is verified by engineering examples.
3. Results
3.1. Denoising of Uplift Pressure-Monitoring Data
There will be some noise in the uplift pressure-monitoring data due to aging electronic components, sensor induction distortion, signal channel disturbance, human factors, and other sudden abnormal factors. Real data mixed with noise will reduce the accuracy and stability of the prediction model. Therefore, the original data must be denoised to ensure validity of the data. Commonly used data denoising methods include the Kalman filter, Wiener filter, wavelet transform, empirical mode decomposition, etc. However, the Wiener filter and Kalman filter are not effective enough to address the problem of nonstationary sequence signals. The wavelet transform needs to set the basis function in advance during operation, and empirical mode decomposition is prone to spurious components and mode mixing during the decomposition of sequential signals. VMD overcomes the problems of mode mixing and spurious components of traditional methods and has been widely used in the field of data signal noise reduction [
21,
22,
23,
24].
Therefore, VMD-SE was used to denoise the uplift pressure-monitoring data. The UP17 measurement point was used as an example. The monitored nonlinear, nonstationary historical uplift pressure data series was first decomposed into six intrinsic mode functions (IMFs) with gentle frequency changes and relative stability using variational modal decomposition (VMD). The VMD decomposition of the UP17 measurement point uplift pressure data is presented in
Figure 6. Then, the noisy sequences were identified by calculating the value of sample entropy (SE) [
25,
26]. In this paper, the SE threshold was set to 0.5, i.e., sequences with SE values greater than 0.5 are noisy sequences. The SE values of each IMF component are presented in
Table 3. IMF5 is a noisy sequence. Finally, the remaining IMF components are reconstituted to form the denoised data samples. A comparison of the denoised data samples with the original uplift pressure samples is shown in
Figure 7.
3.2. Impact Factor and Input Data Set Analysis
In the case of a stable dam, the change in the uplift pressure of a concrete dam is mainly affected by upstream and downstream water levels, rainfall, temperature, and time effects [
27]. The impact factors selected for this study include the following:
Water pressure component (HU, (HU)2, HU(2–3), HU(4–7), HU(8–15), HU(16–30), HU(31–60), HD), where HU denotes the upstream water level at the current monitoring date, HU(q-r) denotes the average upstream water level from q to r days before the current monitoring date and HD is the downstream water level at the current monitoring date; temperature component (T0–1, T2–7, T8–15, T16–30, T31–60, T61–120), where Tq-r denotes the average temperature from q to r days before the current monitoring date; rainfall component (R0–1, R2–3, R4–7, R8–15, R16–30, R31–60), where Rq-r denotes the cumulative value of rainfall from q to r days before the current monitoring date; time effects (θ, lnθ), where θ = t/100, t is the cumulative number of monitoring days from the observation date to the reference date.
The selection of input factors has an influential impact on the prediction accuracy of the model. Input factors with low correlation will not only increase the complexity of model iteration but also affect the accuracy of forecast results. Therefore, it is essential to select factors with substantial impacts. Since the Pearson correlation coefficient method has a large bias in dealing with nonlinear problems, the maximum information coefficient (MIC) is introduced in this paper to optimize the influence factors and determine the final set of input factors [
28]. The results of the MIC calculations are presented in
Table 4. It can be seen that the timing factor had a strong correlation with the uplift pressure, so it was retained; the water pressure component had a certain degree of correlation with the uplift pressure, and (
HU(8–15),
HU(16–30),
HU(31–60),
HD) were selected as input factors; the temperature component had a large MIC value, and (
T16–30,
T31–60,
T61–120) were selected as input factors; the MIC values between rainfall components and uplift pressure were small and had limited effect on the trend of uplift pressure, so they were excluded. The final filtered input factors include: Water pressure component (
HU(8–15),
HU(16–30),
HU(31–60),
HD), Temperature component (
T16–30,
T31–60,
T61–120), and Time effects (
θ,
lnθ).
In this study, the model reliability was validated by using the monitoring data of uplift pressure and the above environmental quantities from 4 July 2010 to 26 July 2016 as the input dataset. The data samples from 4 July 2010 to 11 May 2016 were used as training set samples to train the model, and the data samples from 12 May 2016 to 26 July 2016 were used as test samples to test the model performance effects.
3.3. Model Prediction
The denoised dataset was fed into the model, the model was trained using the training set, and then the model was tested using the test set. The iterative loss value curves for the model during training and prediction are shown in
Figure 8, and the prediction results for the test set are shown in
Figure 9.