Multi-Scale Response Analysis and Displacement Prediction of Landslides Using Deep Learning with JTFA: A Case Study in the Three Gorges Reservoir, China

: Reservoir water and rainfall, leading to ﬂuctuations groundwater levels, are the main triggering factors that induce landslides in the Three Gorges Reservoir area. This study investigates the response mechanism of landslide deformation under reservoir water and rainfall variations through long-time on-site observations. To address the non-stationary characteristics of the time-series records, joint time-frequency analysis (JTFA) is ﬁrst introduced into our landslide prediction model. This model employs optimal variational mode decomposition (VMD) to obtain speciﬁc signal components with clear physical meaning, such as trend component and periodic components. Then, multi-scale response analysis between the displacement and external factors three wavelet methods was conducted. The analysis results show a 1 year primary cycle of the time series associated with the landslide evolution. The reservoir water level and rainfall show anti-phase ﬂuctuations. The periodic displacement correlates signiﬁcantly with rainfall, lagging by about two months. The reservoir water is anti-phase with the landslide displacement, preceding it by approximately three months ( − 51 ± 8 ◦ phase difference). For landslide displacement prediction, the gated recurrent units (GRU) neural network model is integrated into the deep learning forecasting architecture. The model takes into account the correlation and hysteresis effect of input variables. Through six experiments, we investigate the effect of data volume on model predictions to determine the optimal model. The results demonstrate that our proposed model ensures high performance in landslide prediction. Moreover, a comparison with six other intelligent algorithms shows the advantages of our model in terms of time-effectiveness and long-sequence forecasting.


Introduction
Since the Three Gorges Reservoir (TGR) was put into operation in June 2003, the stability of slopes along the TGR bank has changed dramatically in response to the periodic fluctuation of the rainfall intensity and the reservoir storage [1][2][3], resulting in thousands of reactivated landslides.The geomechanical and hydrologic characteristics of these landslides are continually affected by the consistent water levels and rainfall changes, leading to catastrophic events [4].For example, the July 2003 Qianjiangping landslide developed during the initial reservoir impoundment, causing 24 fatalities and significant economic losses [5].Since then, research on long-term monitoring and early warning of these landslides has steadily expanded, leading to a deeper understanding of the inherent patterns and the response mechanism of reservoir landslides.
Studies have revealed that the evolution of the reservoir landslide involves a complex, non-stationary and nonlinear dynamic process [6,7].Consequently, the long-term observations collected on the landslide's behavior exhibit non-stationary time-varying characteristics with specific trends and/or seasonality.Although the deformation evolution and response analysis of reservoir landslides in TGR have been extensively studied and discussed in the time domain [8][9][10], their characteristics in the frequency domain are relatively unexplored, concealing valuable spectral information.Furthermore, non-stationary time series show time-frequency characteristics [6,11,12], making time-frequency representation of the signal essential for identifying the dominant fluctuation modes/spectra of different signals and understanding how those modes vary over time.
The empirical mode decomposition (EMD), developed by Huang et al. [13], is wellknown for its time-frequency analysis, which is suitable for processing non-stationary time series.This method decomposes the data into different band-limited intrinsic mode functions (IMFs), but is sensitive to noise and sampling [14].The variational modal decomposition (VMD), a more recent signal decomposition technique, enables fully intrinsic and adaptive decomposition [14].In addition, Isham et al. also pointed out the importance of determining the number of VMD modes to obtain specific signal components with explicitly physical meaning [15].To achieve efficient decomposition, the Shannon entropy, reflecting the uniformity of probability distribution, can be used as a fitness function to derive optimal parameters through an optimization method, such as the harmony search (HS) algorithm [16] and the wolf optimization algorithm (GWO) [17] employed in this study.GWO is easily implemented and exhibits advantages in strong convergence with few parameters.
Moreover, it is crucial to explore the response mechanisms of landslide deformation [18] to unveil the changing patterns of specific signal components at different scales.In 1998, Torrence et al. proposed wavelet analysis (WA) to study non-stationary time series in signal processing [19].Presently, wavelet decomposition and denoising are the most popular WA methods in time series analysis widely used in Geophysics [19], Geography [20] and Meteorology [21].However, WA methods, such as continuous wavelet transform (CWT), cross-wavelet transform (XWT) and wavelet coherency (WTC), are more suitable for analyzing non-stationary time series at different scales [7,[22][23][24].Unfortunately, they are rarely used to analyze long-term non-stationary observations associated with landslide behavior.As a result, this study will extend WA methods and optimized VMD into a deep learning framework to conduct a joint time-frequency analysis (JTFA) before establishing a landslide forecasting model.
So far, various approaches have been developed for landslide hazards research, ranging from physics-based to statistics-based models [9,25,26].Over the past decades, the integration of machine learning (ML) methods into landslide research has led to significant advancements [2,8,9].In recent years, deep-learning approaches, like recurrent neural networks (RNN) and their variants, designed to address the gradient explosion problem, have shown impressive success in predicting landslide displacements based on GNSS time series data [4,10,27,28].Among these variants, the gated recurrent unit (GRU) proposed by Chung et al. [29] stands out, surpassing other RNN variants, such as the LSTM, in terms of training time, parameter update and generalization ability.Its outstanding performance extends to diverse fields, including economics [30], flood control [31] and disaster reduction [10], among others.
This paper presents a deep learning neural network model for predicting landslide displacement, incorporating a joint time-frequency analysis (JTFA) module.The JTFA employs optimal variational mode decomposition (VMD) to obtain specific signal components with clear physical meaning (e.g., trend component and periodic component).Subsequently, multi-scale response analysis between the displacement and external factors is carried out by wavelet transform (CWT), cross-wavelet transform (XWT) and wavelet coherency (WTC).The enhanced GRU model based on deep learning architecture is employed to forecast landslide displacement (Figure 1e-g).Additionally, the model considers the correlation and hysteresis effect of input variables, such as impact factors, during the modeling process.
Remote Sens. 2023, 15, x FOR PEER REVIEW 3 of 22 carried out by wavelet transform (CWT), cross-wavelet transform (XWT) and wavelet coherency (WTC).The enhanced GRU model based on deep learning architecture is employed to forecast landslide displacement (Figure 1e-g).Additionally, the model considers the correlation and hysteresis effect of input variables, such as impact factors, during the modeling process.

Methodology
As can be seen in Figure 1, the forecasting model consists of seven parts: Figure 1  First, the on-site measured landslide displacements and the pre-selected impact factor sequences are decomposed into specific IMF components by an optimal VMD (Figure 1a,b).The number and parameters of the VMD modes with explicit physical meanings are determined by employing the Shannon entropy as a fitness function and the GWO optimization method.Then, a multi-scale response analysis between the displacement and external factors is carried out by three wavelet analysis methods to investigate landslide deformation response mechanisms and to reveal the changing patterns of specific signal components at different scales (Figure 1c).Finally, the enhanced GRU and DES model based on deep learning architecture is used to predict the landslide displacement (Figure 1e-g).

Joint Time-Frequency Analysis (JTFA)
The collected long-term observations associated with the landslide's behavior are non-stationary time series with specific trends or/and seasonality.The landslide displacement is a time-series function controlled by geological conditions, reflecting long-term deformation trends.While subject to external influences, the landslide displacement behaves as a periodic variation with near-white noise associated with random factors.Therefore, the landslide displacement time series can be expressed as follows: where Yt is the landslide displacement at time t; while Tt, Pt and Rt are the trend, periodic and random terms, respectively.While studies on the deformation characteristics of landslides have been extensively explored in the time domain, the significance of time-frequency analysis of the signal

Methodology
As can be seen in Figure 1, the forecasting model consists of seven parts: Figure 1  First, the on-site measured landslide displacements and the pre-selected impact factor sequences are decomposed into specific IMF components by an optimal VMD (Figure 1a,b).The number and parameters of the VMD modes with explicit physical meanings are determined by employing the Shannon entropy as a fitness function and the GWO optimization method.Then, a multi-scale response analysis between the displacement and external factors is carried out by three wavelet analysis methods to investigate landslide deformation response mechanisms and to reveal the changing patterns of specific signal components at different scales (Figure 1c).Finally, the enhanced GRU and DES model based on deep learning architecture is used to predict the landslide displacement (Figure 1e-g).

Joint Time-Frequency Analysis (JTFA)
The collected long-term observations associated with the landslide's behavior are nonstationary time series with specific trends or/and seasonality.The landslide displacement is a time-series function controlled by geological conditions, reflecting long-term deformation trends.While subject to external influences, the landslide displacement behaves as a periodic variation with near-white noise associated with random factors.Therefore, the landslide displacement time series can be expressed as follows: where Y t is the landslide displacement at time t; while T t , P t and R t are the trend, periodic and random terms, respectively.
While studies on the deformation characteristics of landslides have been extensively explored in the time domain, the significance of time-frequency analysis of the signal should not be overlooked.Time-frequency analysis reveals the time-frequency features, and it provides access to the implicit spectral information of displacement Y t with different spectra, enabling a time-frequency perspective of information mining.Having a time-frequency representation is indispensable to investigate the dominant fluctuation spectra of deformation signals and understand how those spectra vary over time.This involves one-dimensional time series converting into a diffuse two-dimensional time-frequency image, allowing for a comprehensive examination of the signal's temporal and frequency properties.

Grey Wolf Optimized Variational Mode Decomposition (GWO-VMD)
The VMD enables fully intrinsic and adaptive signal decomposition [14], by which the input displacement signal could be divided into several intrinsic mode functions (IMFs) with specific characteristics determined in advance.To ensure that (1) the central frequency bandwidth of each IMF is limited, (2) the sum of the estimated bandwidths is minimized, and (3) the sum of each IMF equals the original signal f so that the constrained variational equation can be expressed as follows: where u k is the kth decomposed IMF with a center frequency w k ; and δ(t) and * indicate the Dirac function and the convolution operator, respectively.
Then, the Lagrange multipliers λ and decomposition parameter ε are introduced to transform the constrained problem into an unconstrained variational one, yielding the augmented Lagrange expression: Here, the alternating direction multiplier method (ADMM) is used to search the saddle points of the unconstrained model to obtain the optimal IMFs and the center frequency of the constrained model.This process is defined as a function f VMD in the Algorithm 1.
The VMD is used to decompose complex digital signals, and the empirical decomposition parameter ε is usually K > 3.This value will not be physically meaningful when applied to the landslide displacement decomposition.Thus, obtaining suitable VMD modes is crucial to getting specific signal components with explicitly physical meaning.At this point, the Shannon entropy (Equation ( 4)) is introduced to obtain the optimal IMFs with sparse features: the larger the value is, the better the uniformity of the probability distribution is [32].
As for the landslide's periodic displacement, the probability distribution is relatively uniform and sparse.Thus, the value of ε is optimized using the grey wolf optimizer (GWO) with H(u k ) as the fitness function [26] to obtain the periodic displacement with a uniform probability distribution and strong sparsity along with the trend and random terms.
The landslide displacement Y t is used as input.The population size P of the grey wolf, the upper bound X up and lower bound X low of the individual grey wolves are initialized.Then, the algorithm iterates and updates N times to obtain the optimal value of ε (see the Algorithm 1).In the Algorithm 1, both A and C are coefficient vectors with A = 2a • r 1 − a and C = 2 • r 2 .r 1 and r 2 are random values between 0 and 1.The convergence factor, a = 2 − 2n/N, decreases linearly from 2 to 0 during the iteration.

Wavelet Analysis (WA)
Torrence and Compo [22] proposed WA to study non-stationary time series in signal processing.The most popular WA methods in time series analysis include wavelet decomposition and denoising.Additionally, there are other WA methods, such as the continuous wavelet transform (CWT), the cross-wavelet transform (XWT) and the wavelet coherency (WTC), which excel at non-stationary time series analysis at various scales.However, despite their effectiveness, these WA methods are still rarely used in non-stationary time series analysis associated with the behavior of landslides.
For a discrete time-series x n (n = 1, . .., N) with equal time spacing δt, the CWT is used for mapping the changing properties of non-stationary signals.CWT is defined as the convolution of x n with a scaled and translated version of the wavelet basis function: where * indicates the complex conjugate, s represents the wavelet scale, and n is the localized time index.The non-orthogonal Morlet wavelets are chosen as the basis function: , where η and w 0 are non-dimensional parameters of time and frequency, respectively.When w 0 = 6, the wavelet scale s equals the Fourier period.The wavelet basis functions ψ 0 are then normalized to have unit energy, and the CWT is conducted in the Fourier space to improve efficiency.In Equation (12), W n (s) is a complex value, and thus, |W n (s)| 2 denotes the wavelet power spectrum that clearly discriminates the periodic fluctuation and intensity in the time series.As the input data is assumed to be cyclic, edge effects appear in the power spectrum when dealing with finite-length time series.Hence, the cone of influence (COI) is defined, representing the corresponding edge effects within the region of the power spectrum.In addition, a Fourier power spectrum is constructed by Monte Carlo to represent the red noise background spectrum and later performs a χ 2 test with a specific confidence level, e.g., 95%, to calculate each scale and construct the confidence contour with a 95% confidence level.
The XWT of two time-series x n and y n is defined as W XY = W X W Y *, which measures the similarity between two waveforms.The arg(W XY ) is the relative local phase between x n and y n in the time-frequency space.The mean and confidence interval must be estimated to analyze the phase difference.The phase relation is quantified using the periodic mean of phases over a region with a significance level of 5% outside the COI.The periodic mean of a set of phase angles is defined as follows: Because the phase angles are not independent, it is difficult to calculate the confidence interval of the mean angle.Thus, the solution is obtained by calculating the scatter around the mean using the circular standard deviation er = −2 ln The power spectrum of XWT only shows the high common power part, while the WTC measures the coherence of the XWT in the time-frequency space.The WTC of two-time series is defined as follows: where S is the smooth operator, and S(W) = S scale (S time (W n (S))).S scale and S time indicate smoothing along the wavelet scale axis and over time, separately.

Deep Learning Forecasting Model 2.2.1. Gated Recurrent Unit (GRU)
The gated recurrent unit (GRU) is a variant of recurrent neural networks (RNN), which outperforms other RNN variants, e.g., the LSTM [29], in terms of training time, and has fewer parameters.The GRU enables each recurrent unit to capture dependencies adaptively at different time scales.As a refinement of the general RNN structure, GRU is more straightforward, owing to the absence of a separate storage unit (see Figure 2) regulating the internal information flow via a gated unit.Moreover, it is more efficient in training time, parameter update and generalization ability than other RNNs [29].
sent the red noise background spectrum and later performs a χ 2 test with a specific confidence level, e.g., 95%, to calculate each scale and construct the confidence contour with a 95% confidence level.
The XWT of two time-series xn and yn is defined as W XY = W X W Y *, which measures the similarity between two waveforms.The arg(W XY ) is the relative local phase between xn and yn in the time-frequency space.The mean and confidence interval must be estimated to analyze the phase difference.The phase relation is quantified using the periodic mean of phases over a region with a significance level of 5% outside the COI.The periodic mean of a set of phase angles is defined as follows: ( ) ( ) ( ) arg , with = cos and sin Because the phase angles are not independent, it is difficult to calculate the confidence interval of the mean angle.Thus, the solution is obtained by calculating the scatter around the mean using the circular standard deviation ( ) The power spectrum of XWT only shows the high common power part, while the WTC measures the coherence of the XWT in the time-frequency space.The WTC of twotime series is defined as follows: where S is the smooth operator, and S(W) = S scale (S time (W n (S))).S scale and S time indicate smoothing along the wavelet scale axis and over time, separately.

Gated Recurrent Unit (GRU)
The gated recurrent unit (GRU) is a variant of recurrent neural networks (RNN), which outperforms other RNN variants, e.g., the LSTM [29], in terms of training time, and has fewer parameters.The GRU enables each recurrent unit to capture dependencies adaptively at different time scales.As a refinement of the general RNN structure, GRU is more straightforward, owing to the absence of a separate storage unit (see Figure 2) regulating the internal information flow via a gated unit.Moreover, it is more efficient in training time, parameter update and generalization ability than other RNNs [29].The update gate zt is responsible for determining how much of the past information is to be neglected, while the reset gate rt is responsible for deciding how much past The update gate z t is responsible for determining how much of the past information is to be neglected, while the reset gate r t is responsible for deciding how much past knowledge needs to be passed along into the future.Assume that the current input at t is x t , then one forward calculation of GRU is as follows: where σ indicates the sigmoid function, and W and b are the weight and bias parameters, respectively.
The hidden state at the present moment h t ht of the network is determined by the update gate, the candidate state of the previous time step h t−1 and the current moment h t .The output of the loop ŷt can be calculated by During the training process, the mean-square error (MSE) is used to measure the errors, and the loss function is defined as T, which is optimized by the adaptive moment estimation (Adam) [33] to obtain the minimization loss, and update W and b iteratively.

Double Exponential Smoothing (DES)
Double exponential smoothing (DES) is a weighted moving average method suitable for predicting time series with certain trends.This method operates by adopting a weighted combination of the past observations and recent observations, with relatively higher weights assigned to the more recent data points.The slope component is updated through exponential smoothing [34], making it well-suited for time-sensitive landslide displacement prediction.Thus, DES is employed to predict the trend component of the landslide displacement.
Given a displacement time series with a specific trend x t , t = 1, . .., N. s t represents the exponential smoothing value, b t represents the optimal trend estimate.The model's output ŷt+m is an estimate of x t+m at t + m (where m ≥ 1) and can be written as: where The ζ and ξ are smoothing factors for the data and the trend, respectively, that range from 0 to 1 and are usually set as ζ = 0.98 and ξ = 0.99.

Evaluation Indicators
Two evaluation indicators, namely the coefficient of determination (R 2 ) and the root mean square error (RMSE), are introduced to evaluate the performance of the deep learning architecture forecasting mode.These indicators are widely used in deep learning regression tasks and are defined as follows: where y t and ŷt are the observations and the predicted value at time t, respectively, and y represents the mean of the N observed measures.

Pre-Processing of the Collected Data Sequence
Baishuihe landslide is a typical loose mound landslide in the TGR, about 56 km west of the Three Gorges Dam in Zigui County, Hubei Province (see Figure 3a).The landslide slope is about 30 • with an average thickness of about 30 m, and the volume is about 1260 × 10 4 m 3 .The sliding surface is a contact zone between residual slope deposit and bedrock that is about 0.9-3.1 m thick.Its bedrock lithology is a medium-thick sandstone layer with a thin mudstone layer, yielding 15 • ∠ 36 • , with joints and fissures developed in the rock layer.The slide material consists mainly of residual Quaternary slope deposits of gravelly soil with a gravel content of 20% to 40% [35].
layer.The slide material consists mainly of residual Quaternary slope deposits of grave soil with a gravel content of 20% to 40% [35].
As shown in Figure 3b, 11 GNSS monitoring stations (with a plane accuracy of 5 ppm) were deployed on the Baishuihe landslide; six stations, namely the ZG93, ZG1 XD-01, XD-02, XD-03 and XD-04, are located in the warning zone.The observed displa ments and the corresponding rainfall and reservoir levels are illustrated in Figure 4. T TGR is in the wet season from May to September when the reservoir begins to play regulatory role of releasing flood water in moderation per annum.At this time of year clear pattern of a step increase in the cumulative displacement appears, indicating a co bined effect of the changing reservoir water levels and rainfall on the landslide evolutio However, monitoring stations in the non-warning area give a relatively small displa ment variation with an average annual amount of around 35 mm.In the warning zon the ZG93 station is located on profile III.As shown in Figure 3c, the mid-slip zone ru the entire slope length; the leading edge of the landslide has been submerged in water years, making the stability of the slope within this area vulnerable to the reservoir water lev  As shown in Figure 3b, 11 GNSS monitoring stations (with a plane accuracy of 5 ± 1 ppm) were deployed on the Baishuihe landslide; six stations, namely the ZG93, ZG118, XD-01, XD-02, XD-03 and XD-04, are located in the warning zone.The observed displacements and the corresponding rainfall and reservoir levels are illustrated in Figure 4.The TGR is in the wet season from May to September when the reservoir begins to play a regulatory role of releasing flood water in moderation per annum.At this time of year, a clear pattern of a step increase in the cumulative displacement appears, indicating a combined effect of the changing reservoir water levels and rainfall on the landslide evolution.However, monitoring stations in the non-warning area give a relatively small displacement variation with an average annual amount of around 35 mm.In the warning zone, the ZG93 station is located on profile III.As shown in Figure 3c, the mid-slip zone runs the entire slope length; the leading edge of the landslide has been submerged in water for years, making the stability of the slope within this area vulnerable to the reservoir water level.In the subsequent analysis, the site monitoring data collected by ZG93 is selecte the data source of displacement.Other synchronous on-site data include the monthly ervoir water level data and the local rainfall data in the TGR between July 2003 and cember 2012.As shown in Figure 4, the time-series displacement of the landslide sh specific trends, seasonality and noise-induced fluctuations, indicating the need to ob IFMs based on VMD.Setting K = 3 in VMD allow us to acquire three IMFs in an increa order of frequency.The influence factors, e.g., the reservoir level and rainfall, mainly tribute to the periodic terms, showing cyclical fluctuations.At this point, the Shan entropy is introduced as the fitness function to obtain the optimal IMFs by searchin the optimum ε using the GWO.The population size of the GWO is set to 20, the num of iterations to 100, and the upper-lower bounds to [0.01,100].The optimal ε of ZG93 determined to be 0.55; it is then utilized in the VMD to obtain the trend, the periodic the random components of the displacement (Figure 5).

Pre-Selection of the Impact Factors
Surface water infiltration is an essential factor affecting slope stability, especially con ing the matrix suction of the unsaturated zone.When the slope experiences rainfall infiltra the bulk density increases, causing the matrix suction to decrease, which in turn raise sliding forces and reduces the shear strength [36].Moreover, rainfall infiltration into frac enhances the split effect and raises the groundwater table, resulting in an increase in por ter pressure [37], thus affecting the stability of the landslide.As shown in Figure 6, the e of rainfall on landslide displacement has an evident hysteresis.For example, the wet se in the TGR area typically begins in May, but the response of landslide deformation doe In the subsequent analysis, the site monitoring data collected by ZG93 is selected as the data source of displacement.Other synchronous on-site data include the monthly reservoir water level data and the local rainfall data in the TGR between July 2003 and December 2012.As shown in Figure 4, the time-series displacement of the landslide shows specific trends, seasonality and noise-induced fluctuations, indicating the need to obtain IFMs based on VMD.Setting K = 3 in VMD allow us to acquire three IMFs in an increasing order of frequency.The influence factors, e.g., the reservoir level and rainfall, mainly contribute to the periodic terms, showing cyclical fluctuations.At this point, the Shannon entropy is introduced as the fitness function to obtain the optimal IMFs by searching for the optimum ε using the GWO.The population size of the GWO is set to 20, the number of iterations to 100, and the upper-lower bounds to [0.01,100].The optimal ε of ZG93 was determined to be 0.55; it is then utilized in the VMD to obtain the trend, the periodic and the random components of the displacement (Figure 5).In the subsequent analysis, the site monitoring data collected by ZG93 is selected as the data source of displacement.Other synchronous on-site data include the monthly reservoir water level data and the local rainfall data in the TGR between July 2003 and December 2012.As shown in Figure 4, the time-series displacement of the landslide shows specific trends, seasonality and noise-induced fluctuations, indicating the need to obtain IFMs based on VMD.Setting K = 3 in VMD allow us to acquire three IMFs in an increasing order of frequency.The influence factors, e.g., the reservoir level and rainfall, mainly contribute to the periodic terms, showing cyclical fluctuations.At this point, the Shannon entropy is introduced as the fitness function to obtain the optimal IMFs by searching for the optimum ε using the GWO.The population size of the GWO is set to 20, the number of iterations to 100, and the upper-lower bounds to [0.01,100].The optimal ε of ZG93 was determined to be 0.55; it is then utilized in the VMD to obtain the trend, the periodic and the random components of the displacement (Figure 5).

Pre-Selection of the Impact Factors
Surface water infiltration is an essential factor affecting slope stability, especially concerning the matrix suction of the unsaturated zone.When the slope experiences rainfall infiltration, the bulk density increases, causing the matrix suction to decrease, which in turn raises the sliding forces and reduces the shear strength [36].Moreover, rainfall infiltration into fractures enhances the split effect and raises the groundwater table, resulting in an increase in porewater pressure [37], thus affecting the stability of the landslide.As shown in Figure 6, the effect of rainfall on landslide displacement has an evident hysteresis.For example, the wet season in the TGR area typically begins in May, but the response of landslide deformation does not

Pre-Selection of the Impact Factors
Surface water infiltration is an essential factor affecting slope stability, especially concerning the matrix suction of the unsaturated zone.When the slope experiences rainfall infiltration, the bulk density increases, causing the matrix suction to decrease, which in turn raises the sliding forces and reduces the shear strength [36].Moreover, rainfall infiltration into fractures enhances the split effect and raises the groundwater table, resulting in an increase in porewater pressure [37], thus affecting the stability of the landslide.As shown in Figure 6, the effect of rainfall on landslide displacement has an evident hysteresis.For example, the wet season in the TGR area typically begins in May, but the response of landslide deformation does not manifest until July or August.This delayed response highlights the complex interactions between rainfall, infiltration and subsequent landslide movements.
manifest until July or August.This delayed response highlights the complex interactions between rainfall, infiltration and subsequent landslide movements.
Studies have shown that the periodic deformation highly correlates with the cumulative rainfall of the current month and the preceding two months [2].In this study, grey correlation analysis gives a correlation degree of 0.79 and 0.78, further confirming their close relationship with the displacement time series.Thus, the monthly rainfall V2 and the two-month cumulative rainfall V3 are selected to characterize the rainfall effects on landslide displacement, thereby avoiding transitional redundancy.The periodical variation of reservoir level alters the distribution of the seepage field and the stress state of the rock mass, directly influencing the stability of the landslide.The rapid decline in reservoir water results in a higher hydraulic gradient inside and outside the slope [38].The seepage force along the slope greatly affects the landslide's stability, especially during the late reservoir water decline when rainfall concentration occurs.The landslide deformation and failure process are further accelerated [39].As shown in Figure 7, the changes in the periodical deformation are consistent with the reservoir level fluctuation.
To characterize the reservoir level effects on landslide displacement, the monthly mean water level V1, the monthly variation V4, the monthly change rate V5 and the bimonthly variation V6 of the reservoir water level are selected.These variables exhibit a grey correlation degree of 0.78, 0.81, 0.8 and 0.76, respectively, signifying their significant relationship with the displacement of the landslide.Studies have shown that the periodic deformation highly correlates with the cumulative rainfall of the current month and the preceding two months [2].In this study, grey correlation analysis gives a correlation degree of 0.79 and 0.78, further confirming their close relationship with the displacement time series.Thus, the monthly rainfall V 2 and the two-month cumulative rainfall V 3 are selected to characterize the rainfall effects on landslide displacement, thereby avoiding transitional redundancy.
The periodical variation of reservoir level alters the distribution of the seepage field and the stress state of the rock mass, directly influencing the stability of the landslide.The rapid decline in reservoir water results in a higher hydraulic gradient inside and outside the slope [38].The seepage force along the slope greatly affects the landslide's stability, especially during the late reservoir water decline when rainfall concentration occurs.The landslide deformation and failure process are further accelerated [39].As shown in Figure 7, the changes in the periodical deformation are consistent with the reservoir level fluctuation.
manifest until July or August.This delayed response highlights the complex interactions between rainfall, infiltration and subsequent landslide movements.
Studies have shown that the periodic deformation highly correlates with the cumulative rainfall of the current month and the preceding two months [2].In this study, grey correlation analysis gives a correlation degree of 0.79 and 0.78, further confirming their close relationship with the displacement time series.Thus, the monthly rainfall V2 and the two-month cumulative rainfall V3 are selected to characterize the rainfall effects on landslide displacement, thereby avoiding transitional redundancy.The periodical variation of reservoir level alters the distribution of the seepage field and the stress state of the rock mass, directly influencing the stability of the landslide.The rapid decline in reservoir water results in a higher hydraulic gradient inside and outside the slope [38].The seepage force along the slope greatly affects the landslide's stability, especially during the late reservoir water decline when rainfall concentration occurs.The landslide deformation and failure process are further accelerated [39].As shown in Figure 7, the changes in the periodical deformation are consistent with the reservoir level fluctuation.
To characterize the reservoir level effects on landslide displacement, the monthly mean water level V1, the monthly variation V4, the monthly change rate V5 and the bimonthly variation V6 of the reservoir water level are selected.These variables exhibit a grey correlation degree of 0.78, 0.81, 0.8 and 0.76, respectively, signifying their significant relationship with the displacement of the landslide.To characterize the reservoir level effects on landslide displacement, the monthly mean water level V 1 , the monthly variation V 4 , the monthly change rate V 5 and the bimonthly variation V 6 of the reservoir water level are selected.These variables exhibit a grey correlation degree of 0.78, 0.81, 0.8 and 0.76, respectively, signifying their significant relationship with the displacement of the landslide.
During the decomposition of the influence factors, we set K = 2 because they mainly show cyclical fluctuations.The other VMD settings are the same as those used for the landslide displacement.The Shannon entropy of the periodic term of the factors serves as the fitness function to obtain the optimal ε.The six obtained values of ε are 0.23, 0.99, 0.13, 0.98, 0.11 and 0.95, respectively.The results are given in Figure 8, where subscript p indicates the periodic term and subscript s indicates the random term.
Remote Sens. 2023, 15, x FOR PEER REVIEW 1 landslide displacement.The Shannon entropy of the periodic term of the factors ser the fitness function to obtain the optimal ε.The six obtained values of ε are 0.23, 0.99 0.98, 0.11 and 0.95, respectively.The results are given in Figure 8, where subscript cates the periodic term and subscript s indicates the random term.

Multi-Scale Response Analysis
Investigating the deformation characteristics and the response analysis at dif scales is crucial to revealing the landslide mechanism.To achieve this, the JTFA is duced to conduct a multi-scale response analysis between the displacement and ex factors by three WA methods to investigate the deformation response mechanism reveal the changing patterns of specific signal components at different scales.Th methods employed in this study include the continuous wavelet transform (CWT cross-wavelet transform (XWT) and the wavelet coherency (WTC).

Analysis of CWT
The CWT extends time series analysis into a time-frequency domain to intu map the changing properties of non-stationary signals.Through CWT, we have obs an apparent yearly periodicity (1 year cycle) in the time series data associated wi Baishuihe landslide.This periodicity is evident in the landslide displacement, rain tensity and reservoir levels fluctuation, all coinciding with the primary cycle.
As critical external factors influencing landslides displacement, the variation rainfall and reservoir water level (R.w.l) also exhibits specific patterns.With a subtr monsoon climate, the precipitation in the TGR shows seasonal features characteriz a total rainfall exceeding 1000 mm per year.The rainfall is concentrated mainly in su and winter [40], which aligns with the 1 year cycle (12 months) of monsoon-relate terns obtained by the CWT analysis (see Figure 9a).
The storage of the TGR can be categorized into three stages:

Multi-Scale Response Analysis
Investigating the deformation characteristics and the response analysis at different scales is crucial to revealing the landslide mechanism.To achieve this, the JTFA is introduced to conduct a multi-scale response analysis between the displacement and external factors by three WA methods to investigate the deformation response mechanisms and reveal the changing patterns of specific signal components at different scales.The WA methods employed in this study include the continuous wavelet transform (CWT), the cross-wavelet transform (XWT) and the wavelet coherency (WTC).

Analysis of CWT
The CWT extends time series analysis into a time-frequency domain to intuitively map the changing properties of non-stationary signals.Through CWT, we have observed an apparent yearly periodicity (1 year cycle) in the time series data associated with the Baishuihe landslide.This periodicity is evident in the landslide displacement, rainfall intensity and reservoir levels fluctuation, all coinciding with the primary cycle.
As critical external factors influencing landslides displacement, the variation of the rainfall and reservoir water level (R.w.l) also exhibits specific patterns.With a subtropical monsoon climate, the precipitation in the TGR shows seasonal features characterized by a total rainfall exceeding 1000 mm per year.The rainfall is concentrated mainly in summer and winter [40], which aligns with the 1 year cycle (12 months) of monsoon-related patterns obtained by the CWT analysis (see Figure 9a).Generally, the displacement dominated by geological conditions is approximately otonic over time, indicating the long-term trend.The displacement affected by externa gering factors (e.g., rainfall and R.w.l variation) can be expressed as a proximate periodic tion, leading to different deformation features.Thus, the periodical displacement is the mal option to illustrate the multi-scale response relationship with the impact factors.
Figure 9c gives the CWT power spectrum of the displacement at ZG93, showin year cyclic period (12 months), indicating a combined effect of changing reservoir w levels and rainfall on the landslide displacement.In addition, a 2 years cyclic perio months) is observed during extreme changes in the landslide displacement.This phe enon seems related to the deformation responding to the overall rise of the R.w.l, but fu research is still needed to confirm the underlying causes.Thus, cross-analysis betwee displacement and the influence factors are performed via XWT and WTC later to further the effect of rainfall and R.w.l variation on the kinematics of the Baishuihe landslide.

Analysis of XWT and WTC
XWT is a measure of similarity between two waveforms, showing the presen high common power part.On the other hand, the WTC, which combines wavelet t form and coherence analysis, measures the coherence of the XWT in the time-frequ space.It reproduces the consistency and correlation of the sequence in different pe and scales through time-frequency analysis of the time series data [41].Thus, emplo Generally, the displacement dominated by geological conditions is approximately monotonic over time, indicating the long-term trend.The displacement affected by external triggering factors (e.g., rainfall and R.w.l variation) can be expressed as a proximate periodic function, leading to different deformation features.Thus, the periodical displacement is the optimal option to illustrate the multi-scale response relationship with the impact factors.
Figure 9c gives the CWT power spectrum of the displacement at ZG93, showing a 1 year cyclic period (12 months), indicating a combined effect of changing reservoir water levels and rainfall on the landslide displacement.In addition, a 2 years cyclic period (24 months) is observed during extreme changes in the landslide displacement.This phenomenon seems related to the deformation responding to the overall rise of the R.w.l, but further research is still needed to confirm the underlying causes.Thus, cross-analysis between the displacement and the influence factors are performed via XWT and WTC later to further study the effect of rainfall and R.w.l variation on the kinematics of the Baishuihe landslide.

Analysis of XWT and WTC
XWT is a measure of similarity between two waveforms, showing the presence of high common power part.On the other hand, the WTC, which combines wavelet trans-form and coherence analysis, measures the coherence of the XWT in the time-frequency space.It reproduces the consistency and correlation of the sequence in different periods and scales through time-frequency analysis of the time series data [41].Thus, employing XWT and WTC with dual time series will contribute to further elucidating the characteristics of landslide displacements and impact factors in terms of local coherence and detailed variation.
To explore the underlying correlations and causes in multiple time periods and scales (e.g., monthly or yearly time scales), we construct, the XWT power and the WTC coherence spectrum of the periodic displacement and the six pre-selected impact factors.The red noise standard spectrum is used to verify the reliability of the results, with a 95% confidence level obtained through the standard red noise test enclosed by the heavy solid black line.The thin solid black line enclosed area represents the cone of influence (COI) of the wavelet analysis, where edge effects are significant; thus, the response relationships will not be analyzed outside the COI.
As can be seen from Figure 10, there is a high common power area shared by the R.w.l and the rainfall time series on a time scale of 1 year (12 months) throughout the study period (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012).The two time series are anti-phase relationship within the whole high common power area, with a mean phase difference of about −178 ± 3 • , confirming a fluctuation pattern of the R.w.l opposite to the precipitation conditions and thereby guaranteeing the safe operation of the TGR and preventing flooding disasters.XWT and WTC with dual time series will contribute to further elucidating the characteristics of landslide displacements and impact factors in terms of local coherence and detailed variation.
To explore the underlying correlations and causes in multiple time periods and scales (e.g., monthly or yearly time scales), we construct, the XWT power and the WTC coherence spectrum of the periodic displacement and the six pre-selected impact factors.The red noise standard spectrum is used to verify the reliability of the results, with a 95% confidence level obtained through the standard red noise test enclosed by the heavy solid black line.The thin solid black line enclosed area represents the cone of influence (COI) of the wavelet analysis, where edge effects are significant; thus, the response relationships will not be analyzed outside the COI.
As can be seen from Figure 10, there is a high common power area shared by the R.w.l and the rainfall time series on a time scale of 1 year (12 months) throughout the study period (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012).The two time series are anti-phase relationship within the whole high common power area, with a mean phase difference of about −178 ± 3°, confirming a fluctuation pattern of the R.w.l opposite to the precipitation conditions and thereby guaranteeing the safe operation of the TGR and preventing flooding disasters.As shown in Figure 11, the periodic displacement and the associated impact factors have a significant power area throughout the study period.Rainfall shows a natural cyclic fluctuation pattern.Figure 11a,b shows that the periodic displacement shares a continuous significant power sector with the rainfall at a time scale of 1 year (12 months) and is highly coherent.The two time series are in-phase throughout the study period, with a phase difference of about 34 ± 12°, indicating about a two-month lag behind the displacement than the rainfall.In addition, a 2 years cyclic period (24 months) coincides with the extreme change in the landslide displacement between 2006 and 2009, showing an antiphase with the R.w.l.This phenomenon seems related to the deformation response to the rapid rise of the R.w.l; since it is the first time the water level R.w.l has reached its maximum, further research is still needed to confirm the underlying causes of this behavior.As shown in Figure 11, the periodic displacement and the associated impact factors have a significant power area throughout the study period.Rainfall shows a natural cyclic fluctuation pattern.Figure 11a,b shows that the periodic displacement shares a continuous significant power sector with the rainfall at a time scale of 1 year (12 months) and is highly coherent.The two time series are in-phase throughout the study period, with a phase difference of about 34 ± 12 • , indicating about a two-month lag behind the displacement than the rainfall.In addition, a 2 years cyclic period (24 months) coincides with the extreme change in the landslide displacement between 2006 and 2009, showing an anti-phase with the R.w.l.This phenomenon seems related to the deformation response to the rapid rise of the R.w.l; since it is the first time the water level R.w.l has reached its maximum, further research is still needed to confirm the underlying causes of this behavior.As previously stated, the fluctuation of the R.w.l shows a cyclical pattern opposite to the precipitation conditions.As shown in Figure 11c-f, the periodic displacement and the associated four R.w.l factors also have a significant power sector with a 1 year (12 months) cyclic period.However, there is a notable difference with the monthly mean water level V1 during the rapid impounding of the TGR (2006-2009), as it shows an inverse phase with the displacement.These four factors all contain a significant power sector with a 2 years (24 months) cyclic period, showing anti-phase relations, which may share a similar cause as that seen in the power spectrum of displacement and rainfall.This finding suggests As previously stated, the fluctuation of the R.w.l shows a cyclical pattern opposite to the precipitation conditions.As shown in Figure 11c-f, the periodic displacement and the associated four R.w.l factors also have a significant power sector with a 1 year (12 months) cyclic period.However, there is a notable difference with the monthly mean water level V 1 during the rapid impounding of the TGR (2006-2009), as it shows an inverse phase with the displacement.These four factors all contain a significant power sector with a 2 years (24 months) cyclic period, showing anti-phase relations, which may share a similar cause as that seen in the power spectrum of displacement and rainfall.This finding suggests that the rise of the reservoir level is closely related to the extraordinary change in landslide displacement, providing valuable insights into the influence of the reservoir water level on landslide behavior during specific periods.
As shown in Figure 11c, there is a phase difference of −51 ± 8 • between the landslide displacement and the monthly mean level of the reservoir water, indicating the displacement change occurs ahead of the change in reservoir water level by about three months.This phenomenon is likely due to the fact that rising reservoir levels contribute to landslide stability, and the displacement increase often occurs before the reservoir level rises or after it falls.In-phase relations are observed between the displacement and the monthly variation V4, the monthly change rate V5 and the bi-monthly variation V6 of the R.w.l, with a mean phase difference of −13 ± 9 • , indicating a consistent variation pattern.All pre-selected impact factors show strong coherence over a 1 year cyclic period (12 months), except during the period from 2006 to 2009.
The analysis above confirms the preselected six impact factors have a strong correlation with the variation trends of landslide displacements.The periodic variation of the rainfall intensity and the reservoir storage leads to regularly varying underground water levels and the seepage field distribution, which adversely affects the Baishuihe landslide's stability.These regular variations alter the equilibrium of the slope and cause periodic changes in landslide displacements time series.The JFTA helps illustrate the periodically changing patterns and the response between the displacement and external factors at multi-scales to reveal the landslide mechanism behind the landslide behavior.By employing JTFA, we gain valuable insights into how the landslide responds to the dynamic interplay of various factors, providing a deeper understanding of the landslide's complex behavior and contributing to the advancement of landslide research and prediction.Six training sets with different data volumes are obtained by dividing the dataset based on natural years (see Table 1).Using these datasets, six training models are constructed, and the optimal data volume for the prediction model is determined by comparing the model performance.Throughout this process, a sequential prediction strategy is implemented, considering the timeliness of monitoring GNSS displacement.

Data Packet Time Date
In the experiment, the deep learning architecture of the forecasting model in this paper is built using the Deep Learning Toolbox of Matlab2020.We empirically set the learning rate to 0.01 and the training number to 250.However, the number of hidden neurons is a crucial factor that can affect the prediction precision, and therefore, it should be determined through carefully designed comparison experiments.The results of the seven comparison experiments are shown in Table 2.According to Table 2, the model performance is proportional to its number of neurons.As the number of neurons increases, the RMSE of the model first decreases and then increases, with the optimal RMSE achieved when the number of neurons is 100.However, the computation time keeps growing as the number of neurons increases.Consequently, the number of hidden neurons is set to 100 in the following experiments.The DES is employed to predict the trend component of the landslide displacement by using a weighted combination of past observations with recent observations given relatively higher weights than the older ones.Thus, despite the data volume difference between the designed six training datasets, the predicted trend component displacement is almost identical.Here, the prediction of Model 6 is taken as the final trend result.The DES results and the absolute error are shown in Figure 12a, with R 2 = 0.998 and RMSE = 2.741 mm.better.However, it does not apply to random components; for random component fo casts, an appropriate amount of datasets works best.

Cumulative Displacement Prediction
The predicted cumulative displacements can be derived by combining the trend, t periodic and the random predicted components.The optimal prediction of each comp nent is illustrated in Figure 12.According to Figure 12, the absolute forecast error of t trend component remains small (<20 mm) and does not vary much, despite the defo mation being around 2 m.In contrast, the absolute forecast error of the periodic comp nent follows a certain regularity and shows a tendency to decrease, and that of the rando component shows randomness.
The evaluation indicators, as listed in Table 4, assess the performance of the six mo els.According to Table 4 The GRU could adaptively capture dependencies at different time scales, especially suitable for handling non-linear problems, and is thus used to forecast landslide displacements for periodic and random terms.The performance of the GRU in predicting the periodic component of the landslide displacement is shown in Table 3.According to Table 3, the GRU performance increases as the training data volume increases; thus, the optimal result is achieved with Model 6.However, the model performance of the random component displacement first increases and then decreases; the optimal results come from Model 3. The above results indicate that better periodic component prediction needs more trained data; of the six designed models, the larger the amount of the training data, the better.However, it does not apply to random components; for random component forecasts, an appropriate amount of datasets works best.

Cumulative Displacement Prediction
The predicted cumulative displacements can be derived by combining the trend, the periodic and the random predicted components.The optimal prediction of each component is illustrated in Figure 12.According to Figure 12, the absolute forecast error of the trend component remains small (<20 mm) and does not vary much, despite the deformation being around 2 m.In contrast, the absolute forecast error of the periodic component follows a certain regularity and shows a tendency to decrease, and that of the random component shows randomness.
The evaluation indicators, as listed in Table 4, assess the performance of the six models.According to Table 4, both indicators experience an increase and then a decrease as the volume of the training dataset increases.A maximum RMSE of 26.043 mm and a minimum R 2 of 0.904 comes from Model 2 with a data volume of two natural years.The turning point found at Model 4 with four years of data volume shows the best results among the six models, with R 2 of 0.952 and RMSE of 18.582 mm.According to Section 4.2.1, Model 6 is the best model for predicting the periodic component, while Model 3 gives the best performance for predicting the random part.Consequently, based on this, an optimal combination prediction model for landslide displacement is constructed.The evaluation indicators of the optimal model, shown in Table 4, outperform the other six models in terms of both evaluation indicators, with an RMSE of 12.301 mm and R 2 of 0.979.Compared with Model 4, the best forecast model among the six, the RMSE decreases by 6.281 mm and R 2 increases by 0.027 for the optimal model.

Comparative Experiments and Analyses
In Section 3.2, we conducted a pre-selection of the impact factors, and later, through the multi-scale response analysis in Section 3.3, we gained valuable insights into how the landslide responds to the dynamic interplay of the selected factors.The analysis revealed an apparent 1 year primary cycle of the time series associated with the landslide evolution.Of particular interest were the delayed response results, which highlight the complex interactions between the reservoir water level (R.w.l) and rainfall with the subsequent landslide movements.This provides a deeper understanding of the landslide's complex behavior and guides the selection of the landslide prediction model.
GRU and other recurrent neural networks (RNNs) excel at handling the non-linear dynamic characteristics present in complex time series.The GRU model, in particular, has the ability to capture dependencies at different time scales, making it well-suited for time series landslide displacement prediction.Considering the outstanding performance of GRU in time-series forecasting, we conducted comparative experiments to analyze the performance differences of the optimal model screened in Section 4.2.2, both when considering impact factors and when not considering them.
Figure 13 displays the predicted cumulative displacements and absolute errors.Notably, when the impact factors are ignored, the model exhibits more significant deviations, indicating a decline in prediction performance.The calculated RMSE and R 2 are 20.218mm and 0.942, respectively.However, when considering the impact factors, the model produces an improved performance, yielding an RMSE of 12.301 mm and R 2 of 0.979.This results in a reduction in the RMSE by 7.917 mm and an increase in R2 by 0.037, respectively.
Remote Sens. 2023, 15, x FOR PEER REVIEW 18 of 22 an apparent 1 year primary cycle of the time series associated with the landslide evolution.
Of particular interest were the delayed response results, which highlight the complex interactions between the reservoir water level (R.w.l) and rainfall with the subsequent landslide movements.This provides a deeper understanding of the landslide's complex behavior and guides the selection of the landslide prediction model.GRU and other recurrent neural networks (RNNs) excel at handling the non-linear dynamic characteristics present in complex time series.The GRU model, in particular, has the ability to capture dependencies at different time scales, making it well-suited for time series landslide displacement prediction.Considering the outstanding performance of GRU in time-series forecasting, we conducted comparative experiments to analyze the performance differences of the optimal model screened in Section 4.2.2, both when considering impact factors and when not considering them.
Figure 13 displays the predicted cumulative displacements and absolute errors.Notably, when the impact factors are ignored, the model exhibits more significant deviations, indicating a decline in prediction performance.The calculated RMSE and R 2 are 20.218mm and 0.942, respectively.However, when considering the impact factors, the model produces an improved performance, yielding an RMSE of 12.301 mm and R 2 of 0.979.This results in a reduction in the RMSE by 7.917 mm and an increase in R2 by 0.037, respectively.As a result, the impact factors selected in Section 3.2 have been proven to be indispensable for multi-scale response analysis and the prediction model training process.Despite the exceptional performance of GRU in time-series forecasting, it is evident that the influence factors of landslides displacement have a significant impact on the predictions.Hence, the analysis and selection of the impact factors during the modeling process are crucial, underscoring the comprehensiveness of the proposed model.
At this stage, the 17 reserved sets of data from August 2011 to December 2012 are utilized for model validation to further verify the model's performance.Figure 14 displays the cumulative landslide displacement prediction (17 sets) and the absolute errors of the proposed model, resulting in a calculated RMSE of 9.715 mm and R 2 of 0.967.The results demonstrate that the optimal combination prediction model exhibits a reliable capacity for predicting landslide displacement.As a result, the impact factors selected in Section 3.2 have been proven to be indispensable for multi-scale response analysis and the prediction model training process.Despite the exceptional performance of GRU in time-series forecasting, it is evident that the influence factors of landslides displacement have a significant impact on the predictions.Hence, the analysis and selection of the impact factors during the modeling process are crucial, underscoring the comprehensiveness of the proposed model.
At this stage, the 17 reserved sets of data from August 2011 to December 2012 are utilized for model validation to further verify the model's performance.Figure 14 displays the cumulative landslide displacement prediction (17 sets) and the absolute errors of the proposed model, resulting in a calculated RMSE of 9.715 mm and R 2 of 0.967.The results demonstrate that the optimal combination prediction model exhibits a reliable capacity for predicting landslide displacement.To further illustrate the superiority of the proposed method compared to existing state-of-the-art methods, such as the GWO-MIC-SVR [42], the V/S-LSTM [43], the Chaotic DWT-ELM [44] and the Multi-Chaotic ELM [45], we conduct a comparative analysis of the forecasts generated by these models.The performance of the forecast models is shown in Table 5.
According to Table 5, the proposed model achieves a competitive high-accuracy result in terms of RMSE, ranking in the top two, with an RMSE of 9.715 mm, closely following the V/S-LSTM with an RMSE of 8.950 mm.However, what makes the proposed model stands out is its ability to makes 17 sets of consecutive forecasts, while V/S-LSTM only forecasts 8 sets.
Considering that forecast errors can accumulate over time, the proposed model's RMSE of 0.765 mm for 17 sets of consecutive forecasts demonstrates its capability for precise and reliable long-term predictions.Thus, from a comprehensive perspective, the proposed model exhibits both time-effectiveness and long-sequence forecasting advantages over the other six intelligent algorithms.

Conclusions
This paper presents a novel deep learning architecture specifically designed for predicting reservoir landslide displacement.The evolution of reservoir landslides involves highly complicated and nonlinear dynamics, characterized by specific time-frequency features.To address the complexities, the joint time-frequency analysis (JTFA) module is designed.This module integrates the GWO-optimized VMD and WA methods, facilitating the extraction of essential signal components with clear physical implications.Additionally, the module conducts multi-scale response analysis, thereby revealing deformation variation patterns in the underlying mechanisms governing the landslide's response behavior.For the actual displacement prediction, The GRU is integrated, which possesses To further illustrate the superiority of the proposed method compared to existing state-of-the-art methods, such as the GWO-MIC-SVR [42], the V/S-LSTM [43], the Chaotic DWT-ELM [44] and the Multi-Chaotic ELM [45], we conduct a comparative analysis of the forecasts generated by these models.The performance of the forecast models is shown in Table 5.According to Table 5, the proposed model achieves a competitive high-accuracy result in terms of RMSE, ranking in the top two, with an RMSE of 9.715 mm, closely following the V/S-LSTM with an RMSE of 8.950 mm.However, what makes the proposed model stands out is its ability to makes 17 sets of consecutive forecasts, while V/S-LSTM only forecasts 8 sets.
Considering that forecast errors can accumulate over time, the proposed model's RMSE of 0.765 mm for 17 sets of consecutive forecasts demonstrates its capability for precise and reliable long-term predictions.Thus, from a comprehensive perspective, the proposed model exhibits both time-effectiveness and long-sequence forecasting advantages over the other six intelligent algorithms.

Conclusions
This paper presents a novel deep learning architecture specifically designed for predicting reservoir landslide displacement.The evolution of reservoir landslides involves highly complicated and nonlinear dynamics, characterized by specific time-frequency features.To address the complexities, the joint time-frequency analysis (JTFA) module is designed.This module integrates the GWO-optimized VMD and WA methods, facilitating the extraction of essential signal components with clear physical implications.Additionally, the module conducts multi-scale response analysis, thereby revealing deformation variation patterns in the underlying mechanisms governing the landslide's response behavior.For the actual displacement prediction, The GRU is integrated, which possesses the capability of adaptively capturing dependencies at multiple time scales.Moreover, during the modeling process, the correlation and hysteresis effect of the impact factors are also considered, further enhancing the accuracy and reliability of the predictions.The model experiments show that as the amount of training data increases, the periodic component prediction improves significantly.For the random component forecast, an appropriate amount of datasets yields the best results, while the trend component remains almost identical regardless of the data size.This insight led to the construction of an optimal combination prediction model, surpassing the performance of the other six designed models in cumulative landslide displacement predictions.This model achieved impressive results, with a minimum RMSE of 12.301 mm and a maximum R2 of 0.979.Moreover, the proposed architecture's superiority in timeeffectiveness and its ability to accurately predict long-sequence landslide displacement have been firmly established through comparative experiments and analyses, in which we evaluated its performance against six other state-of-the-art intelligent methods.The favorable outcomes and impressive forecasting capabilities of our proposed architecture solidify its position as an efficient and accurate tool for landslide displacement prediction.

Figure 1 .
Figure 1.Architecture of the forecasting model.
(a) signal decomposition of landslide displacement, Figure 1 (b) signal decomposition of the impacting factors, Figure 1 (c) multi-scale response analysis of landslide deformation by JTFA, Figure 1 (d) data packet preparation for the deep-learning neural network model, Figure 1 (e) trend displacement prediction by DES, Figure 1 (f) periodic and random displacement prediction by GRU, and Figure 1 (g) cumulative displacements prediction by the optimal model.

Figure 1 .
Figure 1.Architecture of the forecasting model.
(a) signal decomposition of landslide displacement, Figure 1 (b) signal decomposition of the impacting factors, Figure 1 (c) multi-scale response analysis of landslide deformation by JTFA, Figure 1 (d) data packet preparation for the deep-learning neural network model, Figure 1 (e) trend displacement prediction by DES, Figure 1 (f) periodic and random displacement prediction by GRU, and Figure 1 (g) cumulative displacements prediction by the optimal model.

Figure 2 .
Figure 2. The network structure of GRU.

Figure 2 .
Figure 2. The network structure of GRU.

Figure 4 .
Figure 4. Monitoring data in time series of the Baishuihe landslide.

Figure 4 .
Figure 4. Monitoring data in time series of the Baishuihe landslide.

Figure 4 .
Figure 4. Monitoring data in time series of the Baishuihe landslide.

Figure 7 .
Figure 7. Relations between the periodical deformation and the reservoir level fluctuation.During the decomposition of the influence factors, we set K = 2 because they mainly show cyclical fluctuations.The other VMD settings are the same as those used for the

Figure 7 .
Figure 7. Relations between the periodical deformation and the reservoir level fluctuation.During the decomposition of the influence factors, we set K = 2 because they mainly show cyclical fluctuations.The other VMD settings are the same as those used for the

Figure 7 .
Figure 7. Relations between the periodical deformation and the reservoir level fluctuation.

Figure 8 .
Figure 8. Decomposition results of the six influence factors (V1-V6), with the subscript p indi the periodic term and subscript s indicates the random term.
Stage I (June 200 gust 2006) with an average storage level of 135 m; Stage II (August 2006-August with the highest storage level raised to 156 m; Stage III starting from November 200 the highest storage level regulated to 175 m.Since then, the reservoir water level ha

Figure 8 .
Figure 8. Decomposition results of the six influence factors (V 1 -V 6 ), with the subscript p indicates the periodic term and subscript s indicates the random term.
tuated between 145 m and 175 m, exhibiting a contrary periodic fluctuation to the pr itation intensity due to artificial flood control.As shown in Figure9b, a clear 1 year (12 months) of R.w.l has been observed since 2006.

Figure 9 .
Figure 9.The CWT power spectrum of the rainfall (a), the reservoir water level (R.w.l)(b) an displacement at ZG93 (c).Regions enclosed by the heavy solid black line show a greater tha confidence level of the red noise standard spectrum.The cross-hatched regions on either end cate the "cone of influence", where edge effects become important.Color changes mean dif magnitudes of power.

Figure 9 .
Figure 9.The CWT power spectrum of the rainfall (a), the reservoir water level (R.w.l)(b) and the displacement at ZG93 (c).Regions enclosed by the heavy solid black line show a greater than 95% confidence level of the red noise standard spectrum.The cross-hatched regions on either end indicate the "cone of influence", where edge effects become important.Color changes mean different magnitudes of power.The storage of the TGR can be categorized into three stages: Stage I (June 2003-August 2006) with an average storage level of 135 m; Stage II (August 2006-August 2008) with the highest storage level raised to 156 m; Stage III starting from November 2008 with the highest storage level regulated to 175 m.Since then, the reservoir water level has fluctuated between 145 m and 175 m, exhibiting a contrary periodic fluctuation to the precipitation intensity due to artificial flood control.As shown in Figure 9b, a clear 1 year cycle (12 months) of R.w.l has been observed since 2006.Generally, the displacement dominated by geological conditions is approximately monotonic over time, indicating the long-term trend.The displacement affected by external triggering factors (e.g., rainfall and R.w.l variation) can be expressed as a proximate periodic function, leading to different deformation features.Thus, the periodical displacement is the optimal option to illustrate the multi-scale response relationship with the impact factors.Figure9cgives the CWT power spectrum of the displacement at ZG93, showing a 1 year cyclic period (12 months), indicating a combined effect of changing reservoir water levels and rainfall on the landslide displacement.In addition, a 2 years cyclic period (24 months) is observed during extreme changes in the landslide displacement.This phenomenon seems related to the deformation responding to the overall rise of the R.w.l, but further research is still needed to confirm the underlying causes.Thus, cross-analysis between the displacement and the influence factors are performed via XWT and WTC later to further study the effect of rainfall and R.w.l variation on the kinematics of the Baishuihe landslide.

Figure 10 .
Figure 10.The XWT power and WTC coherence spectrum of R.w.l and rainfall.The arrow direction reflects the phase correlation between the two time series.The arrow points from left to right indicate the in-phase relation, and the opposite suggests the anti-phase.In XWT, the color change indicates the magnitude of power.In WTC, the color differences suggest different magnitudes of coherence.

Figure 10 .
Figure 10.The XWT power and WTC coherence spectrum of R.w.l and rainfall.The arrow direction reflects the phase correlation between the two time series.The arrow points from left to right indicate the in-phase relation, and the opposite suggests the anti-phase.In XWT, the color change indicates the magnitude of power.In WTC, the color differences suggest different magnitudes of coherence.

Figure 11 .
Figure 11.The XWT power and WTC coherence spectrum of the periodic term and six impact factors.(a,b) indicate rainfall-type factors V2 and V3.The remainder indicates reservoir-type factors, where (c-f) indicate monthly mean level V1, monthly variation V4, single-month rate of change V5 and bi-monthly variation V6.

Figure 11 .
Figure 11.The XWT power and WTC coherence spectrum of the periodic term and six impact factors.(a,b) indicate rainfall-type factors V 2 and V 3 .The remainder indicates reservoir-type factors, where (c-f) indicate monthly mean level V 1 , monthly variation V 4 , single-month rate of change V 5 and bi-monthly variation V 6 .

4 . 1 .
Training Dataset and Parameter Setting In the experiments, monthly GNSS-measured displacements at ZG93 were acquired from July 2003 to March 2013.Additionally, the corresponding monthly reservoir water level and precipitation data are used in the subsequent analysis and modeling.For the training process, datasets collected from July 2003 to June 2009, comprising 72 sets of data, are utilized as model inputs, while 25 sets of data collected from July 2009 to July 2010 are used as forecast datasets, and the reserved 17 sets of data from August 2011 to December 2012 are used for model validation.

Figure 12 .
Figure 12.Predictions of each displacement component.

Figure 12 .
Figure 12.Predictions of each displacement component.

Figure 13 .
Figure 13.Predictions of the cumulative landslide displacement.The red marker indicates the absolute error without considering influencing factors, while the blue circle indicates the absolute error of our proposed method.

Figure 13 .
Figure 13.Predictions of the cumulative landslide displacement.The red marker indicates the absolute error without considering influencing factors, while the blue circle indicates the absolute error of our proposed method.

Figure 14 .
Figure 14.Predictions of the cumulative landslide displacement.The blue circles stands for the absolute errors of the proposed model.

Figure 14 .
Figure 14.Predictions of the cumulative landslide displacement.The blue circles stands for the absolute errors of the proposed model.

Table 1 .
The designed training dataset.

Table 2 .
The performance of the GRU using different numbers of neurons.

Table 3 .
The performance of the GRU in predicting periodic and random displacement.

Table 4 .
Predictions of the cumulative landslide displacement.

Table 5 .
Comparison of model prediction results.

Table 5 .
Comparison of model prediction results.