The Application of KNN-Optimized Hybrid Models in Landslide Displacement Prediction

Jiang, Hongwei; Wu, Jiayi; Zhou, Hao; Liu, Mengjie; Li, Shihao; Wu, Yuexu; Guo, Yongfan

doi:10.3390/eng6080169

Open AccessArticle

The Application of KNN-Optimized Hybrid Models in Landslide Displacement Prediction

by

Hongwei Jiang

^1,2,3

,

Jiayi Wu

¹,

Hao Zhou

¹,

Mengjie Liu

¹,

Shihao Li

¹,

Yuexu Wu

⁴ and

Yongfan Guo

^5,*

¹

School of Urban Construction, Changzhou University, Changzhou 213164, China

²

School of Energy Science and Engineering, Henan Polytechnic University, Jiaozuo 454003, China

³

Badong National Observation and Research Station of Geohazards, China University of Geosciences, Wuhan 430074, China

⁴

Wushan Geological Environment Monitoring Station, Chongqing 404700, China

⁵

Department of Civil Engineering, McMaster University, Hamilton, ON L8S 4L8, Canada

^*

Author to whom correspondence should be addressed.

Eng 2025, 6(8), 169; https://doi.org/10.3390/eng6080169

Submission received: 30 May 2025 / Revised: 11 July 2025 / Accepted: 21 July 2025 / Published: 23 July 2025

(This article belongs to the Section Chemical, Civil and Environmental Engineering)

Download

Browse Figures

Versions Notes

Abstract

Early warning systems depend heavily on the accuracy of landslide displacement forecasts. This study focuses on the Bazimen landslide located in the Three Gorges Reservoir region and proposes a hybrid prediction approach combining support vector regression (SVR) and long short-term memory (LSTM) networks. These models are optimized via the K-Nearest Neighbor (KNN) algorithm. Initially, cumulative displacement data were separated into trend and cyclic elements using a smoothing approach. SVR and LSTM were then used to predict the components, and KNN was introduced to optimize input factors and classify the results, improving accuracy. The final KNN-optimized SVR-LSTM model effectively integrates static and dynamic features, addressing limitations of traditional models. The results show that LSTM performs better than SVR, with an RMSE and MAPE of 24.73 mm and 1.87% at monitoring point ZG111, compared to 30.71 mm and 2.15% for SVR. The sequential hybrid model based on KNN-optimized SVR and LSTM achieved the best performance, with an RMSE and MAPE of 23.11 mm and 1.68%, respectively. This integrated model, which combines multiple algorithms, offers improved prediction of landslide displacement and practical value for disaster forecasting in the Three Gorges area.

Keywords:

Bazimen landslide; landslide displacement prediction; integrated models; disaster forecasting

1. Introduction

Landslides are responsible for causing significant casualties and property losses worldwide [1]. A method that can effectively predict landslide displacement is necessary. However, the environment where landslides occur often has complexity and diversity. Therefore, predicting landslide displacement is challenging but essential work for disaster reduction. At the same time, selecting an appropriate displacement prediction model is paramount [2]. Over the past 20 years, many researchers have studied the implementation of intelligent algorithms in landslide displacement prediction [3,4].

Various methods have been developed for landslide displacement prediction. Du et al. [5] applied a backpropagation neural network to predict accumulation-layer landslides. Liu et al. [6] found that deep learning algorithms, such as LSTM and GRU, achieved promising results across three types of landslides. Recently, developing integrated models with better adaptability and generalization has become a major focus of research [7].

Input data organization plays a key role in prediction modeling. Li et al. [8] utilized the Hodrick–Prescott filtering technique to decompose cumulative displacement into long-term trends and periodic variations, then applied polynomial regression and least squares support vector machines to predict them. Jia et al. [9] also used time series models for two-component decomposition. A particle swarm optimization-enhanced support vector machine was also used to predict displacement based on rainfall and reservoir levels.

Input factors are typically screened using mutual information, Pearson correlation, or gray relational analysis [10], while RMSE and MAPE are commonly used to assess prediction accuracy.

At present, extensive research has been conducted on both static prediction models, such as SVR, and dynamic models like LSTM. Li et al. [11] introduced a time-varying forecasting framework for slope movement, which employs SSA combined with a multi-layered LSTM network, aiming to improve the representation of slope behavior over time and overcome the limitations present in traditional stationary models. Lin et al. [12] explored the correlation between rainfall, reservoir level variation, and the periodic behavior of landslides and introduced a model that integrates time series analysis with a bidirectional LSTM (Double-BiLSTM) structure. Zhang et al. [13] constructed a composite dynamic prediction system using the empirical mode decomposition soft screening stopping criterion (SSSC-EMD) together with a deep bidirectional LSTM (DBi-LSTM), allowing for the more accurate representation of “stepped” deformation features on slopes. Jiang et al. [14] comprehensively utilized the advantages of several algorithms and performed integrated model research based on linear weights. These studies collectively indicate that advances in landslide displacement prediction largely depend on innovations in decomposition strategies and the fine-tuning of model hyperparameters.

In recent years, several advanced hybrid AI models have been introduced to enhance landslide displacement prediction. Zhang et al. [15] proposed a modular framework combining CEEMD, ACO, and SVR to extract multi-scale features. Wen et al. [16] developed a multi-strategy model integrating SSA, PSO, GSA, and SVR for trend prediction optimization. Ma et al. [17] presented an ensemble method based on probabilistic weighting to fuse outputs from different models. These intelligent frameworks represent a significant advancement over traditional SVR- and LSTM-based methods and provide a broader direction for landslide displacement modeling.

Although various static, dynamic, and decomposition-based prediction models have been explored, most existing studies have focused on improving prediction performance using single-model frameworks or simple model stacking. For instance, SVR is effective in capturing linear trends, while LSTM handles temporal dependencies well. However, few studies have attempted to combine these complementary models into structured hybrid models, particularly using classification algorithms to optimize their integration. In this context, the combination of SVR and LSTM models optimized through K-Nearest Neighbor (KNN) classification remains underexplored. Therefore, there is a need to develop a comprehensive prediction model that couples static and dynamic components, guided by a classifier to enhance prediction adaptability and performance.

It is necessary to find a new method that focuses on a sequential hybrid model to predict the displacement of landslides. This study involves the preprocessing of monitoring records related to ground movement in the Bazimen slope and the selection of 84 time-step data for research. The cumulative displacement is additively separated into two components: a trend term and a periodic term [18]. We decomposed the displacement monitoring data into trend items and period items; the trend items were predicted using the single variable method, and the period items were predicted by the multi-variable method. The overall displacement forecast is calculated by aggregating the estimated values of both the trend and cycle terms derived using SVR for static prediction and LSTM for dynamic modeling. Finally, KNN serves to construct a classifier for identifying the most suitable predictive scheme, and the output of the KNN algorithm is optimized to refine the static–dynamic hybrid model and combine the forecasting advantages of the static–dynamic model.

2. Methodology

2.1. Separation of Displacement Time Series into Long-Term and Cyclic Patterns

Accurately separating trend and periodic components forms a fundamental step in building dependable prediction models [19]. The original displacement information of the landslide is separated into trend and periodic signals for further analysis [20]. The trend component of landslides is mainly influenced by internal geological conditions like tectonic activity and weathering, and is also connected to the deformation evolution phase. Analyzing external factors that impact landslide prediction and progression is of great importance [12]. Short-term landslide movements, as seen in the Three Gorges area, reflect periodic behavior primarily attributed to rainfall variability and reservoir water level changes [21]. Displacement resulting from random disturbances is minimal and hard to detect, and thus, was excluded from the analysis in this research. In this study, the cumulative displacement sequence is represented by the following form:

X (t) = μ (t) + σ (t)

(1)

Here,

t

denotes the time index;

X (t)

refers to the total displacement at time

t

;

μ (t)

corresponds to the trend component; and

σ (t)

indicates the periodic component.

2.2. Moving Average Methods

The moving average method used in this paper eliminates fluctuations in the time series and effectively identifies period series and its influencing factors to achieve more effective trend prediction. The formulation for the simple moving average is presented as follows:

{\bar{X}}_{t} = \frac{X_{t} + X_{t - 1} + \dots + X_{t - n + 1}}{n} (t = n, n + 1, \dots, T)

(2)

Here,

{\bar{X}}_{t}

denotes the trend component of displacement corresponding to time t;

X_{t}

represents the total measured displacement at time step t; and n refers to the averaging window size. In this study, a moving average period of 12 is used, corresponding to the annual water level regulation cycle in the Three Gorges Reservoir region.

2.3. Support Vector Regression

In machine learning, support vector regression (SVR) refers to a type of supervised learning approach that incorporates suitable learning mechanisms, which is commonly applied to process data for both classification tasks and analytical purposes [22]. Therefore, it is a machine learning method based on minimizing structural risk [23]. This method can solve the problems of low sample, high-dimensional nonlinearity, and local minimum [24]. In SVR models, the sample data

D (x_{i}, y_{i})

(x_{i}, y_{i} \in R, i = 1, 2, \dots, k)

(k is the number of sample data) obtained for dealing with regression problems is divided into a fitting sample and a predicting sample [25]. Training samples are projected into a high-dimensional feature space, within which the best-fitting linear regression function is established [26]. The support vector regression function is defined as follows:

f (x) = 〈W \cdot Φ (x)〉 + b

(3)

In this expression,

W

represents the model’s weight parameter,

Φ (x)

refers to a nonlinear feature transformation from input to output space, and b is the bias term. Considering the fitting error, with the introduction of relaxation factors,

ζ_{i} \geq 0, ζ_{i}^{*} \geq 0

, the optimization problem of SVR can be converted into the following:

R_{\min} = \frac{1}{2} {‖W‖}^{2} + C \sum_{i = 1}^{j} (ζ_{i} + ζ_{i}^{*})

(4)

Here, C denotes the regularization coefficient, and

ε

refers to the ε-insensitive loss function used in SVR.

Standard kernels include radial basis function (RBF) kernels, linear kernels, and polynomial kernels [20]. The RBF is widely used with its broad convergence domain. In this study, we selected RBF kernels for the SVR model.

In the current study, the grid search, due to its simplicity and reproducibility, was employed for SVR parameter tuning. However, recent advancements in adopting the metaheuristic algorithms have demonstrated superior efficiency in uncertain and complex geoengineering spaces [27,28,29].

According to the taxonomy of global optimization algorithms, SVR tuning approaches can be classified into deterministic, stochastic, and metaheuristic strategies [30]. Among these, metaheuristic multi-objective optimization frameworks have gained attention for their adaptability and ability to balance predictive accuracy and generalization performance [31].

In geoengineering tasks, metaheuristics not only support hyperparameter selection but also contribute to feature selection, ensemble model calibration, and spatial geodata processing, which are particularly valuable for landslide displacement prediction and other spatially distributed problems [29,31].

Therefore, while the grid search is suitable for initial modeling, future studies are encouraged to integrate metaheuristic multi-objective optimization techniques with SVR to enhance robustness, accuracy, and computational efficiency in geotechnical prediction models.

2.4. Long Short-Term Memory Neural Networks

As illustrated in Figure 1a, recurrent neural networks (RNNs) exhibit strong modeling capabilities and are widely used in domains like natural language processing (NLP) [32]. Each RNN structure includes input units denoted by the sequence

\{x_{0}, x_{1}, \cdot \cdot \cdot, x_{t}, x_{t + 1}, \cdot \cdot \cdot\}

, with the corresponding outputs represented as

\{y_{0}, y_{1}, \cdot \cdot \cdot, y_{t}, y_{t + 1}, \cdot \cdot \cdot\}

. The network also includes hidden units that produce intermediate states expressed as

\{s_{0}, s_{1}, \cdot \cdot \cdot, s_{t}, s_{t + 1}, \cdot \cdot \cdot\}

.

s_{t} = f (U x_{t} + W s_{t - 1})

(5)

The symbol s_t denotes the internal state of the hidden layer corresponding to time step t. Its value is determined using the current input and the hidden state from the preceding time step. The function f commonly refers to a nonlinear activation function.

In practical applications, traditional RNNs struggle to model long-term dependencies due to issues like gradient vanishing or explosion. LSTM networks, a specialized form of RNNs, are designed to effectively capture long-range dependencies (see Figure 1b). In LSTM, each recurrent unit is substituted with a memory block structure. This memory structure contains three types of gate mechanisms. The input gate regulates the method by which new information enters the memory cell, whereas the output gate manages how the stored information is passed on to subsequent units or contributes to the final output.

In the LSTM structure, the forget gate determines which parts of the previous cell state

C_{t - 1}

are retained or removed by evaluating the past hidden output

h_{t - 1}

and the current input

x_{t}

. This gate outputs values ranging from 0 to 1 for each element in the former cell state. A value of 1 implies full preservation, whereas 0 indicates total removal. The output state is determined by the joint effect of the three gating mechanisms. A major difference between LSTM and conventional RNNs lies in the way the hidden state vector h_t is computed. Within the LSTM framework, h_t is formulated as follows:

\{\begin{array}{l} i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i}) \\ f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f}) \\ {\tilde{C}}_{t} = \tanh (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{c}) \\ o_{t} = σ (W_{o} [h_{t - 1}, x_{t}] + b_{o}) \\ h_{t} = o_{t} * \tanh (C_{t}) \end{array}

(6)

In this context,

i_{t}

,

f_{t}

, and

o_{t}

correspond to the outputs of the input, forget, and output gates, respectively, as well as the cell state in the memory module. Their related biases are denoted by

b_{i}

,

b_{f}

,

b_{o}

,

b_{c}

, and the symbol

σ

indicates the sigmoid function used for activation;

W_{c}

represents the weight that connects the memory unit to the output node. The above formula’s set of results is iterated from times t = 1 to T.

2.5. Reliability Evaluation of the Model

The optimal hyperparameters were determined by applying the full dataset during the model fitting stage. Subsequently, these tuned parameters, along with the complete training set, were used to build the final models [34]. To evaluate the model’s predictive ability, two indicators were employed: the root mean square error (RMSE) [35] and mean absolute percentage error (MAPE) [36].

R M S E = \sqrt{\frac{1}{N} {\sum_{t = 1}^{N} (x_{org} - x_{prd})}^{2}}

(7)

M A P E = \frac{1}{N} \sum_{t = 1}^{N} |x_{prd} - \frac{x_{org}}{x_{prd}}|

(8)

Here,

x_{org}

denotes the observed (measured) value,

x_{prd}

represents the forecasted value, and N indicates the total number of predictions. A visual summary of the process is provided in Figure 2.

3. Study Site

3.1. Bazimen Landslide

The Bazimen landslide occurs along the eastern margin of the Xiangxi River and is situated inside Zigui County (Figure 3). In this section, the Xiangxi River flows predominantly from north to south and is located near the Yangtze River. The lower boundary of the landslide was submerged by the Three Gorges Reservoir, at an elevation ranging from 55 to 135 m [37]. The main landslide mass lies on the Xiangxi River’s right bank, following a north–south alignment along the bank slope. The body of the landslide is distributed in a dustpan shape at the base of the valley slope. The elevation range of the landslide mass spans from 139 m to 280 m, with the terrain gradually descending from west to east and exhibiting a terraced pattern. Additionally, the landslide features two secondary benches: a lower platform located at 139–165 m and an upper platform between 220 and 230 m.

3.2. Landslide Inventory

The Bazimen landslide was first equipped with a monitoring system in June 2003, comprising four GPS deformation monitoring points (ZG111, ZG110, ZG112, and ZG109) (Figure 4).

The Bazimen landslide was reactivated after water storage in the Gezhouba Reservoir in 1982. Four NW-SE trending cracks developed at a landslide elevation of 80–125 m. Deformation cracks occurred again in the Bazimen landslide in 1983. From May to June 2003, the water level of the Three Gorges Reservoir rose from 80 m to 135 m, which is the initial stage of landslide deformation. ZG111 showed the largest displacement at the top of the landslide. From July to August 2003, cracks occurred in the landslide. Before the rainy season in May 2004, the reservoir area displacement curve showed a steady growth trend, implying that the rising water level from the first impoundment continued to impact the reservoir area. From May to July 2004, multiple fissures were observed within the landslide body. By June 2005, these structural cracks showed signs of intensified deformation. Between June and July 2007, the displacement velocity of the landslide continued to increase. Since that time, in the Three Gorges Reservoir area, the water elevation has remained within the range of 145 to 175 m with a maximum drop of 30 m from high to low. The detailed annual variation process is shown in the diagram (Table 1).

The Bazimen landslide exhibits deformation traits typical of a forward-propagating slide, indicating that the downward movement originated in the upper region and continued as the displacement intensified [38]. Based on Figure 5, the ZG110 and ZG111 GPS stations were installed along the primary I–I′ profile across the landslide zone.

4. Prediction Process

4.1. Point Selection and Data Processing

This study uses the ZG111 point as the GPS-based deformation monitoring location, with 84 time-step records from January 2004 to December 2010 serving as the basis for analysis.

Initially, the landslide displacement data were processed using a moving average technique for decomposition. According to existing studies on landslide displacement prediction, datasets are typically split using ratios like 7:1.5:1.5 or 6:2:2 [16,18]. Post-decomposition, the initial 72 time steps are allocated as fitting data. Of these, 60 steps are used for model training, 12 are used for validating and tuning hyperparameters, and the remaining 12 are used for testing model performance. Figure 6 illustrates the detailed decomposition of the landslide displacement increment.

4.2. Factor Selection

Choosing appropriate influencing and state-related factors is essential for accurate landslide forecasting [16]. Figure 7 illustrates how initial landslide displacement correlates with selected influencing variables. This study identifies multiple candidate variables, including rainfall fluctuations and reservoir level changes, which are denoted as f₁ to f₁₂ [39]. The gray correlation analysis method (GCAM) was employed with a resolution of 0.5 to obtain the correlation coefficient. According to the work of researchers, GCAM typically employs a gray relational coefficient calculated at a resolution threshold of 0.5. They concluded that the higher gray correlation coefficient illustrates a strong correlation between two variables. It has also been pointed out that if the gray correlation coefficient is more significant than 0.6, a strong correlation between the two variables can be assumed. Following this method, the candidate input factors were selected based on the principle proposed by Jiang et al. [14]. All available input factors are summarized in Table 2.

Meanwhile, candidate factors were examined for multicollinearity based on correlation degree analysis. Removing variables with excessive collinearity enhances the model’s predictive performance. Generally speaking, a variable is considered to exhibit multicollinearity if its tolerance value does not exceed 0.1 or if the variance inflation factor (VIF) surpasses 5 [40]. The relationship between displacement increments and issues at ZG111 showed that these factors have an inseparable relationship with each other (Figure 8).

According to the results of the analysis, f₇ is selected to represent the fluctuation range of the reservoir water level for the current month. f₉ denotes the count of days when the reservoir level declines in the same period, and f₃ reflects the peak daily rainfall within the month. The assessment of input variables was thus far finalized (Table 3).

4.3. Normalization and Inverse Normalization

For machine learning algorithms, normalization, as a fundamental task in data preprocessing, can help identify optimal parameters more easily. Data normalization involves proportionally rescaling values to fall within a defined numerical range [41]. To facilitate the comparison and weighting of indicators with different units or magnitudes and to accelerate network convergence during training, the data were normalized by removing unit constraints and converting them into dimensionless numerical values.

It was assumed that the entire fitted dataset represents known information, while the prediction dataset is treated as unknown. To avoid information leakage, normalization and denormalization boundaries should be computed exclusively from the training (fitted) set, and not the full dataset.

x_{scale} = 2 \times \frac{x_{origin} - x_{\min}}{x_{\max} - x_{\min}} - 1

(9)

where

x_{scale}

is the normalized value,

x_{origin}

is the original value,

x_{\max}

is the maximum value of the samples, and

x_{\min}

is the minimum value of the samples.

However, in order to further enhance the model’s adaptability in handling dynamic databases, it is recognized that the optimization of the standardization of new data is necessary. Currently, the model relies on static preprocessing methods for data handling, and automated standardization approaches are not included. Solutions for standardizing dynamic databases include automated standardization techniques, such as the RWMCE clustering ensemble method proposed by Ni et al. [42], which automates the processing of dynamic datasets. By incorporating reliability weighting during the ensemble process, the standardization of different data sources is optimized, and data processing efficiency is improved. Additionally, real-time standardization techniques based on stream data processing, such as the back analysis and rheological parameter optimization method used by Zeng et al. [43], can standardize dynamic databases for specific regions in real-time. These methods ensure the consistency and reliability of dynamic databases when updating with new data. Although this approach has not been implemented in the current study, the incorporation of automated standardization technologies based on deep learning or stream data processing will be considered in future research to support the real-time update and handling of dynamic data.

4.4. Parameters of SVRs and LSTMs

SVR models were developed in a Python-based environment (version 3.7). Their key parameters were optimized via the grid search (GS) technique [15], employing RBF as the kernel. The kernel coefficient Gamma of RBF and the penalty parameter C were determined using GS. During grid searching (GS), Gamma was tuned within the interval [0.075, 1.075] using a step size of 0.075, while the parameter C was explored over the range [1, 75] with a step size of 1.

The LSTM model structure, comprising the batch size, neuron count, and layer number, was optimized progressively via the grid search (GS) technique. During GS execution, the batch size and neuron number were both searched within the interval [1, 20], incrementing by values of one. The patience of early stopping was set to 30, indicating that training would terminate if no further improvement in performance was observed over 30 consecutive iterations. Table 4 presents the final hyperparameter settings for both the LSTM and SVR models.

5. Results

5.1. SVR Models and LSTM Models

Figure 9 and Table 5 illustrate the comparison between predicted and measured displacements at point ZG111 using the SVR and LSTM models. For the ZG111 point, the SVR model yielded RMSE and MAPE values of 30.71 mm and 2.15%, respectively, whereas the LSTM model showed improved accuracy with corresponding figures of 24.73 mm and 1.87%.

For the trend component at point ZG111, the RMSE values obtained by the SVR and LSTM models were 2.30 mm and 3.52 mm, respectively. For the periodic component, the corresponding RMSE values were 28.92 mm and 23.61 mm, as shown in Table 6.

Compared to the last six months, the SVR and LSTM models performed poorly in the first six months. Notably, LSTM exhibited superior trend forecasting accuracy over SVR during this early stage. LSTM models performed significantly better at point a and period b than SVR models in total displacement. However, for these 12 data points, the LSTM model did not consistently outperform the SVR model.

5.2. Sequential Hybrid Model Optimized by K-Nearest Neighbor

In the previous chapters, four models related to the SVR and LSTM algorithms were established in Section 4.4. In this section, the SVR and LSTM algorithms are applied to predict the entire dataset, and the total prediction displacements of ZG111 from January 2004 to December 2010 are also obtained.

The rationale for optimizing the hybrid SVR–LSTM model using the K-Nearest Neighbor algorithm was built upon a binary classification approach: the output was labeled as 1 when the SVR prediction was closer to the actual value, and the output was labeled as 0 otherwise [44]. The total predicted displacements from the LSTM and SVR models, along with their differences, were selected as candidate input features. These three variables (f₁₃, f₁₄, f₁₅) were then introduced into the optimization model as new predictive inputs. The hyperparameters and optimal inputs of the KNN model used to predict the optimal model are shown in Table 7.

There are 15 candidate input factors for optimizing the model: f₁~f₁₅. By using the factor selection method in Section 3.2, the candidate factors were screened, and the optimal input factors for the model were obtained.

A KNN-based optimization approach was employed to determine the optimal weighting scheme for the hybrid model [17]. In this section, the complete ZG111 dataset comprising 84 time steps was utilized for displacement prediction using both SVR and LSTM models. Specifically, 60 time-steps of ZG111 were employed for model fitting, 12 time-steps were allocated for validation to tune the model parameters, and the final 12 time-steps were utilized for testing to estimate the hybrid model’s most suitable weights. Consequently, Table 6 summarizes the best-fitting model parameters derived through the optimization process.

Figure 10 and Table 8 present a comparison between the predicted and observed displacements at point ZG111, generated by the KNN-optimized models. In the case of ZG111’s total displacement, the hybrid model achieved RMSE and MAPE values of 23.11 mm and 1.68%, respectively. In contrast, the SVR model yielded values of 30.71 mm and 2.15%, while the LSTM model recorded 24.73 mm and 1.87% for the same metrics.

In contrast to the LSTM and SVR approaches, the optimized KNN-integrated model provides total displacement estimates that more closely match the observed data compared to each standalone model in period a’ and period b’. This indicates that when discrepancies occur between the outputs of LSTM and SVR, the KNN-optimized hybrid model tends to assign higher weights to estimates exhibiting greater proximity to the ground truth.

Although the current evaluation focuses on historical data, the proposed KNN-optimized hybrid model features a modular structure that supports real-time prediction. With its modular design, the model can be adapted to incorporate continuous monitoring data through dynamic retraining and online learning strategies. In future applications, real-time updating mechanisms, adaptive normalization, and uncertainty quantification techniques will be integrated to enable robust prediction under evolving environmental and geological conditions. These enhancements will facilitate the deployment of the model for continuous landslide monitoring and early warning purposes.

Figure 10 presents a comparison of the displacement curves predicted by the refined model against real monitoring data, confirming its strong performance throughout the testing period. To improve the adaptability and explainability of the model, subsequent studies may employ sensitivity analysis methods to quantitatively evaluate how each input factor influences the prediction results. Additionally, utilizing the weight database from the neural network components for feature selection can improve model performance and stability, especially under complex geological conditions and dynamic environmental changes. With these enhancements, the proposed model is expected to exhibit greater adaptability and predictive reliability in practical applications.

6. Discussion

In this study, landslide displacement was decomposed into trend and periodic components, while stochastic elements were not taken into account. To date, limited research has focused on decomposing landslide displacement into its stochastic component, largely due to the fact that such variability arises from numerous uncontrollable external influences, including wind and traffic loads. The stochastic component of landslide displacement is difficult to isolate or predict using only computational methods applied to recorded cumulative deformation and corresponding temporal data [45]. Due to the limitations of current detection technologies, identifying the factors influencing the stochastic part of the slope movement continues to be a difficult task. Therefore, future investigations ought to prioritize the design of efficient frameworks that can forecast such random behavior.

Furthermore, the current model lacks formal uncertainty quantification in its outputs, which limits its applicability in risk-based geotechnical decision-making. To address this, recent studies have applied techniques such as Monte Carlo dropout [46,47], deep ensembles [48], and Bayesian recurrent networks [49] to estimate prediction confidence intervals. Integrating such methods into the SVR–LSTM–KNN framework would allow the model not only to output predicted displacements but also to quantify their associated uncertainty, enhancing interpretability and robustness. In future work, we plan to incorporate these techniques to improve the reliability of displacement forecasting in dynamic geological environments.

Over the past few years, a large number of investigations have utilized intelligent algorithms to forecast landslide movements, such as by employing different ensemble strategies [50]. In this work, a sequential hybrid model was developed by coupling LSTM and SVR with optimization via the KNN algorithm, which demonstrated strong predictive performance. Concurrently, neural network-based models have also seen extensive application in addressing landslide displacement prediction challenges. To further improve prediction accuracy, the exploration of more advanced deep learning algorithms remains essential [51]. Therefore, the approach introduced herein offers a novel perspective for utilizing deep learning techniques in forecasting landslide movements [52]. In future studies, a comparison with more recent integrated frameworks, such as CEEMDAN–LSTM or CNN–LSTM, along with visualized performance results, can further validate the superiority of the proposed model.

Furthermore, the choice of the moving average method for displacement decomposition is grounded in both geotechnical interpretation and methodological robustness. Compared to EMD, VMD, and wavelet-based approaches, the moving average method based on a 12-month cycle preserves the periodic structure of the data while minimizing the risk of boundary distortion and data leakage [53,54,55]. This choice aligns with the observed annual hydrological cycles caused by precipitation patterns and changes in reservoir water levels observed locally [56]. From a geotechnical standpoint, it ensures that the trend component captures long-term deformation driven by slope structure and creep, whereas the periodic term reflects cyclic environmental loading. This decomposition strategy not only facilitates physical interpretability but also maintains data consistency for predictive modeling.

In many studies, the displacement of landslides is often decomposed into two or three components before further prediction processing. When attempting to apply the SVR and LSTM algorithms directly in this study without decomposing the landslide displacement, the corresponding outcomes are presented in Table 9 and Table 10. It is undesirable to use this decomposition method to predict landslide displacement alone from the results. It is evident that for the precise forecasting of displacement, the decomposition of displacement plays a crucial role. In future research, taking the decomposition method in this paper as an example, efforts should focus on investigating the distinctions between the trend and periodic components within the original decomposition, and assessing if such distinctions become more pronounced or less evident in the reassembled overall displacement.

In addition, current monitoring data are subject to uncertainties and noise. However, the model does not explicitly handle these aspects. Incorporating noise filtering techniques or uncertainty quantification methods, such as dropout-based variance estimation or ensemble Bayesian methods, would be beneficial for enhancing robustness.

Beyond assessing the performance of the coupled model, attention should also be paid to the sensitivity and uncertainty associated with its parameters. The sensitivity analysis of parameters such as C and Gamma for SVR, along with the number of layers and epochs for LSTM, indicates that these factors strongly influence model performance. In contrast, the KNN optimization process appears to be less sensitive to parameter changes. While this study demonstrates the effectiveness of the model, further research is needed to explore the uncertainty of the predictions through more comprehensive sensitivity analyses.

Despite its promising performance, the proposed model has limitations. The KNN component is sensitive to the choice of distance metrics and neighbor count, which may affect ensemble reliability. In addition, the model assumes temporal stationarity, which may not hold under extreme events. Its generalizability across different landslide sites still needs further validation.

Lastly, input factor selection based on gray correlation and tolerance analysis may still introduce site-specific biases. Employing adaptive feature selection or cross-domain learning strategies may improve the generalizability and robustness of the model.

Although it is widely acknowledged that integrated models tend to outperform individual models, comparative analysis remains essential. The purpose of comparing the proposed hybrid model with standalone SVR and LSTM models is not to dispute this consensus, but to quantify the specific performance improvement achieved through integration. Such comparisons are widely adopted in the literature to demonstrate the value and contribution of each component model and to justify the added complexity of hybrid approaches.

It is worth noting that this study does not seek to determine the most optimal single model tailored to a particular landslide scenario. Instead, we propose a hybrid modeling strategy that combines the advantages of static and dynamic models. By optimizing the SVR–LSTM framework with KNN, this study demonstrates a conceptual modeling approach for improving landslide displacement prediction rather than a direct competition among models. In future work, more advanced hybridization strategies and model architectures will be explored to enhance predictive generalizability across varied geotechnical scenarios.

Since the presented model assumes equal processing likelihood across entities, the validity of the sensitivity analysis and subsequent predictability can be further investigated using the weight database of the optimum model. To support this, sensitivity-based feature selection techniques can be employed to quantify the relative influence of input variables, enhance model interpretability, and improve generalization. Recent studies have applied global sensitivity analysis methods and neural network sensitivity tools for model refinement and input optimization [57,58,59,60]. Incorporating such methods in future work will enable the dynamic adjustment of input features and better calibration of the prediction framework under evolving environmental conditions.

In line with the above research directions, another promising avenue is to combine KNN with advanced SVR approaches and apply them to real-time prediction and multi-variable regression tasks. Abbaszadeh Shahri et al. [61] proposed a metaheuristic-optimized SVR model that demonstrated strong predictive performance in geotechnical applications, while Luo et al. [62] developed a simulation-enhanced SVR framework that also achieved excellent results. Incorporating KNN into these advanced SVR structures may further improve local adaptability and pattern recognition, especially under complex or evolving landslide conditions. This hybridization has the potential to achieve a better balance among model interpretability, robustness, and adaptability.

7. Conclusions

A hybrid SVR–LSTM model optimized using the K-Nearest Neighbor (KNN) algorithm was developed for landslide displacement prediction. The proposed model was utilized to estimate the ground movement of the Bazimen landslide across the Three Gorges Reservoir region, and the proposed model was employed.

Its performance was evaluated against individual models like SVR and LSTM, and it demonstrated dependable predictive capabilities. From this study, the following conclusions are summarized:

(1) Overall, the LSTM framework exhibited superior performance over SVR in the majority of forecasting intervals, though not consistently across the entire forecast period, as each model had its own advantages. In comparison, the KNN-optimized SVR-LSTM hybrid model developed in this research generated forecasts that were more aligned with the actual data in 8 out of 12 time steps, demonstrating improved prediction accuracy and generalization capability. Leveraging the KNN for input feature optimization and integrating the complementary capabilities of SVR and LSTM, the proposed model successfully captured long-term deformation patterns and seasonal hydrological effects associated with landslides.

(2) In the model construction process, a displacement decomposition method based on a 12-month moving average was adopted to effectively separate trend and periodic components. This enhanced the physical interpretability and temporal modeling capability of the model, providing a clearer logical basis for understanding the association between landslide movements and the factors that govern them.

By demonstrating a strong performance on historical datasets, the proposed hybrid model has shown its reliability and potential for forecasting ground movement in the Three Gorges region and other landslide-prone areas. However, it currently lacks real-time updating and adaptive normalization capabilities. Future work will explore dynamic learning and automated normalization strategies to enhance its applicability to continuous monitoring and complex geotechnical conditions.

Author Contributions

Conceptualization, H.J. and H.Z.; methodology, H.J. and H.Z.; software, H.J., J.W., and H.Z.; formal analysis, H.J. and H.Z.; investigation, H.J., Y.W., and H.Z.; visualization, H.J., M.L., S.L., and H.Z.; writing—original draft preparation, H.J., H.Z., and J.W.; writing—review and editing, Y.W. and Y.G. All authors have read and agreed to the published version of the manuscript.

Funding

The research work was funded by the Research Fund of National Natural Science Foundation of China (NSFC) (Grant No. 52404113); the Changzhou Sci&Tech Program (Grant No. CJ20240042); the Open Fund of Badong National Observation and Research Station of Geohazards (No. BNORSG-202408); and the high-level Talent Introduction Project of Changzhou University (Grant No. ZMF22020036).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data used in this study are not directly included in the article. For access to the relevant data, please contact the corresponding author. We can provide related reference literature.

Acknowledgments

The authors thank the National Cryosphere Desert Data Center (http://www.ncdc.ac.cn, accessed on 15 March 2025) for providing the data from the landslide monitoring site for this article and for their strong support for this research. We also sincerely thank our group members for their contributions to experimental ideas and help with problems that arose during the experiments, which significantly accelerated the research process.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zeng, T.; Yin, K.; Jiang, H. Groundwater level prediction based on a combined intelligence method for the Sifangbei landslide in the Three Gorges Reservoir Area. Sci. Rep. 2022, 12, 11108. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Li, Q.; Wang, Y. Effect of slope angle on fractured rock masses under combined influence of variable rainfall infiltration and excavation unloading. J. Rock Mech. Geotech. Eng. 2024, 16, 4154–4176. [Google Scholar] [CrossRef]
Xu, J.; Jiang, Y.; Yang, C. Landslide Displacement Prediction during the Sliding Process Using XGBoost, SVR and RNNs. Appl. Sci. 2022, 12, 6056. [Google Scholar] [CrossRef]
Zhang, J.; Chen, C.; Wu, C. Development of An Image-based Borehole Flowmeter for Real-time Monitoring of Groundwater Flow Velocity and Direction in Landslide Boreholes. IEEE Sens. J. 2024, 24, 42079–42087. [Google Scholar] [CrossRef]
Zhang, J.; Lin, C.; Tang, H. Input-parameter optimization using a SVR based ensemble model to predict landslide displacements in a reservoir area-A comparative study. Appl. Soft Comput. 2024, 150, 111107. [Google Scholar] [CrossRef]
Liu, Z.; Guo, D.; Lacasse, S. Algorithms for intelligent prediction of landslide displacements. J. Zhejiang Univ. Sci. A 2020, 21, 412–429. [Google Scholar] [CrossRef]
Zhang, J.; Tang, H.; Zhou, B. A new early warning criterion for landslides movement assessment: Deformation Standardized Anomaly Index. Bull. Eng. Geol. Environ. 2024, 83, 205. [Google Scholar] [CrossRef]
Li, D.; Sun, Y.; Yin, K. Displacement characteristics and prediction of Baishuihe landslide in the Three Gorges Reservoir. J. Mt. Sci. 2019, 16, 2203–2214. [Google Scholar] [CrossRef]
Zhang, J.; Tang, H.; Tan, Q. A generalized early warning criterion for the landslide risk assessment: Deformation probability index (DPI). Acta Geotech. 2024, 19, 2607–2627. [Google Scholar] [CrossRef]
Jiang, H.; Wang, Y.; Guo, Z. Landslide Displacement Prediction Stacking Deep Learning Algorithms: A Case Study of Shengjibao Landslide in the Three Gorges Reservoir Area of China. Water 2024, 16, 3141. [Google Scholar] [CrossRef]
Li, L.; Zhang, M.; Wen, Z. Dynamic prediction of landslide displacement using singular spectrum analysis and stack long short-term memory network. J. Mt. Sci. 2021, 18, 2597–2611. [Google Scholar] [CrossRef]
Lin, Z.; Sun, X.; Ji, Y. Landslide Displacement Prediction Model Using Time Series Analysis Method and Modified LSTM Model. Electronics 2022, 11, 1519. [Google Scholar] [CrossRef]
Zhang, M.; Han, Y.; Yang, P. Landslide displacement prediction based on optimized empirical mode decomposition and deep bidirectional long short-term memory network. J. Mt. Sci. 2023, 20, 637–656. [Google Scholar] [CrossRef]
Jiang, H.; Li, Y.; Zhou, C. Landslide Displacement Prediction Combining LSTM and SVR Algorithms: A Case Study of Shengjibao Landslide from the Three Gorges Reservoir Area. Appl. Sci. 2020, 10, 7830. [Google Scholar] [CrossRef]
Zhang, J.; Tang, H.; Wen, T. A Hybrid Landslide Displacement Prediction Method Based on CEEMD and DTW-ACO-SVR-Cases Studied in the Three Gorges Reservoir Area. Sensors 2020, 20, 4287. [Google Scholar] [CrossRef] [PubMed]
Wen, H.; Xiao, J.; Xiang, X. Singular spectrum analysis-based hybrid PSO-GSA-SVR model for predicting displacement of step-like landslides: A case of Jiuxianping landslide. Acta Geotech. 2024, 19, 1835–1852. [Google Scholar] [CrossRef]
Ma, J.; Liu, X.; Niu, X. Forecasting of Landslide Displacement Using a Probability-Scheme Combination Ensemble Prediction Technique. Int. J. Environ. Res. Public Health 2020, 17, 4788. [Google Scholar] [CrossRef] [PubMed]
Luo, W.; Dou, J.; Fu, Y. A Novel Hybrid LMD–ETS–TCN Approach for Predicting Landslide Displacement Based on GPS Time Series Analysis. Remote Sens. 2022, 15, 229. [Google Scholar] [CrossRef]
Zhang, Y.; Tang, J.; He, Z. A novel displacement prediction method using gated recurrent unit model with time series analysis in the Erdaohe landslide. Nat. Hazards 2020, 105, 783–813. [Google Scholar] [CrossRef]
Zhou, C.; Yin, K.; Cao, Y. Application of time series analysis and PSO–SVM model in predicting the Bazimen landslide in the Three Gorges Reservoir, China. Eng. Geol. 2016, 204, 108–120. [Google Scholar] [CrossRef]
Yang, B.; Yin, K.; Lacasse, S. Time series analysis and long short-term memory neural network to predict landslide displacement. Landslides 2019, 16, 677–694. [Google Scholar] [CrossRef]
Zhang, Y.; Tang, J.; Cheng, Y. Prediction of landslide displacement with dynamic features using intelligent approaches. Int. J. Min. Sci. Technol. 2022, 32, 539–549. [Google Scholar] [CrossRef]
Huang, F.; Cao, Z.; Guo, J. Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. Catena 2020, 191, 104580. [Google Scholar] [CrossRef]
Chang, Z.; Du, Z.; Zhang, F. Landslide Susceptibility Prediction Based on Remote Sensing Images and GIS: Comparisons of Supervised and Unsupervised Machine Learning Models. Remote Sens. 2020, 12, 502. [Google Scholar] [CrossRef]
Ma, J.; Xia, D.; Guo, H. Metaheuristic-based support vector regression for landslide displacement prediction: A comparative study. Landslides 2022, 19, 2489–2511. [Google Scholar] [CrossRef]
Cao, Y.; Yin, K.; Zhou, C. Establishment of Landslide Groundwater Level Prediction Model Based on GA-SVM and Influencing Factor Analysis. Sensors 2020, 20, 845. [Google Scholar] [CrossRef] [PubMed]
Nguyen, H.; Bui, X.-N.; Topal, E. Enhancing predictions of blast-induced ground vibration in open-pit mines: Comparing swarm-based optimization algorithms to optimize self-organizing neural networks. Int. J. Coal Geol. 2023, 275, 104294. [Google Scholar] [CrossRef]
Lialestani, S.P.M.; Parcerisa, D.; Himi, M.; Abbaszadeh Shahri, A. A novel modified bat algorithm to improve the spatial geothermal mapping using discrete geodata in catalonia-spain. Model. Earth Syst. Environ. 2024, 10, 4415–4428. [Google Scholar] [CrossRef]
Iraninezhad, R.; Asheghi, R.; Ahmadi, H. A new enhanced grey wolf optimizer to improve geospatially subsurface analyses. Model. Earth Syst. Environ. 2025, 11, 108. [Google Scholar] [CrossRef]
Stork, J.; Eiben, A.E.; Bartz-Beielstein, T. A new taxonomy of global optimization algorithms. Nat. Comput. 2022, 21, 219–242. [Google Scholar] [CrossRef]
Agrawal, P.; Abutarboush, H.F.; Ganesh, T.; Mohamed, A.W. Metaheuristic algorithms on feature selection: A survey of one decade of research (2009–2019). IEEE Access 2021, 9, 26766–26791. [Google Scholar] [CrossRef]
Li, H.; Xu, Q.; He, Y. Temporal detection of sharp landslide deformation with ensemble-based LSTM-RNNs and Hurst exponent. Geomat. Nat. Hazards Risk 2021, 12, 3089–3113. [Google Scholar] [CrossRef]
Gers, F.A.; Schmidhuber, J. LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE Trans. Neural Netw. 2001, 12, 1333–1340. [Google Scholar] [CrossRef] [PubMed]
Huang, F.; Yin, K.; Zhang, G. Landslide displacement prediction using discrete wavelet transform and extreme learning machine based on chaos theory. Environ. Earth Sci. 2016, 75, 1376. [Google Scholar] [CrossRef]
Zeng, T.; Jiang, H.; Li, Q. Landslide displacement prediction based on Variational mode decomposition and MIC-GWO-LSTM model. Stoch. Environ. Res. Risk Assess 2022, 36, 1353–1372. [Google Scholar] [CrossRef]
Zhou, C.; Yin, K.; Cao, Y. A novel method for landslide displacement prediction by integrating advanced computational intelligence algorithms. Sci. Rep. 2018, 8, 7287. [Google Scholar] [CrossRef] [PubMed]
Ye, X.; Zhu, H.; Cheng, G. Thermo-hydro-poro-mechanical responses of a reservoir-induced landslide tracked by high-resolution fiber optic sensing nerves. J. Rock Mech. Geotech. Eng. 2023, 16, 1018–1032. [Google Scholar] [CrossRef]
Miao, F.; Wu, Y.; Xie, Y. Prediction of landslide displacement with step-like behavior based on multialgorithm optimization and a support vector regression model. Landslides 2017, 15, 475–488. [Google Scholar] [CrossRef]
Krkač, M.; Bernat, G.; Arbanas, S. A comparative study of random forests and multiple linear regression in the prediction of landslide velocity. Landslides 2020, 17, 2515–2531. [Google Scholar] [CrossRef]
Ye, C.; Wei, R.; Ge, Y. GIS-based spatial prediction of landslide using road factors and random forest for Sichuan-Tibet Highway. J. Mt. Sci. 2021, 19, 461–476. [Google Scholar] [CrossRef]
Li, L.; Wang, C.; Wen, Z. Landslide displacement prediction based on the ICEEMDAN, ApEn and the CNN-LSTM models. J. Mt. Sci. 2023, 20, 1220–1231. [Google Scholar] [CrossRef]
Ni, P.; Zhang, X.; Zhai, D.; Zhou, Y.; Li, T. Enhancing diversity and robustness of clustering ensemble via reliability weighted measure. Appl. Intell. 2023, 53, 30778–30802. [Google Scholar] [CrossRef]
Zeng, P.; Zhang, L.; Li, T.; Sun, X.; Zhao, L.; Dong, X.; Xu, Q. Constructing a region-specific rheological parameter database for probabilistic run-out analyses of loess flowslides. Landslides 2023, 20, 1167–1185. [Google Scholar] [CrossRef]
Li, X.; Kong, J.; Wang, Z. Landslide displacement prediction based on combining method with optimal weight. Nat. Hazards 2011, 61, 635–646. [Google Scholar] [CrossRef]
Lin, Z.; Ji, Y.; Liang, W. Landslide Displacement Prediction Based on Time-Frequency Analysis and LMD-BiLSTM Model. Mathematics 2022, 10, 2203. [Google Scholar] [CrossRef]
Ledda, E.; Fumera, G.; Roli, F. Dropout injection at test time for post hoc uncertainty quantification in neural networks. Inf. Sci. 2023, 645, 119356. [Google Scholar] [CrossRef]
Yin, X.; Hu, Q.; Schaefer, G. Open set recognition through monte carlo dropout-based uncertainty. Int. J. Bio-Inspired Comput. 2021, 18, 210–220. [Google Scholar] [CrossRef]
Xia, Y.; Zhang, J.; Jiang, T.; Gong, Z.; Yao, W.; Feng, L. HatchEnsemble: An efficient and practical uncertainty quantification method for deep neural networks. Complex Intell. Syst. 2021, 7, 2855–2869. [Google Scholar] [CrossRef]
McDermott, P.L.; Wikle, C.K. Bayesian recurrent neural network models for forecasting and quantifying uncertainty in spatial-temporal data. Entropy 2019, 21, 184. [Google Scholar] [CrossRef] [PubMed]
Pei, H.; Meng, F.; Zhu, H. Landslide displacement prediction based on a novel hybrid model and convolutional neural network considering time-varying factors. Bull. Eng. Geol. Environ. 2021, 80, 7403–7422. [Google Scholar] [CrossRef]
Ma, Z.; Mei, G. Deep learning for geological hazards analysis: Data, models, applications, and opportunities. Earth-Sci. Rev. 2021, 223, 103858. [Google Scholar] [CrossRef]
Huang, F.; Zhang, J.; Zhou, C. A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 2019, 17, 217–229. [Google Scholar] [CrossRef]
Ke, L. Denoising GPS-based structure monitoring data using hybrid EMD and wavelet packet. Math. Probl. Eng. 2017, 2017, 4920809. [Google Scholar] [CrossRef]
Yang, X.; Li, J.; Jiang, X. Research on information leakage in time series prediction based on empirical mode decomposition. Sci. Rep. 2024, 14, 28362. [Google Scholar] [CrossRef] [PubMed]
Liu, Q.; Lu, G.; Dong, J. Prediction of landslide displacement with step-like curve using variational mode decomposition and periodic neural network. Bull. Eng. Geol. Environ. 2021, 80, 3783–3799. [Google Scholar] [CrossRef]
Yang, B.; Yin, K.; Xiao, T.; Chen, L.; Du, J. Annual variation of landslide stability under the effect of water level fluctuation and rainfall in the Three Gorges Reservoir, China. Environ. Earth Sci. 2017, 76, 564. [Google Scholar] [CrossRef]
Zhang, P. A novel feature selection method based on global sensitivity analysis with application in machine learning-based prediction model. Appl. Soft. Comput. 2019, 85, 105859. [Google Scholar] [CrossRef]
Naik, D.L.; Kiran, R. A novel sensitivity-based method for feature selection. J. Big Data 2021, 8, 128. [Google Scholar] [CrossRef]
Pizarroso, J.; Portela, J.; Munoz, A. NeuralSens: Sensitivity analysis of neural networks. J. Stat. Softw. 2022, 102, 1–36. [Google Scholar] [CrossRef]
Asheghi, R.; Hosseini, S.A.; Saneie, M.; Shahri, A.A. Updating the neural network sediment load models using different sensitivity analysis methods: A regional application. J. Hydroinf. 2020, 22, 562–577. [Google Scholar] [CrossRef]
Abbaszadeh Shahri, A.; Maghsoudi Moud, F.; Mirfallah Lialestani, S.P. A hybrid computing model to predict rock strength index properties using support vector regression. Eng. Comput. 2022, 38, 579–594. [Google Scholar] [CrossRef]
Luo, C.; Zhu, S.P.; Keshtegar, B.; Niu, X.; Taylan, O. An enhanced uniform simulation approach coupled with SVR for efficient structural reliability analysis. Reliab. Eng. Syst. Saf. 2023, 237, 109377. [Google Scholar] [CrossRef]

Figure 1. (a) A typical RNN module. (b) An LSTM architecture [14,33].

Figure 2. A schematic representation of the prediction framework introduced in this study.

Figure 3. (a) The regional context of the TGRA (Three Gorges Reservoir area) in China; (b) the spatial location of the Bazimen landslide; (c) a satellite-based overview of the Bazimen landslide (visual obtained via Google Earth).

Figure 4. The layout of monitoring points across the Bazimen landslide and its displacement zones. The red line represents the landslide boundary, the purple line indicates the cross-section I–I’, and the blue line denotes the coastline.

Figure 5. The geological profile diagram along section I–I′ illustrating the Bazimen slope failure.

Figure 6. The results of the decomposition of the displacement monitoring data at the ZG111 point.

Figure 7. Monitoring the records of rainfall, water levels in the reservoir, and cumulative displacement within the landslide zone.

Figure 8. Variation in rainfall, reservoir level, and cumulative displacement increases at ZG111.

Figure 9. Curves for the predicted displacement at point ZG111. (A) Trend term comparison of ZG111; (B) Periodic term comparison; (C) Total displacement prediction. The red dotted boxes indicate areas of significant variation; the arrows highlight the direction of displacement trends; letters “a” and “b” mark regions with noticeable differences between observed and predicted values.

Figure 10. The fitting performance of the model at point ZG111: observed and predicted displacements during January–December 2010. The red dotted boxes highlight key comparison areas. Labels a′ and b′ indicate regions with obvious differences among models.

Table 1. The stages of deformation of the Bazimen landslide.

Deformation Stage	Time Range	Remarks
1	March 2003–May 2003	The landslide deformation starting stage.
2	June 2003–September 2006	Cracks begin to appear with the first fluctuation of 135 m.
3	October 2006–August 2008	The deformation activity of cracks has intensified with an initial fluctuation of 156 m.
4	September 2008–December 2010	The first fluctuation of 175 m shown by cumulative time–displacement curves with a periodic step-like characteristic.

Table 2. Gray relational degree between the influencing variables and the periodic movement of the Bazimen landslide.

Candidate Factors	Description	ZG111
f₁	the precipitation during the current month	0.68
f₂	the precipitation during the past two months	0.63
f₃	the maximum daily rainfall during the current month	0.65
f₄	the number of rainy days during the current month	0.63
f₅	the maximum continuous number of rainfall days during the current month	0.67
f₆	the average reservoir level during the current month	0.63
f₇	the change in the reservoir level during the current month	0.73
f₈	the change in the reservoir level during the past two months	0.67
f₉	the number of days of reservoir water level decline during the current month	0.68
f₁₀	the accumulated decrease in reservoir water level during the current month	0.69
f₁₁	the number of days over which reservoir water levels rose during the current month	0.64
f₁₂	the accumulated increase in reservoir water levels during the current month	0.63

Table 3. The result of the collinearity test in ZG111.

Candidate Factors	Initial Input Factor		New Input Factor 1		New Input Factor 2
Candidate Factors	Tolerance	VIF	Tolerance	VIF	Tolerance	VIF
f₁	0.147	6.820	0.148	6.743	0.233	4.291
f₂	0.216	4.632	0.229	4.365	0.235	4.263
f₃	0.245	4.079	0.261	3.829	/	/
f₄	0.320	3.130	0.330	3.030	0.333	3.007
f₅	0.508	1.968	0.555	1.802	0.573	1.746
f₆	0.592	1.690	0.599	1.671	0.611	1.636
f₇	0.006	179.967	/	/	/	/
f₈	0.246	4.073	0.261	3.837	0.262	3.818
f₉	0.017	59.365	/	/	/	/
f₁₀	0.015	66.722	0.302	3.314	0.317	3.152
f₁₁	0.017	59.286	0.261	3.828	0.263	3.797
f₁₂	0.006	171.982	0.223	4.485	0.226	4.431

Table 4. Optimal hyperparameter combination of SVRs and LSTMs at point ZG111.

Point	LSTMs				SVRs
Point	Number of Layers	Number of Epochs	Number of Batch Sizes	Number of Neurons	C	Gamma
Trend term of ZG111	3	54	12	22	21	0.5099
Periodic term of ZG111	3	65	28	22	74.0	0.75

Table 5. Prediction accuracy of displacement at point ZG111 using LSTM and SVR models.

Time	Original Displacement (mm)	SVRs			LSTMs
Time	Original Displacement (mm)	Predicted Displacement (mm)	Absolute Error (mm)	Relative Error (%)	Predicted Displacement (mm)	Absolute Error (mm)	Relative Error (%)
January 2010	1091.10	1063.57	27.53	2.52	1051.56	39.54	3.62
February 2010	1089.50	1071.65	17.85	1.64	1067.39	22.11	2.03
March 2010	1101.70	1081.30	20.4	1.85	1081.33	20.37	1.85
April 2010	1111.40	1074.04	37.36	3.36	1121.60	10.20	0.92
May 2010	1109.80	1114.29	4.49	0.40	1140.80	31.00	2.79
June 2010	1121.40	1152.87	31.47	2.81	1162.53	41.13	3.67
July 2010	1189.40	1189.36	0.04	0.00	1206.89	17.49	1.47
August 2010	1232.90	1198.16	34.74	2.82	1214.94	17.96	1.46
September 2010	1253.50	1193.15	60.35	4.81	1217.85	35.65	2.84
October 2010	1268.30	1225.88	42.42	3.34	1258.32	9.98	0.79
November 2010	1264.20	1236.41	27.79	2.20	1261.92	2.28	0.18
December 2010	1262.00	1261.16	0.84	0.07	1272.52	10.52	0.83
Min			0.04	0.00		9.98	0.18
Max			60.35	4.81		41.13	3.67
Mean			25.44	2.15		21.52	1.87
RMSE		30.71			24.73

Table 6. The accuracy of the predicted displacement in periodic and trend terms using LSTM and SVR models at point ZG111.

Model	RMSE in Trend Term (mm)	RMSE in Periodic Term (mm)
SVR	2.30	28.92
LSTM	3.52	23.61

Table 7. The selected hyperparameters and input features for the best-performing prediction model at point ZG111.

Inputs	N_NEIGHBORS	P
f₁, f₃, f₄, f₅, f₆, f₁₀, f₁₁, f₁₂, f₁₃, f₁₅	2	2

Table 8. Prediction accuracy at the ZG111 point based on the refined model.

Time	Original Displacement (mm)	Predicted Displacement (mm)	Classification Output Results	Absolute Error (mm)	Relative Error (%)
January 2010	1091.10	1051.56	0	39.54	3.62
February 2010	1089.50	1071.65	1	17.85 ^	1.64
March 2010	1101.70	1081.33	0	20.37 *	1.85
April 2010	1111.40	1121.60	0	10.20 *	0.92
May 2010	1109.80	1114.29	1	4.49 ^	0.40
June 2010	1121.40	1152.87	1	31.47 ^	2.81
July 2010	1189.40	1206.89	0	17.49	1.47
August 2010	1232.90	1198.16	1	34.74	2.82
September 2010	1253.50	1217.85	0	35.65 *	2.84
October 2010	1268.30	1258.32	0	9.98 *	0.79
November 2010	1264.20	1261.92	0	2.28 *	0.18
December 2010	1262.00	1272.52	0	10.52	0.83
Min				2.28	0.18
Max				39.54	3.62
Mean				19.55	1.68
RMSE		23.11

* Better than SVR models. ^ Better than LSTM models.

Table 9. Optimal hyperparameter combination of SVRs and LSTMs at point ZG111 without decomposition.

Point	LSTMs				SVRs
Point	Number of Layers	Number of Epochs	Number of Batch Sizes	Number of Neurons	C	Gamma
Total displacement of ZG111	3	65	28	22	74.0	0.75

Table 10. The accuracy of the predicted displacement in total displacement by a single model at point ZG111 without decomposition.

Model	RMSE of Single Model in Total Displacement (mm)
SVR	386.93
LSTM	453.59

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, H.; Wu, J.; Zhou, H.; Liu, M.; Li, S.; Wu, Y.; Guo, Y. The Application of KNN-Optimized Hybrid Models in Landslide Displacement Prediction. Eng 2025, 6, 169. https://doi.org/10.3390/eng6080169

AMA Style

Jiang H, Wu J, Zhou H, Liu M, Li S, Wu Y, Guo Y. The Application of KNN-Optimized Hybrid Models in Landslide Displacement Prediction. Eng. 2025; 6(8):169. https://doi.org/10.3390/eng6080169

Chicago/Turabian Style

Jiang, Hongwei, Jiayi Wu, Hao Zhou, Mengjie Liu, Shihao Li, Yuexu Wu, and Yongfan Guo. 2025. "The Application of KNN-Optimized Hybrid Models in Landslide Displacement Prediction" Eng 6, no. 8: 169. https://doi.org/10.3390/eng6080169

APA Style

Jiang, H., Wu, J., Zhou, H., Liu, M., Li, S., Wu, Y., & Guo, Y. (2025). The Application of KNN-Optimized Hybrid Models in Landslide Displacement Prediction. Eng, 6(8), 169. https://doi.org/10.3390/eng6080169

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

The Application of KNN-Optimized Hybrid Models in Landslide Displacement Prediction

Abstract

1. Introduction

2. Methodology

2.1. Separation of Displacement Time Series into Long-Term and Cyclic Patterns

2.2. Moving Average Methods

2.3. Support Vector Regression

2.4. Long Short-Term Memory Neural Networks

2.5. Reliability Evaluation of the Model

3. Study Site

3.1. Bazimen Landslide

3.2. Landslide Inventory

4. Prediction Process

4.1. Point Selection and Data Processing

4.2. Factor Selection

4.3. Normalization and Inverse Normalization

4.4. Parameters of SVRs and LSTMs

5. Results

5.1. SVR Models and LSTM Models

5.2. Sequential Hybrid Model Optimized by K-Nearest Neighbor

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI