Landslide Susceptibility Prediction Based on a CNN–LSTM–SAM–Attention Hybrid Model

Wu, Honggang; Niu, Jiabi; Li, Yongqiang; Wang, Yinsheng; Qiu, Daohong

doi:10.3390/app15137245

Open AccessArticle

Landslide Susceptibility Prediction Based on a CNN–LSTM–SAM–Attention Hybrid Model

by

Honggang Wu

^1,2,

Jiabi Niu

^3,*,

Yongqiang Li

^1,2,

Yinsheng Wang

¹ and

Daohong Qiu

³

¹

China Railway Science Research Institute Group Co., Ltd., Chengdu 610032, China

²

China Railway Northwest Scientific Research Institute Co., Ltd., Lanzhou 730070, China

³

Institute of Geotechnical and Underground Engineering, Shandong University, Jinan 250061, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(13), 7245; https://doi.org/10.3390/app15137245

Submission received: 13 May 2025 / Revised: 16 June 2025 / Accepted: 24 June 2025 / Published: 27 June 2025

(This article belongs to the Section Civil Engineering)

Download

Browse Figures

Versions Notes

Abstract

Accurate prediction of landslide susceptibility is a key component of disaster risk reduction and early warning systems. Traditional landslide susceptibility prediction methods often face challenges in capturing complex nonlinear and spatio-temporal relationships inherent in geospatial data. In this study, we propose a Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Spatial Attention Mechanism (SAM) hybrid deep learning model designed for spatial landslide susceptibility prediction. The model is trained on a comprehensive dataset comprising 19,898 samples, constructed from landslide records and 16 influencing factors in Kumamoto Prefecture, Japan. The input dataset is processed in tabular format using Microsoft Excel and includes variables such as topography, meteorology, soil characteristics, and human activity. The proposed model leverages Convolutional Neural Networks (CNN) to extract spatial features, Long Short-Term Memory networks (LSTM) to model temporal dependencies, and a Spatial Attention Mechanism (SAM) to enhance feature weighting dynamically. Experimental results demonstrate that the CNN–LSTM–SAM–Attention model significantly outperforms traditional machine learning approaches in terms of accuracy, precision, recall, F1 score, ROC–AUC, and PR–AUC. This substantial improvement is attributed to the model’s enhanced capability in capturing complex spatio-temporal patterns and dynamically weighting critical spatial features through the integrated Spatial Attention Mechanism (SAM). This study highlights the potential of deep learning-based approaches for improving the reliability of spatial landslide susceptibility prediction in complex terrain and dynamic climatic conditions.

Keywords:

landslide susceptibility prediction; deep learning; CNN-LSTM-SAM-Attention model; model performance evaluation

1. Introduction

In the five years between 2019 and 2023, there were 28,109 geologic disasters in China, 16,209 of which were landslide disasters, according to statistical data from the China Statistical Yearbook 2024 edition. Since landslides are one of the most damaging geologic disasters and result in a significant number of fatalities and financial losses each year, landslide prediction is crucial for minimizing these losses [1].

Landslide prediction techniques include machine learning model prediction, conventional statistical prediction, and physical mechanics-based prediction. Prediction techniques based on physical mechanics necessitate additional geomechanical characteristics, which are difficult to gather. Additionally, physical mechanics techniques have more computing complexity and more stringent usage requirements [2]. In the subject of landslide prediction, machine learning model prediction techniques and conventional statistical prediction techniques are currently gaining traction [3,4,5]. Weight of evidence (WOE), logistic regression, generalized linear models (GLM), generalized additive models (GAM), and multivariate statistical analysis are examples of traditional statistical prediction models that rely on historical data and preset thresholds [6,7]. While these models are simple to use, they are unable to capture the complex dependencies and nonlinear relationships present in landslide data [8]. The shortcomings of traditional statistical prediction models in capturing complex dependencies and nonlinear relationships have been addressed by machine learning techniques like decision trees [9], support vector machines (SVMs) [10], and random forests [11,12]. However, these techniques still have difficulty handling the high dimensionality and temporal dynamics of the data, and while their predictive ability is superior to that of traditional statistical models, it has demonstrated very little difference in predictive performance in a number of real-world projects [13,14,15].

Deep learning techniques, known for their ability to automatically extract significant features from large and complex datasets, have become increasingly prevalent in landslide prediction models in recent years. Various studies have demonstrated the effectiveness of deep learning in enhancing prediction accuracy by handling both spatial and temporal data. For instance, Liu et al. (2022) compared the performance of convolutional neural networks (CNNs) and conventional machine learning methods in landslide susceptibility mapping, showing that CNN models outperform traditional approaches due to their ability to capture intricate spatial patterns [16,17,18]. Similarly, Nguyen et al. (2022) explored the potential of recurrent neural networks (RNNs), particularly Long Short-Term Memory (LSTM) networks, in modeling temporal dependencies and forecasting landslide occurrences triggered by rainfall and earthquakes. Additionally, hybrid models combining CNN and LSTM have been proposed to address the spatiotemporal complexity of landslide data, with promising results in both spatial feature extraction and temporal dependency modeling [17,18,19,20,21].

Long Short-Term Memory (LSTM) networks have shown notable advantages in modeling and forecasting landslide displacement due to their capability to capture long-term temporal dependencies and nonlinear relationships in sequential data. Unlike traditional neural networks, LSTM can retain and selectively forget past information, making it highly suitable for analyzing complex hydrometeorological conditions leading to landslides. Fang et al. (2021) applied LSTM to flood susceptibility prediction and found it outperformed traditional machine learning models in both sensitivity and specificity, thanks to its sequential data modeling capacity [22]. Similarly, Liu et al. (2025) used LSTM to predict landslide displacement in the Italian Alps by incorporating rainfall, temperature, humidity, and snowmelt data. The results showed that LSTM models effectively captured the delayed and cumulative effects of snowmelt and precipitation, outperforming other configurations in both early-stage and long-term prediction scenarios [23]. Overall, LSTM’s ability to model time-series data makes it a powerful tool for landslide forecasting, especially in environments where landslide movement is influenced by multiple interrelated factors such as rainfall, snow cover, and seasonal temperature variations.

Convolutional Neural Networks (CNNs) have gained prominence in landslide susceptibility mapping due to their powerful capabilities in spatial feature extraction and image interpretation. CNNs can automatically learn and extract complex features from satellite imagery and digital elevation models (DEMs), making them highly effective in capturing nonlinear and spatially heterogeneous triggering factors such as topographic curvature, slope, aspect, weathering, and hydrological conditions. For example, in the Gorzineh-khil region of Iran, a 15-layer CNN model integrating both topographic and hydrological factors achieved superior predictive performance—79% accuracy, 73% precision, 75% recall, and 77% F1-score—significantly outperforming conventional models such as SVM (70%), k-NN (65%), and decision trees (60%) [24]. Similarly, in a study conducted in Icheon, South Korea, CNNs combined with metaheuristic optimization algorithms like Grey Wolf Optimizer (GWO) and Imperialist Competitive Algorithm (ICA) demonstrated enhanced performance in susceptibility mapping by improving both spatial generalization and predictive precision [25]. These findings underline CNN’s robustness in processing high-dimensional data and modeling complex spatial patterns, making it a powerful deep learning approach for landslide prediction, particularly in regions with diverse geomorphological and hydrometeorological conditions.

Spatial Attention Mechanism (SAM) have shown notable benefits in landslide detection and prediction by enhancing feature focus, reducing background noise, and improving boundary precision. Amankwah et al. (2022) demonstrated that attention-based models like STANet and SNUNet outperform traditional CNNs in segmentation accuracy [26]. Moghimi et al. (2024) combined SAM with LSTM, improving landslide susceptibility mapping in complex terrains [27]. Wei et al. (2022) proposed OC-ACNN, integrating statistical knowledge and attention to boost spatial prediction accuracy [28]. Zhang et al. (2022) introduced a multi-head self-attention LSTM model that captured abrupt displacement patterns more effectively [29]. Overall, SAM enhances landslide modeling by capturing key spatial-temporal features and improving generalization in varied and complex environments.

However, a critical limitation persists in existing studies: they predominantly apply CNN and LSTM in isolation, thereby failing to adequately capture and model the essential dynamic coupling mechanism of spatiotemporal features that drive landslide initiation [30,31,32]. Landslide triggering is frequently caused by complex multi-factor spatiotemporal interactions, including geological activity, rainwater penetration, and terrain stability, which demand integrated spatiotemporal analysis. Landslide triggering is frequently caused by multi-factor spatiotemporal interactions, including geological activity, rainwater penetration, and terrain stability. To directly address this limitation of isolated spatiotemporal modeling, this study proposes a CNN–LSTM–SAM–Attention landslide prediction model. This integrated architecture explicitly captures the dynamic coupling of spatiotemporal features by synergistically combining spatial feature extraction (CNN), temporal dependency modeling (LSTM), and dynamic feature weighting (SAM-Attention mechanism) within a cohesive spatio-temporal analytic framework [30,31,32]. In order to identify long-term dependencies in time-series data, the model first uses a CNN to extract important characteristics from the original data. Lastly, the self-attention mechanism highlights the important spatiotemporal aspects by weighting the LSTM’s output [33,34].

This study’s primary contributions: (1) The novel integration of CNN, LSTM, and SAM self-attention mechanism within a single, end-to-end architecture. This specifically overcomes the key limitation of existing methods identified above—their inability to model the dynamic coupling of spatiotemporal features—by enabling simultaneous and synergistic extraction of spatial patterns, modeling of temporal dependencies, and dynamic weighting of salient features. (2) Compile information about the 2020 Kumamoto Prefecture landslide, build a database, and use the model suggested in this research to train and forecast. (3) Train and predict using conventional machine learning models on the same datasets and then compare the model evaluation index with the model suggested in this paper. It is demonstrated that the CNN–LSTM–SAM–Attention model can successfully capture the spatio-temporal characteristics of the data, deal with the complex and nonlinear relationships in geospatial and temporal data, and then improve the accuracy of landslide prediction by comparing evaluation indices like accuracy, precision, recall, F1 score, ROC–AUC, and PR–AUC [35,36].

This is how the remainder of the paper is structured. Section 2 presents the CNN–LSTM–SAM–Attention model’s process framework and each module’s underlying principles. Section 3 creates a database of landslide samples and describes the engineering background of the landslides in Kumamoto Prefecture. The model’s training procedure and actual outcomes are detailed in Section 4, which also analyzes and assesses the model in light of the findings. In Section 5, a traditional machine learning model is used to predict landslides on the same landslide sample database. The performance differences between the CNN–LSTM–SAM–Attention model suggested in this paper and the traditional machine learning model are compared. The conclusion and prospects for further study are presented in Section 6.

2. Research Methodology (CNN–LSTM–SAM–Attention Model)

By integrating spatial feature extraction, time series modeling, and an attention-weighting mechanism, the CNN–LSTM–SAM–Attention model presented in this paper aims to increase the accuracy of landslide prediction. It consists of a convolutional neural network (CNN), a long short-term memory network (LSTM), and a spatial attention mechanism (SAM). With the help of the spatial attention mechanism, the model can better forecast outcomes by highlighting significant aspects and accurately capturing the intricate spatial and temporal connections found in landslide data.

2.1. Principles of Convolutional Neural Networks (CNN)

One deep learning model that is particularly good at handling images and spatiotemporal data is the Convolutional Neural Network (CNN). Fundamentally, it uses convolutional procedures to extract local characteristics from the input data. The input layer, convolutional layer, pooling layer, fully connected layer, and output layer are the components of a basic CNN. Each element of the input layer’s a m × n matrix has a feature value, and the input data is shown as a 2D feature map. The three main parts of a CNN are the convolutional layer, the pooling layer, and the fully connected layer. Multiple convolutional kernels are present in the convolutional layer. A set of convolutional kernels can be used to capture local representations between slippery-slope feature vectors, and additional convolutional layers can be employed to represent more complex features and iteratively learn from low-level features. In contrast, the pooling layer uses downsampling to reduce the quantity of feature information and the dimensions of the feature map, which enhances the model’s computational efficiency and lowers the possibility of overfitting. The final classification or regression operation is carried out by the fully connected layer, which combines the local features extracted by the convolutional and pooling layers into a global feature representation. The fully connected layer enhances the model’s performance by capturing the intricate patterns and correlations of the input data through the learning of weights and biases.

Rich spatial information, including topography, climate, soil characteristics, etc., is typically present in the raw data used for landslide prediction. CNN uses a convolution technique to extract valuable spatial characteristics from the raw data. Let

X \in R^{H \times W \times C}

be the input data, where H, W, and C stand for the data’s height, width, and number of channels, respectively. The following formula can be used to define the output feature map of the convolution operation

Y \in R^{H^{'} \times W^{'} \times K}

:

Y_{i, j, k} = \sum_{m = 0}^{M - 1} \sum_{n = 0}^{N - 1} \sum_{c = 0}^{C - 1} X_{i + m, j + n, c} \cdot W_{m, n, c, k} + b_{k}

(1)

where

Y_{i, j, k}

is the output features following the convolution process,

W

is the convolution kernel, and

b_{k}

is the bias term. The CNN can efficiently capture the local spatial properties of the input data and produce more insightful inputs for time series modeling by using multi-layer convolution and pooling operations.

2.2. Principles of Long and Short-Term Memory Networks (LSTM)

Long Short-Term Memory Network (LSTM) is a special kind of Recurrent Neural Network (RNN). RNN has the advantage of being able to relate the information from past data to the current data; nevertheless, it faces the issue of gradient vanishing or gradient expanding when the data length exceeds a particular point. To efficiently capture the long-term dependencies in time series data, the LSTM regulates the information flow by introducing three gate control units (Input Gate, Oblivious Gate, and Output Gate). It also resolves the gradient vanishing and gradient explosion issues of RNN in lengthy sequences and preserves the hidden state or memory of previous inputs.

The cell states

C_{t}

and hidden states

h_{t}

make up the LSTM’s basic structure. Through three gates, each LSTM cell regulates the output and updating of data. In particular, the output gate determines the output information at present, the forgetting gate determines the extent to which the cell state has been forgotten, and the input gate determines the impact of the current input on the cell state. The following equation represents the LSTM’s updating process:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(2)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(3)

\tilde{C_{t}} = t a n h (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(4)

C_{t} = f_{t} ⊙ C_{t - 1} + i_{t} ⊙ \tilde{C_{t}}

(5)

o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(6)

h_{t} = o_{t} ⊙ \tanh (C_{t})

(7)

where

f_{t}

,

i_{t}

,

o_{t}

are forgetting gate, input gate and output gate, respectively,

C_{t}

is cell state,

h_{t}

is hidden state,

σ

is sigmoid activation function and

⊙

denotes element by element multiplication.

By efficiently transferring time information backward and capturing long-term temporal dependencies, the LSTM model’s strong memory capacity allows it to handle time-series data in landslide prediction tasks. This helps the model comprehend the data change trend over time, increasing prediction accuracy.

2.3. Spatial Attention Mechanism (SAM) Principles

The Spatial Attention Mechanism (SAM) is an adaptive mechanism that improves the model’s capacity to represent features at significant spatial locations by giving various features in the model varying weights to emphasize significant information and suppress unimportant ones. The following is the spatial attention mechanism’s concept and implementation:

The input feature map: Assume that the input feature map is F∈R^(H × W × C), where H is the height, W is the width, and C is the number of channels.
Pooling activities:

To create a feature map

F_{m a x}

with a size of 1 × 1 × C, Global Max Pooling (GMP) is applied to the input feature map in the channel dimension.

To create a new feature map

F_{a v g}

with dimensions of 1 × 1 × C, the input feature map is subjected to Global Average Pooling (GAP) in the channel dimension.

(1): Fusion Pooling Result: To obtain a complete feature representation, fuse $F_{m a x}$ and $F_{a v g}$ together. This can be achieved by concatenating (splicing) or performing an additive operation.
(2): Attention weight computation: To create a spatial attention weight map $A \in R^{H \times W}$ , the fused features are run through a convolutional layer or another transformation function.
(3): Applying Attention Weights: The computed attention weight map A is applied to the original feature map to weigh the features at each spatial location. This can be achieved by element-by-element multiplication, i.e., $\tilde{F} = F \cdot A$ , where · denotes element-by-element multiplication.
(4): Output weighted feature map: The weighted feature map $\tilde{F}$ will be used as an input to the subsequent network modules for subsequent tasks such as classification, detection, etc.

Specifically, given an input feature map

F \in R^{H \times W \times C}

, the spatial attention mechanism generates an attention weight matrix

M_{s} \in R^{H \times W}

by a convolution operation indicating the importance of each spatial location. The computational process can be represented as

M_{s} (F) = σ ({C o n v}_{2} (F))

(8)

where

{C o n v}_{2}

is a 2D convolution operation,

σ

is the sigmoid activation function, and

M_{s} (F)

is the generated attention weight matrix.

Following the procedures mentioned above, the Spatial Attention Mechanism Pass (SAM) may draw attention to crucial regions and highlight the characteristics of significant spatial locations in the input feature maps, enhancing the model’s prediction performance.

2.4. Process Framework

The proposed CNN–LSTM–SAM–Attention framework for landslide prediction integrates four modules: CNN (feature extraction), SAM (spatial attention), LSTM (temporal modeling), and Classification. The overall workflow is depicted in Figure 1. Each module is detailed in subsequent subsections.

2.4.1. CNN Module

As illustrated in Figure 2, the CNN module architecture is designed to extract hierarchical features from raw sequences.

Function: Hierarchical feature extraction from raw sequences.

Selection justification: Convolutional layers efficiently capture local spatial patterns in sensor data. The multi-scale design (progressively increasing receptive fields) models both short- and mid-range dependencies critical for early landslide indicators.

2.4.2. SAM Module

Figure 3 provides a visual representation of the SAM module architecture, demonstrating the dynamic feature weighting mechanism based on spatial significance.

Function: Dynamic feature weighting based on spatial significance.

Selection justification: Attention mechanisms suppress noise in geospatial data by focusing computational resources on critical regions (e.g., unstable slopes). This adaptive weighting improves robustness against irrelevant terrain variations.

2.4.3. LSTM Module

The LSTM module architecture, shown in Figure 4, focuses on modeling long-term temporal dependencies critical for landslide prediction.

Function: Long-term temporal dependency modeling.

Selection justification: The LSTM module effectively models precursor event sequences where timing is critical. Its state compression mechanism balances memory retention with overfitting prevention by reducing dimensionality while preserving essential temporal patterns.

2.4.4. Classification Module

Figure 5 depicts the classification module, which maps risk probability, using a softmax classifier to provide probabilistic outputs for decision-making.

Function: Risk probability mapping.

Selection justification: A minimal classifier prevents overfitting given limited landslide event data. Softmax provides probabilistic outputs essential for risk-tiered decision making.

3. Engineering Background and Database Construction

In this paper, a database for landslide prediction is constructed using the landslide disaster data of Kumamoto Prefecture in 2020, combined with the actual geographic and meteorological data, as the input data for the CNN–LSTM–SAM–Attention model. The process of database construction and processing includes data cleaning, normalization, dimensionality reduction, and sample equalization to improve the predictive ability of the model.

3.1. Introduction to the Background of the Project

In July 2020, Kumamoto Prefecture was hit by a historically rare persistent rainstorm. The disaster was triggered by the stagnation of the Ume rain front, with cumulative rainfall exceeding 800 mm in 72 h, locally reaching the extreme rainfall standard of “once in decades” as defined by the Japan Meteorological Agency (JMA). The heavy rainfall triggered large-scale landslides and flooding, resulting in more than 50 deaths and 20 missing people, as well as the destruction of more than 2000 houses and severe damage to infrastructure (e.g., roads and bridges). Kumamoto Prefecture is located in the central part of Kyushu, with a predominantly mountainous and hilly topography, and complex geologic conditions, which have been the cause of many landslides in the past, due to both typhoons and torrential rainfalls. The large amount of multidimensional data on topography, meteorology, soil, and human activities from this disaster provides a valuable resource for landslide disaster prediction and research. Based on these data, this study processes the data and constructs a comprehensive sample database to provide rich input for model training.

3.2. Data Structure and Type

The dataset used in this study consists of tabular data, with each row representing a unique geographical location and time point, and each column representing a feature relevant to landslide prediction. The features include topographical, meteorological, and geological data, such as slope, distance from fault lines, and soil properties. This tabular structure was chosen for ease of processing and compatibility with machine learning algorithms.

A sample of the dataset, as shown in Table 1 below, is provided to illustrate the structure and format of the data used in the model. The data includes 16 features, such as slope and proximity to fault lines, which are key factors in landslide prediction.

3.3. Data Processing and Dataset Construction

To ensure data quality, consistency, and suitability for model training, the raw data underwent a structured preprocessing pipeline before the final training dataset was assembled. The primary goal was to create a well-structured, standardized, balanced, and optimized dataset ready for input into the landslide prediction model. The resulting training dataset consists of 19,898 samples, each characterized by 16 optimized features, with balanced representation of landslide (positive) and non-landslide (negative) classes.

3.3.1. Data Cleaning

Missing values within the raw data were addressed using appropriate techniques such as mean-filling or interpolation. Outliers were identified and corrected or filtered based on predefined, physically reasonable thresholds for each variable (e.g., maximum plausible slope angle, rainfall intensity).

This step is crucial for maintaining data integrity and preventing spurious results caused by incomplete or erroneous entries. Handling missing values ensures all samples can be used, while outlier treatment prevents extreme, potentially unrepresentative values from unduly influencing model training.

3.3.2. Data Standardization

All numerical features (e.g., distance from river, slope angle, rainfall amount) were transformed using Z-score standardization. This technique rescales each feature to have a mean (μ) of 0 and a standard deviation (σ) of 1.

Features naturally exhibit different units and scales (e.g., meters for distance, degrees for slope, mm for rainfall). Standardization eliminates these magnitude disparities. This prevents features with larger native ranges from dominating the model’s learning process solely due to their scale, leading to more stable and efficient training, and ensuring all features contribute proportionally to the model’s objective function.

3.3.3. Feature Optimization

The Minimum Redundancy Maximum Relevance (mRMR) algorithm was employed to rigorously evaluate feature importance for landslide prediction. mRMR quantifies both the relevance of each feature to landslide occurrence and the redundancy among features. Analysis confirmed that all 16 original features exhibited significant predictive relevance and acceptably low mutual redundancy. Consequently, the complete feature set was retained for model training.

3.3.4. Sample Equalization (Class Balancing)

Landslide prediction inherently presents a severe class imbalance problem, with significantly fewer landslide occurrences (positive samples) compared to non-occurrences (negative samples). To address this, the SMOTE (Synthetic Minority Over-sampling Technique) algorithm was applied exclusively to the training set. SMOTE generates synthetic examples for the minority landslide class by interpolating between existing real minority samples.

This process structures the training dataset to have an approximately equal proportion of landslide and non-landslide samples. Severe class imbalance biases models towards predicting the majority class (non-landslide), as simply predicting “no landslide” yields high accuracy but fails the core prediction task. By balancing the classes within the training set using SMOTE, the model is exposed to sufficient examples of landslides during learning, enabling it to learn the characteristics of both classes effectively and reducing prediction bias towards the majority class.

After the above steps, the finally constructed database contains multi-dimensional and multi-temporal landslide prediction data, covering a variety of factors such as topography, meteorology, soil, etc., which provides high-quality data support for model training and prediction. The reasonable construction and preprocessing steps of this database lay the foundation for improving the performance of the CNN–LSTM–SAM–Attention model in landslide prediction.

4. Model Realization and Result Analysis

4.1. Experimental Parameter Optimization and Training Process

4.1.1. Parameter Calibration

Model hyperparameters were optimized as follows:

Learning rate: Initial 0.001 with decay factor 0.1 per 800 iterations (selected from {0.01, 0.001, 0.0001} via grid search)
Regularization: L2 factor 0.0001 optimized via Bayesian search (range: 0.00001 to 0.001)
LSTM units: 6 units validated through 5-fold cross-validation.

4.1.2. Validation Process

Data partitioning: Stratified 60–10–30% split (training-validation-testing)
Early stopping: Triggered after 5 epochs of non-improving validation loss
Performance tracking: Validation metrics monitored every 30 iterations

4.1.3. Training Outcome

Training converged with:

Validation accuracy: 91.2%
Validation loss: 0.214
<2% divergence between training/validation accuracy
Validation results demonstrate effective generalization capability (Figure 6).

4.2. Results and Analysis

4.2.1. Analysis of Training Dynamics and Convergence Patterns

Figure 6 details the evolution of training and validation metrics throughout the iterative process. The upper plot displays accuracy progression: Training accuracy increases rapidly within the first 200 iterations. Validation accuracy closely follows this trend without significant deviation. Both curves demonstrate stable convergence, maintaining levels above 90% after 800 iterations with only marginal final discrepancy.

The lower loss curve reveals finer optimization dynamics. Training loss decreases sharply during the initial 200 iterations, while validation loss exhibits synchronous reduction. After 400 iterations, both losses enter a phase of gradual and stable decline, ultimately converging near 0.2. Although the final training/validation loss stabilizes around 0.2, this value does not indicate model deficiency: Its consistency with high accuracy convergence confirms the model has reached its optimization limit, with no compromise to practical predictive capability.

The synchronous movement of curves—particularly the validation loss decreasing consistently with training loss without significant divergence—validates the effectiveness of regularization. Minor periodic fluctuations in validation metrics reflect normal sensitivity to data shuffling but demonstrate immediate recovery, indicating robust model behavior. Collectively, these patterns confirm an optimized learning process, exhibiting neither underfitting nor overfitting.

4.2.2. Classification Performance

The confusion matrices for the training data and test data are shown in Figure 7. The prediction accuracy of the training set reaches 93.35%, while the prediction accuracy of the test set is 91.66%. This observation is further corroborated by the training curves (Figure 6), which show close convergence and synchronized behavior between the training and validation sets for both accuracy and loss metrics, confirming the absence of overfitting.

Comparison of key indicators:

As this study focuses on a binary classification task for landslide prediction, the following metrics are calculated:

Accuracy: The proportion of correctly predicted samples (both positive and negative classes) relative to the total number of samples. This metric provides an intuitive measure of the model’s overall predictive capability.

Precision: The ratio of true positive samples among all samples predicted as positive (i.e., “the accuracy of landslide predictions”). It reflects the model’s ability to “reduce false alarms.” A high precision indicates high reliability in the model’s landslide predictions.

Recall: The ratio of true positive samples correctly predicted as positive among all actual positive samples (i.e., “the coverage of detected landslides”). It evaluates the model’s capability to “reduce missed detections.” A high recall suggests that the model is more likely to capture potential landslide risks.

F1-Score: The harmonic mean of precision and recall, which balances the trade-off between these two metrics. It provides a comprehensive evaluation of the model’s performance on imbalanced datasets, avoiding the limitations of relying on a single metric. In landslide prediction, a high F1-score indicates that the model achieves a better balance between reducing false alarms and missed detections, making it one of the core optimization objectives.

The landslide prediction model proposed in this study demonstrates exceptional performance on both the training and testing sets. A detailed comparison of key metrics is presented in Table 2. The model’s performance on the testing set compared to the training set highlights the following advantages:

Superior Generalization Capability: The testing set accuracy (91.66%) shows only a minor discrepancy (<2%) from the training set accuracy (93.35%), indicating no overfitting and confirming the model’s applicability to real-world scenarios.
Balanced Precision and Coverage: On the testing set, the model achieves high precision (92.73%) and recall (87.64%), with an F1-score of 0.9012. This reflects an effective trade-off between reducing false alarms and missed detections, aligning with disaster prevention requirements.
Robustness to Class Imbalance: Despite the high proportion of negative class samples, the model stably captures nearly 90% of landslide events (recall: 87.64%), validating its adaptability to skewed data distributions.

In summary, the proposed model exhibits superior comprehensive performance (F1 > 0.9), combining reliability, generalizability, and practicality. It provides a highly credible solution for landslide early warning systems, effectively addressing critical challenges in disaster risk management.

4.2.3. ROC Curves and PR Curves

In landslide prediction tasks, the model’s performance is further validated through Receiver Operating Characteristic (ROC) curves and Precision-Recall (PR) curves.

ROC Curve and AUC Value

The ROC curve, with false positive rate (FPR) as the horizontal axis and true positive rate (TPR) as the vertical axis, reflects the model’s ability to discriminate between landslides (positive category) and non-landslides (negative category) under different thresholds, and the closer the AUC (area under the curve) is to 1, the better is the model’s classification performance. A high AUC value indicates that the model can effectively reduce the “misreporting of non-landslides as landslides” (low FPR) and maximize the “correct detection of real landslides” (high TPR), which is crucial to guarantee the reliability of disaster warning.

2.: PR curves with AUC values

The PR curve, with Recall on the horizontal axis and Precision on the vertical axis, focuses on the prediction quality of the positive category (landslide). The AUC value of the PR curve in this model reaches 0.9740 (close to 1), which indicates that the model can still achieve both high Precision (reduce false alarms) and high Recall (reduce missed alarms) in a scenario with a high degree of category imbalance (sparse landslide samples).

The ROC curve and PR curve of the model in this paper are shown in Figure 8.

The high AUC value (0.9703) of the ROC curve reflects the strong overall classification ability of the model, while the high AUC value (0.9740) of the PR curve further validates its excellent performance in category-imbalanced data. The combination of the two suggests that the model in this paper is not only able to accurately distinguish landslides from non-landslides but also maintains a high confidence in its early warning ability when landslide samples are scarce.

5. Discussion

5.1. Results of Landslide Prediction Based on Conventional Models

In order to verify the optimization performance of the CNN–LSTM–SAM–Attention model in the landslide prediction task, this study compares it with traditional models such as Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), and CNN. All models use the same training set (2020 landslide data from Kumamoto Prefecture, Japan), feature inputs, and evaluation metrics (accuracy, precision, recall, F1 score, ROC–AUC, and PR–AUC) to ensure comparable results.

The performance metrics of the Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), and Convolutional Neural Network (CNN) models are summarized in Table 3:

The logistic regression (LR) model demonstrates substantially lower predictive accuracy (81.07%) compared to other models, suggesting that its linear assumption struggles to capture the nonlinear interactions among landslide-contributing factors such as slope gradient and rainfall patterns. With the lowest recall rate (70.90%) among conventional models, this approach exhibits critical deficiencies in identifying positive-class instances (landslides), potentially leading to systematic underreporting of disaster risks. The model’s comprehensive performance metrics further reveal limitations: both the F1-score (0.7624) and ROC–AUC (0.8809) remain suboptimal, particularly when handling datasets with moderate class imbalance (60% negative-class instances).

The support vector machine (SVM) achieves the highest precision (90.12%) among conventional models, demonstrating its enhanced capacity to process nonlinear features through kernel function optimization compared to logistic regression. Nevertheless, while its recall rate (79.58%) exceeds that of logistic regression, a persistent 20.42% detection gap persists, fundamentally compromising its suitability for high-reliability early-warning systems requiring near-complete landslide identification.

The random forest (RF) model demonstrates superior comprehensive performance with accuracy (88.88%), ROC–AUC (0.9535), and PR–AUC (0.8917) outperforming both SVM and logistic regression, confirming its ensemble learning mechanism effectively captures complex nonlinear relationships. A critical limitation persists: the F1-score (0.8655) reveals systematic deficiencies in precision-recall harmonization (91.01% vs. 82.51%), likely attributable to inherent decision tree overfitting tendencies or feature importance distribution imbalances within the ensemble architecture.

The convolutional neural network (CNN) demonstrates superior performance over logistic regression and SVM in both accuracy (88.14%) and recall (84.94%), confirming its efficacy in capturing localized spatial patterns within geospatial data. While achieving precision (87.37%) and ROC–AUC (0.952) comparable to Random Forest, the model’s F1-score (0.709) remains substantially lower than its counterparts, revealing systemic deficiencies in optimizing precision-recall equilibrium. Regarding class imbalance sensitivity, despite attaining an elevated PR–AUC (0.9562), the recall rate (84.94%) underperforms the integrated CNN-LSTM-SAM framework (87.64%) proposed in this work, highlighting the constrained capacity of standalone CNN architectures to sufficiently represent positive-class instances.

Core limitations of conventional modeling approaches:

Spatiotemporal feature integration deficit: No existing implementations systematically integrate multidimensional predictors (terrain rasters and rainfall sequences), failing to capture dynamic interlocking mechanisms governing landslide triggering in geological processes.
Class imbalance vulnerability: Consistently sub-85% recall rates expose systemic bias toward conservative prediction paradigms prioritizing specificity over sensitivity, compromising early-warning efficacy through underdetection risks.
These findings collectively reveal inherent performance constraints in traditional landslide prediction frameworks, necessitating architectural innovation with dynamic feature recalibration mechanisms for enhancing operational viability through holistic framework redesign.

While deep learning models like CNN often demonstrate superior capability in complex domains such as image recognition or sequential data modeling, its performance relative to the Random Forest (RF) model in this specific task warrants clarification, particularly given the introductory premise of deep learning’s potential advantages. The observed underperformance of the standalone CNN model (Accuracy: 88.14%, F1-Score: 0.7090) compared to RF (Accuracy: 88.88%, F1-Score: 0.8655) can be primarily attributed to the nature of our input data and the core operational principles of each model. Crucially, our landslide prediction task utilizes tabular data (Excel format), where each sample is represented by a fixed set of 16 distinct geospatial and climatic features per location point. This structure differs fundamentally from the dense, spatially correlated grids (e.g., images, rasters) where CNNs excel through local feature extraction via convolutional filters. Applying CNNs directly to tabular data, where explicit spatial locality between adjacent features is often absent or less meaningful, can be suboptimal. The convolutional operations may struggle to capture the complex, potentially non-local interactions between heterogeneous features (e.g., slope angle vs. antecedent rainfall) as effectively as RF’s ensemble of decision trees, which inherently perform robust feature selection and interaction modeling. Furthermore, RF’s bagging mechanism inherently mitigates overfitting, which can be a challenge for CNNs, especially with datasets of limited size relative to model complexity (as is common in geospatial applications). The CNN’s significantly lower F1-score starkly illustrates its difficulty in achieving a balanced trade-off between precision and recall on this tabular classification task under class imbalance. This highlights a key nuance: different deep learning architectures are specialized for distinct data modalities and tasks. While CNNs dominate spatially structured data, their direct application to conventional tabular feature vectors might not always yield benefits over well-tuned ensemble methods like RF for that specific data type. This limitation underscores the rationale behind our proposed CNN–LSTM–SAM–Attention model, which strategically fuses deep learning components (CNN for potential spatial patterns within derived representations, LSTM for temporal sequences) and attention mechanisms to transcend the limitations of both standalone classical models and monolithic deep learning architectures applied to heterogeneous data.

5.2. Comparative Analysis of the CNN–LSTM–SAM–Attention Model and Traditional Models

The key metrics of the CNN–LSTM–SAM–Attention model compared with traditional models are shown in Table 3, and the following Figure 9 compares the PR curve and ROC curve graphs of each model.

According to the comparison, the CNN–LSTM–SAM–Attention model put forth in this study outperforms the traditional models in the landslide prediction task by a considerable margin. Its strong discriminant in high-dimensional feature spaces is confirmed by the fact that both ROC–AUC (0.9703) and PR–AUC (0.9740) are substantially higher than the conventional models and near the theoretical maximum (1.0) capability.

Beginning with the fundamental metrics, including accuracy, recall, F1 value, AUC value of PR curve, and ROC curve, we compare the CNN–LSTM–SAM–Attention model’s performance to that of conventional models. The analysis is as follows:

Accuracy: The CNN–LSTM–SAM–Attention framework achieves superior accuracy (91.66%), outperforming all conventional models by 2.78–10.59 percentage points, with a 2.78% improvement over the suboptimal Random Forest (88.88%). While Random Forest (88.88%) and CNN (88.14%) demonstrate competency in static spatial feature extraction, their exclusion of temporal dynamics critically compromises adaptability to evolving landslide conditions. Similarly, SVM (87.35%) and logistic regression (81.07%) exhibit fundamental constraints in modeling complex nonlinear interactions due to inherent architectural limitations. This performance hierarchy conclusively validates the proposed model’s enhanced predictive capacity through spatiotemporal fusion (CNN-LSTM) and adaptive attention mechanisms (SAM).

Recall: 87.64% for CNN–LSTM–SAM–Attention, a 5.13% improvement over the optimal traditional model (Random Forest, 82.51%), covering more real landslide events. Logistic regression (70.90%) is only applicable to linearly divisible scenarios due to the simplicity of the model, and the risk of underreporting is extremely high.SVM (79.58%) and CNN (84.94%) do not explicitly enhance the ortho-classical sample response, although they partially alleviate the problem through nonlinear modeling. It can be seen that the SAM module of the model in this paper significantly reduces underreporting by dynamically adjusting the feature weights.

F1-score: The CNN–LSTM–SAM–Attention model achieves a superior F1-score of 0.9012, outperforming the best traditional model (Random Forest: 0.8655) by 4.1 percentage points. This demonstrates enhanced precision-recall balance (92.73% precision vs. 87.64% recall) compared to conventional approaches. Traditional models exhibit limitations: Random Forest (0.8655 F1) and SVM (0.8452 F1) show inadequate coordination between false alarms and missed detections due to unmodeled temporal dependencies, while CNN’s single spatial feature extraction results in substantially lower performance (0.709 F1). Our model’s high F1-score (>0.9) confirms its effectiveness in maintaining safety-economic equilibrium for disaster prevention applications.

PR–AUC: The CNN–LSTM–SAM–Attention model achieves a superior PR–AUC of 0.9740, substantially exceeding conventional methods (logistic regression: 0.8885; SVM: 0.9487). Notably, our solution maintains precision >90% in high-recall regimes (Recall > 0.8) (Figure 8), demonstrating effective coverage of landslide events with minimal false positives while preserving operational reliability.

ROC–AUC: The CNN–LSTM–SAM–Attention model achieves a near-optimal ROC–AUC of 0.9703, significantly outperforming conventional approaches (e.g., logistic regression: 0.8809; SVM: 0.9453). While Random Forest (0.9535) and CNN (0.9520) demonstrate relatively strong performance, their inability to model temporal features results in compromised dynamic event discrimination. Notably, our model maintains a true positive rate (TPR) > 0.9 at a low false positive rate (FPR = 0.2) (Figure 8), demonstrating precise differentiation between landslide and non-landslide events while effectively preventing unnecessary emergency response activations.

Comparison results show that the CNN–LSTM–SAM–Attention model comprehensively surpasses traditional models in five indices: accuracy, recall, F1 value, PR–AUC, and ROC–AUC, and provides a high-precision and high-reliability solution for landslide prediction tasks, which can be applied to disaster warning under complex terrain and dynamic climate conditions.

5.3. Model Component Contribution Analysis

The proposed CNN-LSTM-SAM framework integrates three core modules to address the limitations of conventional landslide prediction models. Based on the experimental results and architectural mechanisms, the specific contributions of each component are analyzed as follows:

The CNN module processes the 16-dimensional tabular feature vectors by reorganizing them into spatial representations through feature embedding layers. This transformation enables convolutional operations to extract local feature interactions and nonlinear patterns within the 16 geospatial factors. As shown in Table 3, the standalone CNN achieves 88.14% accuracy and 84.94% recall—significantly outperforming logistic regression and SVM models. This demonstrates CNN’s capacity to capture complex relationships within tabular data, particularly for terrain-related features like slope gradient and soil composition.

The LSTM module handles sequential dependencies within the feature set. By processing any inherent temporal patterns or ordered dependencies within the 16 feature dimensions, it captures dynamic interactions and temporal evolution patterns among factors such as vegetation index changes and soil moisture variations. The 3.52% accuracy gain of the full framework (91.66%) over standalone CNN (88.14%) validates LSTM’s critical role in modeling feature dynamics, which is further evidenced by the 2.13% recall improvement (87.64% vs. 85.51% for RF).

The SAM (Spatial Attention Module) dynamically recalibrates feature importance across the spatial dimensions derived from CNN outputs. This mechanism focuses computational resources on the most discriminative features while suppressing noise from less relevant factors. As demonstrated in Figure 9a, the framework maintains precision >90% across recall levels >0.8—a capability directly attributable to SAM’s selective weighting of high-impact spatial patterns in complex terrain environments.

Collectively, these components establish a synergistic processing chain: CNN extracts localized feature representations, LSTM models dynamic interactions, and SAM optimizes feature weighting. This integrated architecture overcomes the feature integration limitations of conventional approaches while achieving state-of-the-art performance.

5.4. Model Limitations

While the CNN–LSTM–SAM–Attention model provides significant improvements over traditional machine learning models, it is not without limitations. One of the primary constraints is the model’s reliance on a comprehensive dataset of landslide occurrences and related factors. The performance of the model is directly tied to the quality and quantity of the input data. Although we have taken extensive measures to clean, standardize, and balance the dataset, any inconsistencies or biases in the dataset could affect the model’s accuracy and generalizability.

Furthermore, the model assumes that spatial and temporal relationships remain relatively consistent within the training data’s geographic region. This assumption may limit the model’s ability to predict landslide susceptibility accurately in areas with significantly different environmental, climatic, or geological conditions. The CNN and LSTM components, while powerful, may also struggle with spatial data that does not exhibit clear local patterns or temporal sequences. In such cases, the convolutional layers may not extract meaningful features, and the LSTM may fail to capture temporal dependencies.

Additionally, despite the integration of the Spatial Attention Mechanism (SAM), which helps focus on relevant spatial regions, the model may still be sensitive to background noise and less significant features in certain cases. This can lead to occasional misclassifications in complex terrains where multiple variables interact in unpredictable ways.

5.5. Transferability to Other Regions

The transferability of the CNN–LSTM–SAM–Attention model to regions beyond Kumamoto Prefecture remains a critical area for future exploration. While the model has demonstrated excellent performance on the Kumamoto dataset, its success in other regions would depend on the availability of high-quality data and the similarity of spatial and temporal patterns. Landslide susceptibility is highly context-dependent, and factors such as topography, rainfall patterns, soil composition, and human activities can vary significantly from region to region.

In order to apply the model to other areas, additional data collection would be required to ensure that the model can adapt to different geographical conditions. It may be necessary to fine-tune the model or retrain it using region-specific datasets to achieve optimal performance. Future work should focus on testing the model in various regions, including regions with differing climatic and geological conditions, to assess its robustness and adaptability. Techniques like transfer learning may also be explored to improve the model’s ability to generalize across diverse environments.

In conclusion, while the CNN–LSTM–SAM–Attention model shows promising results in Kumamoto Prefecture, further validation and adaptation to different regions are essential to fully assess its transferability. A more in-depth analysis of model limitations and its ability to predict landslide susceptibility in diverse regions will help enhance the model’s robustness and reliability in real-world applications.

6. Conclusions

This study proposes the CNN–LSTM–SAM–Attention model for landslide susceptibility assessment, with systematic performance evaluation conducted using 2020 Kumamoto landslide event data against benchmark models (Random Forest, SVM, logistic regression). The principal findings are as follows:

The proposed CNN–LSTM–SAM–Attention model demonstrates superior landslide prediction capabilities in real-world engineering applications. Trained on a rigorously curated dataset from the 2020 Kumamoto landslide (16-dimensional features, 19,899 samples after outlier removal), the model achieves exceptional metrics: 91.66% accuracy, 0.9012 F1-score, and 87.64% recall, indicating robust predictive performance with balanced false-positive/false-negative mitigation critical for disaster underreporting prevention. Near-optimal ROC–AUC (0.9703) and PR–AUC (0.9740) scores approaching theoretical maxima confirm its discriminative power in high-dimensional feature spaces and class-imbalanced scenarios.
The CNN–LSTM–SAM–Attention model significantly outperforms the four traditional models (random forest, SVM, logistic regression, and CNN), reflecting the effectiveness of spatio-temporal feature fusion and dynamic optimization. By using the same training set and evaluation metrics (accuracy, precision, recall, F1 score, ROC–AUC and PR–AUC), landslide prediction is carried out with four traditional models, namely, Random Forest, Support Vector Machine, Logistic Regression and CNN, and the results are compared with the model proposed in this paper. The comparison results show that the CNN–LSTM–SAM–Attention model outperforms the traditional models in all five indicators, which shows that the model in this paper effectively improves the discriminative ability of the model in high-dimensional feature space through the spatio-temporal feature fusion by the CNN and LSTM modules and the dynamic adjustment of the feature weights by the SAM module.
Practical application value: The high precision (92.73%) of the model can reduce the cost of misjudgment in landslide warning, while the high recall (87.64%) can cover most of the potential landslide events, providing high confidence support for local disaster prevention planning and emergency response. Compared with traditional models, its comprehensive performance is more adaptable to the prediction needs of complex terrain (e.g., steep slopes, broken rock formations) and dynamic climate (e.g., sudden rainfall in the rainy season).

In conclusion, this study proposes a CNN–LSTM–SAM–Attention model with the background of actual landslide disasters, which has high accuracy and reliability in predicting the occurrence of landslides under the influence of multidimensional complex factors, and provides a practical reference for the deepening of the application of deep learning in the field of geohazards.

Author Contributions

Conceptualization, H.W., J.N. and Y.L.; methodology, H.W. and Y.L.; validation, H.W., J.N., Y.W. and D.Q.; formal analysis, Y.L. and Y.W.; investigation, H.W., Y.L. and D.Q.; resources, J.N. and D.Q.; data curation, Y.W. and Y.L.; writing—original draft preparation, H.W. and Y.L.; writing—review and editing, J.N. and D.Q.; visualization, Y.W.; supervision, J.N.; project administration, J.N.; funding acquisition, H.W. and J.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Science and Technology Research and Development Program of China Railway Group Limited (2022-Major Special Project-07), the National Natural Science Foundation of China (Grant No. 41772298), and the Natural Science Foundation of Shandong Province (Grant No. ZR2023MD051).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

Authors Honggang Wu, Yongqiang Li and Yinsheng Wang were employed by the company China Railway Science Research Institute Group Co., Ltd. Honggang Wu and Yongqiang Li were employed by the company China Railway Northwest Scientific Research Institute Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The funders had no role in the design of the study; in the collection, analysis, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

References

Dikshit, A.; Sarkar, R.; Pradhan, B.; Segoni, S.; Alamri, A.M. Rainfall induced landslide studies in indian himalayan region: A critical review. Appl. Sci. 2020, 10, 2466. [Google Scholar] [CrossRef]
Lin, J.; Chen, W.; Qi, X.; Hou, H. Risk assessment and its influencing factors analysis of geological hazards in typical mountain environment. J. Clean. Prod. 2021, 309, 127077. [Google Scholar] [CrossRef]
Cao, J.; Zhang, Z.; Du, J.; Zhang, L.; Song, Y.; Sun, G. Multi-geohazards susceptibility mapping based on machine learning-a case study in Jiuzhaigou, China. Nat. Hazards 2020, 102, 851–871. [Google Scholar] [CrossRef]
Hegde, J.; Rokseth, B. Applications of machine learning methods for engineering risk assessment—A review. Saf. Sci. 2020, 122, 104492. [Google Scholar] [CrossRef]
Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; ThaiPham, B.; Bui, D.T.; Avtar, R.; Abderrahmane, B. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance. Earth-Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
Dikshit, A.; Pradhan, B.; Alamri, A.M. Pathways and challenges of the application of artificial intelligence to geohazards modelling. Gondwana Res. 2021, 100, 290–301. [Google Scholar] [CrossRef]
Goyes-Penafiel, P.; Hernandez-Rojas, A. Landslide susceptibility index based on the integration of logistic regression and weights of evidence: A case study in popayan, colombia. Eng. Geol. 2021, 280, 105958. [Google Scholar] [CrossRef]
Chen, C.; Fan, L. Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models. Stoch. Environ. Res. Risk Assess. 2023. [Google Scholar] [CrossRef]
Chen, W.; Zhang, S.; Li, R.; Shahabi, H. Performance evaluation of the gis-based data mining techniques of best-first decision tree, random forest, and naïve bayes tree for landslide susceptibility modeling. Sci. Total Environ. 2018, 644, 1006–1018. [Google Scholar] [CrossRef]
Huang, Y.; Zhao, L. Review on landslide susceptibility mapping using support vector machines. CATENA 2018, 165, 520–529. [Google Scholar] [CrossRef]
He, Q.; Wang, M.; Liu, K. Rapidly assessing earthquake-induced landslide susceptibility on a global scale using random forest. Geomorphology 2021, 391, 107889. [Google Scholar] [CrossRef]
Tanyu, B.F.; Abbaspour, A.; Alimohammadlou, Y.; Tecuci, G. Landslide susceptibility analyses using random forest, c4.5, and c5.0 with balanced and unbalanced datasets. CATENA 2021, 203, 105355. [Google Scholar] [CrossRef]
Chen, W.; Xie, X.; Wang, J.; Pradhan, B.; Hong, H.; Bui, D.T.; Duan, Z.; Ma, J. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 2017, 151, 147–160. [Google Scholar] [CrossRef]
Goetz, J.N.; Brenning, A.; Petschko, H.; Leopold, P. Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput. Geosci. 2015, 81, 1–11. [Google Scholar] [CrossRef]
Viet-Ha, N.; Zandi, D.; Shahabi, H.; Chapi, K.; Shirzadi, A.; Al-Ansari, N.; Singh, S.K.; Dou, J.; Nguyen, H. Comparison of support vector machine, bayesian logistic regression, and alternating decision tree algorithms for shallow landslide susceptibility mapping along a mountainous road in the west of iran. Appl. Sci. 2020, 10, 5047. [Google Scholar] [CrossRef]
Dieu Tien, B.; Tsangaratos, P.; Viet-Tien, N.; Ngo Van, L.; Phan Trong, T. Comparing the prediction performance of a deep learning neural network model with conventional machine learning models in landslide susceptibility assessment. CATENA 2020, 188, 104426. [Google Scholar] [CrossRef]
Nguyen Viet, T.; Nguyen, D.D.; Nguyen Duc, M.; Cao Trong, C.; Hung, M.S.; Le, H.V.; Prakash, I.; Pham, B.T. Exploring deep learning models for roadside landslide prediction: Insights and implications from comparative analysis. Phys. Chem. Earth Parts A/B/C 2024, 136, 103741. [Google Scholar] [CrossRef]
Liu, R.; Yang, X.; Xu, C.; Wei, L.; Zeng, X. Comparative study of convolutional neural network and conventional machine learning methods for landslide susceptibility mapping. Remote Sens. 2022, 14, 321. [Google Scholar] [CrossRef]
Yang, S.; Wang, Y.; Wang, P.; Mu, J.; Jiao, S.; Zhao, X.; Wang, Z.; Wang, K.; Zhu, Y. Automatic identification of landslides based on deep learning. Appl. Sci. 2022, 12, 8153. [Google Scholar] [CrossRef]
Yao, J.; Qin, S.; Qiao, S.; Che, W.; Chen, Y.; Su, G.; Miao, Q. Assessment of landslide susceptibility combining deep learning with semi-supervised learning in jiaohe county, jilin province, china. Appl. Sci. 2020, 10, 5640. [Google Scholar] [CrossRef]
Yu, R.; Guo, R.; Jiang, L.; Shao, Y.; Zhou, Z. Susceptibility assessment of glacier-related debris flow on the southeastern tibetan plateau using different hybrid machine learning models. Sci. Total Environ. 2024, 954, 176400. [Google Scholar] [CrossRef] [PubMed]
Fang, Z.; Wang, Y.; Peng, L.; Hong, H. Predicting flood susceptibility using lstm neural networks. J. Hydrol. 2021, 594, 125734. [Google Scholar] [CrossRef]
Liu, Y.; Brezzi, L.; Liang, Z.; Gabrieli, F.; Zhou, Z.; Cola, S. Image analysis and lstm methods for forecasting surficial displacements of a landslide triggered by snowfall and rainfall. Landslides 2024, 22, 619–635. [Google Scholar] [CrossRef]
Nikoobakht, S.; Azarafza, M.; Akgun, H.; Derakhshani, R. Landslide susceptibility assessment by using convolutional neural network. Appl. Sci. 2022, 12, 5992. [Google Scholar] [CrossRef]
Hakim, W.L.; Rezaie, F.; Nur, A.S.; Panahi, M.; Khosravi, K.; Lee, C.-W.; Lee, S. Convolutional neural network (cnn) with metaheuristic optimization algorithms for landslide susceptibility mapping in icheon, south korea. J. Environ. Manag. 2022, 305, 114367. [Google Scholar] [CrossRef] [PubMed]
Amankwah, S.O.Y.; Wang, G.; Gnyawali, K.; Hagan, D.F.T.; Sarfo, I.; Zhen, D.; Nooni, I.K.; Ullah, W.; Zheng, D. Landslide detection from bitemporal satellite imagery using attention-based deep neural networks. Landslides 2022, 19, 2459–2471. [Google Scholar] [CrossRef]
Moghimi, A.; Singha, C.; Fathi, M.; Pirasteh, S.; Mohammadzadeh, A.; Varshosaz, M.; Huang, J.; Li, H. Hybridizing genetic random forest and self-attention based cnn-lstm algorithms for landslide susceptibility mapping in darjiling and kurseong, india. Quat. Sci. Adv. 2024, 14, 100187. [Google Scholar] [CrossRef]
Wei, R.; Ye, C.; Ge, Y.; Li, Y. An attention-constrained neural network with overall cognition for landslide spatial prediction. Landslides 2022, 19, 1087–1099. [Google Scholar] [CrossRef]
Zhang, Z.-K.; Zhang, D.-M.; Li, J.; Wu, Y.-P. Lstm-mh-sa landslide displacement prediction model based on multi-head self-attention mechanism. Rock Soil Mech. 2022, 43, 477–486. [Google Scholar] [CrossRef]
Jiang, H.; Li, Y.; Zhou, C.; Hong, H.; Glade, T.; Yin, K. Landslide displacement prediction combining lstm and svr algorithms: A case study of shengjibao landslide from the three gorges reservoir area. Appl. Sci. 2020, 10, 7830. [Google Scholar] [CrossRef]
Jiang, Z.; Wang, M.; Liu, K. Comparisons of convolutional neural network and other machine learning methods in landslide susceptibility assessment: A case study in pingwu. Remote Sens. 2023, 15, 798. [Google Scholar] [CrossRef]
Lin, Z.; Sun, X.; Ji, Y. Landslide displacement prediction model using time series analysis method and modified lstm model. Electronics 2022, 11, 1519. [Google Scholar] [CrossRef]
Ji, S.; Yu, D.; Shen, C.; Li, W.; Xu, Q. Landslide detection from an open satellite imagery and digital elevation model dataset using attention boosted convolutional neural networks. Landslides 2020, 17, 1337–1352. [Google Scholar] [CrossRef]
Xi, L.; Yu, J.; Ge, D.; Pang, Y.; Zhou, P.; Hou, C.; Li, Y.; Chen, Y.; Dong, Y. Sam-cffnet: Sam-based cross-feature fusion network for intelligent identification of landslides. Remote Sens. 2024, 16, 2334. [Google Scholar] [CrossRef]
Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2021, 17, 168–192. [Google Scholar] [CrossRef]
Zhao, Z.A.; He, Y.; Yao, S.; Yang, W.; Wang, W.; Zhang, L.; Sun, Q. A comparative study of different neural network models for landslide susceptibility mapping. Adv. Space Res. 2022, 70, 383–401. [Google Scholar] [CrossRef]

Figure 1. The process framework diagram of the CNN-LSTM-SAM-Attention model.

Figure 2. CNN module architecture diagram.

Figure 3. SAM module architecture diagram.

Figure 4. LSTM module architecture diagram.

Figure 5. Classification module architecture diagram.

Figure 6. Accuracy and loss curves during training of the CNN-LSTM-SAM-Attention model.

Figure 7. Confusion matrices for training set and test set.

Figure 8. The ROC curve and PR curve of the CNN–LSTM–SAM–Attention model.

Figure 9. Comparison of PR and ROC curves across multiple models.

Table 1. A sample of the dataset.

No.	Land Use	Lithology	DEM	Slope	Curvature	Aspect	NDVI	SPI	TWI
1	2	3	341	23.255	38.652	273.366	7062	1.517	2.374
2	3	3	280	24.953	51.294	192.529	6961	0.844	4.412
3	2	3	127	25.893	34.655	242.103	7861	2.966	2.293
4	2	2	39	26.783	48.139	233.130	6815	0.926	2.454
5	2	3	373	38.975	35.483	266.423	6918	1.397	1.821
No.	dis2road	dis2drainage	dis2catchment	dis2faults	DV	X	Y	label
1	6.796	911.046	1629.682	1251.960	0.000001	804,235	810,807	1
2	102.833	172.909	4.685	1512.070	−0.26099664	857,547.4	831,943.4	1
3	27.917	319.300	6903.066	2803.783	0.343779743	819,718	825,885	0
4	131.746	306.661	5351.267	510.760	1.061006904	833,572.8	839,678.7	0
5	29.458	424.368	1853.138	261.500	14.7746563	834,785.8	827,973.5	0

Table 2. Comparison of key metrics between training and test sets.

	Accuracy	Precision	Recall	F1-Score
Training Set	93.35%	94.57%	89.84%	0.9214
Test Set	91.66%	92.73%	87.64%	0.9012

Table 3. Performance comparison of different models.

	LR	SVM	RF	CNN	CLSA *
Accuracy	81.07%	87.35%	88.88%	88.14%	91.66%
Precision	82.45%	90.12%	91.01%	87.37%	92.73%
Recall	70.90%	79.58%	82.51%	84.94%	87.64%
F1-Score	0.7624	0.8452	0.8655	0.7090	0.9012
PR–AUC	0.8809	0.9453	0.9535	0.9520	0.9703
ROC–AUC	0.8885	0.9487	0.8917	0.9562	0.9740

* CLSA denotes the CNN–LSTM–SAM–Attention model.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, H.; Niu, J.; Li, Y.; Wang, Y.; Qiu, D. Landslide Susceptibility Prediction Based on a CNN–LSTM–SAM–Attention Hybrid Model. Appl. Sci. 2025, 15, 7245. https://doi.org/10.3390/app15137245

AMA Style

Wu H, Niu J, Li Y, Wang Y, Qiu D. Landslide Susceptibility Prediction Based on a CNN–LSTM–SAM–Attention Hybrid Model. Applied Sciences. 2025; 15(13):7245. https://doi.org/10.3390/app15137245

Chicago/Turabian Style

Wu, Honggang, Jiabi Niu, Yongqiang Li, Yinsheng Wang, and Daohong Qiu. 2025. "Landslide Susceptibility Prediction Based on a CNN–LSTM–SAM–Attention Hybrid Model" Applied Sciences 15, no. 13: 7245. https://doi.org/10.3390/app15137245

APA Style

Wu, H., Niu, J., Li, Y., Wang, Y., & Qiu, D. (2025). Landslide Susceptibility Prediction Based on a CNN–LSTM–SAM–Attention Hybrid Model. Applied Sciences, 15(13), 7245. https://doi.org/10.3390/app15137245

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Landslide Susceptibility Prediction Based on a CNN–LSTM–SAM–Attention Hybrid Model

Abstract

1. Introduction

2. Research Methodology (CNN–LSTM–SAM–Attention Model)

2.1. Principles of Convolutional Neural Networks (CNN)

2.2. Principles of Long and Short-Term Memory Networks (LSTM)

2.3. Spatial Attention Mechanism (SAM) Principles

2.4. Process Framework

2.4.1. CNN Module

2.4.2. SAM Module

2.4.3. LSTM Module

2.4.4. Classification Module

3. Engineering Background and Database Construction

3.1. Introduction to the Background of the Project

3.2. Data Structure and Type

3.3. Data Processing and Dataset Construction

3.3.1. Data Cleaning

3.3.2. Data Standardization

3.3.3. Feature Optimization

3.3.4. Sample Equalization (Class Balancing)

4. Model Realization and Result Analysis

4.1. Experimental Parameter Optimization and Training Process

4.1.1. Parameter Calibration

4.1.2. Validation Process

4.1.3. Training Outcome

4.2. Results and Analysis

4.2.1. Analysis of Training Dynamics and Convergence Patterns

4.2.2. Classification Performance

4.2.3. ROC Curves and PR Curves

5. Discussion

5.1. Results of Landslide Prediction Based on Conventional Models

5.2. Comparative Analysis of the CNN–LSTM–SAM–Attention Model and Traditional Models

5.3. Model Component Contribution Analysis

5.4. Model Limitations

5.5. Transferability to Other Regions

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI