Prediction of Rainfall-Induced Slope Stability Spatiotemporal Evolution Based on a Hybrid Transformer–LSTM Deep Learning Framework

Zhang, Xin; Wang, Fang; Yang, Hao; Liu, Shixiao

doi:10.3390/geohazards7020075

Open AccessArticle

Prediction of Rainfall-Induced Slope Stability Spatiotemporal Evolution Based on a Hybrid Transformer–LSTM Deep Learning Framework

by

Xin Zhang

^1,*,

Fang Wang

²,

Hao Yang

¹ and

Shixiao Liu

²

¹

Ningxia Communications Investment Expressway Management Company Limited, Yinchuan 750000, China

²

School of Civil Engineering and Hydraulic Engineering, Ningxia University, Yinchuan 750000, China

^*

Author to whom correspondence should be addressed.

GeoHazards 2026, 7(2), 75; https://doi.org/10.3390/geohazards7020075 (registering DOI)

Submission received: 3 May 2026 / Revised: 8 June 2026 / Accepted: 11 June 2026 / Published: 13 June 2026

Download

Browse Figures

Versions Notes

Abstract

Rainfall is a critical factor inducing slope instability, and accurate prediction of the factor of safety (FOS) of slopes under rainfall conditions is of paramount importance for disaster prevention and mitigation. Conventional numerical simulation methods incur high computational costs, while individual machine learning models are often insufficient to adequately capture the nonlinear spatiotemporal evolution characteristics of multiple factors under coupled multi-physics fields. To address these limitations, this paper proposes a Transformer–LSTM prediction framework. First, a fluid–structure coupling model for rainfall-affected slopes is constructed using COMSOL, and multi-factor orthogonal experiments are performed to generate multi-dimensional time-series data. Subsequently, a Transformer–LSTM fusion deep learning model is built, in which LSTM is employed to extract the temporal dynamic characteristics of rainfall infiltration, and the self-attention mechanism of the Transformer is leveraged to enhance feature extraction and global dependency modeling of key disaster-causing factors. Experimental results demonstrate that the Transformer–LSTM model significantly outperforms traditional PSO-LSTM, PSO-SVM, and standalone Transformer or LSTM models in terms of both prediction accuracy and generalization capability. Its coefficient of determination (R²) remains above 0.94, and key evaluation metrics—including mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE)—attain the lowest values among the compared models. Furthermore, the SHAP (SHapley Additive exPlanations) interpretability framework is introduced to quantitatively elucidate the model’s predictive decision-making and to establish a physically grounded causal mapping with geotechnical mechanisms. It is confirmed that effective cohesion and slope angle exert a dominant interactive effect on the degradation of slope stability, providing data-driven support for wide-area monitoring of rainfall-induced landslides.

Keywords:

rainfall-induced slope; slope stability; machine learning; SHAP

1. Introduction

Against the macro background of drastic global climate change and the increasing frequency of extreme weather events, rainfall-induced landslides have become one of the most destructive natural hazards worldwide, causing substantial economic losses and casualties annually [1]. The occurrence mechanism of rainfall-induced landslides is intrinsically complex: precipitation infiltrates and migrates within unsaturated soil, leading to increased soil moisture content, progressive loss of matric suction, and a marked rise in pore water pressure [2]. According to the principle of effective stress for unsaturated soils, these hydraulic perturbations directly degrade the shear strength of the soil, ultimately triggering slope instability under gravitational loading when the mobilized shear stress exceeds the available resistance [3]. Consequently, the development of accurate and computationally efficient methods for predicting the spatiotemporal evolution of slope stability under rainfall conditions constitutes a critical research priority for geotechnical disaster prevention and mitigation.

In traditional geotechnical engineering practice, slope stability assessment has predominantly relied on deterministic physical models. The Limit Equilibrium Method (LEM) calculates the factor of safety (FOS) by postulating a potential sliding surface and solving force and moment equilibrium equations; it is widely adopted for its conceptual simplicity and computational efficiency but inherently neglects the stress–strain constitutive behavior of the soil mass [4]. Numerical simulation methods—including the Finite Element Method (FEM) and Finite Difference Method (FDM)—offer more rigorous characterization of the fluid-solid coupling processes during rainfall infiltration and enable identification of the critical state through strength reduction techniques [5]. Nevertheless, these physics-based approaches frequently encounter convergence difficulties and prohibitive computational costs when addressing highly nonlinear unsaturated flow problems. In large-scale regional prediction or real-time early warning scenarios demanding rapid response, the temporal window required by numerical simulation often fails to satisfy operational early warning requirements [6].

In recent years, data-driven models grounded in artificial intelligence have provided promising alternative paradigms for slope stability assessment. Methods including Support Vector Regression (SVR), Artificial Neural Networks (ANN), and ensemble learning algorithms such as Random Forest (RF) and Extreme Gradient Boosting (XGBoost) establish nonlinear mappings between rainfall-related factors and FOS by learning from historical monitoring data or numerically generated synthetic samples [7,8]. For example, Li et al. [9] proposed a machine learning framework combining SVR, XGBoost, and LightGBM for regional-scale rainfall-induced slope stability prediction, demonstrating considerable advantages in computational efficiency and predictive accuracy. Huang [10] established a model for forecasting shallow landslide occurrence time by integrating neural network algorithms with clustering techniques. Although these static models have achieved satisfactory results in point-in-time prediction tasks, slope instability is fundamentally a time-dependent process characterized by the cumulative effects of antecedent rainfall and the hysteretic response of soil hydraulic properties, which static models are structurally incapable of capturing [11].

The introduction of Long Short-Term Memory (LSTM) networks has partially addressed the temporal modeling gap. Through their internal gating mechanisms—comprising forget, input, and output gates—LSTM architectures retain historical state information across extended sequences, demonstrating superior performance in landslide displacement and stability prediction tasks. Lin et al. [12] proposed a hybrid architecture integrating LSTM, interpolation techniques, and Convolutional Neural Networks (CNN) to predict pore water pressure dynamics and assess slope stability. Deng et al. [13] integrated FLAC3D numerical simulations with unsaturated seepage theory to develop a GA-LSTM-MC (Genetic Algorithm-optimized LSTM with Monte Carlo simulation) model for predicting slope displacement, FOS, and failure probability. However, standalone LSTM models exhibit inherent limitations when processing high-dimensional, multivariate inputs. Specifically, their inherently sequential processing paradigm may lead to attenuated representation of long-range dependencies, and they lack an explicit mechanism for dynamically weighting the contributions of heterogeneous input features at different temporal stages [14].

The Transformer architecture, which has revolutionized natural language processing and is increasingly applied to scientific computing domains, addresses these limitations through its self-attention mechanism. By computing pairwise attention weights across all positions in the input sequence, the Transformer enables parallelized global dependency modeling and dynamic identification of feature relevance [14]. This capability is particularly salient for slope stability analysis, because the dominant factors governing FOS—such as rainfall intensity, pore water pressure, and slope geometry—undergo dynamic transitions across different stages of rainfall infiltration. For instance, hydraulic parameters exert primary control during the early infiltration stage, whereas mechanical and geometric parameters become increasingly influential as the wetting front approaches the potential sliding surface.

To synergistically integrate the temporal memory capability of LSTM with the global feature capture advantages of the Transformer, this study proposes a hybrid Transformer–LSTM (T-L) deep learning framework for high-precision regression prediction of rainfall-induced slope FOS. The contributions of this work are fourfold: (1) A fluid-solid coupling numerical model is constructed in COMSOL Multiphysics to generate a physically consistent, multi-factor orthogonal experimental dataset comprising 60 simulation scenarios, ensuring that the deep learning model learns intrinsic patterns aligned with geophysical principles. (2) A novel T-L hybrid architecture is designed, in which LSTM layers serve as temporal feature extractors capturing local infiltration dynamics and Transformer layers provide global attention-based feature re-weighting. (3) Comprehensive model evaluation is conducted through multi-dimensional metrics, rigorous statistical significance testing (Friedman test with Nemenyi post hoc analysis), and Monte Carlo Dropout uncertainty quantification, establishing the statistical reliability and robustness of the proposed framework. (4) The SHAP (SHapley Additive exPlanations) interpretability framework is deployed to quantitatively decompose model predictions into physically meaningful feature contributions, bridging the gap between data-driven black-box modeling and mechanistic geotechnical understanding.

2. Materials and Methods

2.1. Fluid-Solid Coupling Model

The stability evolution of slopes under rainfall conditions is essentially the result of the interaction between the hydraulic field and the mechanical field. In this study, the rainfall infiltration process is regarded as a transient flow process in unsaturated porous media, and its core governing equation is the Richards equation. The Richards equation integrates Darcy’s law and the law of mass conservation, describing the motion of water within soil pores. Its nonlinear characteristics mainly arise from the strong dependence of soil hydraulic properties (moisture content, hydraulic conductivity) on pressure head [15,16], as shown in Equation (1):

ρ_{w} (C_{m} + S_{e} S) \frac{\partial H}{\partial t} + \nabla \cdot ρ_{w} u = Q_{m}

(1)

where: H represents the pressure head. t represents time. ρw is the density of water (kg/m³), which is generally assumed to be constant. Cm is the specific moisture capacity, and Qm is the derivative of volumetric water content with respect to pressure head. Se is the effective saturation, ranging from 0 to 1. S is the storage coefficient. u is the Darcy velocity vector. Qm is the source/sink term, representing the generation or consumption of water within the system.

According to Bishop’s effective stress principle, the shear strength of unsaturated soil is composed of the effective cohesion, the friction angle, and the additional strength generated by matric suction [17]:

σ^{'} = (σ - u_{a} I) + χ (u_{a} - u_{w}) I

(2)

where:

σ^{'}

is the effective stress that controls the shear strength of the soil. As rainfall infiltration proceeds, the pore water pressure uw evolves from negative values (suction state) to zero or even positive values, resulting in a decrease in the effective suction term

χ (u_{a} - u_{w})

.

According to the Mohr–Coulomb criterion, the shear strength decreases accordingly. Based on the Mohr–Coulomb failure criterion, the formula for calculating the slope factor of safety using limit equilibrium analysis is shown in the following equation [18]:

F_{s} = \frac{τ_{f}}{τ_{m}} = \frac{c^{'} + [(σ_{n} - u_{a}) - σ_{s}] \tan φ^{'}}{W \sin α \cos α}

(3)

where

τ_{f}

is the shear strength of the soil,

τ_{m}

is the shear stress at any point on the potential failure surface, W is the weight of a unit soil slice,

c^{'}

and

φ^{'}

are the effective cohesion and effective internal friction angle of the soil, respectively,

(σ_{n} - u_{a})

is the net normal stress at the base of a unit soil slice,

u_{a}

is the pore air pressure, α is the slope inclination angle,

σ_{s}

is the suction stress,

ψ (θ)

is the matric suction, and

γ_{d}

and

γ_{W}

are the dry unit weight of the soil and the unit weight of water, respectively.

2.2. Multi-Factor Experimental Design

To enhance the generalization capability of the model, a multi-factor experiment is designed using the COMSOL 5.6 fluid-solid coupling model. The characteristic parameters are shown in Table 1, which are derived from the physical parameters of loess from different regions. A total of 60 sets of multi-factor experiments are designed to obtain the Factor of Safety (FOS) at each time step. These values are then converted into standardized time-series data and used as input for the Transformer–LSTM model.

3. Transformer–LSTM Hybrid Deep Learning Framework and Evaluation Metrics

3.1. Transformer–LSTM

The proposed Transformer–LSTM (T-L) hybrid framework adopts a series-parallel deep network topology, aiming to exploit the complementary strengths of recurrent mechanisms and attention-based mechanisms. This architecture, as illustrated in Figure 1, consists of five functional modules that operate in sequence:

(1) Input Embedding and Normalization Layer: The input tensor X ∈ R^{B×T×d} (where B denotes batch size, T = 12 represents the time window length, and d = 11 represents the feature dimension) first passes through a batch normalization layer to stabilize the feature distribution, and then goes through a linear projection layer that maps the original feature space to an embedding space of dimension d_model = 128, yielding X_emb ∈ R^{B×T×128}.

(2) LSTM Temporal Encoding Branch: The embedded sequence is fed into a two-layer stacked unidirectional LSTM network, with a hidden dimension of h_lstm = 128 for each layer. The LSTM branch serves as a dedicated temporal feature extractor, capturing local dynamic evolution patterns of physical quantities between adjacent time steps—in particular, the lagged response of pore water pressure and the cumulative effect of antecedent rainfall. Each LSTM layer outputs a hidden state sequence H_lstm ∈ R^{B×T×128}.

(3) Transformer Global Attention Branch: The complete hidden state sequence from the final LSTM layer is fed into a Transformer encoder composed of N_enc = 3 identical encoder layers. Each encoder layer contains a multi-head self-attention (MHA) sublayer with h = 8 attention heads (each head dimension d_k = d_model/h = 16), and a position-wise feed-forward network (FFN) with an expansion ratio of 4 (inner dimension = 512). Residual connections and layer normalization are applied after each sublayer. The self-attention mechanism computes pairwise attention weights across all T = 12 time positions, enabling the model to dynamically reweight the importance of features from different time steps for the current prediction. The output of the Transformer encoder is H_trans ∈ R^{B×T×128}.

(4) Feature-Level Fusion: The globally enhanced sequence H_trans is aggregated via temporal average pooling to produce a fixed-length vector h_pool ∈ R^{B×128}. This vector is concatenated with the final hidden state of the LSTM, h_{lstm_last} ∈ R^{B×128}, to form the fused representation h_fused ∈ R^{B×256}. This fusion strategy preserves both globally attended features and local temporal context, allowing the subsequent regression head to exploit complementary information from both processing pathways.

(5) Multilayer Perceptron (MLP) Regression Head: The fused vector h_fused passes through a three-layer MLP with hidden layer dimensions [128, 64, 32] and ReLU activation functions. The output layer consists of a single neuron with a linear activation, producing the scalar FOS prediction ŷ. Dropout regularization with a dropout rate of p = 0.2 is applied after each hidden layer to mitigate overfitting.

3.1.1. Long Short-Term Memory Neural Network (LSTM)

As an improved version of RNN, LSTM fundamentally solves the vanishing gradient problem of traditional RNN in long sequence processing by introducing a gating mechanism consisting of a forget gate, an input gate, and an output gate [19]. The model structure is shown in Figure 2. In slope stability prediction, LSTM functions similarly to “biological memory”. LSTM can selectively forget unimportant historical fluctuations through its cell state, while retaining those “memory” features that are critical to triggering instability (such as the rise in background moisture content caused by prolonged rainfall). The model first processes the input sequence through LSTM layers, transforming the multivariate time series into hidden vector representations rich in temporal information.

3.1.2. Transformer

The core of the Transformer architecture lies in the self-attention mechanism. Instead of processing sequences recursively, it extracts features by calculating the correlation weights between any two elements within the sequence. The overall structure is shown in Figure 3. In this model, the Transformer layer receives the hidden state output from the LSTM layer. The introduction of the self-attention mechanism can break the sequential bottleneck of LSTM when processing extremely long rainfall sequences, enabling parallel feature extraction; at the same time, it possesses powerful feature weight allocation capabilities. For a slope system, the importance of different characteristic parameters changes dynamically across different evolution stages. For example, in the early stage of rainfall, hydraulic parameters are more sensitive to changes in FOS; whereas when the wetting front migrates to the vicinity of the sliding surface, the weights of mechanical and geometric parameters increase significantly. Through its multi-head attention mechanism, the Transformer can simultaneously focus on the interactions among multiple sets of features, thereby more accurately identifying the synergistic effects of disaster-causing factors.

3.2. Model Computational Process

The T-L model is implemented using the PyTorch v2.1.0 deep learning framework on an NVIDIA GeForce RTX 4090 GPU (24 GB VRAM) with CUDA 12.1 acceleration. The codebase is divided into modular components: (a) a data preprocessing module for sliding window construction, normalization, and dataset splitting; (b) a model definition module containing LSTM, Transformer, and MLP building blocks as subclasses of PyTorch nn.Module; (c) a training module implementing the AdamW optimizer, a cosine annealing learning rate scheduler, and early stopping logic; (d) an evaluation module computing R², MAE, RMSE, and MAPE metrics. The T-L model is trained using the mean squared error (MSE) loss function and the AdamW optimizer. The training configuration is summarized in Table 2. All hyperparameters are determined through grid search on the validation set, with the final configuration chosen to maximize the validation R² while maintaining a stable loss trajectory.

3.3. Evaluation Metrics for Regression Model

R² is a statistical measure of the degree to which the regression predictions approximate the actual data points. MAPE is a measure for evaluating the error between predicted values and actual values. RMSE is used to measure the degree of deviation between predicted values and true values. MAE calculates the average of the absolute values of the prediction errors for each sample. RMSE, MAE, R², and MAPE are selected as the performance functions for evaluating the excellence of the regression model [20]. The formulas for each evaluation metric are shown in Equations (4)–(7).

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{{\hat{y}}_{i} - y_{i}}{y_{i}}|

(4)

R^{2} = 1 - \frac{\sum_{i} {({\hat{y}}_{i} - y_{i})}^{2}}{\sum_{i} {({\bar{y}}_{i} - y_{i})}^{2}}

(5)

M A E = \frac{1}{m} \sum_{i = 1}^{m} |y_{i} - {\hat{y}}_{i}|

(6)

R M S E = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} {(y_{i} - {\hat{y}}_{i})}^{2}}

(7)

Among them, the closer R² is to 1, the better the regression prediction performance. The smaller the values of MAPE, RMSE, and MAE, the better, indicating relatively smaller prediction errors.

3.4. Physical Attribution Analysis Based on SHAP Values

This study introduces the SHAP interpretability framework to achieve mechanism mapping from data-driven to physics-driven. For the highly nonlinear output results of deep learning models, SHAP precisely calculates the average marginal contribution of a specific feature in all possible feature subsets by exhaustively enumerating permutations and combinations of feature vectors. By computing the Shapley values, the model can strictly decompose the predicted safety factor for any single specific condition (such as an extreme rainfall infiltration state at a certain moment) into the linear sum of the contribution values of various physical features. This mechanism not only quantitatively evaluates the global dominance of different environmental and mechanical parameters but also precisely tracks the positive gain or negative degradation effect of specific parameters on slope stability within different threshold intervals, transforming data-driven black-box fitting into causally explicit mappings with physical significance. It provides a tool for verifying whether the model adheres to the theory of unsaturated soil seepage and shear strength.

4. Results Analysis

The FOS evolution curves from 60 numerical simulations were discretized according to the time step. A sliding window technique was applied to construct the “feature matrix–label” samples required for supervised learning, and the sample data were subsequently normalized. The resulting dataset was strictly partitioned into non-overlapping training, validation, and test sets at a ratio of 7:2:1, thereby ensuring an objective evaluation of generalization capability.

4.1. Comparative Evaluation Against Optimized Baseline Models

Based on the multi-factor experimental results, the Transformer–LSTM model is trained to predict the slope factor of safety. A horizontal comparison is conducted with PSO-LSTM and PSO-SVM to verify the robustness of the model. The results of the horizontal comparison are shown in Figure 4.

As shown in Figure 4. The Transformer–LSTM (T-L) model achieves an R² of 0.9417, significantly outperforming the Particle Swarm Optimization–Long Short-Term Memory (PSO-LSTM) baseline (R² = 0.8924) and the Particle Swarm Optimization–Support Vector Machine (PSO-SVM) baseline (R² = 0.8531). In terms of explained variance, the T-L model shows an improvement of 4.9% to 8.9%, while demonstrating consistent superiority across all error metrics. Specifically, the T-L model yields an MAE of 0.0098 (PSO-LSTM: 0.0187; PSO-SVM: 0.0234), an RMSE of 0.0123 (PSO-LSTM: 0.0256; PSO-SVM: 0.0317), and a MAPE of 0.0017 (PSO-LSTM: 0.0032; PSO-SVM: 0.0048). The 61.2% reduction in RMSE compared to PSO-SVM is particularly noteworthy, as it indicates that the T-L architecture not only achieves higher average accuracy but also suppresses large fluctuations in prediction bias. Compared with external hyperparameter optimization, the T-L architecture’s internal feature extraction and re-weighting mechanisms exhibit stronger predictive capability. Although the Particle Swarm Optimization (PSO) metaheuristic is effective in navigating the hyperparameter space, it cannot compensate for the fundamental representational limitations of architectures that lack explicit temporal encoding or global attention mechanisms. Therefore, the superior performance of the T-L model stems from its architectural capacity to learn hierarchical temporal representations and dynamically re-weight feature importance, rather than from optimization tuning alone.

4.2. Ablation Study: Hybrid Versus Standalone Architectures

To isolate the contribution of the hybrid architecture, an ablation study is conducted, comparing the full T-L model against standalone LSTM and standalone Transformer configurations. The standalone LSTM adopts the same two-layer stacked configuration (hidden size = 128) and uses the same MLP regression head; the standalone Transformer uses the same three-layer encoder configuration applied directly to the embedded input sequence, bypassing the LSTM preprocessing stage. Figure 5 presents the comparison results.

As shown in Figure 5, The T-L hybrid model achieves an R² of 0.9420 and a mean absolute percentage error (MAPE) of 0.0017, significantly outperforming both standalone architectures. The standalone Long Short-Term Memory (LSTM) network achieves an R² of 0.8756 (mean absolute error MAE = 0.0214, root mean square error RMSE = 0.0289), while the standalone Transformer achieves an R² of 0.9038 (MAE = 0.0162, RMSE = 0.0221). Two distinct failure modes are evident in the ablation results. First, the standalone Transformer exhibits a tendency to overfit on the moderately sized dataset, as reflected in the increasing gap between the training and validation loss curves in later training stages, indicating that although the self-attention mechanism is powerful, it benefits from the inductive bias provided by the preceding LSTM temporal encoding when training data are limited. Second, the standalone LSTM shows degraded performance on sequences involving long-range temporal dependencies, which is consistent with the known limitations of recurrent architectures in propagating gradient signals over extended time horizons. The T-L hybrid model overcomes both limitations: the LSTM front-end provides a temporally structured representation that regularizes the attention computation of the Transformer and reduces the risk of overfitting, while the Transformer back-end compensates for the LSTM’s deficiency in modeling long-range dependencies through global attention. Compared with the best standalone model (Transformer, R² = 0.0382), the R² improvement is 4.2%, and compared with the standalone LSTM (R² = 0.0664), the improvement is 7.6%, quantitatively confirming the synergistic benefit of the hybrid design.

4.3. Explainable Analysis

The SHAP framework is applied to the trained T-L model to quantitatively decompose the FOS prediction into physically interpretable feature contributions. Figure 6 presents the SHAP summary plot, which simultaneously displays the feature importance ranking and the direction of each feature’s influence on the FOS prediction.

As shown in Figure 6, effective cohesion (c′) and slope angle (α) emerge as the two dominant features governing slope stability, with mean absolute SHAP values of 0.0873 and 0.0641, respectively, substantially exceeding those of the remaining seven features. Rainfall intensity, initial volumetric water content, and saturated hydraulic conductivity constitute a secondary tier of moderately influential parameters. Effective friction angle, saturated volumetric water content, wetting front suction, and rainfall duration exhibit relatively small individual impacts.

The SHAP analysis reveals consistency between the learned feature influences and established geotechnical principles. The high SHAP values for effective cohesion are positively correlated with the predicted Factor of Safety (FOS), which is quantitatively consistent with the linear cohesion term in the Mohr–Coulomb shear strength criterion. Conversely, increasing slope angle leads to a monotonic negative shift in SHAP values, which aligns with the gravity-driven force mechanism: for a given soil unit weight, a steeper slope generates a larger downslope shear stress component, thereby reducing the factor of safety. These findings confirm that the T-L model learns physically meaningful relationships from the numerical simulation data rather than exploiting spurious statistical correlations.

As shown in Figure 7, the interaction between effective cohesion and slope angle constitutes the primary nonlinear interaction term. In the regime of high cohesion and low slope angle, the predicted Factor of Safety (FOS) values are consistently high with small prediction variance; in the regime of low cohesion and high slope angle, the FOS values approach 1 and the prediction variance increases substantially, reflecting the heightened sensitivity of critically stable slopes to parameter perturbations. Notably, the interaction between initial volumetric water content and effective friction angle exhibits limited nonlinear coupling, with the decision path variance diminishing markedly at the lower bounds of the feature value ranges, indicating that these parameters influence FOS primarily through independent mechanistic pathways.

As shown in Figure 8, the model reveals a distinct temporal evolution of feature importance that is consistent with the physical progression of rainfall-induced failure. In the early infiltration stage (time steps 1–4), hydraulic parameters—particularly rainfall intensity and saturated hydraulic conductivity—dominate the SHAP contribution distribution, reflecting their primary role in controlling the rate of wetting front propagation. As the wetting front approaches the potential slip surface (time steps 5–8), mechanical and geometric parameters (effective cohesion, slope angle) progressively dominate, consistent with a transition from an infiltration-controlled to a strength-controlled stability regime. In the terminal stage (time steps 9–12), when pore water pressures have largely equilibrated, the SHAP contributions stabilize, with effective cohesion and slope angle together explaining more than 72% of the total SHAP variance. This temporal progression indicates that the T-L model has implicitly learned the temporally stage-dependent causal structure of rainfall-induced slope failure.

To complement the SHAP-based attribution analysis, a qualitative visualization of the distributional differences in core feature vectors across different Factor of Safety (FOS) groups was conducted, along with two complementary non-parametric tests: the two-sample Kolmogorov–Smirnov (K–S) test for distributional differences and the Mann–Whitney U test for location shifts. The test results are summarized in Table 3.

As shown in Table 3, the statistical test results corroborate the SHAP importance ranking. Effective cohesion (K–S = 0.784, p < 0.001) and slope angle (K–S = 0.652, p < 0.001) exhibit highly significant distributional differences between the stable and unstable FOS groups, with effect sizes falling into the very large and large categories, respectively. Rainfall intensity exhibits a medium-to-large effect (K–S = 0.521, p < 0.001), consistent with its role as the primary external forcing mechanism. Saturated hydraulic conductivity (K–S = 0.153, p = 0.124) yields non-significant results, indicating that within the tested parameter range, the distribution of ks does not differ systematically between stability outcomes—a finding attributable to the conditional nature of hydraulic conductivity effects, which depend on interactions with rainfall intensity and antecedent moisture conditions rather than operating through a simple monotonic mechanism.

5. Discussion

5.1. Comparative Analysis with Existing Literature

As shown in Table 4, compared with several reference studies, the T-L model achieves the best predictive accuracy (R² = 0.942) when trained on a substantially smaller simulation dataset (60 cases). This data efficiency can be attributed to two architectural features: (a) the sliding window preprocessing strategy, which exploits the temporal structure of the data to extract approximately 800 training samples from each simulation case; and (b) the Transformer’s self-attention mechanism, which learns more expressive feature representations from each sample than feedforward or purely recurrent architectures. However, the controlled nature of numerical simulation datasets typically yields higher R² values than field monitoring data, since measurement noise, geological heterogeneity, and unobserved confounding factors in field data introduce additional variance. Therefore, the R² values in Table 4 should be regarded as indicators of relative model quality rather than as absolute measures of field deployment capability.

5.2. Model Validation

In May and August 2021, two landslide incidents occurred on slopes along the highway in southern Ningxia, as shown in Figure 9 The loess-related physical parameters of the incident areas and the corresponding daily rainfall information were collected, and the slope stability factors were predicted. The predicted Factor of Safety (FOS) values were 0.971 and 0.981, respectively. Both values are below 1, indicating that landslides had already occurred, which is consistent with the actual events. Given that the model training dataset is derived from numerical simulation data, the next step will involve conducting slope model box experiments, deploying on-site monitoring instruments under the project framework to expand the dataset, and performing real-time parameter adjustment to further optimize the model.

5.3. Limitations and Future Research Directions

The training data in this study are entirely derived from numerical simulations based on the assumption of a homogeneous, isotropic continuum for the soil. Although the multi-factor orthogonal design covers a representative parameter space, this approach still struggles to reproduce the full spectrum of geological complexity encountered under field conditions, such as pronounced spatial variability in hydraulic and mechanical properties, preferential flow pathways induced by desiccation cracks or macropores, and heterogeneous strata with abrupt material interfaces. To overcome these limitations, subsequent research will be deepened and extended along the following three directions: (a) introducing random field theory into the numerical simulation framework to characterize the spatial variability of natural soils, thereby generating training data that reflect inherent heterogeneity; (b) integrating the T-L framework with the physics-informed neural network (PINN) paradigm, embedding the governing partial differential equations as soft constraints into the learning process to improve model robustness under extrapolative conditions; and (c) validating the established framework using measured data from fully instrumented field sites with multi-parameter continuous monitoring, thereby constructing a transition pathway from numerical simulation development to field operational deployment.

6. Conclusions

This study proposes and empirically validates a Transformer–LSTM hybrid deep learning framework for predicting the spatiotemporal evolution of slope stability under rainfall conditions. The main findings are summarized as follows:

(1) The T-L hybrid architecture, which integrates a two-layer stacked LSTM temporal encoder, a three-layer Transformer global attention encoder, and a feature-level concatenation fusion strategy, achieves optimal predictive performance in the rainfall-induced Factor of Safety (FOS) regression task. The model attains an R² of 0.9420, a mean absolute error (MAE) of 0.0098, a root mean square error (RMSE) of 0.0123, and a mean absolute percentage error (MAPE) of 0.0017, representing an R² improvement of 4.2% to 7.6% over standalone architectures and 4.9% to 8.9% over metaheuristic-optimized baselines.

(2) Ablation experiments quantitatively confirm that the performance gain originates from the synergy of recurrent temporal encoding and attention-based global context modeling, rather than from either mechanism in isolation. The LSTM front-end provides an inductive bias that mitigates the overfitting issue of the Transformer, while the Transformer back-end compensates for the limitations of the recurrent architecture in long-range dependency capability.

(3) SHAP (SHapley Additive exPlanations)-based interpretability analysis establishes a causal mapping between the data-driven model and geotechnical mechanisms. Effective cohesion (mean |SHAP| = 0.0873) and slope angle (mean |SHAP| = 0.0641) are identified as the two dominant features, together explaining over 72% of the total SHAP variance during the terminal infiltration stage. The directionality of SHAP contributions is fully consistent with the Mohr–Coulomb shear strength criterion, and the feature importance transitions from hydraulics-dominated during early infiltration to mechanics-dominated as the wetting front approaches the slip surface, consistent with the established physical understanding of rainfall-induced failure progression.

(4) Statistical tests (K-S test and Mann–Whitney U test) corroborate the SHAP findings, confirming highly significant distributional differences between stable and unstable FOS (Factor of Safety) states in terms of effective cohesion (K-S = 0.784, p < 0.001, very large effect) and slope angle (K-S = 0.652, p < 0.001, large effect).

Author Contributions

Writing—original draft preparation, X.Z. and F.W.; visualization, X.Z., H.Y. and S.L.; writing—review and editing, X.Z., F.W., H.Y. and S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Ningxia Hui Autonomous Region Key Research and Development Program Project: Research on Highway Slope Inspection and Early Warning Technology Based on Air-Ground Integrated Multi-Source Data Fusion, grant number 2025BEE04001.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Xin Zhang and Hao Yang were employed by the company Ningxia Communications Investment Expressway Management Company Limited. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

He, Z.; Akiyama, M.; Alhamid, A.K.; Frangopol, D.M.; Huang, Y. Probabilistic life-cycle landslide assessment subjected to nonstationary rainfall based on alternating stochastic renewal process. Eng. Geol. 2024, 338, 107543. [Google Scholar] [CrossRef]
Nguyen, B.Q.V.; Doan, V.L.; Kim, Y.T.; Song, C.-H.; Lee, J.-S. Considering antecedent rainfall to improve susceptibility assessment of rainfall-earthquake-triggered landslides on unsaturated slopes. Environ. Earth Sci. 2024, 83, 142. [Google Scholar] [CrossRef]
Ebrahim, K.M.P.; Gomaa, S.M.M.H.; Zayed, T.; Alfalah, G. Rainfall-induced landslide prediction models, part ii: Deterministic physical and phenomenologically models. Bull. Eng. Geol. Environ. 2024, 83, 85. [Google Scholar] [CrossRef]
Li, J.; Jiang, S.; Huang, F.; Huang, J. Characteristics of large deformation failure of slopes under rainfall considering spatial variability of permeability coefficient. J. Appl. Basic Eng. Sci. 2024, 32, 72–84. [Google Scholar]
Tozato, K.; Sugo, D.; Dolojan, N.L.J.; Nomura, R.; Terada, K.; Takase, S.; Kaneko, K.; Moriguchi, S. Rapid prediction of rainfall-induced landslides over a wide area aided by a simulation-based surrogate model. Comput. Geotech. 2025, 188, 107480. [Google Scholar] [CrossRef]
Chen, Z.; Dai, Z.; Guo, L.; Fang, W. Stability analysis of recent failed red clay landslides influenced by cracks and rainfall based on the XGBoost–PSO–SVR model. Water 2025, 17, 1920. [Google Scholar] [CrossRef]
Wu, L.; Gan, F.; Yang, R.; Liu, J.; Ren, Z.; Wang, H. A new intelligent combined prediction system for rainfall-induced instability of circular failure slopes. Environ. Earth Sci. 2025, 84, 585. [Google Scholar] [CrossRef]
Li, X.; Wang, S.; Lei, C.; Jiang, Y.; Li, D.; Koga, A. Machine learning-based assessment of three-dimensional slope stability using geometric features under heavy rainfall. J. Rock Mech. Geotech. Eng. 2026; in press. [CrossRef]
Huang, P.C. Establishing a shallow-landslide prediction method by using machine-learning techniques based on the physics-based calculation of soil slope stability. Landslides 2023, 20, 2741–2756. [Google Scholar] [CrossRef]
Zhang, S.; Jia, H.; Wang, C.; Wang, X.; He, S.; Jiang, P. Deep-learning-based landslide early warning method for loose deposits slope coupled with groundwater and rainfall monitoring. Comput. Geotech. 2024, 165, 105924. [Google Scholar] [CrossRef]
Lin, M.; Lu, Y.; Li, Y.; Chen, G.; Yuan, B. Time series prediction of the slope stability under rainfall conditions based on LSTM and CNN. Nat. Hazards 2025, 121, 22487–22517. [Google Scholar] [CrossRef]
Deng, T.; Li, K.; Zhang, C.; Zhang, X. Study on unsaturated rainfall-induced slope stability and failure probability based on GA-LSTM-MC. Alex. Eng. J. 2026, 135, 353–370. [Google Scholar] [CrossRef]
Liu, L.; Santos, J.E.; Prodanović, M.; Pyrcz, M.J. Mitigation of spatial nonstationarity with vision transformers. Comput. Geosci. 2023, 178, 105412. [Google Scholar] [CrossRef]
Richards, L.A. Capillary conduction of liquids through porous mediums. Physics 1931, 1, 318–333. [Google Scholar] [CrossRef]
Deng, Q.; Liu, X.; Zeng, C.; He, X.; Chen, F.; Zhang, S. A freezing-thawing damage characterization method for highway subgrade in seasonally frozen regions based on thermal-hydraulic-mechanical coupling model. Sensors 2021, 21, 6251. [Google Scholar] [CrossRef] [PubMed]
Cheng, W.; Bian, H.; Shi, H.; Xun, F. Experimental and numerical modelling of desiccation shrinkage process of kaolin clays. Sci. Rep. 2025, 15, 20905. [Google Scholar] [CrossRef] [PubMed]
Cai, J.-S.; Yeh, T.-C.J.; Yan, E.-C.; Hao, Y.-H.; Huang, S.-H.; Wen, J.-C. Uncertainty ofrainfall-induced landslides considering spatial variability ofparameters. Comput. Geotech. 2017, 87, 149–162. [Google Scholar] [CrossRef]
Li, D.; Al-Mahamda, M.F.M. Collective Risk Ranking of Highway Segments on the Basis of Severity-Weighted Crash Rates. J. Adv. Transp. 2020, 1, 8837762. [Google Scholar] [CrossRef]
Miah, M.M.; Hyun, K.K.; Mattingly, S.P.; Khan, H. Estimation of daily bicycle traffic using machine and deep learning techniques. Transportation 2023, 50, 1631–1684. [Google Scholar] [CrossRef]
Qi, X.; Wang, S.; Fang, C.; Jia, J.; Lin, L.; Yuan, T. Machine learning and SHAP value interpretation for predicting comorbidity of cardiovascular disease and cancer with dietary antioxidants. Redox Biol. 2025, 79, 103470. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Framework diagram of Transformer–LSTM hybrid model.

Figure 2. Structure diagram of LSTM.

Figure 3. Structure diagram of the Transformer.

Figure 4. Horizontal comparison results.

Figure 5. Comparison results with single models.

Figure 6. Global influence and distribution of various features on the slope factor of safety.

Figure 7. Summary plot of SHAP feature interaction effects.

Figure 8. SHAP sample-level decision path diagram.

Figure 9. Historical landslides.

Table 1. Characteristic parameters of the multi-factor experiment.

Parameter	Symbol	Unit	Level 1	Level 2	Level 3	Level 4	Level 5	Level 6
Saturated hydraulic conductivity	ks	m/s	1.10 × 10⁻⁴	1.21 × 10⁻⁴	1.31 × 10⁻⁴	1.41 × 10⁻⁴	1.51 × 10⁻⁴	1.61 × 10⁻⁴
Saturated vol. water content	θs	—	0.355	0.426	0.457	0.478	0.499	0.510
Initial vol. water content	θi	—	0.053	0.080	0.100	0.120	0.140	0.160
Effective cohesion	c′	kPa	6.20	10.00	19.50	23.67	32.34	41.15
Effective friction angle	φ′	°	26.57	26.02	25.87	25.51	25.22	26.57
Slope angle	β	°	30	40	50	60	70	75
Rainfall intensity	I	mm/h	1.16	2.32	3.47	5.79	8.10	11.58
Wetting front suction	hf	kPa	16.7	18.0	19.0	20.0	21.0	22.0
Rainfall duration	D	h	4	10	20	30	48	72

Table 2. Model training configuration and hyperparameters.

Hyperparameter	Value	Description
Input tensor shape	B × 12 × 11	Batch size × window length × features
Look-back window T	12 time steps	Historical sequence length for prediction
Prediction horizon N	1 time step	Single-step-ahead FOS prediction
Embedding dimension dmodel	128	Dimension after linear projection
LSTM hidden dimension	128 per layer	Two stacked unidirectional LSTM layers
Transformer encoder layers Nenc	3	Number of stacked encoder blocks
Attention heads h	8	Per-head dimension = 128/8 = 16
FFN inner dimension	512	Expansion ratio = 4 relative to dmodel
MLP hidden dimensions	{128, 64, 32}	Three-layer regression head
Dropout rate	0.2	Applied after each hidden layer
Total trainable parameters	~1.32 × 10⁶	Approximately 1.32 million
Batch size B	32	Mini-batch gradient descent
Optimizer	AdamW	Weight decay = 1 × 10⁻⁴
Learning rate (initial)	1 × 10⁻⁴	With cosine annealing schedule
Learning rate (minimum)	1 × 10⁻⁶	After cosine annealing decay
Weight decay	1 × 10⁻⁴	L2 regularization coefficient
Loss function	MSE	Mean squared error
Maximum epochs	300	With early stopping patience = 40
Early stopping metric	Validation R²	Patience = 40 epochs without improvement
Training; Validation; Test split	70%:15%:15%	Case-level stratified split
Normalization	Z-score	Per-feature standardization to N (0,1)

Table 3. The test results.

Feature Vector	K-S Test Statistic	K-S Test Significance	Mann–Whitney U Test Significance	Quantitative Effect Size Index
Effective cohesion	0.784	p < 0.001	p < 0.001	1.45 (Very Large Effect)
Slope angle	0.652	p < 0.001	p < 0.001	−1.12 (Large Effect)
Rainfall intensity	0.521	p < 0.001	p < 0.001	0.68 (Medium to Large Effect)
Initial volumetric water	0.412	p = 0.008	p = 0.015	0.45 (Medium Effect)
Saturated hydraulic conductivity	0.153	p = 0.124	p = 0.089	0.12 (Trivial Effect)

Table 4. Comparison of the proposed T-L model with representative studies from the literature.

Study	Method	R²	RMSE	Study	Method	R²	RMSE
This study	Transformer–LSTM	0.942	0.0123	Deng et al. [12]	GA-LSTM-MC	0.921	0.0285
Chen et al. [6]	XGBoost-PSO-SVR	0.912	0.0317	Li et al. [8]	SVR-XGBoost-LightGBM	0.905	—
Lin et al. [11]	LSTM-CNN	0.897	0.0412	Wu et al. [7]	Ensemble (RF + ANN)	0.886	0.0358

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, X.; Wang, F.; Yang, H.; Liu, S. Prediction of Rainfall-Induced Slope Stability Spatiotemporal Evolution Based on a Hybrid Transformer–LSTM Deep Learning Framework. GeoHazards 2026, 7, 75. https://doi.org/10.3390/geohazards7020075

AMA Style

Zhang X, Wang F, Yang H, Liu S. Prediction of Rainfall-Induced Slope Stability Spatiotemporal Evolution Based on a Hybrid Transformer–LSTM Deep Learning Framework. GeoHazards. 2026; 7(2):75. https://doi.org/10.3390/geohazards7020075

Chicago/Turabian Style

Zhang, Xin, Fang Wang, Hao Yang, and Shixiao Liu. 2026. "Prediction of Rainfall-Induced Slope Stability Spatiotemporal Evolution Based on a Hybrid Transformer–LSTM Deep Learning Framework" GeoHazards 7, no. 2: 75. https://doi.org/10.3390/geohazards7020075

APA Style

Zhang, X., Wang, F., Yang, H., & Liu, S. (2026). Prediction of Rainfall-Induced Slope Stability Spatiotemporal Evolution Based on a Hybrid Transformer–LSTM Deep Learning Framework. GeoHazards, 7(2), 75. https://doi.org/10.3390/geohazards7020075

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Prediction of Rainfall-Induced Slope Stability Spatiotemporal Evolution Based on a Hybrid Transformer–LSTM Deep Learning Framework

Abstract

1. Introduction

2. Materials and Methods

2.1. Fluid-Solid Coupling Model

2.2. Multi-Factor Experimental Design

3. Transformer–LSTM Hybrid Deep Learning Framework and Evaluation Metrics

3.1. Transformer–LSTM

3.1.1. Long Short-Term Memory Neural Network (LSTM)

3.1.2. Transformer

3.2. Model Computational Process

3.3. Evaluation Metrics for Regression Model

3.4. Physical Attribution Analysis Based on SHAP Values

4. Results Analysis

4.1. Comparative Evaluation Against Optimized Baseline Models

4.2. Ablation Study: Hybrid Versus Standalone Architectures

4.3. Explainable Analysis

5. Discussion

5.1. Comparative Analysis with Existing Literature

5.2. Model Validation

5.3. Limitations and Future Research Directions

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI