A Hybrid Physics–Machine Learning Framework for Landslide Susceptibility Assessment with an Improved Non–Landslide Sampling Strategy

Peng, Dalei; Chen, Maoyuan; Zhou, Yeping; Li, Pinliang; Xiao, Shihao; Shen, Yuyang; Tan, Boren; Kong, Linghao; Xu, Qiang

doi:10.3390/rs18030408

Open AccessArticle

A Hybrid Physics–Machine Learning Framework for Landslide Susceptibility Assessment with an Improved Non–Landslide Sampling Strategy

by

Dalei Peng

¹

,

Maoyuan Chen

^1,*,

Yeping Zhou

¹,

Pinliang Li

¹,

Shihao Xiao

²

,

Yuyang Shen

¹,

Boren Tan

¹,

Linghao Kong

¹ and

Qiang Xu

¹

State Key Laboratory of Geohazard Prevention and Geoenvironment Protection, Chengdu University of Technology, Chengdu 610059, China

²

Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(3), 408; https://doi.org/10.3390/rs18030408

Submission received: 10 December 2025 / Revised: 21 January 2026 / Accepted: 22 January 2026 / Published: 26 January 2026

(This article belongs to the Topic AI for Natural Disasters Detection, Prediction and Modeling)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

An improved non–landslide sampling strategy increases landslide susceptibility prediction accuracy by 16.46%, with the MLP model outperforming RF, SVM, and XGBoost.
The proposed hybrid physics–machine learning framework integrates key factors like rainfall, groundwater, and human activity, aligning with SHAP analysis findings.

What are the implications of the main findings?

The improved non–landslide sampling strategy coupled with MLP offers a superior alternative to the traditional buffering method for evaluating landslide susceptibility.
The proposed framework successfully incorporates physical mechanisms of clustered landslides into regional assessments, as demonstrated by the 2024 Typhoon Gaemi event.

Abstract

Rainfall–triggered clustered landslides pose severe risks to communities and infrastructure in mountainous regions. High–precision susceptibility assessment is essential for early warning and hazard mitigation. The traditional buffering method neglects physical slope stability mechanisms, leading to the misclassification of potentially unstable areas. To improve susceptibility model accuracy, we propose an improved non–landslide sampling strategy that integrates the physical–model TRIGRS (Transient Rainfall Infiltration and Grid–based Regional Slope–Stability Model) with 50 m buffering constraints. A hybrid physics–machine learning framework is used to evaluate the performance of landslide susceptibility assessment across four machine learning models, such as Multi–Layer Perceptron (MLP), Random Forest (RF), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost). Among the four models, the TRIGRS model integrated with MLP achieves the highest accuracy in susceptibility mapping. The improved non–landslide sampling strategy increased average Area Under the Curve (AUC) by 16.46% in random cross–validation and improved spatial generalization capability by 29% in spatial cross–validation, demonstrating its robustness in unseen areas. SHAP factor analysis further confirms rainfall, groundwater table, and human activity as the primary influencing factors, which aligns with physical mechanisms and improves model interpretability. Therefore, the proposed non–landslide sampling strategy coupled with the TRIGRS and MLP models outperforms traditional buffering method in evaluating regional landslide susceptibility, providing a more physically basis for geohazard risk assessment.

Keywords:

rainfall–induced landslides; landslide susceptibility; physics–based model; non–landslide sample; sampling strategy

1. Introduction

The increasing frequency and severity of typhoons and extreme precipitation in recent years have led to a notable escalation in rainfall–triggered landslides and their associated impacts [1]. The escalating impacts of global climate change have further intensified the frequency and severity of these extreme weather events [2,3]. Consequently, areas surrounding cities, major infrastructure developments, and world heritage sites now face heightened landslide risk [4,5]. To address this challenge, landslide susceptibility assessment serves as a vital tool for estimating the spatial likelihood of landslide occurrence, thereby offering a scientific basis for risk evaluation and hazard mitigation [6,7,8].

The accuracy of landslide susceptibility assessments is strongly dependent on multiple factors, including the quality and resolution of input data, the choice of predictive model, and the methodological framework employed. The quality of training samples is a crucial element among them [9,10,11]. Previous research has made significant progress in identifying and sampling landslides, while the importance of non–landslide samples has been largely overlooked [12,13]. Non–landslide samples are crucial for delineating the decision boundaries between stable and unstable conditions in susceptibility modeling [14]. The inclusion of non–landslide samples through a random selection process has the potential to introduce incorrect labels, which can result in misclassifying potentially unstable areas as stable, despite their unfavorable geological conditions [15]. This systematic bias reduces the model capacity to generalize and weakens the robustness of predictions [16]. Therefore, achieving a scientifically sound and representative selection of non–landslide samples has emerged as a critical challenge in enhancing the accuracy of landslide susceptibility models [17].

Common non–landslide sampling approaches include random selection, susceptibility–based sampling, and buffer–zone exclusion [18]. Random sampling fails to account for spatial patterns, potentially mislabeling non–landslide samples in highly susceptible areas [19]. Susceptibility assessment sampling creates potential landslide training samples by preliminary model [20]. However, this sampling approach does not account for the movement scope and affected areas of slope failure [21,22]. Buffer zone sampling is commonly used due to its ability to partially reflect spatial impact of slope failure despite its simplicity [23]. However, the precision of this method relies on indirect indicators and the subjectively defined buffer radii [24]. These non–landslide sampling strategies have inherent limitations without considering the underlying physical mechanisms of slope failure, which may inadvertently contain unstable slope sample, increase model uncertainty and reduce its performance [25]. In this study, such a traditional buffering method is explicitly retained as a benchmark to rigorously evaluate the performance improvements offered by the improved non–landslide sampling strategy.

This study aims to: (1) develop an improved non–landslide sampling strategy by coupling a physics–based model (TRIGRS) with four machine learning algorithms; and (2) evaluate the resulting gains in landslide susceptibility model accuracy relative to traditional buffering method. Using the landslide event triggered by the 2024 Typhoon Gaemi in Zixing County as a case study, we propose an improved strategy that integrates the TRIGRS model with 50 m buffering constraints. This method directly filters stable areas from a mechanistic perspective by simulating rainfall infiltration and slope stability, while also incorporating spatial exclusion to minimize interference. The findings provide a new solution to the challenge of non–landslide sample selection. High–precision susceptibility maps can directly inform dynamic regional landslide risk assessments, thereby strengthening evidence–based disaster prevention and mitigation efforts.

2. Materials

2.1. Study Area

The study area lies in southeastern Hunan Province, at the convergence of the western Luoxiao Mountains and the northern Nanling Mountains. The topography generally slopes downward from the southeast to the northwest, with elevations varying between 268 m and 2042 m (Figure 1a,b). The southeastern area is characterized by steep mountainous topography, while the northwestern region is relatively flat. The local geology is predominantly composed of Sinian, Cambrian, Devonian, and Caledonian rocks. Figure 1c depicts the dominant lithology (granite (Gr), metasandstone (Mss), siliceous rock (Crt), limestone (Ls), and quartz sandstone (Qss)), along with major faults, roads, and rivers. The hydrological system is part of the Xiangjiang River Basin. The Dongjiang Reservoir serves as a major water management hub, with a total storage capacity of 8.12 × 10⁹ m³ and a surface area of approximately 160 km² [26]. The study area experiences a humid subtropical monsoon climate, receiving an average annual precipitation of 1487.6 mm, most of which occurs between March and August. A notable event occurred from 26 to 28 July 2024, when Typhoon Gaemi passed through Xingning Town, bringing over 500 mm of rainfall and triggering numerous landslides (Figure 1d).

2.2. Data Source

Multi–source geospatial data sources, acquisition periods, and spatial resolutions for all geological environment and human activity used in this study are summarized in Table 1. The Digital Elevation Model (DEM) (spatial resolution 5 m) and pre–event satellite imagery (spatial resolution 2 m) were acquired on 1 May 2023, while post–event imagery (spatial resolution 0.7 m) was obtained on 24 October 2024. For vegetation analysis, the normalized difference vegetation index (NDVI) between May 2023 and October 2024 was derived from the Sentinel–2 multispectral image data with a spatial resolution of 10 m and a temporal resolution of 3–5 d [27]. Rainfall data was derived from radar–based precipitation provided by Caiyun Technology Company (Chengdu, China), with a spatial resolution of 1 km and a temporal resolution of 1 h. Monthly groundwater table data (spatial resolution 1 km) and a geological map (scale 1:200,000) were obtained from the National Geological Data Center [28]. The soil thickness was calculated using the linear slope–dependent model (Equation (1)) proposed by Saulnier et al. (1997) [29], which assumes that soil thickness was inversely proportional to the slope gradient:

d_{i} = d_{m a x} \{1 - [\frac{t a n β_{i} - t a n β_{m i n}}{t a n β_{m a x} - t a n β_{m i n}} (1 - \frac{d_{m i n}}{d_{m a x}})]\}

(1)

where

d_{i}

represents the soil thickness at grid cell

i

;

d_{m a x}

and

d_{m i n}

are the maximum and minimum soil thickness in the study area, determined as 5 m and 0.8 m based on field surveys;

β_{i}

is the slope gradient at grid cell

i

;

β_{m a x}

and

β_{m i n}

denote the maximum and minimum slope angles in the study area, respectively.

3. Methods

3.1. Modelling Procedure

The methodological framework of this study is illustrated in Figure 2. The workflow was developed to enhance the accuracy of landslide susceptibility assessment and was structured into three primary stages: (1) data acquisition and processing, (2) machine learning model development, and (3) landslide probability prediction and mapping.

Initially, a comprehensive spatial database was established by integrating the landslide inventory with 15 potential influencing factors. To ensure scale consistency across heterogeneous datasets, all input layers were resampled to a uniform spatial resolution of 5 m × 5 m. This resolution was selected to align with the high–precision DEM (5 m) and high–resolution satellite imagery (0.7 m) used in the study, which is essential for accurately capturing topographic controls on shallow landslides. Specifically, continuous variables (e.g., rainfall) were processed using bilinear interpolation to preserve spatial continuity, while categorical variables (e.g., lithology) used nearest neighbor assignment.

In this study, regular grid cells were adopted as the basic mapping units. The study area was discretized into grid cells with a spatial resolution of 5 m × 5 m, consistent with the DEM data. This grid–based approach was selected to preserve the fine–scale spatial heterogeneity captured by the high–resolution data. Following grid cell discretization, training datasets were constructed by combining positive samples (17,529 landslides extracted from landslide crown areas) with an equal number of negative samples (17,529 non–landslides) selected via two distinct strategies, maintaining a balanced 1:1 ratio. These datasets were employed to train four machine learning models: MLP, SVM, RF, and XGBoost. Model performance was assessed using both random and spatial cross–validation.

3.2. Landslide Influencing Factors

The complex and diverse development and mechanism of landslides over a large area are mainly influenced by five key factors: topography, hydrology, land cover, geology, and other factors such as human activity. A total of 15 potential influencing factors in the landslide susceptibility model have been identified (Figure 3), including elevation, slope, aspect, stream power index, topographic wetness index, terrain undulation, surface cutting depth, NDVI, soil thickness, lithology, rainfall, groundwater table, distance to fault, distance to river, and distance to road.

To minimize information redundancy and ensure model robustness, a two–stage screening strategy was employed. Firstly, a Pearson correlation analysis was conducted, indicating that pairwise correlations among most factors were low (<0.5) (Figure 4). However, to detect potential multivariate collinearity that pairwise comparisons might miss, a variance inflation factor (VIF) analysis was performed. As presented in Table 2, the VIF values for all 15 factors range from 1.02 to 4.97, falling well below the strict threshold of 5. These results quantitatively confirm that the selected factors are statistically independent and suitable for multivariate modeling.

3.3. Non–Landslide Sampling Strategies

3.3.1. Strategy I: Improved Non–Landslide Sampling Strategy

Strategy I optimizes non–landslide sample selection by integrating physical slope stability analysis using TRIGRS model with 50 m buffering constraints. This approach aims to identify areas that are both physically stable and spatially distinct from existing landslides.

The TRIGRS model was employed to simulate the factor of safety (FoS) for slope grid cells under rainfall conditions [30]. The model calculates slope stability based on the infinite slope analysis method, expressed as:

F_{s} (Z, t) = \frac{t a n φ}{t a n δ} + \frac{c - ψ (Z, t) γ_{w} t a n φ}{γ_{s} Z s i n δ c o s δ}

(2)

where c is effective cohesion, φ is the effective internal friction angle, γ_w is the unit weight of water, γ_s is the unit weight of soil, and Ψ(Z, t) is the pressure head in the unsaturated layer calculated via Equation (3):

ψ (Z, t) = \frac{c o s α}{δ c o s^{2} α} l n [\frac{K (Z, t)}{K_{S}}] + ψ_{0}

(3)

where δ is the fitting parameter for the soil–water characteristic curve [31], α is the slope angle, K(Z, t) is the unsaturated permeability coefficient, and Z is the soil thickness.

To implement the model, the parameterization for the five lithological types present in the study area is summarized in Table 3. Soil physical and mechanical parameters are derived from regional literature and geological reports [32]. Notably, as Siliceous rock (Crt) predominantly consists of exposed bedrock with minimal soil cover, rock mass parameters were applied to reflect its high stability, whereas soil parameters are used for other weathered lithologies. Regarding the hydraulic parameters, due to the lack of basin–wide in situ testing, representative empirical values were assigned based on the dominant soil properties in the region. Specifically, the saturated permeability (K_s) was set to 3.58 × 10⁻⁵ m/s, and the hydraulic diffusivity D₀ was initialized at 7.16 × 10⁻³ m²/s. The saturated (θ_s) and residual (θ_r) water contents were set to 0.40 and 0.05, respectively [33]. To accurately initialize the hydraulic conditions, the model was initialized with a steady–state background infiltration rate (I_Z0) of 1.0 × 10⁻⁷ m/s, representing antecedent moisture conditions. The initial groundwater table was estimated spatially using a depth–dependent scaling factor (d_w = 0.8Z) based on regional hydrogeological surveys. Regarding the boundary conditions, the finite–depth analytical solution was employed (series expansion terms N_max = 30, M_max = 20). This configuration enforces a zero–flux (impermeable) boundary at the soil–bedrock interface to capture the formation of perched water tables, while the surface uses a flux boundary equal to the time–varying rainfall intensity.

While we acknowledge that assigning regional averages simplifies local heterogeneity caused by weathering or structural discontinuities, this approach provides a representative baseline for regional–scale screening. In terms of model configuration, the model was driven by a 72 h uniform rainfall intensity. A conservative stability threshold of FoS > 1.5 was adopted to define stable areas (referring to Table 4). This threshold aligns with standard geotechnical practices for permanent slopes [34], providing a safety margin to account for parameter uncertainties. Areas with FoS ≤ 1.5 were conservatively excluded to minimize false negatives.

The final non–landslide samples for Strategy I were selected from grid cells that satisfied two conditions: (1) being classified as stable by the TRIGRS model (FoS > 1.5), and (2) being located outside a 50 m buffer zone surrounding historical landslides. The 50 m constraint was applied to account for mapping uncertainties and to exclude potential tension cracks or unstable transition zones immediately adjacent to landslide bodies.

3.3.2. Strategy II: Traditional Buffering Method

Strategy II represents the traditional buffering method and serves as a benchmark. In this strategy, non–landslide samples were randomly selected based on a spatial constraint: they must be located outside a 50 m buffer zone of historical landslides [35]. This method relies on the assumption that areas spatially distant from known land–slides are stable, without considering the physical mechanical properties of the slope.

3.4. Machine Learning Models and Hyperparameter Optimization

To evaluate landslide susceptibility through diverse algorithmic perspectives, this study employs four representative machine learning models: multi–layer perceptron (MLP), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGBoost). These algorithms were specifically selected to cover the primary learning paradigms in current geospatial modeling: neural networks, bagging ensembles, kernel–based learning, and gradient boosting. By incorporating these distinct mechanisms, the study establishes a robust comparative baseline that transcends the limitations of any single algorithmic family [35,36,37,38,39].

To ensure optimal generalization and fair comparison, the hyperparameters for all models were rigorously tuned using the Bayesian optimization algorithm implemented via the optuna framework. The optimization process was configured as follows:

(1): Objective: minimize the validation loss to prevent overfitting.
(2): Search Strategy: the tree–structured parzen estimator (TPE) algorithm was used to efficiently explore the hyperparameter space (300 trials per model).
(3): Validation: a hold–out validation strategy (20% data split) was used within the optimization loop.

The optimal hyperparameters are summarized in Table 5.

3.4.1. Multi–Layer Perceptron Model

MLP is a feedforward neural network capable of capturing complex non–linear relationships [40]. In this study, the MLP classifier architecture was rigorously optimized via Bayesian optimization (Optuna), resulting in a structure comprising three primary fully connected hidden layers. The first hidden layer consists of 832 neurons with GELU activation and a dropout rate of 0.50, followed by batch normalization. The second hidden layer contains 416 neurons with GELU activation and a dropout rate of 0.30. The third hidden layer comprises 416 neurons with Leaky ReLU activation, a dropout rate of 0.10, and batch normalization. Additionally, a dense layer with 32 neurons (ReLU activation, dropout 0.20) was implemented before the final sigmoid output layer. The model was trained using the RMSprop optimizer with a learning rate of 0.0015 and a batch size of 128.

3.4.2. Support Vector Machine Model

SVM classifies data by constructing an optimal hyperplane in a high–dimensional space [41]. Based on the optimization results, the radial basis function (RBF) kernel was selected with a regularization parameter C = 1, balancing the trade–off between margin maximization and classification error.

3.4.3. Random Forest Model

RF is a bagging–based ensemble algorithm that aggregates predictions from multiple decision trees to reduce variance [42]. The optimal configuration for this dataset included an ensemble of 1000 trees with a maximum depth of 15 and a feature sampling ratio of 0.45.

3.4.4. Extreme Gradient Boosting Model

XGBoost is an efficient implementation of gradient boosting designed for high performance with high–dimensional data [43]. The model was tuned with a learning rate of 0.02 and a maximum tree depth of 10. To prevent overfitting, subsample and column sampling ratios were set to 0.95 and 0.73, respectively.

3.5. Validation Method

3.5.1. Receiver Operating Characteristics

The model performance was primarily evaluated using the area under the receiver operating characteristic curve (AUC) [44]. The ROC curve plots the true positive rate (sensitivity) against the false positive rate (1–specificity) at various thresholds. AUC values range from 0.5 to 1, where values exceeding 0.9 are generally considered to indicate excellent predictive performance.

3.5.2. Confusion Matrix

To provide a comprehensive assessment, statistical metrics derived from the confusion matrix were calculated, including accuracy, precision, recall, and F1–score [45]. Accuracy (ACC) represents the proportion of correctly classified samples and is calculated as:

A C C = \frac{T P + T N}{T P + F N + F P + T N}

(4)

where TP, TN, FP, and FN represent true positives, true negatives, false positives, and false negatives, respectively.

3.5.3. Random and Spatial Cross–Validation

This study employs both random and spatial cross–validation to rigorously evaluate model performance. Random 5–fold cross–validation was used for hyperparameter optimization and baseline assessment, while spatial cross–validation was adopted to evaluate the model’s transferability to unseen areas. Model performance was quantified using ROC curves and confusion matrices.

Random 5–fold cross–validation involves randomly partitioning the entire dataset into five equal subsets, iteratively using four for training and one for validation. While effective for baseline testing, this method ignores spatial dependencies. To address this, we implemented spatial cross–validation based on administrative boundaries. The study area was divided into five distinct towns. In each iteration, data from four towns served as the training set, while the remaining town functioned as a spatially independent testing set. By strictly separating training and testing samples geographically, this method robustly assesses the model’s ability to predict landslides in unseen regions.

4. Results

4.1. Landslide Inventories

The landslides inventory database in the study area was established through manual visual interpretation, analyzing alterations in image color tones and topographic features in pre– and post–event high–resolution satellite imageries and digital elevation models (DEMs) (Figure 5a,b). To ensure the reliability of the database and filter out noise, the minimum mapping unit (MMU) was set to 50 m² (approximately 200 pixels in the post–event imagery). For the sampling strategy, the highest point of the scarp (crown area) was extracted for each landslide polygon. This specific location was selected to characterize the initiation zone as its topographical and geological conditions at the scarp are little difference. Hence, positive samples could be taken from the highest point of landslides [46].

The 2024 Typhoon Gaemi triggered a total of 17,529 landslides in Zixing County (Figure 5a). From a geographical perspective, landslides are predominantly concentrated in Bamianshanyaozu Town and Zhoumensi Town. Field investigations and UAV images (resolution with 0.05 m) were conducted to validate the database’s accuracy. Based on the verification results for three representative areas (Figure 6) and the high resolution of remote sensing imagery (0.7 m), the positional accuracy is estimated to be within 1 m, and the interpretation accuracy exceeds 90%.

This study adopts two widely used indicators—landslide number density (LND), defined as the number of landslides per square kilometer, and landslide area percentage (LAP), defined as the ratio of landslide area to the total area of each factor category—to analyze the spatial patterns of rainfall–induced landslides and their relationships with controlling factors (Figure 5c,d) [47]. By integrating LND, LAP, and the absolute landslide count, we conduct a comprehensive assessment of landslide development intensity across different factor classes. The interpretation of these metrics reveals distinct scenarios: high LND with low LAP indicates numerous but small–scale landslides; low LND with high LAP suggests fewer but large–area failures; and concurrent high values of both LND and LAP signify severe overall landslide activity. Notably, the magnitude of LND and LAP is influenced by the size of classification intervals, underscoring the necessity of combining these indices with landslide counts for a robust and balanced evaluation.

Elevation of Zixing County ranges from 268 m to 2042 m. The elevation was classified into intervals of 100 m (for elevations below 1000 m) and 200 m (for higher elevations) to analyze the landslide distribution. Statistical results indicate that a total of 15,417 landslides (87.9% of all recorded events) occurred at elevations ≤ 900 m. As shown in Figure 5c, both landslide number density (LND) and landslide area percentage (LAP) exhibit a non–monotonic pattern with elevation—initially decreasing, then increasing, and finally decreasing again—with the peak LND and LAP values occurring in the (700, 800] m interval. Regarding slope, the maximum gradient in Zixing reaches 89°.

The slope gradient was classified into seven intervals using the natural breaks (Jenks) method to reflect the natural grouping of data values. Statistical analysis reveals that 15,924 landslides (90.8% of the total) occurred on slopes ≤ 34°. Analysis of Figure 5d reveals that LND increases with slope up to a maximum in the (9°, 16°] interval and then declines, whereas LAP first decreases, then increases, and subsequently decreases again, peaking at slopes ≤ 9°.

4.2. Distribution of Non–Landslide Samples

Analysis of the factor of safety (FoS) distribution map produced by the TRIGRS model reveals that 14,270 of the recorded landslides (81%) are concentrated in unstable areas with an FoS below 1.5, whereas only 3347 landslide points (19%) occur in highly stable zones (FS > 1.5) (Figure 7a,c). This pronounced spatial alignment between landslide occurrences and low–FoS regions underscores the physical credibility of using TRIGRS outputs to guide non–landslide sample selection. To minimize bias arising from mapping uncertainties and spatial autocorrelation, a supplementary 50 m buffer zone is excluded around each landslide perimeter, yielding the final set of non–landslide samples used for model development (Figure 7b,d).

4.3. Landslide Susceptibility Mapping

Landslide susceptibility maps are produced using four machine learning algorithms: multilayer perceptron (MLP), random forest (RF), support vector machine (SVM), and XGBoost. The number of grid cells corresponding to each susceptibility level for different models under strategy I is presented in Table 6.

The susceptibility maps are classified into five levels based on index values: 0–0.2 (very low), 0.2–0.4 (low), 0.4–0.6 (medium), 0.6–0.8 (high), and 0.8–1.0 (very high) (Figure 8). Given that an effective model should assign high probabilities to actual landslide locations, we focused our evaluation on the cumulative proportion of landslides falling within the high to very high susceptibility classes (0.6–1.0).

The analysis reveals that the capture rates for the RF, SVM, and XGBoost models in this critical range were relatively limited, accounting for only 62.1%, 62.7%, and 65.4% of the historical landslides, respectively. In distinct contrast, the MLP model demonstrated superior sensitivity, successfully predicting 90.0% of the inventory landslides within these high–risk regions. This represents a substantial performance margin of 24.6% to 27.9% over the benchmark models. These findings indicate that the MLP model generates a probability distribution that aligns more closely with the actual spatial distribution of landslides, effectively concentrating high–risk predictions on unstable slopes (Figure 9).

4.4. Comparison of Model Performance

To evaluate both the learning capability and the spatial transferability of the models, we conducted a comprehensive performance assessment using two distinct validation protocols: (1) random 5–fold cross–validation, which establishes the baseline model accuracy; and (2) spatial cross–validation, which tests the model’s ability to generalize to unseen geographic regions. The results from these two approaches are presented below.

4.4.1. Random Cross–Validation

Under random 5–fold cross–validation, the performance of the four machine learning models was evaluated using ROC curves (Figure 10a) and detailed statistical metrics (Table 7). All models performed robustly with sampling strategy I, with MLP model achieving the highest mean AUC of 0.934, significantly outperforming XGBoost (0.907), RF (0.902), and SVM (0.886). Paired t–tests confirmed these differences were statistically significant (p < 0.001). MLP model also achieved the highest accuracy (0.86) and F1–score (0.87), demonstrating strong capability in handling the complex non–linear relationships inherent in landslide susceptibility modeling (Table 8).

To validate the effectiveness of the proposed Strategy I, we compared it against Strategy II (traditional buffering method) across four machine learning algorithms. To quantify the performance enhancement, the relative improvement rate for each metric (e.g., AUC, accuracy, specificity) was calculated as:

I m p r o v e m e n t = \frac{M e t r i c_{S t r a t e g y I} - M e t r i c_{S t r a t e g y I I}}{M e t r i c_{S t r a t e g y I I}} \times 100 %

(5)

Under random cross–validation, Strategy I consistently improved AUC across all models. For the MLP model, AUC values significantly increased from 0.812 to 0.934, representing a 15.02% improvement (Figure 10). Similarly, the RF, SVM, and XGB models exhibited AUC gains of 13.03%, 17.66%, and 20.13%. Meanwhile, the average improvement in AUC across the four models is 16.46%.

For the optimal MLP model, specificity improved from 61.9% (strategy II) to 79.8% (strategy I) (Table 8), indicating a significant reduction in false positives. This improvement stems from Strategy I’s ability to filter out physically unstable areas mislabeled as non–landslides in Strategy II [48,49]. Consequently, strategy I correctly classified 2812 non–landslide samples, an increase of 667 over the 2145 identified under strategy II. The improved non–landslide sampling strategy (strategy I) enhanced the predictive accuracy in comparison with the traditional buffering method (Strategy II) (Figure 11). This performance improvement is primarily attributed to the integration of physical constraints, which enhances data reliability by filtering out physically unstable areas that traditional buffering method overlooks. By employing the TRIGRS model to exclude these ambiguous samples, which may be geologically unstable despite lacking historical failures [50,51], Strategy I prevents the inclusion of noise that would otherwise disrupt classification boundaries, thereby ensuring high–quality training samples [52,53].

4.4.2. Spatial Cross–Validation

Results under spatial cross–validation further confirmed the generalizability of Strategy I (Table 9). Across five spatially independent administrative regions, Strategy I achieved a mean AUC of 0.854, outperforming Strategy II by 29.66%. It is worth noting that the mean AUC decreased from 0.934 (random CV) to 0.854 (spatial CV). This performance drop confirms that the administrative–based partitioning effectively reduced spatial autocorrelation and information leakage, suggesting that the test towns possessed sufficient geological and topographical heterogeneity to rigorously challenge the model’s generalizability. This pronounced improvement underscores the particular efficacy of Strategy I in addressing the spatial overfitting that commonly affects traditional buffering method. Critically, Strategy I effectively mitigated the high false–positive rate observed in Strategy II. For instance, in Town M, specificity improved dramatically from 0.163 under Strategy II to 0.891 under Strategy I. These findings confirm that the physics–integrated strategy not only enhances global accuracy but also significantly improves transferability to spatially independent areas.

5. Discussion

5.1. The Mechanism of Strategy I in Enhancing Model Performance

The SHAP analysis was employed to interpret the global importance of influencing factors. The importance ranking was determined based on the mean absolute SHAP value, which represents the average marginal contribution of each feature to the model’s prediction output. As shown in Figure 12, rainfall ranks first with a mean absolute SHAP value of 0.145, followed closely by groundwater table (0.144) and distance to road (0.084). Quantitative analysis reveals that the cumulative contribution of the top six factors (rainfall, groundwater table, distance to road, aspect, distance to river, and distance to fault) accounts for approximately 80.1% of the total feature importance. This finding indicates that heavy precipitation and hydrological conditions are the primary triggers of landslides in the study area (Figure 13a,b), while anthropogenic activities (distance to roads) and geological structures also exert significant influence [54].

The clustered landslides triggered by Typhoon Gaemi in 2024 evolved through three distinct failure stages [55,56] (Figure 13c,e): (a) initial stage with favorable drainage conditions and thinner accumulation mass; (b) as rainfall infiltrates the slope, rising soil moisture content gradually undermines slope stability; (c) slope failure stage with continuous rainfall saturating the unsaturated zone, significantly reducing the slope shear strength. Critical physical and mechanical processes in the TRIGRS model include rainfall infiltration and groundwater dynamics, consistent with key factors from SHAP analysis. This alignment suggests that the strategy I effectively captures the fundamental physical processes of slope instability. Mutual validation between physical mechanisms and data–driven approach provides compelling evidence for the landslide failure mechanisms. Furthermore, the landslide susceptibility assessment coupling physics–based model with machine learning significantly enhances model interpretability and prediction reliability.

5.2. Comparative Analysis of SHAP Importance and Statistical Correlation

To further investigate the model’s interpretability, we compared the SHAP feature importance with the Pearson correlation analysis (Figure 12). Discrepancies between the two metrics highlight the MLP model’s advantage in capturing complex non–linear relationships that traditional linear statistics may overlook.

Firstly, regarding factors with low correlation but high importance, aspect and distance to road are prime examples. Although aspect typically shows a weak linear correlation with landslide distribution due to its cyclical nature (0–360°), it ranks 4th in SHAP importance. This indicates that the model successfully captured the windward slope effect of Typhoon Gaemi, where specific slope orientations received significantly more precipitation and wind load. Similarly, distance to road (rank 3) exhibits a threshold–based non–linear impact and slope stability is compromised mainly in the immediate vicinity of road cuts, a pattern is often underestimated by linear correlation coefficients.

Secondly, regarding factors that appear highly correlated or physically important but ranked lower, such as TWI (rank 12) and SPI (rank 15), this can be attributed to the model’s handling of information redundancy. As shown in the correlation matrix (Figure 6), these morphological indices share information with slope and elevation. Critically, since the model already identified rainfall (rank 1) and groundwater table (rank 2) as the dominant features, the marginal contribution of static hydrological proxies like TWI was reduced. This suggests that the physics–informed sampling strategy (strategy I) guides the model to prioritize direct physical drivers (dynamic hydrology) over indirect static indicators.

6. Conclusions

This study proposes an improved non–landslide sampling strategy for landslide susceptibility assessment. The performance of the improved non–landslide sampling strategy was evaluated by comparing the ROC values across four machine learning models. The key findings of this study are summarized below:

(1): The improved non–landslide sampling strategy, which integrates TRIGRS–derived stability (FoS > 1.5) with 50 m buffering constraints, significantly improved model performance over traditional buffering method. Using a 0.5 classification threshold, the improved strategy yielded substantial performance gains, with average improvements of 16.46% in AUC, 9.75% in specificity, and 16.78% in overall accuracy across all models.
(2): Under the improved non–landslide sampling strategy, the MLP model outperformed RF, SVM, and XGBoost, achieving an AUC of 0.934 and an accuracy of 86.2%.
(3): SHAP analysis identified rainfall, groundwater table, and human engineering activities as dominant factors controlling landslide initiation during Typhoon Gaemi. This confirms that the improved strategy effectively captures the fundamental mechanisms of slope instability.

Future work could aim to: (1) integrate high–precision deformation monitoring data for identifying actively unstable slopes; and (2) refine input parameters for physical simulations using probabilistic methods to quantify uncertainties and improve model generalization.

Author Contributions

Conceptualization, M.C. and P.L.; methodology, D.P., M.C., P.L., S.X. and Y.S.; software, M.C. and Y.S.; validation, M.C., Y.Z. and Q.X.; formal analysis, D.P.; investigation, M.C.; data curation, D.P., Y.Z. and Y.S.; writing—original draft preparation, D.P. and M.C.; writing—review and editing, D.P., M.C., Y.Z., P.L., S.X., B.T. and L.K.; project administration, D.P. and Q.X.; funding acquisition, D.P. and Q.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China (No. 2022YFC3003205), the National Natural Science Foundation of China (No. 42377196), the Sichuan Science and Technology Program (No. 2024NSFSC1997), Chengdu University of Technology Postgraduate Innovative Cultivation Program (No. 2024BJCX009), State Key Laboratory of Geohazard Prevention and Geoenvironment Protection Independent Research Project (No. SKLGP2022Z028), and Joint Research Foundation of Gansu Province (Nos. 24JRRA800 and 25JRRA1158).

Data Availability Statement

Data will be made available on request.

Acknowledgments

Thanks to Caiyun Technology Company for providing us with the rainfall radar data. We also thank the editor and all reviewers for their valuable comments.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

Gr	Granite
Mss	Metasandstone
Ctr	Siliceous rock
Ls	Limestone
Qss	Quartz sandstone
MLP	Multi–layer Perceptron
RF	Random Forest
SVM	Support Vector Machine
XGBoost	Extreme Gradient Boosting
AUC	Area Under the Curve
SHAP	SHapley Additive exPlanations
NDVI	Normalized Difference Vegetation Index
FoS	Factor of safety
RBF	Radial Basis Function
ROC	Receiver Operating Characteristic
TPR	True positive rate
FPR	False positive rate
TPs	True Positives
TNs	True Negatives
FPs	False Positives
FNs	False Negatives
ACC	Accuracy
TRIGRS	Transient Rainfall Infiltration and Grid–based Regional Slope–Stability Model

References

Iverson, R. Landslide triggering by rain infiltration. Water Resour. Res. 2000, 36, 1897–1910. [Google Scholar] [CrossRef]
Shan, K.; Lin, Y.; Chu, P.-S.; Yu, X.; Song, F. Seasonal advance of intense tropical cyclones in a warming climate. Nature 2023, 623, 83–89. [Google Scholar] [CrossRef]
Tollefson, J. Severe weather linked more strongly to global warming. Nature 2015, 520, 20. [Google Scholar] [CrossRef]
Zhao, H.; Xu, Q.; Chen, W.; Xu, F.; Peng, D.; Pu, C.; Liu, R.; Shen, Y. Improving rainfall-triggered landslide susceptibility mapping through source-area boundary sampling and multi-dimensional feature analysis. Gondwana Res. 2025, 153, 265–283. [Google Scholar] [CrossRef]
Kong, L.; Feng, W.; Yi, X.; Xue, Z.; Bai, L. Enhanced landslide susceptibility mapping in data-scarce regions via unsupervised few-shot learning. Gondwana Res. 2025, 138, 31–46. [Google Scholar] [CrossRef]
Mirus, B.B.; Jones, E.S.; Baum, R.L.; Godt, J.W.; Slaughter, S.; Crawford, M.M.; Lancaster, J.; Stanley, T.; Kirschbaum, D.B.; Burns, W.J.; et al. Landslides across the USA: Occurrence, susceptibility, and data limitations. Landslides 2020, 17, 2271–2285. [Google Scholar] [CrossRef]
Tsangaratos, P.; Ilia, I.; Hong, H.; Chen, W.; Xu, C. Applying Information Theory and GIS-based quantitative methods to produce landslide susceptibility maps in Nancheng County, China. Landslides 2017, 14, 1091–1111. [Google Scholar] [CrossRef]
Huang, W.; Ding, M.; Li, Z.; Yu, J.; Ge, D.; Liu, Q.; Yang, J. Landslide susceptibility mapping and dynamic response along the Sichuan-Tibet transportation corridor using deep learning algorithms. Catena 2023, 222, 106866. [Google Scholar] [CrossRef]
Yang, C.; Liu, L.-L.; Huang, F.; Huang, L.; Wang, X.-M. Machine learning-based landslide susceptibility assessment with optimized ratio of landslide to non-landslide samples. Gondwana Res. 2023, 123, 198–216. [Google Scholar] [CrossRef]
Reichenbach, P.; Rossi, M.; Malamud, B.D.; Mihir, M.; Guzzetti, F. A review of statistically-based landslide susceptibility models. Earth-Sci. Rev. 2018, 180, 60–91. [Google Scholar] [CrossRef]
Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; ThaiPham, B.; Bui, D.T.; Avtar, R.; Abderrahmane, B. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance. Earth-Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
Kudaibergenov, M.; Nurakynov, S.; Iskakov, B.; Iskaliyeva, G.; Maksum, Y.; Orynbassarova, E.; Akhmetov, B.; Sydyk, N. Application of artificial intelligence in landslide susceptibility assessment: Review of recent progress. Remote Sens. 2024, 17, 34. [Google Scholar] [CrossRef]
Zhang, L.; Guo, Z.; Qi, S.; Zhao, T.; Wu, B.; Li, P. Landslide susceptibility evaluation and determination of critical influencing factors in eastern Sichuan mountainous area, China. Ecol. Indic. 2024, 169, 112911. [Google Scholar] [CrossRef]
Youssef, A.M.; Pourghasemi, H.R. Landslide susceptibility mapping using machine learning algorithms and comparison of their performance at Abha Basin, Asir Region, Saudi Arabia. Geosci. Front. 2021, 12, 639–655. [Google Scholar] [CrossRef]
Chang, Z.; Catani, F.; Huang, F.; Liu, G.; Meena, S.R.; Huang, J.; Zhou, C. Landslide susceptibility prediction using slope unit-based machine learning models considering the heterogeneity of conditioning factors. J. Rock Mech. Geotech. Eng. 2023, 15, 1127–1143. [Google Scholar] [CrossRef]
Wei, X.; Zhang, L.; Gardoni, P.; Chen, Y.; Tan, L.; Liu, D.; Du, C.; Li, H. Comparison of hybrid data-driven and physical models for landslide susceptibility mapping at regional scales. Acta Geotech. 2023, 18, 4453–4476. [Google Scholar] [CrossRef]
Wang, J.; Wang, Y.; Li, M.; Qi, Z.; Li, C.; Qi, H.; Zhang, X. Improved landslide susceptibility assessment: A new negative sample collection strategy and a comparative analysis of zoning methods. Ecol. Indic. 2024, 169, 112948. [Google Scholar] [CrossRef]
Liu, L.; Duan, C.; Gao, J.; Xiao, H.; Zhu, W.; Yang, C. Landslide susceptibility assessment using machine learning with a novel SHAP-based sampling strategy. Geosci. Front. 2025, 102188. [Google Scholar] [CrossRef]
Lyu, H.-M.; Yin, Z.-Y.; Hicher, P.-Y.; Laouafa, F. Incorporating mitigation strategies in machine learning for landslide susceptibility prediction. Geosci. Front. 2024, 15, 101869. [Google Scholar] [CrossRef]
Deng, Z.; Lan, H.; Li, L.; Liu, Y.; Tian, N. A catchment-scale landslide hydro-mechanical coupling model considering spatial heterogeneity. J. Hydrol. 2025, 134855. [Google Scholar] [CrossRef]
Hong, H.; Wang, D.; Zhu, A.-X.; Wang, Y. Landslide susceptibility mapping based on the reliability of landslide and non-landslide sample. Expert Syst. Appl. 2024, 243, 122933. [Google Scholar] [CrossRef]
Zhou, X.; Wen, H.; Zhang, Y.; Xu, J.; Zhang, W. Landslide susceptibility mapping using hybrid random forest with GeoDetector and RFE for factor optimization. Geosci. Front. 2021, 12, 101211. [Google Scholar] [CrossRef]
Wu, M.; Zhang, G.; Zhang, S.; Huang, W.; Zou, H. Spatiotemporal decoupling of rainfall patterns and shallow landslide stability: Lessons learned from calibrated ensemble physically-based models. J. Rock Mech. Geotech. Eng. 2025. [Google Scholar] [CrossRef]
Xiao, S.; Xiao, T.; Jiang, R.; Wang, H.; Ju, L.; Zhang, L. Two-phase strategy for rapid and unbiased assessment of earthquake-induced landslides. Eng. Geol. 2024, 336, 107562. [Google Scholar] [CrossRef]
Liu, S.; Wang, L.; Zhang, W.; Sun, W.; Wang, Y.; Liu, J. Physics-informed optimization for a data-driven approach in landslide susceptibility evaluation. J. Rock Mech. Geotech. Eng. 2024, 16, 3192–3205. [Google Scholar] [CrossRef]
Mo, W.; Zhao, Y.; Yang, N.; Xu, Z.; Zhao, W.; Li, F. Effects of climate and land use/land cover changes on water yield services in the Dongjiang Lake Basin. ISPRS Int. J. Geo-Inf. 2021, 10, 466. [Google Scholar] [CrossRef]
Yang, J.; Dong, J.; Xiao, X.; Dai, J.; Wu, C.; Xia, J.; Zhao, G.; Zhao, M.; Li, Z.; Zhang, Y.; et al. Divergent shifts in peak photosynthesis timing of temperate and alpine grasslands in China. Remote Sens. Environ. 2019, 233, 111395. [Google Scholar] [CrossRef]
Wang, M.; Yao, J.; Chang, H.; Liu, R.; Cao, Y.; Zhao, Y. Monthly Groundwater Level Grid Dataset of China Region (2005-2022). Natl. Tibet. Plateau/Third Pole Environ. Data Cent. 2024. Available online: https://cstr.cn/18406.11.Terre.tpdc.301342 (accessed on 20 January 2025).
Leonarduzzi, E.; McArdell, B.W.; Molnar, P. Rainfall-induced shallow landslides and soil wetness: Comparison of physically-based and probabilistic predictions. Hydrol. Earth Syst. Sci. Discuss. 2020, 25, 5937–5950. [Google Scholar] [CrossRef]
Pradhan, B.; Lee, S. Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ. Earth Sci. 2010, 60, 1037–1054. [Google Scholar] [CrossRef]
Gardner, W.R. Some steady-state solutions of the unsaturated moisture flow equation with application to evaporation from a water table. Soil Sci. 1958, 85, 228–232. [Google Scholar] [CrossRef]
Srivastava, R.; Yeh, T.-C.J. Analytical solutions for one-dimensional, transient infiltration toward the water table in homogeneous and layered soils. Water Resour. Res. 1991, 27, 753–762. [Google Scholar] [CrossRef]
Yang, L.; Cui, Y.; Xu, C.; Ma, S. Application of coupling physics-based model TRIGRS with random forest in rainfall-induced landslide-susceptibility assessment. Landslides 2024, 21, 2179–2193. [Google Scholar] [CrossRef]
Weidner, L.; Oommen, T.; Escobar-Wolf, R.; Sajinkumar, K.S.; Samuel, R.A. Regional-scale back-analysis using TRIGRS: An approach to advance landslide hazard modeling and prediction in sparse data regions. Landslides 2018, 15, 2343–2356. [Google Scholar] [CrossRef]
Dagdelenler, G.; Nefeslioglu, H.A.; Gokceoglu, C. Modification of seed cell sampling strategy for landslide susceptibility mapping: An application from the Eastern part of the Gallipoli Peninsula (Canakkale, Turkey). Bull. Eng. Geol. Environ. 2016, 75, 575–590. [Google Scholar] [CrossRef]
Godt, J.W.; Baum, R.L.; Savage, W.Z.; Salciarini, D.; Schulz, W.H.; Harp, E.L. Transient deterministic shallow landslide modeling: Requirements for susceptibility and hazard assessments in a GIS framework. Eng. Geol. 2008, 102, 214–226. [Google Scholar] [CrossRef]
Wu, Y.; Lan, H.; Gao, X.; Li, L.; Meng, Y. A regional slope stability assessment model based on Bayesian theory. J. Eng. Geol. 2014, 22, 1227–1233. [Google Scholar] [CrossRef]
Tan, D.; Xu, X.; Wang, L.; Xu, J.; Shi, Q. Deformation evolution and failure mechanism of rainfall-induced granite residual soil landsliding event in Northern Guangdong, China. Landslides 2025, 22, 925–941. [Google Scholar] [CrossRef]
Chen, T.; Sanjou, M.; Hiraishi, T.; Xu, G. Failure mechanisms of carbonate rock landslides: Structure, karst generation and reservoir water fluctuations. Geomorphology 2025, 109975. [CrossRef]
Sahin, E.K. Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Appl. Sci. 2020, 2, 1308. [Google Scholar] [CrossRef]
Huang, F.; Cao, Z.; Guo, J.; Jiang, S.-H.; Li, S.; Guo, Z. Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. Catena 2020, 191, 104580. [Google Scholar] [CrossRef]
Tien Bui, D.; Tuan, T.A.; Klempe, H.; Pradhan, B.; Revhaug, I. Spatial prediction models for shallow landslide hazards: A comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 2016, 13, 361–378. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Zhao, F.; Miao, F.; Wu, Y.; Ke, C.; Gong, S.; Ding, Y. Refined landslide susceptibility mapping in township area using ensemble machine learning method under dataset replenishment strategy. Gondwana Res. 2024, 131, 20–37. [Google Scholar] [CrossRef]
Wu, R.Z.; Hu, X.D.; Mei, H.B.; He, J.; Yang, J. Spatial susceptibility assessment of landslides based on random forest: A case study from Hubei section in the Three Gorges Reservoir area. Earth Sci. 2021, 46, 321–330. [Google Scholar] [CrossRef]
Huang, F.; Tao, S.; Chang, Z.; Huang, J.; Fan, X.; Jiang, S.-H.; Li, W. Efficient and automatic extraction of slope units based on multi-scale segmentation method for landslide assessments. Landslides 2021, 18, 3715–3731. [Google Scholar] [CrossRef]
Xie, C.; Huang, Y.; Li, L.; Li, T.; Xu, C. Detailed inventory and spatial distribution analysis of rainfall-induced landslides in Jiexi County, Guangdong Province, China in August 2018. Sustainability 2023, 15, 13930. [Google Scholar] [CrossRef]
Kavzoglu, T.; Sahin, E.K.; Colkesen, I. Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression. Landslides 2014, 11, 425–439. [Google Scholar] [CrossRef]
Li, M.; Wang, H.; Chen, J.; Zheng, K. Assessing landslide susceptibility based on the random forest model and multi-source heterogeneous data. Ecol. Indic. 2024, 158, 111600. [Google Scholar] [CrossRef]
Li, Y.; Hu, X.; Zhang, H.; Zheng, H.; Li, N. Displacement prediction and failure mechanism analysis of rainfall-induced colluvial landslides. J. Hydrol. 2025, 133361. [Google Scholar] [CrossRef]
Xu, Q.; Zhao, B.; Dai, K.; Dong, X.; Li, W.; Zhu, X.; Yang, Y.; Xiao, X.; Wang, X.; Huang, J.; et al. Remote sensing for landslide investigations: A progress report from China. Eng. Geol. 2023, 321, 107156. [Google Scholar] [CrossRef]
Guo, Z.; Tian, B.; Zhu, Y.; He, J.; Zhang, T. How do the landslide and non-landslide sampling strategies impact landslide susceptibility assessment?—A catchment-scale case study from China. J. Rock Mech. Geotech. Eng. 2024, 16, 877–894. [Google Scholar] [CrossRef]
Li, C.; Feng, P.; Meng, J.; Catani, F.; Hellevang, H.; Tang, H.; Sun, X.; Huang, D. Physics-informed deep learning for revealing the evolutionary characteristics of landslides induced by rainfall process. Geophys. Res. Lett. 2025, 52, e2025GL117356. [Google Scholar] [CrossRef]
Zhang, J.; Ma, X.; Zhang, J.; Sun, D.; Zhou, X.; Mi, C.; Wen, H. Insights into geospatial heterogeneity of landslide susceptibility based on the SHAP-XGBoost model. J. Environ. Manag. 2023, 332, 117357. [Google Scholar] [CrossRef] [PubMed]
Huang, J.; Wen, H.; Zhou, X.; Xiao, J. Is there difference in landslide susceptibility model based on explainable artificial intelligence from the perspective of slope units with different scales? Reliab. Eng. Syst. Saf. 2025, 111701. [Google Scholar] [CrossRef]
Xu, F.; Xu, Q.; Pu, C.; Wang, X.; Xu, P. Can different machine learning methods have consistent interpretations of DEM-based factors in shallow landslide susceptibility assessments? J. Rock Mech. Geotech. Eng. 2025, 17, 7864–7881. [Google Scholar] [CrossRef]

Figure 1. Location, topography, lithology, and accumulated rainfall during July 26–28 in the study area. (a) Location of Zixing County, Hunan Province, China; (b) Topography and main towns in Zixing County; (c) Lithology and main faults; (d) Accumulated rainfall (26–28 July 2024) triggered by the 2024 Typhoon Gaemi.

Figure 2. Methodological flowchart of this study.

Figure 3. Spatial distribution of the fifteen landslide–influencing factors. (a) Elevation; (b) Slope; (c) Aspect; (d) Stream power index; (e) Topographic wetness index; (f) Terrain undulation; (g) Surface cutting depth; (h) Normalized Difference Vegetation Index (NDVI); (i) Soil thickness; (j) Lithology; (k) Accumulated rainfall during July 26–28; (l) Groundwater table; (m) Distance to fault; (n) Distance to river; (o) Distance to road.

Figure 4. Heatmap of Pearson correlation coefficients among the influencing factors. The color scale indicates the correlation coefficient values, with red representing positive correlation and blue representing negative correlation. Asterisks indicate statistical significance: * p < 0.05, ** p < 0.01, and *** p < 0.001.

Figure 5. Spatial distribution and statistics of the landslide inventory. (a) Spatial clustering of landslides triggered by the extreme rainstorm event; (b) Landslide reporting statistics for towns within the study area; (c) Statistical analysis of landslides and elevation; (d) Statistical analysis of landslides and slope.

Figure 6. Validation of the landslide inventory using UAV images in three typical villages. (a,b) Pre– and post–event interpretation in Yanwo Village; (c,d) Pre– and post–event interpretation in Qingyao Village; (e,f) Pre– and post–event interpretation in Lianhua Village.

Figure 7. Illustration of the improved non–landslide sampling strategy (Strategy I). (a) Factor of safety (FoS) in the study area; (b) Areas with FoS > 1.5 and located outside landslide buffer zones; (c) Detailed view showing the overlay of FoS values on landslide locations; (d) Detailed view of areas with FoS > 1.5 overlaid on landslide buffer zones.

Figure 8. Landslide susceptibility maps generated by four machine learning models using the improved non–landslide sampling strategy (Strategy I). (a) MLP model; (b) RF model; (c) SVM model; (d) XGBoost model.

Figure 9. Distribution of landslides across different susceptibility zones under Strategy I.

Figure 10. Performance evaluation of eight susceptibility models. (a) ROC curves of eight models; (b) Radar chart showing secondary performance metrics for the eight models.

Figure 11. Comparison of two non–landslide sampling methods for constructing MLP–based susceptibility maps. (a) Strategy I: Non–landslide samples are selected from physically stable areas (FoS > 1.5) excluding landslide buffer zones; (b) Strategy II: traditional buffering method where non–landslide samples are selected randomly from any area outside the landslide buffer zones, without physical stability constraints. (Locations as shown in Figure 7c).

Figure 12. Feature importance from SHAP analysis.

Figure 13. Schematic diagram of the physical mechanism driving rainfall–induced landslides. (a) pre–event remote sensing images; (b) post–event remote sensing images; (c) before rainfall stable slope no visible signs of movements; (d) during rainfall water accumulation, groundwater rise saturated soil; (e) after rainfall stripped vegetation scarred landscape.

Table 1. Data sources and spatial resolutions.

Type	Indicators	Spatial Resolution	Source
Remote sensing imagery	Pre–event images	2.0 m	ZY–3 satellite–FWD
Remote sensing imagery	Post–event images	0.7 m	Jilin–1 satellite–PMS
Topography	Elevation (m)	5 m	ZY–3 satellite–FWD
Geological environment	Normalized difference vegetation index	10 m	Sentinel–2
	Soil thickness (m)	0.5 m	Calculated based on empirical Equation (1)
	Lithology	1:200,000	National Geological Data Center (https://www.resdc.cn/)
	Accumulated rainfall (mm)	1 km	Radar–based precipitation data produced by Caiyun Technology Company
	Groundwater table (m)	1 km	National Geological Data Center (https://www.resdc.cn/)
	Distance to fault (m)	5 m	Derived from Euclidean distance to fault lines referring to https://www.resdc.cn/
	Distance to river (m)	5 m	Derived from Euclidean distance to river lines referring to https://www.resdc.cn/
Human activity	Distance to road (m)	5 m	Derived from Euclidean distance to road lines referring to https://www.resdc.cn/

Table 2. Multicollinearity analysis of the conditioning factors.

Factor Category	Conditioning Factor	Variance Inflation Factor	Result
Topography	Terrain undulation	4.97	No multicollinearity
	Surface cutting depth	4.32	No multicollinearity
	Elevation	3.21	No multicollinearity
	Slope	2.02	No multicollinearity
	Aspect	1.02	No multicollinearity
	TWI	1.22	No multicollinearity
	SPI	1.15	No multicollinearity
Hydrology	Distance to river	1.84	No multicollinearity
	Rainfall	2.01	No multicollinearity
	Groundwater table	1.88	No multicollinearity
Geology	Lithology	1.08	No multicollinearity
Geology	Soil thickness	2.33	No multicollinearity
Environment	Distance to fault	1.22	No multicollinearity
	Distance to road	2.37	No multicollinearity
	NDVI	1.23	No multicollinearity
Conclusion	All values	<5	Independent

Table 3. Soil physical and mechanical parameters of five lithological types used in TRIGRS model.

Parameter	Unit	Lithological Types
Parameter	Unit	Mss	Gr	Crt	Ls	Qss
Cohesion (c′)	KN/m²	22	33.28	2340	800	247
Friction angle (φ′)	°	37	41.37	35	40	40
Unit weight of soil (γ_s)	KN/m³	26.8	18.2	27	26	24.9

Table 4. Classification of slope stability based on the factor of safety (FoS).

Stability	Unstable	Slightly Unstable	Basically Stable	Stable
Factor of safety	FoS ≤ 1.0	1.0 < FoS ≤ 1.25	1.25 < FoS ≤ 1.5	FoS > 1.5

Table 5. Hyperparameter search spaces and optimal settings for the four ML models.

ML Models	Hyperparameter	Search Space (Range)	Optimal Value
MLP	Hidden Layers	[3, 7]	3
	Learning Rate	[5 × 10⁻⁵, 5 × 10⁻²] (log)	0.001
	Batch Size	[16, 128]	128
	Optimizer	[Adam, SGD]	Adam
SVM	Kernel	[RBF]	RBF
	C (Regularization)	[0.1, 100] (log)	1
	Gamma	[0.001, 1] (log)	Scale
RF	n_estimators	[100, 2000]	1000
	Max Depth	[5, 30]	15
	Max Features	[0.1, 0.9]	0.45
XGBoost	Learning Rate	[0.01, 0.3]	0.02
	Max Depth	[3, 15]	10
	Subsample	[0.5, 1.0]	0.95
	Colsample_bytree	[0.5, 1.0]	0.73

Table 6. The number of grid cells at each susceptibility level for the four models under sampling strategy I.

Sampling Strategy	LSM Class
Sampling Strategy	Very Low	Low	Medium	High	Very High
MLP–Strategy I	269,343	28,468	36,111	40,826	60,169
RF–Strategy I	150,643	115,865	94,983	51,003	22,423
SVM–Strategy I	156,116	108,768	94,472	50,106	25,455
XGBoost–Strategy I	222,863	65,797	64,644	49,491	32,122

Table 7. Quantitative comparison of model performance (AUC) under sampling strategy I using random 5–fold cross–validation and paired t–tests.

Fold	MLP (Proposed)	RF	SVM	XGBoost
Fold 1	0.932	0.900	0.879	0.905
Fold 2	0.933	0.901	0.883	0.910
Fold 3	0.935	0.901	0.887	0.899
Fold 4	0.937	0.903	0.890	0.915
Fold 5	0.931	0.905	0.891	0.906
Mean AUC	0.934	0.902	0.886	0.907
Std. dev	±0.002	±0.002	±0.005	±0.006
p–value (vs. MLP)	–	<0.001	<0.001	<0.001

Table 8. Detailed performances of four machine learning models under two sampling strategies.

Model	Strategy	AUC	Accuracy	Precision	Recall	Specificity	F1–Score
MLP	Strategy I	0.934	0.862	0.821	0.927	0.798	0.871
MLP	Strategy II	0.812	0.734	0.695	0.845	0.619	0.763
RF	Strategy I	0.902	0.823	0.798	0.864	0.781	0.830
RF	Strategy II	0.798	0.683	0.729	0.598	0.771	0.654
SVM	Strategy I	0.886	0.840	0.810	0.887	0.792	0.847
SVM	Strategy II	0.753	0.684	0.690	0.683	0.684	0.687
XGBoost	Strategy I	0.907	0.837	0.810	0.882	0.793	0.844
XGBoost	Strategy II	0.755	0.687	0.783	0.530	0.849	0.632

Table 9. Spatial cross–validation results of the MLP model under two sampling strategies.

Test Region (Town)	Strategy I AUC	Strategy I Accuracy	Strategy I Specificity	Strategy II AUC	Strategy II Accuracy	Strategy II Specificity	Improvement (AUC)
K	0.799	0.622	0.830	0.589	0.539	0.589	35.65%
F	0.767	0.631	0.774	0.636	0.459	0.733	20.60%
G	0.908	0.855	0.931	0.585	0.494	0.322	55.21%
D	0.879	0.754	0.634	0.762	0.608	0.348	15.35%
M	0.916	0.788	0.891	0.754	0.547	0.163	21.49%
Mean ± Std.	0.854 ± 0.067	0.730 ± 0.101	0.812 ± 0.116	0.665 ± 0.087	0.529 ± 0.057	0.431 ± 0.227	29.66% ± 16.14%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Peng, D.; Chen, M.; Zhou, Y.; Li, P.; Xiao, S.; Shen, Y.; Tan, B.; Kong, L.; Xu, Q. A Hybrid Physics–Machine Learning Framework for Landslide Susceptibility Assessment with an Improved Non–Landslide Sampling Strategy. Remote Sens. 2026, 18, 408. https://doi.org/10.3390/rs18030408

AMA Style

Peng D, Chen M, Zhou Y, Li P, Xiao S, Shen Y, Tan B, Kong L, Xu Q. A Hybrid Physics–Machine Learning Framework for Landslide Susceptibility Assessment with an Improved Non–Landslide Sampling Strategy. Remote Sensing. 2026; 18(3):408. https://doi.org/10.3390/rs18030408

Chicago/Turabian Style

Peng, Dalei, Maoyuan Chen, Yeping Zhou, Pinliang Li, Shihao Xiao, Yuyang Shen, Boren Tan, Linghao Kong, and Qiang Xu. 2026. "A Hybrid Physics–Machine Learning Framework for Landslide Susceptibility Assessment with an Improved Non–Landslide Sampling Strategy" Remote Sensing 18, no. 3: 408. https://doi.org/10.3390/rs18030408

APA Style

Peng, D., Chen, M., Zhou, Y., Li, P., Xiao, S., Shen, Y., Tan, B., Kong, L., & Xu, Q. (2026). A Hybrid Physics–Machine Learning Framework for Landslide Susceptibility Assessment with an Improved Non–Landslide Sampling Strategy. Remote Sensing, 18(3), 408. https://doi.org/10.3390/rs18030408

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid Physics–Machine Learning Framework for Landslide Susceptibility Assessment with an Improved Non–Landslide Sampling Strategy

Highlights

Abstract

1. Introduction

2. Materials

2.1. Study Area

2.2. Data Source

3. Methods

3.1. Modelling Procedure

3.2. Landslide Influencing Factors

3.3. Non–Landslide Sampling Strategies

3.3.1. Strategy I: Improved Non–Landslide Sampling Strategy

3.3.2. Strategy II: Traditional Buffering Method

3.4. Machine Learning Models and Hyperparameter Optimization

3.4.1. Multi–Layer Perceptron Model

3.4.2. Support Vector Machine Model

3.4.3. Random Forest Model

3.4.4. Extreme Gradient Boosting Model

3.5. Validation Method

3.5.1. Receiver Operating Characteristics

3.5.2. Confusion Matrix

3.5.3. Random and Spatial Cross–Validation

4. Results

4.1. Landslide Inventories

4.2. Distribution of Non–Landslide Samples

4.3. Landslide Susceptibility Mapping

4.4. Comparison of Model Performance

4.4.1. Random Cross–Validation

4.4.2. Spatial Cross–Validation

5. Discussion

5.1. The Mechanism of Strategy I in Enhancing Model Performance

5.2. Comparative Analysis of SHAP Importance and Statistical Correlation

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI