Next Article in Journal
National Digital Infrastructure: Clustering Open-Source Solutions for Sovereign Monitoring of the Environment
Previous Article in Journal
Remote Sensing Image Captioning via Self-Supervised DINOv3 and Transformer Fusion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Dynamic Landslide Susceptibility Assessment Method Based on Multi-Source Remote Sensing, XGBoost, and SHAP: A Case Study in Yongsheng County, Yunnan Province

China Aero Geophysical Survey and Remote Sensing Center for Natural Resources, Beijing 100083, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(6), 845; https://doi.org/10.3390/rs18060845
Submission received: 31 January 2026 / Revised: 1 March 2026 / Accepted: 8 March 2026 / Published: 10 March 2026

Highlights

What are the main findings?
  • A dynamic landslide susceptibility framework integrating multi-source remote sensing, XGBoost, and SHAP is proposed.
  • Distance to roads and the maximum deformation rate derived from InSAR are identified as the dominant controlling factors of landslide occurrence.
What are the implications of the main findings?
  • Integrating multi-source remote sensing for comprehensive remote sensing identification effectively improves landslide disaster inventory completeness and model performance.
  • SHAP-based interpretation enhances the transparency and practical applicability of machine learning-based susceptibility assessments.

Abstract

Landslide susceptibility assessment (LSA) heavily depends on the completeness of landslide inventories and the interpretability of predictive models. Conventional inventories, based solely on historical records, often fail to identify newly occurring or slow-moving landslides, leading to biased susceptibility estimates. To address this limitation, this study proposes a dynamic LSA framework that integrates multi-source remote sensing data, Extreme Gradient Boosting (XGBoost) modeling, and Shapley Additive Explanations (SHAP), with a case study in Yongsheng County, Yunnan Province, China. This study jointly uses multi-temporal optical remote sensing imagery and Sentinel-1 InSAR (Interferometric Synthetic Aperture Radar) deformation data to update the landslide inventory. Compared with the historical inventory containing 334 landslide points, the updated inventory incorporates an additional 140 deformation-related landslide hazard points. XGBoost models were developed using conditioning factors selected through multicollinearity analysis to evaluate the influence of inventory completeness on model performance. Results show that the model based on the updated inventory achieves a significant improvement in predictive accuracy. SHAP-based interpretation reveals that distance to roads and maximum deformation rate are the dominant factors controlling landslide occurrence, reflecting the combined effects of human activities and dynamic ground deformation. The resulting susceptibility map shows that the Area Under the Curve (AUC) value for susceptibility zoning of the updated sample increases from 0.857 to 0.928, with high and very high susceptibility zones occupying 8.28% of the study area. Overall, the proposed framework improves both the accuracy and interpretability of LSA and demonstrates the effectiveness of multi-source remote sensing data for dynamic landslide hazard assessment in mountainous regions.

1. Introduction

Landslides are among the most destructive geological hazards in mountainous regions, posing serious threats to human life, infrastructure, and socio-economic development worldwide [1,2,3,4]. Their occurrence results from the complex interactions of geological conditions, topography, hydrological processes, vegetation cover, and human activities [5,6]. In tectonically active and geomorphologically complex areas like southwestern China, landslides exhibit high frequency and strong spatial heterogeneity, greatly complicating effective hazard assessment and risk mitigation.
Yongsheng County, located in Yunnan Province, is a high-risk area for landslides. Historical records show that 489 landslide events have occurred in the region, resulting in significant economic losses, including direct losses of 31.80 million RMB and indirect losses of 39.01 million RMB, with 8 fatalities. The population in high-risk areas is estimated to exceed 20,000, highlighting the urgent need for landslide hazard assessment and mitigation measures.
LSA aims to estimate the spatial probability of landslide occurrence under specific environmental conditions and has become a fundamental component of regional landslide risk management and land-use planning [2,7,8,9]. Existing LSA approaches can generally be classified into qualitative expert-based methods, semi-quantitative statistical models, and physically based models [10]. Expert-based approaches heavily rely on subjective judgment, while physically based models require detailed geotechnical and hydrological parameters that are often unavailable at regional scales. As a result, data-driven methods have increasingly become the dominant approach in recent LSA studies [11,12].
With advances in remote sensing and geospatial technologies, machine learning algorithms have been widely applied to landslide susceptibility mapping due to their ability to capture complex nonlinear relationships among environmental factors [13,14]. Ensemble learning models, such as Random Forest (RF), Gradient Boosting Decision Tree (GBDT), and XGBoost, have shown superior performance in handling high-dimensional and heterogeneous datasets [15,16]. In particular, XGBoost integrates gradient boosting optimization with regularization strategies, effectively reducing overfitting while improving computational efficiency and prediction accuracy [17,18]. Therefore, XGBoost has been increasingly adopted for landslide susceptibility assessment in diverse geomorphological settings [19,20].
Despite the strong predictive capability of existing models, two critical challenges remain in current LSA research. First, the reliability of susceptibility models is largely dependent on the completeness and accuracy of landslide inventories [5,21]. Most existing studies primarily rely on historical inventories compiled from field surveys [22]. These inventories are often incomplete in densely vegetated or inaccessible mountainous areas and tend to overlook slow-moving or newly occurring landslides, leading to biased training samples and reduced model generalization performance [23,24]. Second, many high-performance machine learning models, including XGBoost, are commonly regarded as “black-box” models because their internal decision-making mechanisms are difficult to interpret. This lack of transparency limits the practical applicability of susceptibility results in hazard prevention, engineering design, and policy-making [25,26].
Recent developments in remote sensing technologies offer promising solutions to these challenges. Multi-temporal optical remote sensing imagery can identify landslide-related surface features and geomorphological changes [27], while InSAR technology can detect subtle ground deformation with millimeter-level accuracy [28,29,30,31]. InSAR is particularly effective for identifying active or potential landslide hazards characterized by slow and continuous deformation, which may not be visually detectable in optical imagery [32,33]. By integrating optical interpretation with InSAR-derived deformation information, dynamic updating of landslide inventories can be achieved, thus improving their completeness and representativeness [34,35].
At the same time, model interpretability has become an important research focus in geohazard modeling. The SHAP framework, based on cooperative game theory, provides a unified and theoretically sound method for quantifying the contribution of individual input variables to model predictions [36]. SHAP not only enables global interpretation of variable importance but also allows for local explanation of individual predictions, offering valuable insights into the nonlinear relationships, interaction effects, and threshold behaviors of landslide controlling factors [37,38,39].
Based on these considerations, this study proposes an integrated framework for dynamic landslide susceptibility assessment, combining multi-source remote sensing data, XGBoost modeling, and SHAP-based interpretation. Yongsheng County, Yunnan Province, China, serves as the case study. The main objectives of this research are:
(1)
To construct an updated landslide inventory by integrating multi-temporal optical remote sensing interpretation with Sentinel-1 InSAR deformation monitoring;
(2)
To evaluate the impact of inventory updating on susceptibility modeling performance through a comparative analysis of historical and integrated inventories;
(3)
To identify key controlling factors and interpret their nonlinear and interactive effects on landslide occurrence using SHAP;
(4)
To generate a landslide susceptibility map with enhanced accuracy and interpretability, supporting regional landslide hazard prevention and management.

2. Materials and Methods

2.1. Study Area

Yongsheng County is located in the northwestern part of Yunnan Province, with geographical coordinates of 100°22′–101°32′E and 25°59′–27°05′N (Figure 1). The total area is approximately 5099 square kilometers. This region lies at the junction of the Hengduan Mountains and the Northwestern Yunnan Plateau, with terrain that is higher in the northeast and lower in the southwest, with elevations ranging from 1056 to 3953 m.
The region is tectonically active, dominated by deeply incised river systems, including the Jinsha River and its tributaries. Geological structures and lithological contrasts strongly control slope stability and geomorphological evolution. The climate is affected by the southwest monsoon, with distinct wet and dry seasons. Rainfall is highly variable, with the majority occurring from June to September, often in short-duration, high-intensity rainfall events that promote landslide initiation.
In addition to natural factors, human activities such as road construction, slope excavation, and land-use changes have significantly altered slope stability. The combination of steep topography, fractured lithology, intense fluvial erosion, monsoonal rainfall, and anthropogenic disturbance makes Yongsheng County highly prone to landslide hazards.

2.2. Data Sources

2.2.1. Multi-Source Remote Sensing and Thematic Data

To support landslide susceptibility modeling, various datasets from multiple sources were collected, including optical remote sensing imagery, Synthetic Aperture Radar (SAR) data, digital elevation models (DEMs), geological maps, land cover data, and infrastructure information. All datasets were preprocessed, resampled to a 30 m spatial resolution, and projected to the same coordinate reference system to ensure consistency.
DEM Data: Derived from ALOS/PRISM global DSM products, these data provided detailed topographic information for slope-related factor extraction.
Geological Data: Engineering geological rock groups and fault distribution maps were obtained from 1:50,000 regional geological maps, supplemented by field survey interpretations.
Hydrological and Anthropogenic Data: River networks and road systems were sourced from the National Geographical Information Resources Directory Service.
Deformation Data: In this study, surface deformation information was extracted using Stacking-InSAR processing based on Sentinel-1 satellite data (2017–2024). A total of 80 scenes were analyzed by combining both ascending and descending orbit data, with a temporal interval between acquisitions ranging from 6 to 12 days. The selection of interferometric pairs was based on the shortest temporal-spatial baseline, and interferograms with coherence values lower than 0.3 were excluded to enhance the reliability of deformation monitoring. Ultimately, deformation rate data were extracted.
A summary of the datasets used in this study, along with their sources, is provided in Table 1.

2.2.2. Landslide Conditioning Factors

Based on the geomorphological characteristics of the study area, previous landslide susceptibility studies, and data availability, a total of 11 candidate conditioning factors were initially selected. These factors were grouped into five categories: topography, geology, hydrology, human activities, and deformation indicators.
The selected conditioning factors are summarized as follows:
  • Topography: elevation, slope, and aspect
  • Geology: engineering geological rock groups and distance to faults
  • Hydrology: distance to rivers
  • Human Activities: distance to roads
  • Deformation Indicators: four InSAR-derived deformation parameters, including maximum, minimum, and mean deformation rates, as well as the standard deviation of deformation rate.
All continuous variables were resampled to a spatial resolution of 30 m and normalized to a range of [0, 1] using min–max normalization prior to model construction. This preprocessing ensured consistency among input variables and improved model convergence and performance.
The spatial distribution of the environmental factors in the study area, including slope, aspect, elevation, and distances to faults, rivers, and roads, is shown in Figure 2. This map visually supports the understanding of how these factors are spatially distributed across the study area. This screening procedure ensured an acceptable level of statistical independence among the selected conditioning factors, thereby enhancing the robustness, interpretability, and predictive reliability of the landslide susceptibility models.

2.3. Methods

This study follows a systematic framework consisting of three main stages: (1) data preprocessing and sample construction, (2) model development and interpretation, and (3) model evaluation and susceptibility mapping (Figure 3).
In the first stage, multi-source remote sensing data and geological information were integrated to construct two landslide sample datasets, namely, the historical inventory and the integrated inventory. In the second stage, landslide susceptibility models were developed using the XGBoost algorithm, and the models were evaluated using multiple performance metrics. SHAP was applied to interpret the model outputs at both global and local levels. In the third stage, landslide susceptibility maps were generated, the main influencing factors were identified, and the model was evaluated using the AUC metric.
The core innovation of this framework lies in the comparative analysis of susceptibility models constructed with different inventory completeness levels and the integration of SHAP-based interpretability into the modeling process.

2.3.1. Landslide Hazard Identification and Methodology

(1)
Stacking InSAR
The Stacking-InSAR technique is a relatively simple form of time-series InSAR. It operates by linearly stacking and weighting multiple unwrapped differential interferograms acquired over the study period. Compared with traditional D-InSAR, this approach more effectively suppresses atmospheric delays, DEM errors, and other noise sources, thereby yielding more accurate estimates of the annual average deformation rate.
The method relies on two key assumptions: atmospheric phase delays across individual interferograms are random and approximately equal, and ground deformation can be approximated as linear over time. Based on these assumptions, multiple interferogram combinations are generated, and high-quality pairs are selected according to the shortest temporal–spatial baseline criterion. These selected interferograms are then stacked to estimate deformation. Therefore, Stacking-InSAR is particularly suitable for regions dominated by linear deformation, and the annual average deformation rate during the study period can be expressed as:
p h _ r a t e = i = 1 N φ i / i = 1 N Δ t i
where t i is the baseline of the interferogram, φ i   is the unwrapped differential interferometric phase, and p h _ r a t e is the linear phase rate.
To ensure the reliability of deformation estimates, all interferometric pairs with coherence < 0.3 were excluded before phase unwrapping. Furthermore, deformation results in densely vegetated regions (NDVI > 0.2) were masked due to temporal decorrelation, and only pixels with adequate coherence were retained for subsequent processing. The workflow is illustrated in Figure 4.
Stacking-InSAR was selected over PS-InSAR and SBAS-InSAR due to its computational efficiency. Previous studies have confirmed that with a sufficient number of high-quality interferometric pairs, Stacking-InSAR achieves deformation results comparable to PS-InSAR and SBAS-InSAR, while requiring significantly less processing time, making it suitable for large-area deformation monitoring.
(2)
Satellite Remote Sensing
Landslides that have already occurred typically exhibit distinct morphological features and readily identifiable elements, making them common targets for optical remote sensing interpretation. In contrast, potential or unfailed landslides lack well-developed characteristics and must be identified through subtle geomorphic indicators or early deformation signs associated with slope instability, such as localized collapses, rockfalls, surface cracks, or anomalies in vegetation cover.
In this study, multi-temporal optical imagery acquired from the Gaofen-2 satellite and Google Earth is employed for landslide identification. This approach not only effectively reduces the interference caused by cloud cover in the study area but also enables the observation of historical imagery to track landslide evolution over time, thereby providing a richer and more reliable dataset for analysis.
(3)
Principles of Landslide Hazard Identification
Landslide hazard identification was conducted by integrating deformation concentration zones from InSAR, optical remote sensing interpretation results, and geological data. The identification criteria are as follows:
  • For confirmed historical landslides, if deformation concentration zones or typical geomorphological indicators associated with slope instability are present, the site is identified as a landslide hazard.
  • For slopes without recorded landslides, if both deformation concentration zones and clear micro-landform indicators appear and spatially coincide, the slope is identified as a potential landslide hazard.
  • If only one indicator is present (either deformation concentration or morphological signatures), but favorable geological conditions and triggering factors exist, the slope may still be classified as a potential hazard.

2.3.2. Landslide Inventory Construction and Conditioning Factor Analysis

(1)
Landslide Inventory Construction
A reliable landslide inventory is essential for landslide susceptibility assessment. In this study, two landslide inventory datasets were constructed to evaluate the impact of inventory completeness on susceptibility modeling performance.
Historical Landslide Inventory (I1): The historical inventory consists of 334 landslide points derived from previous geological hazard surveys, historical records, and visual interpretations of high-resolution optical remote sensing images. These landslides represent confirmed historical events and form the baseline for susceptibility modeling.
Integrated Landslide Inventory (I2): This inventory was constructed by integrating multi-temporal optical remote sensing interpretation with Sentinel-1 InSAR deformation analysis. Areas exhibiting persistent deformation signals and geomorphological characteristics consistent with landslide processes were classified as newly identified landslide hazards. The integrated inventory includes both historical landslides and deformation-related potential hazards, providing a more comprehensive representation of landslide activity in the study area.
(2)
Non-Landslide Sample Selection
Non-landslide samples were selected through a rigorous and systematic procedure to minimize sampling bias:
  • Candidate zones were defined as areas with slopes < 10°, located more than 500 m from documented landslides, and excluding water bodies and construction land. Spatial random sampling was applied with a minimum point spacing of 200 m to reduce spatial autocorrelation.
  • A 1:2 positive-to-negative sample ratio (landslide: non-landslide) was used during training to enhance the model’s ability to learn background features and mitigate class imbalance.
  • Multi-temporal optical imagery was analyzed to verify that the selected non-landslide points showed no evidence of historical or ongoing slope activity. Additionally, five-fold cross-validation was performed to assess the robustness of the sample selection.
All conditioning factors were resampled to a 30 m spatial resolution to ensure consistency. However, this resolution introduces scale limitations: the minimum detectable landslide area is approximately 900 m2 for square-shaped features, but elongated or irregular landslides typically require 3–5 pixels (2700–4500 m2) to be reliably captured. As a result, small-scale landslides may be underrepresented, potentially affecting the model’s performance at fine scales. This limitation is further discussed in the Section 3 and Section 4.
(3)
Conditioning Factor Analysis
To assess the intercorrelations among the conditioning factors and ensure the stability and interpretability of the model, we conducted a multicollinearity analysis. The Variance Inflation Factor (VIF) was used to quantify the degree of multicollinearity among the 11 candidate conditioning factors. To perform the VIF analysis, we used the “statsmodels” library in Python 3.10, with the specific steps outlined as follows:
  • Data Preprocessing: All variables were standardized prior to the VIF analysis to ensure consistency in their measurement scales.
  • VIF Calculation: The VIF for each candidate variable was calculated using the formula:
V I F i = 1 / 1 R i 2
where R i 2 is the coefficient of determination for the regression model of each candidate variable in relation to the other variables.
  • Threshold for Elimination: A threshold of VIF = 10 was adopted as the criterion for variable elimination. Variables with VIF values exceeding this threshold were considered to exhibit strong multicollinearity and were therefore excluded from the model.
The analysis revealed that three deformation-related variables—mean deformation rate, minimum deformation rate, and standard deviation of deformation rate—exhibited significantly high VIF values and were thus excluded from the model. This is due to the fact that these variables are derived from the same deformation time series and exhibit strong linear dependence on the maximum deformation rate, resulting in high multicollinearity. In contrast, the maximum deformation rate, representing an extreme value feature, showed weaker linear dependence on the other statistical measures, leading to its relatively low VIF (1.092). This phenomenon reflects the inherent structural relationships among the deformation statistics rather than any issues related to data quality or methodological procedures.
The remaining eight conditioning factors, with acceptable VIF values, were retained as input variables for the model. The VIF values for all candidate variables are shown in Table 2.

2.3.3. Landslide Susceptibility Assessment Using XGBoost and SHAP-Based Interpretation

(1)
XGBoost-Based Landslide Susceptibility Modeling and Model Evaluation
XGBoost was employed to develop landslide susceptibility models due to its strong predictive capability and effectiveness in capturing complex nonlinear relationships and feature interactions [17].
Data Partitioning: Landslide and non-landslide samples were randomly divided into training (70%) and testing (30%) datasets. To mitigate class imbalance, non-landslide samples were selected from stable areas using a 1:1 ratio with landslide samples [23].
Hyperparameter Optimization: Key hyperparameters, including the number of trees, learning rate, maximum tree depth, and subsampling ratio, were optimized through grid search combined with cross-validation [11,19]. In addition, regularization parameters were incorporated to reduce overfitting and enhance model generalization.
Model Construction and Comparison: Two independent XGBoost models were constructed using the I1 and I2 landslide inventories, respectively, allowing for a direct comparison of model performance under different inventory completeness scenarios [8,15].
Model Evaluation: Model performance was quantitatively evaluated using accuracy, precision, recall, F1-score, and the AUC. All evaluation metrics were calculated based on the independent testing dataset to ensure an objective and unbiased assessment [9,20].
(2)
Model Interpretation Using SHAP
To improve the interpretability of the XGBoost models, the SHAP framework was applied. SHAP values quantify the contribution of each conditioning factor to the model output, thereby providing a consistent and theoretically grounded interpretation of feature importance and interactions.
Model interpretation was conducted at three levels:
  • Global Interpretation: Assessing the overall importance of conditioning factors across the entire dataset.
  • Local Interpretation: Explaining individual model predictions to reveal site-specific landslide triggering mechanisms.
  • Interaction Analysis: Exploring nonlinear interactions between key conditioning factors. The integration of SHAP with XGBoost effectively reduces the black-box nature of machine learning models and provides transparent insights into the dominant factors controlling landslide occurrence.
(3)
Susceptibility Mapping and Assessment
Landslide susceptibility maps were generated by applying the trained XGBoost models to the entire study area. The resulting continuous susceptibility index was classified into five susceptibility classes—Very Low, Low, Moderate, High, and Very High—using the Natural Breaks (Jenks) classification method. This classification scheme facilitates practical landslide hazard management and enables effective comparison between susceptibility maps derived from different landslide inventories [38,39,40,41].
The predictive performance of the susceptibility maps was further evaluated using the AUC, providing an overall measure of model discrimination capability [12,20].

3. Results

3.1. Landslide Inventory

As shown in Figure 4, the updated landslide inventory in the study area includes a total of 474 landslides. The basic information of the landslide inventory was obtained through two main methods:
(1)
Historical inventory information, which consists of 334 landslide locations.
(2)
Identification based on multi-temporal optical imagery and InSAR deformation monitoring results, which identified a total of 140 landslides.
Figure 5 shows the spatial distribution of landslides in the study area, revealing a widespread distribution with localized concentrations. The newly identified landslides via remote sensing technologies exhibit belt-like clustering features in regions (a), (b), (c), and (d), with new landslide line densities reaching 0.83/km, 0.51/km, 0.85/km, and 0.74/km, respectively.
These four subsets, a, b, c, and d, are located along the Wulang River, Maguo River, and Jinsha River. Surface deformation monitoring results indicate relatively low slope stability in the region, with numerous areas showing deformation anomalies. Landslide activity in the area is generally active, with many landslides displaying clear deformation and damage features in optical imagery or InSAR monitoring results (Figure 6). Overall, most areas in the study region are experiencing ongoing landslide activity, with several landslide hotspots exhibiting particularly frequent occurrences.

3.2. Model Performance Evaluation

The performance of the XGBoost-based landslide susceptibility model constructed using two landslide inventories (I1 and I2) was evaluated using various quantitative metrics, including accuracy, precision, recall, F1 score, and AUC. All metrics were calculated based on an independent test set.
As shown in Figure 7 and Figure 8, the model trained using the historical landslide inventory achieved an AUC of 0.858, indicating good predictive capability. In contrast, the model based on the integrated landslide inventory exhibited significantly improved performance, with an AUC of 0.881. The absolute increase in AUC by 0.023 demonstrates that the integration of landslide hazard points identified through multi-source remote sensing data effectively enhanced the model’s discriminative ability.
In addition to AUC, other evaluation metrics consistently showed improvement for the I2 model. Accuracy, precision, recall, and F1 score were all higher than those of the I1 model, indicating that the updated landslide inventory not only improved overall classification accuracy but also reduced false positives and false negatives. These results confirm that inventory completeness plays a crucial role in data-driven susceptibility modeling.

3.3. SHAP-Based Interpretation of Model Results

3.3.1. Global Importance of Conditioning Factors

To interpret the results of the XGBoost model, SHAP was applied to quantify the contribution of each environmental factor to landslide susceptibility. The SHAP feature importance plot (Figure 9) and the SHAP summary plot (Figure 10) illustrate the relative importance and direction of influence of the input variables.
The results show that the distance to roads and the maximum deformation rate are the two most important factors controlling landslide occurrence in the study area. Shorter distances to roads and higher deformation rates are associated with increased landslide susceptibility. Other important factors include slope, elevation, and distance to rivers, while factors such as slope aspect and curvature have relatively low contributions.
These findings suggest that human activities, particularly road construction and slope modifications, combined with ground deformation captured by InSAR, play a dominant role in triggering landslides in Yongsheng County.

3.3.2. Nonlinear Effects and Feature Interactions

As shown in Figure 11a, SHAP values exhibit a strong positive correlation with deformation rate, where higher deformation rates correspond to greater positive SHAP contributions (with a peak of approximately 2.4). This result suggests that InSAR-derived deformation monitoring is an effective indicator for identifying active landslides and assessing potential hazards. It is noteworthy that this relationship is not strictly linear [29,35,36,37].
Figure 11b illustrates the relationship between distance to roads and landslide susceptibility. The influence of road distance shows a clear spatial decay pattern. Within 500 m of roads, SHAP values show a significant negative contribution (approximately −0.2 to −0.8), indicating that proximity to roads significantly increases the likelihood of landslide occurrence. This effect is primarily attributed to engineering activities, such as slope excavation, changes in load conditions, and hydrological disturbances. Beyond 500 m, the negative influence rapidly diminishes and becomes negligible at approximately 1500 m, suggesting that road-induced disturbances have a clear spatial impact threshold.
Further analysis reveals important interaction effects among features. In Figure 11a, scatter points colored by elevation indicate that sensitivity to road distance varies with topographic height. Low-elevation areas (red points) exhibit stronger negative SHAP values near roads, reflecting the heightened susceptibility of river valleys and basin landforms to engineering disturbances. In contrast, high-elevation areas (blue points) show weaker responses, suggesting that increased topographic relief may partially buffer the effects of anthropogenic disturbance.
Additionally, a clear interaction exists between maximum deformation rate and road distance. The results indicate that in areas close to roads (<500 m), high deformation rates are often associated with markedly higher positive SHAP values (exceeding 2.0), demonstrating that engineering disturbances and persistent deformation jointly intensify landslide risk. Conversely, in natural slope environments distant from roads (>1500 m), even moderate to high deformation rates correspond to substantially lower SHAP contributions (generally below 0.5). This finding highlights a synergistic relationship between human engineering activities and dynamic deformation signals: on slopes affected by engineering interventions, the hazard-inducing effect of deformation is significantly amplified, whereas in relatively undisturbed natural systems, its influence becomes more independent and attenuated.

3.4. Landslide Susceptibility Mapping

Based on the trained XGBoost model, landslide susceptibility maps were generated for the entire study area. The continuous susceptibility index was classified into five levels using the natural breaks classification method: very low, low, medium, high, and very high susceptibility.
The susceptibility maps derived from both inventories are generally similar, but there are some differences in local susceptibility evaluation (Figure 12). High-susceptibility areas are relatively dispersed, mainly along major river valleys and road corridors. Some newly identified deformation-prone areas in the I1 inventory were classified as medium or low susceptibility, reflecting the limitations of relying solely on historical landslide records.
In contrast, the susceptibility map generated using the I2 inventory presents a more coherent spatial pattern. High- and very-high-susceptibility areas are more distinctly concentrated in regions with low elevation, steep slopes, proximity to rivers and roads, and significant InSAR deformation signals. The distribution of these areas better matches both historical landslides and newly detected hazard points, improving the spatial representativeness of the susceptibility map.
Quantitatively, the high- and very-high-susceptibility areas in the I2-based map account for 8.28% of the total study area, while very-low- and low-susceptibility areas dominate regions with relatively flat terrain and limited human disturbance. This distribution is consistent with the known geomorphological and environmental conditions in Yongsheng County. The resulting susceptibility zone table is shown in Table 3.
As shown in Figure 13, the susceptibility distribution map generated by the model trained on the historical landslide inventory achieved an AUC of 0.857, indicating good predictive capability. In contrast, the susceptibility distribution map based on the integrated landslide inventory exhibited significantly improved performance, with an AUC of 0.928. The absolute increase in AUC by 0.071 indicates that the integration of landslide hazard points identified through multi-source remote sensing data effectively enhanced the model’s discriminative ability.

4. Discussion

4.1. Effect of Landslide Inventory Updating on Model Performance

The results of this study clearly demonstrate that updating the landslide inventory by integrating multi-source remote sensing data substantially improves landslide susceptibility modeling performance. Compared to the model trained with the historical inventory (I1), the model based on the integrated inventory (I2) exhibits significant improvements in AUC, as well as consistent gains in accuracy, precision, recall, and F1-score.
This performance enhancement can primarily be attributed to the increased representativeness and completeness of the training samples. Historical landslide inventories typically focus on large, catastrophic, and easily identifiable events, while slow-moving or newly developed landslides are often underrepresented, especially in densely vegetated or inaccessible mountainous regions. As a result, models trained solely on historical inventories often underestimate susceptibility in areas with ongoing deformation but no visible surface manifestations.
By incorporating deformation-related landslide hazards identified through InSAR and multi-temporal optical imagery, the integrated inventory captures both past landslide events and currently active or potential landslide processes. This reduces sample bias and enables the model to better learn the environmental conditions associated with landslide initiation. Consequently, it enhances the model’s generalization ability across the study area. These findings highlight the importance of treating landslide inventories as dynamic datasets, rather than static historical records, in data-driven susceptibility assessments.

4.2. Advantages of Multi-Source Remote Sensing Integration

The integration of multi-source remote sensing data plays a crucial role in enhancing landslide detection and susceptibility modeling. Optical remote sensing imagery provides important information on surface morphology, land cover changes, and geomorphological features associated with landslides. In contrast, InSAR deformation feature extraction is particularly effective in detecting subtle ground deformation that may precede slope instability.
In this study, InSAR-derived deformation indicators contribute unique and complementary information to traditional conditioning factors. Many deformation-related hazard points identified in the integrated inventory are not clearly distinguishable from stable slopes using optical imagery alone. This is particularly relevant in regions where landslides exhibit slow deformation behavior or are obscured by dense vegetation. The improved performance of the I2-based model confirms that deformation signals extracted from InSAR effectively enhance the model’s ability to identify landslide-prone areas.
Compared to susceptibility studies that rely on a single data source, the multi-source approach used here provides a more comprehensive characterization of landslide processes by combining surface features with subsurface deformation dynamics. This integration demonstrates the strong potential of combining optical and radar remote sensing for large-scale landslide susceptibility assessments, especially in tectonically active and mountainous regions.

4.3. Interpretation of Controlling Factors Based on SHAP

The application of SHAP provides valuable insights into the factors influencing landslide occurrence, enabling an interpretable analysis of the XGBoost model. The results indicate that the distance to roads and maximum deformation rate are the two most influential factors governing landslide susceptibility in Yongsheng County.
The proximity to roads has a significant impact, emphasizing the critical role of human activities in destabilizing slopes. Road construction typically involves slope excavation, redistribution of loads, and modifications to hydrological conditions, all of which can significantly reduce slope stability. The SHAP dependence plot reveals a spatial decay pattern, with strong negative SHAP contributions within approximately 500 m of roads, indicating a notably higher landslide probability in these areas. Beyond 1500 m, the influence of roads rapidly diminishes, becoming negligible. This suggests a clear spatial threshold for road-induced disturbances, consistent with previous studies indicating that landslides often cluster near transportation corridors [25,26].
Maximum deformation rate, derived from InSAR, is identified as another dominant controlling factor, emphasizing the importance of ongoing ground movement in landslide initiation. Higher deformation rates correspond to increasingly positive SHAP values, confirming that InSAR-based deformation signals are effective indicators of active or potentially unstable slopes [19,20,21,23]. However, the SHAP dependence relationship reveals a nonlinear trend, with positive contributions tending to saturate or slightly decline beyond a certain deformation rate threshold. This behavior suggests that the transition from deformation accumulation to failure probability is not purely linear and may involve threshold effects or damage saturation processes.
SHAP interaction analysis further reveals a significant synergistic effect between the maximum deformation rate and distance to roads. By quantifying the interaction strength across different distance ranges, we found that within 500 m of roads, higher deformation rates significantly amplify landslide susceptibility, with SHAP values increasing rapidly in this range, indicating a stronger impact of roads. As the distance increases, SHAP values stabilize, and the influence of deformation rate diminishes, especially beyond approximately 1500 m, where the road influence becomes negligible. This suggests that in more distant areas, the contribution of deformation rate to landslide susceptibility is smaller. When the deformation rate approaches zero, the impact of roads becomes significant, especially within 500 m, with SHAP values rising sharply. When the deformation rate exceeds 2, the influence of roads gradually decreases, showing a saturation effect. This threshold effect indicates that in areas with high deformation rates, road proximity plays a dominant role in increasing the probability of landslide occurrence.
By capturing nonlinear responses and interaction effects, SHAP effectively bridges the gap between high-performance machine learning models and the physical understanding of landslide processes, enhancing both the interpretability and practical applicability of landslide susceptibility assessments.

4.4. Implications for Landslide Hazard Management

The integrated framework proposed in this study has significant implications for landslide hazard prevention and risk management. The susceptibility map generated using the updated inventory more accurately delineates high-risk zones, particularly in areas influenced by both natural and anthropogenic factors.
Areas identified as high susceptibility zones due to proximity to roads and the presence of deformation signals should be prioritized for slope monitoring, engineering reinforcement, and drainage optimization along transportation corridors. Specifically, it is recommended to conduct a comprehensive slope risk assessment for roads passing through the “high” and “very high” susceptibility zones in the susceptibility map. Real-time monitoring equipment such as GNSS or ground-based InSAR should be installed, particularly in areas with significant deformation within 500 m of the road. During the road planning phase, the susceptibility map generated in this study should be used as a basis for defining red-line avoidance zones, avoiding large-scale excavation in high-risk areas. For existing roads, a graded warning system and emergency response plan should be developed based on the deformation thresholds identified by SHAP analysis. Furthermore, the integration of InSAR-derived deformation data aids in the early identification of potentially unstable slopes, supporting a shift from passive response to proactive risk management strategies.
From a methodological perspective, the combined use of multi-source remote sensing, machine learning, and model interpretability offers a transferable framework that can be applied to other landslide-prone regions. The transparent interpretation of model results also facilitates communication between scientists, engineers, and decision-makers, thereby enhancing the practical utility of susceptibility assessments.

4.5. Model Limitations and Future Directions

Despite the encouraging results of this study, several limitations should be acknowledged, which also point the way forward for future research.
Firstly, the spatial resolution of the input datasets (30 m) presents an inherent limitation. This resolution may not be sufficient to fully capture the features of small-scale landslides, particularly those with a spatial scale smaller than the pixel size, potentially introducing uncertainty into susceptibility estimates at local scales. Future research should focus on integrating higher-resolution remote sensing data sources (such as 10 m or sub-meter optical imagery and LiDAR data) to improve the identification of small-scale landslides, thereby enhancing model performance at finer scales.
Secondly, although deformation signals obtained from InSAR significantly enhance hazard detection capabilities, the technique itself has inherent limitations. InSAR measurements are susceptible to decorrelation, atmospheric delays, and vegetation cover, all of which can introduce noise and uncertainty into deformation measurements. Vegetation, in particular, has a significant impact on interferometric coherence: areas with high NDVI values (indicating dense vegetation) typically exhibit low coherence, leading to increased temporal decorrelation and reduced reliability of surface displacement estimates. While this study mitigated these effects by selecting high-quality interferograms with sufficiently strong coherence, vegetation interference remains a primary challenge in applying InSAR in densely vegetated mountainous areas. Future research may consider incorporating supplementary data from other sources, such as high-resolution optical imagery or LiDAR data, to cross-check and complement InSAR results, thereby enhancing deformation analysis accuracy, especially in vegetation-dense regions.
Future research should focus on integrating higher-resolution remote sensing data, refining deformation signal extraction techniques, and exploring spatiotemporal susceptibility modeling approaches to better capture landslide evolution over time. Furthermore, integrating real-time monitoring systems with remote sensing data could allow for dynamic and continuous assessment of landslide hazards, providing more timely and effective risk management.

5. Conclusions

This study proposes a dynamic landslide susceptibility assessment framework that integrates multi-source remote sensing data, XGBoost modeling, and SHAP interpretability, with a case study conducted in Yongsheng County, Yunnan Province, China. The main conclusions are summarized as follows:
(1)
Through the integration of multi-temporal optical remote sensing interpretation and Sentinel-1 InSAR time-series deformation monitoring, the landslide inventory was effectively updated, and an additional 140 landslide hazard points were identified. This dynamic inventory more comprehensively reflects both historical and currently active landslide processes.
(2)
The completeness of the landslide inventory significantly impacts the performance of susceptibility modeling. Compared to the model based on historical inventories, the model trained using the integrated inventory achieved higher predictive accuracy, with the test set AUC improving from 0.858 to 0.881 (an increase of 0.023). The AUC of the generated susceptibility map improved from 0.857 to 0.928 (an increase of 0.071), demonstrating the effectiveness of integrating multi-source remote sensing information.
(3)
SHAP-based interpretation revealed that the distance to roads and the maximum deformation rate are the dominant controlling factors for landslide occurrence in the study area. The results highlight the combined effects of human activities and dynamic ground deformation on landslide development and confirm the value of InSAR deformation signals in susceptibility evaluation.
(4)
The landslide susceptibility map generated using the integrated inventory shows a more reasonable spatial pattern. High- and very-high-susceptibility zones are mainly distributed in low-elevation areas close to rivers and roads. These areas account for 8.28% of the total study area, providing practical guidance for regional landslide disaster prevention and mitigation.
Overall, this study demonstrates that combining dynamic landslide inventory updating with interpretable machine learning provides a robust and practical solution for improving landslide susceptibility assessment in complex mountainous environments.

Author Contributions

Conceptualization, S.Y.; methodology, S.Y.; software, S.Y.; validation, S.Y., S.W., Y.G. and S.Y.; formal analysis, S.Y.; investigation, S.Y., S.W. and X.R.; resources, S.Y., S.W. and Y.G.; data curation, S.Y.; writing—original draft preparation, S.Y.; writing—review and editing, S.Y.; visualization, S.Y.; supervision, S.W.; project administration, S.Y., S.W., Y.G., X.R., W.L. and D.Z.; funding acquisition, S.W. and S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Ministry-Provincial Cooperation Pilot Project (2023ZRBSHZ048).

Data Availability Statement

The data presented in this study are available upon request from the corresponding author due to the confidential nature of the data.

Conflicts of Interest

All the authors declare that they have no financial interests.

References

  1. Malamud, B.D.; Turcotte, D.L.; Guzzetti, F.; Reichenbach, P. Landslide inventories and their statistical properties. Earth Surf. Process. Landf. 2004, 29, 687–711. [Google Scholar] [CrossRef]
  2. Guzzetti, F.; Reichenbach, P.; Cardinali, M.; Galli, M.; Ardizzone, F. Probabilistic landslide hazard assessment at the basin scale. Geomorphology 2005, 72, 272–299. [Google Scholar] [CrossRef]
  3. Van Westen, C.J.; Castellanos, E.; Kuriakose, S.L. Spatial data for landslide susceptibility, hazard and vulnerability assessment: An overview. Eng. Geol. 2008, 102, 112–131. [Google Scholar] [CrossRef]
  4. Sidle, R.C.; Ochiai, H. Landslides: Processes, Prediction, and Land Use; American Geophysical Union: Washington, DC, USA, 2006. [Google Scholar] [CrossRef]
  5. Reichenbach, P.; Rossi, M.; Malamud, B.D.; Mihir, M.; Guzzetti, F. A review of statistically-based landslide susceptibility models. Earth-Sci. Rev. 2018, 180, 60–91. [Google Scholar] [CrossRef]
  6. Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; ThaiPham, B.; Bui, D.T.; Avtar, R.; Abderrahmane, B. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance. Earth-Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
  7. Liu, S.; Wang, L.; Zhang, W.; He, Y.; Saha, P. A comprehensive review of machine learning-based methods in landslide susceptibility mapping. Geol. J. 2023, 58, 2283–2301. [Google Scholar] [CrossRef]
  8. Kalantar, B.; Pradhan, B.; Naghibi, S.A.; Motevalli, A.; Mansor, S. Assessment of the effects of training data selection on the landslide susceptibility mapping: A comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN). Geomat. Nat. Hazards Risk 2018, 9, 49–69. [Google Scholar] [CrossRef]
  9. Sevgen, E.; Kocaman, S.; Nefeslioglu, H.A.; Gokceoglu, C. A Novel Performance Assessment Approach Using Photogrammetric Techniques for Landslide Susceptibility Mapping with Logistic Regression, ANN and Random Forest. Sensors 2019, 19, 3940. [Google Scholar] [CrossRef] [PubMed]
  10. Azarafza, M.; Azarafza, M.; Akgün, H.; Atkinson, P.M.; Derakhshani, R. Deep learning-based landslide susceptibility mapping. Sci. Rep. 2021, 11, 24112. [Google Scholar] [CrossRef] [PubMed]
  11. Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
  12. Can, R.; Kocaman, S.; Gokceoglu, C. A Comprehensive Assessment of XGBoost Algorithm for Landslide Susceptibility Mapping in the Upper Basin of Ataturk Dam, Turkey. Appl. Sci. 2021, 11, 4993. [Google Scholar] [CrossRef]
  13. Kavzoglu, T.; Teke, A. Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost). Arab J. Sci. Eng. 2022, 47, 7367–7385. [Google Scholar] [CrossRef]
  14. Pawłuszek-Filipiak, K.; Lewandowski, T. The Impact of Feature Selection on XGBoost Performance in Landslide Susceptibility Mapping Using an Extended Set of Features: A Case Study from Southern Poland. Appl. Sci. 2025, 15, 8955. [Google Scholar] [CrossRef]
  15. Oliveira, S.C.; Zêzere, J.L.; Garcia, R.A.; Pereira, S.; Vaz, T.; Melo, R. Landslide susceptibility assessment using different rainfall event-based landslide inventories: Advantages and limitations. Nat. Hazards 2024, 120, 9361–9399. [Google Scholar] [CrossRef]
  16. Wasowski, J.; Bovenga, F. Investigating landslides and unstable slopes with satellite Multi Temporal Interferometry: Current issues and future perspectives. Eng. Geol. 2014, 174, 103–138. [Google Scholar] [CrossRef]
  17. Intrieri, E.; Raspini, F.; Fumagalli, A.; Lu, P.; Del Conte, S.; Farina, P.; Allievi, J.; Ferretti, A.; Casagli, N. The Maoxian landslide as seen from space: Detecting precursors of failure with Sentinel-1 data. Landslides 2018, 15, 123–133. [Google Scholar] [CrossRef]
  18. Chang, M.; Sun, W.; Xu, H.; Tang, L. Identification and deformation analysis of potential landslides after the Jiuzhaigou earthquake by SBAS-InSAR. Environ. Sci. Pollut. Res. 2023, 30, 39093–39106. [Google Scholar] [CrossRef] [PubMed]
  19. Ma, T.; Yi, X.; Ci, H.; Wang, R.; Yang, H.; Yan, Z. Landslide Susceptibility Evaluation Integrating Machine Learning and SBAS-InSAR-Derived Deformation Characteristics: A Case Study of Yining County, Xinjiang. Sensors 2026, 26, 707. [Google Scholar] [CrossRef] [PubMed]
  20. Zhang, K.; Xiao, W.; Ning, S.; Zhou, Y.; Sun, X.; Zhu, H.; Rong, A.; Huang, Y.; Yuan, H.; Thapa, B.R. Assessing landslide susceptibility with dynamic deformation monitoring and explainable machine learning: A case study in Longhua County, China. Geomat. Nat. Hazards Risk 2026, 17, 1. [Google Scholar] [CrossRef]
  21. Lundberg, S.M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J.M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; Lee, S.I. From local explanations to global understanding with explainable AI for trees. Nature Mach. Intell. 2020, 2, 56–67. [Google Scholar] [CrossRef]
  22. Sun, D.; Chen, D.; Zhang, J.; Mi, C.; Gu, Q.; Wen, H. Landslide Susceptibility Mapping Based on Interpretable Machine Learning from the Perspective of Geomorphological Differentiation. Land 2023, 12, 1018. [Google Scholar] [CrossRef]
  23. Qiu, H.; Xu, Y.; Tang, B.; Su, L.; Li, Y.; Yang, D.; Ullah, M. Interpretable Landslide Susceptibility Evaluation Based on Model Optimization. Land 2024, 13, 639. [Google Scholar] [CrossRef]
  24. Khan, D.; Akram, W.; Ullah, S. Enhancing landslide susceptibility predictions with XGBoost and SHAP: A data-driven explainable AI method. Geocarto Int. 2025, 40, 2514725. [Google Scholar] [CrossRef]
  25. Wen, H.; Liu, B.; Di, M.; Li, J.; Zhou, X. A SHAP-enhanced XGBoost model for interpretable prediction of coseismic landslides. Adv. Space Res. 2024, 74, 3826–3854. [Google Scholar] [CrossRef]
  26. Wen, H.; Yan, F.; Huang, J.; Li, Y. Interpretable machine learning models and decision-making mechanisms for landslide hazard assessment under different rainfall conditions. Expert Syst. Appl. 2025, 270, 126582. [Google Scholar] [CrossRef]
  27. Yao, Z.; Chen, M.; Zhan, J.; Zhuang, J.; Sun, Y.; Yu, Q.; Yu, Z. Refined Landslide Susceptibility Mapping by Integrating the SHAP-CatBoost Model and InSAR Observations: A Case Study of Lishui, Southern China. Appl. Sci. 2023, 13, 12817. [Google Scholar] [CrossRef]
  28. Geng, H.; Wang, W.; Liu, J.; Benson, D. Landslide susceptibility modeling based on SHAP interpretability and ensemble learning: A case study in Fuyuan County, Southwest China. Front. Earth Sci. 2025, 13, 1731872. [Google Scholar] [CrossRef]
  29. Zhao, Z.; Liu, Z.Y.; Xu, C. Slope Unit-Based Landslide Susceptibility Mapping Using Certainty Factor, Support Vector Machine, Random Forest, CF-SVM and CF-RF Models. Front. Earth Sci. 2021, 9, 589630. [Google Scholar] [CrossRef]
  30. Biswas, B.; Rahaman, A.; Barman, J. Comparative Assessment of FR and AHP Models for Landslide Susceptibility Mapping for Sikkim, India and Preparation of Suitable Mitigation Techniques. J. Geol. Soc. India 2023, 99, 791–801. [Google Scholar] [CrossRef]
  31. Ali, A.; Teku, D.; Sisay, T.; Mihret, B. A combined analysis of frequency ratio and analytical hierarchy process for landslide susceptibility assessment in Tenta, South Wollo, Ethiopia. Sci. Rep. 2025, 15, 17899. [Google Scholar] [CrossRef]
  32. Duan, M.; Li, Z.; Xu, B.; Jiang, W.; Cao, Y.; Xiong, Y.; Wei, J. Turbulent atmospheric phase correction for SBAS-InSAR. J. Geod. 2024, 98, 81. [Google Scholar] [CrossRef]
  33. Yin, C.; Li, H.; Che, F.; Li, Y.; Hu, Z.; Liu, D. Susceptibility mapping and zoning of highway landslide disasters in China. PLoS ONE 2020, 15, e0235780. [Google Scholar] [CrossRef]
  34. Dandridge, C.; Stanley, T.; Kirschbaum, D.; Amatya, P.; Lakshmi, V. The influence of land use and land cover change on landslide susceptibility in the Lower Mekong River Basin. Nat. Hazards 2023, 115, 1499–1523. [Google Scholar] [CrossRef]
  35. Meng, Y.; Yang, N.; Qian, Z.; Zhang, G. What Makes an Online Review More Helpful: An Interpretation Framework Using XGBoost and SHAP Values. J. Theor. Appl. Electron. Commer. Res. 2021, 16, 466–490. [Google Scholar] [CrossRef]
  36. Abdi, A.M. Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data. GISci. Remote Sens. 2020, 57, 1–20. [Google Scholar] [CrossRef]
  37. Rondinone, M.; Dal Sasso, S.F.; Aung, H.H.; Contillo, L.; Dimola, G.; Schiattarella, M.; Fiorentino, M.; Telesca, V. Assessing Flood and Landslide Susceptibility Using XGBoost: Case Study of the Basento River in Southern Italy. Appl. Sci. 2025, 15, 5290. [Google Scholar] [CrossRef]
  38. Badapalli, P.K.; Nakkala, A.B.; Kottala, R.B.; Gugulothu, S.; Hasher, F.F.B.; Mishra, V.N.; Zhran, M. Landslide Susceptibility Level Mapping in Kozhikode, Kerala, Using Machine Learning-Based Random Forest, Remote Sensing, and GIS Techniques. Land 2025, 14, 1453. [Google Scholar] [CrossRef]
  39. Ma, B.; Yin, C.; Gao, F.; Song, X.; Li, M. Landslide Susceptibility Mapping Using Remote Sensing Interpretation and a Blending-XGBoost-CNN Model. Appl. Sci. 2025, 15, 11969. [Google Scholar] [CrossRef]
  40. Xu, Q.; Yordanov, V.; Amici, L.; Brovelli, M.A. Landslide susceptibility mapping using ensemble machine learning methods: A case study in Lombardy, Northern Italy. Int. J. Digit. Earth 2024, 17, 2346263. [Google Scholar] [CrossRef]
  41. Sharma, N.; Saharia, M.; Ramana, G.V. High resolution landslide susceptibility mapping using ensemble machine learning and geospatial big data. Catena 2024, 235, 107653. [Google Scholar] [CrossRef]
Figure 1. Overview of the study area: (a) Geographic location. (b) Main fault distribution. (c) Landslide point distribution and engineering geological units.
Figure 1. Overview of the study area: (a) Geographic location. (b) Main fault distribution. (c) Landslide point distribution and engineering geological units.
Remotesensing 18 00845 g001
Figure 2. Spatial distribution of environmental factors in the study area. (a) Slope. (b) Aspect. (c) Elevation. (d) Distance to faults. (e) Distance to rivers. (f) Distance to roads.
Figure 2. Spatial distribution of environmental factors in the study area. (a) Slope. (b) Aspect. (c) Elevation. (d) Distance to faults. (e) Distance to rivers. (f) Distance to roads.
Remotesensing 18 00845 g002
Figure 3. Research Methodology Flowchart.
Figure 3. Research Methodology Flowchart.
Remotesensing 18 00845 g003
Figure 4. Stacking-InSAR workflow.
Figure 4. Stacking-InSAR workflow.
Remotesensing 18 00845 g004
Figure 5. Updated landslide distribution pattern within the study region (red circles represent newly identified landslides through multi-source remote sensing technologies; black rounded squares represent historical landslides). (a) Zone: Landslide concentration along the Wulang River. (b) Zone: Landslide concentration along the Jinsha River. (c) Zone: Newly identified landslide concentration along the Maguo River. (d) Zone: Landslide concentration along the Ludila Hydroelectric Station section of the Jinsha River.
Figure 5. Updated landslide distribution pattern within the study region (red circles represent newly identified landslides through multi-source remote sensing technologies; black rounded squares represent historical landslides). (a) Zone: Landslide concentration along the Wulang River. (b) Zone: Landslide concentration along the Jinsha River. (c) Zone: Newly identified landslide concentration along the Maguo River. (d) Zone: Landslide concentration along the Ludila Hydroelectric Station section of the Jinsha River.
Remotesensing 18 00845 g005
Figure 6. Typical surface deformation features. (a,b) InSAR deformation rate map of the landslide concentration zone. (c,d) Interferometric pair characteristics of typical landslide InSAR deformation monitoring results. (eg) Landslide deformation damage features visible in optical imagery and field surveys.
Figure 6. Typical surface deformation features. (a,b) InSAR deformation rate map of the landslide concentration zone. (c,d) Interferometric pair characteristics of typical landslide InSAR deformation monitoring results. (eg) Landslide deformation damage features visible in optical imagery and field surveys.
Remotesensing 18 00845 g006
Figure 7. Performance Evaluation of the XGBoost Model Based on Historical Landslide Inventory (I1 Group). (a) Confusion matrix showing the classification results between landslide and non-landslide samples. (b) Radar chart of evaluation metrics (Accuracy, Precision, Recall, F1-Score).
Figure 7. Performance Evaluation of the XGBoost Model Based on Historical Landslide Inventory (I1 Group). (a) Confusion matrix showing the classification results between landslide and non-landslide samples. (b) Radar chart of evaluation metrics (Accuracy, Precision, Recall, F1-Score).
Remotesensing 18 00845 g007
Figure 8. Performance Improvement of the Landslide Susceptibility Model after Incorporating Multi-Source Remote Sensing Samples (I2 Group). (a) Confusion matrix showing the classification results between landslide and non-landslide samples. (b) Radar chart of evaluation metrics (Accuracy, Precision, Recall, F1-Score).
Figure 8. Performance Improvement of the Landslide Susceptibility Model after Incorporating Multi-Source Remote Sensing Samples (I2 Group). (a) Confusion matrix showing the classification results between landslide and non-landslide samples. (b) Radar chart of evaluation metrics (Accuracy, Precision, Recall, F1-Score).
Remotesensing 18 00845 g008
Figure 9. Decomposition of SHAP values for landslide conditioning factors by effect direction. (The label (a) shows the model based on I1 samples; the label (b) shows the model based on I2 samples).
Figure 9. Decomposition of SHAP values for landslide conditioning factors by effect direction. (The label (a) shows the model based on I1 samples; the label (b) shows the model based on I2 samples).
Remotesensing 18 00845 g009
Figure 10. SHAP summary plot: (the label (a) shows the model based on I1 samples; the label (b) shows the model based on I2 samples). This plot illustrates the relationship between feature values (x-axis) and their contribution to the model output (landslide occurrence probability) (y-axis, SHAP values). Each point represents a sample, and the color indicates the magnitude of the feature value (red for high values, blue for low values). The horizontal distribution of points reveals the direction (positive or negative) and strength of the feature’s influence.
Figure 10. SHAP summary plot: (the label (a) shows the model based on I1 samples; the label (b) shows the model based on I2 samples). This plot illustrates the relationship between feature values (x-axis) and their contribution to the model output (landslide occurrence probability) (y-axis, SHAP values). Each point represents a sample, and the color indicates the magnitude of the feature value (red for high values, blue for low values). The horizontal distribution of points reveals the direction (positive or negative) and strength of the feature’s influence.
Remotesensing 18 00845 g010
Figure 11. SHAP dependence plots for (a) Interaction between Distance to Roads and Maximum Deformation Rate and (b) Interaction between Distance to Roads and Elevation.
Figure 11. SHAP dependence plots for (a) Interaction between Distance to Roads and Maximum Deformation Rate and (b) Interaction between Distance to Roads and Elevation.
Remotesensing 18 00845 g011
Figure 12. Landslide susceptibility maps based on (a) historical landslide inventory (I1) and (b) integrated landslide inventory (I2).
Figure 12. Landslide susceptibility maps based on (a) historical landslide inventory (I1) and (b) integrated landslide inventory (I2).
Remotesensing 18 00845 g012
Figure 13. Receiver operating characteristic (ROC) curves of landslide susceptibility models based on I1 and I2 inventories.
Figure 13. Receiver operating characteristic (ROC) curves of landslide susceptibility models based on I1 and I2 inventories.
Remotesensing 18 00845 g013
Table 1. Datasets used in this study and their sources.
Table 1. Datasets used in this study and their sources.
ParameterSpecific DataSource/Resolution
TopographicDEMALOS/PRISM (Global DSM), 12.5 m resolution
GeologicalEngineering Geological Rock Group; Faults1:50,000 regional geological map and field survey interpretation
Hydrological & AnthropogenicRivers; RoadsNational Geographical Information
Resources Directory Service
DeformationSurface Deformation
Velocity (Ascending/
Descending Orbit)
Sentinel-1 SAR images (2017–2024),
extracted using the Stacking InSAR technique
Landslide InventoryHistorical landslide points; Remotely Sensed Landslide PointsYunnan Geological Hazard Database;
Integrated interpretation of multi-temporal optical imagery (GF-2/Google Earth, 2017–2024) and InSAR results
Table 2. Multicollinearity analysis results (VIF) of conditioning factors.
Table 2. Multicollinearity analysis results (VIF) of conditioning factors.
FeatureToleranceVIF
Elevation0.751.333
Engineering Geological Rock Group0.8791.138
Distance to Faults0.8481.179
Distance to Roads0.7661.305
Slope0.8541.171
Aspect0.9571.045
Distance to Rivers0.9621.039
Maximum Deformation Rate0.9161.092
Average Deformation Rate0.1456.904
Minimum Deformation Rate0.08112.245
Standard Deviation of Deformation Rate0.09810.225
Table 3. Susceptibility partition table based on two sets of disaster catalogs.
Table 3. Susceptibility partition table based on two sets of disaster catalogs.
Landslide Susceptibility
Zonation
(I1)
Area/km2
(I1)
Proportion/%
(I2)
Area/km2
(I2)
Proportion/%
Very-Low-Susceptibility Zone1358.6826.651249.9924.51
Low-Susceptibility Zone1934.8937.952043.4940.07
Moderate-Susceptibility Zone1424.0427.931383.2427.13
High-Susceptibility Zone284.955.59321.466.30
Very-High-Susceptibility Zone96.431.89100.821.98
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yan, S.; Wang, S.; Guo, Y.; Rong, X.; Zhao, D.; Li, W. A Dynamic Landslide Susceptibility Assessment Method Based on Multi-Source Remote Sensing, XGBoost, and SHAP: A Case Study in Yongsheng County, Yunnan Province. Remote Sens. 2026, 18, 845. https://doi.org/10.3390/rs18060845

AMA Style

Yan S, Wang S, Guo Y, Rong X, Zhao D, Li W. A Dynamic Landslide Susceptibility Assessment Method Based on Multi-Source Remote Sensing, XGBoost, and SHAP: A Case Study in Yongsheng County, Yunnan Province. Remote Sensing. 2026; 18(6):845. https://doi.org/10.3390/rs18060845

Chicago/Turabian Style

Yan, Shuhao, Shanshan Wang, Yixuan Guo, Xingxing Rong, Dan Zhao, and Wei Li. 2026. "A Dynamic Landslide Susceptibility Assessment Method Based on Multi-Source Remote Sensing, XGBoost, and SHAP: A Case Study in Yongsheng County, Yunnan Province" Remote Sensing 18, no. 6: 845. https://doi.org/10.3390/rs18060845

APA Style

Yan, S., Wang, S., Guo, Y., Rong, X., Zhao, D., & Li, W. (2026). A Dynamic Landslide Susceptibility Assessment Method Based on Multi-Source Remote Sensing, XGBoost, and SHAP: A Case Study in Yongsheng County, Yunnan Province. Remote Sensing, 18(6), 845. https://doi.org/10.3390/rs18060845

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop