An Optimized Heterogeneous Ensemble Learning Algorithm for InSAR Landslide Susceptibility Mapping Based on the Adaptive Sampling Strategy

Li, Lu; Cheng, Hongyan; Guo, Yuhua; Liu, Shangqiang; Yin, Jianyong; Wang, Jili

doi:10.3390/rs18121985

Open AccessArticle

An Optimized Heterogeneous Ensemble Learning Algorithm for InSAR Landslide Susceptibility Mapping Based on the Adaptive Sampling Strategy

by

Lu Li

^1,*

,

Hongyan Cheng

¹,

Yuhua Guo

¹,

Shangqiang Liu

¹,

Jianyong Yin

¹ and

Jili Wang

²

¹

Space Star Technology Co., Ltd., China Academy of Space Technology, Beijing 100095, China

²

Department of Space Microwave Remote Sensing System, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(12), 1985; https://doi.org/10.3390/rs18121985 (registering DOI)

Submission received: 15 March 2026 / Revised: 27 April 2026 / Accepted: 28 April 2026 / Published: 15 June 2026

(This article belongs to the Special Issue Landslide Detection Using Machine and Deep Learning)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

An adaptive sampling strategy that integrates InSAR-derived information with Getis–Ord Gi-based hotspot analysis was proposed to refine the samples.
A Monte Carlo-based frequency ratio analysis was proposed to address low computational efficiency and optimize conditioning factors.

What are the implications of the main findings?

Surface deformation information serves as an effective indicator for refining non-stable samples related to geological disasters and generating high-quality training samples for model learning.
The proposed framework provides support for regional geohazard susceptibility assessment under dynamic environments.

Abstract

Landslide susceptibility algorithms demonstrate high reliability in quantifying the likelihood of landslide occurrence. However, traditional methods are often limited by computationally intensive sampling strategies and models with limited adaptability. In this study, we propose an adaptive sampling strategy based on hotspot analysis to enhance the reliability of the generated samples. Additionally, we develop an improved meta-ensemble (IME) stacking-based heterogeneous framework for landslide susceptibility assessment by integrating a support vector machine (SVM), random forest (RF), and XGBoost. To further reduce factor complexity, a Monte Carlo-based frequency ratio analysis is employed. The Baihetan Reservoir area along the Jinsha River was selected as the study area. A total of 26 conditioning factors were considered, supplemented by 120 Sentinel-1A images to cover the study area. The proposed sampling strategy was then used to generate high-quality samples. Finally, to evaluate the performance of the proposed method, the proposed ensemble learning framework was applied to assess landslide susceptibility with eight models using five evaluation metrics. The experimental results demonstrated that: (1) the adaptive sampling strategy improved both the quantity and quality of the training samples; (2) the adoption of the Monte Carlo strategy increased the sample partitioning rate; and (3) despite the formally highest IME metrics, the inclusion of InSAR information did not lead to a statistically significant improvement in the forecast compared to the high-quality basic sampling strategy. Overall, the proposed methodology provides valuable support for regional geohazard susceptibility assessment in dynamic environments.

Keywords:

landslide susceptibility; ensemble learning algorithm; sampling strategy; synthetic aperture radar interferometry (InSAR); hotspot analysis

1. Introduction

Geological disaster evolution is a long-term and dynamic process controlled by multiple factors. Variations in external triggering conditions at different evolutionary stages can influence the likelihood of geological disaster occurrence. Landslide susceptibility assessment quantifies the likelihood of geological disasters, such as landslides, occurring within a specific region over a given period. These algorithms typically draw on historical information from the study area and employ complex models to forecast future land conditions [1]. Over the past few decades, landslide susceptibility algorithms exhibit high reliability in assessing the potential landslides and ground subsidence, making them increasingly robust and widely applicable [2].

Geological disasters are significantly influenced by extreme weather events and intensive anthropogenic activities. As a result, they exhibit broad distribution, frequent occurrences, and severe impacts [3]. Moreover, landslides commonly develop in a concealed manner and occur suddenly, posing risks to human life and property. Due to the varying geological environments across different regions, landslide susceptibility assessment faces significant challenges [4]. Consequently, research on landslide susceptibility assessment is essential. With the rapid advancement of computer science, remote sensing technologies, and geographic information systems (GISs), numerous models have been developed. Through expansion into diverse scenarios, these models have progressively matured. In general, these methods are categorized into two types: knowledge-driven methods and data-driven methods [4]. Knowledge-driven methods rely on hierarchical expert analysis, but often exhibit limited practical applicability. For the latter, numerous studies have incorporated geological, meteorological, and other multi-source datasets to assess landslide susceptibility. These studies highlight the critical role of multi-source data and advanced methods in enhancing predictive accuracy. Rapid advances in computing technologies have driven the evolution of assessment algorithms. Initially, basic statistical methods, such as frequency ratio analysis [5] and weight-of-evidence models [6], were commonly used. Subsequently, advanced methods, including logistic regression [7], support vector machines (SVMs) [8], and decision trees [9], have been increasingly adopted. These methods have exhibited robust performance across various scenarios. Notably, artificial intelligence-based algorithms have attracted considerable attention due to their superior accuracy, computational efficiency, and cost-effectiveness. Machine learning algorithms are a branch of artificial intelligence. They facilitate thorough and nuanced assessment of geological disasters by analyzing high-dimensional datasets. Xie et al. [10] employed logistic regression and SVMs for landslide susceptibility assessment in 2017. Chen et al. [11] employed the random forest (RF) model for landslide risk assessment in Taibai County in 2018. Hakim et al. [2] employed four models to assess geological hazard susceptibility in Jakarta, Indonesia, in 2020. To improve predictive accuracy, numerous enhanced machine learning methods have been proposed [12]. In 2023, Liu et al. [8] refined the SVM by substituting the Euclidean distance with the Mahalanobis distance. In 2024, Ouyang et al. [13] optimized the loss function by introducing the PU-PullBaggingDT model. In addition, GIS-based models that integrate multiple meta-ensembles have been developed in recent years. Numerous experiments have demonstrated that machine learning methods outperform conventional methods in landslide susceptibility assessment [14,15].

The reliability of landslide susceptibility assessment is closely related to the selection of models and the quality of input samples [16]. On the one hand, different models exhibit distinct strengths and limitations, making it difficult to adapt to dynamically geological environments. For example, decision trees are well-suited for discrete data, whereas SVMs excel in high-dimensional data [17]. At this stage, a single model presents limitations for diverse scenarios [18]. Over the past few decades, meta-ensemble algorithms have attracted widespread attention as novel methods. By integrating multiple weak classifiers, the meta-ensemble algorithm leverages the strengths of baseline models to improve generalization [19]. Meta-ensemble algorithms can be categorized into homogeneous and heterogeneous types. In homogeneous ensemble algorithms, such as bagging and boosting [20], the same classifier is used as the baseline model. In contrast, heterogeneous ensemble algorithms (e.g., stacking) employ diverse classifiers as baseline models. By relying on diverse classifiers to extract features from multiple perspectives, heterogeneous ensemble algorithms realize complementary advantages [21,22].On the other hand, the reliability of the model depends on the quality of the input samples. Bias introduced by sampling can lead to significant differences in predictions [23].

The threshold-based sampling strategy randomly selects negative samples from features that fall below a specified threshold [18]. The similarity-based sampling strategy selects negative samples based on similarity to the stable area [24]. For instance, the geographical similarity-based sampling strategy is innovative but computationally intensive due to point-by-point calculations [25]. Statistical analysis was used to identify non-stable samples from low-susceptibility areas. However, the method is highly affected by regional variability [26]. Hong et al. improved the quality of training samples through reliability scoring [27]. Huang et al. introduced the self-organizing map, an unsupervised model that does not require labeled data. The method focuses on spatial coverage and feature representativeness, conducting clustering or pattern mapping but not yielding landslide probabilities or classifications [28]. Random sampling strategies are easier to implement than other methods. The random sampling strategy selects negative samples from areas free of geological disasters or from buffer zones surrounding a known hazard position [29]. A typical example is double-buffer-based sampling methods, which extract samples from non-stable locations or buffer zones around known hazard sites [30]. However, such methods may include potential geological hazard zones in the samples [31]. Non-stable samples are closely related to the occurrence of geological disasters in the corresponding ground surface. Surface deformation information serves as an effective indicator for locating non-stable samples associated with geological disasters [32]. At this point, synthetic aperture radar interferometry (InSAR) offers high-precision, large-scale, and spatially continuous deformation information, thereby aiding in geological disaster identification, impact assessment, risk early warning, and disaster prevention. Combined with automatic detection methods, such as hotspot analysis (HSA), surface deformation derived from InSAR can be used to identify the non-stable samples and further enhance the quality of samples. The effectiveness of HSA has been validated across various scenarios. In 2012, Lu et al. [33] applied HSA to detect slow-moving landslides with low displacement rates. In 2021, Zhu et al. [34] employed HSA to identify clusters of anomalous deformation along the Minjiang River in Mao County. In addition, sample deficiencies may affect the model accuracy [35]. High collinearity among conditioning factors increases sample complexity, degrades model performance, and may lead to erroneous predictions. Redundancy among conditioning factors is a key consideration in model construction. Therefore, in dynamic monitoring environments, developing a robust model along with an effective sampling strategy is crucial for studying the landslide susceptibility.

The Baihetan Hydropower Station reservoir area, situated along the Jinsha River’s main channel, features steep slopes, intersecting gullies, and unevenly distributed hazardous rock masses on both banks. Such terrain is prone to landslides, debris flows, and other geological hazards [36]. Disasters with complex underlying mechanisms pose serious risks to human life, property, and infrastructure. Given the intricate geological conditions and the presence of multiple triggering factors, the Baihetan Reservoir area serves as a suitable site for validating the proposed method.

Building on previous research and the existing challenges, this paper proposes an optimized heterogeneous ensemble learning algorithm combined with an adaptive sampling strategy based on hotspot analysis to assess landslide susceptibility. The Baihetan reservoir area is selected as the study area. To validate the effectiveness of our proposed method, a comparative analysis was conducted with eight models using five metrics.

2. Study Area and Datasets

2.1. Study Area

The study area covers the middle and lower reaches of the Jinsha River in China, with its western bank lying in Sichuan Province and its eastern bank lying in Yunnan Province. The area covers approximately 12,760.9

{km}^{2}

, with elevations ranging from 403 to 4071 m. The coverage of the study area is illustrated in Figure 1. The study area is characterized by complex terrain, including steep slopes, rugged roads, strong river erosion, and V-shaped valleys, with prominent ridges and deeply incised valleys [37]. Multiple fault zones are distributed throughout the entire area, predominantly consisting of reverse faults [38]. Some faults are associated with compressive fracture zones with widths of several tens of meters. The unique geological conditions, shaped by tectonic activity, human activities, and soil erosion, have made the region susceptible to geological disasters [39]. The proportions of geological disaster types in the study area are shown in Figure 1b. Over 700 disasters have been recorded in the study area, predominantly comprising landslides and debris flows. These geological disasters are primarily distributed along fault zones and river corridors, posing threats to nearby villages.

2.2. Conditioning Factors

Given the complexity of the triggering mechanisms, selecting the appropriate factors is crucial for assessing landslide susceptibility [40]. To comprehensively understand how conditioning factors trigger geological disasters, various datasets were collected from multiple sources. The collected datasets are divided into six categories, comprising a total of 27 conditioning factors, including 7 topographic data, 4 anthropogenic data, 4 geological data, 7 hydro-meteorological data, 4 InSAR-derived data, and 1 historical disaster inventory. The statistics of the conditioning factors covering the study area are listed in Table 1, while their sources and resolutions are listed in Table 2.

Environmental changes encompass variations in precipitation and temperature, topography, human activities, geological features, and hydro-meteorological conditions. Topographic data were derived using the functional relationship between elevation and the factors. A time-series InSAR technique was utilized to derive four types of InSAR-related information. Anthropogenic data were calculated using vector data of roads, buildings, and others. To standardize the spatial resolution, those with different sizes were resampled to a 30 m resolution using the cubic interpolation method. Figure 2 depicts the distribution of all 26 conditioning factors of the study area. The color scale follows a red–yellow–blue gradient, where red represents lower values and blue represents higher values.

Topographic data encompass elevation and derived metrics, including slope, aspect, and curvature. Higher slope values indicate steeper terrain, which is prone to high-elevation rock landslides, whereas lower slope values correspond to flatter terrain. Aspect represents the direction of slope orientation. Curvature describes the degree of change in terrain surface orientation and reflects the mechanisms underlying disaster evolution. Buildings and roads are commonly used as representative indicators of human activities [16]. Land subsidence induced by building loads primarily occurs in developed areas [41]. The normalized difference vegetation index (NDVI) reflects vegetation density and health status [42]. Higher NDVI values indicate denser and healthier vegetation, which plays a vital role in soil stability and erosion control. Well-developed root systems improve soil stability and reduce the risk of landslides [43]. Lithology refers to the rock properties of geological formations. Rock masses near active faults are more likely to develop joints and fractures. Landslide susceptibility is closely associated with the distribution and activity of fault zones [43]. Vegetation coverage provides essential information for the assessment of ecological quality. Areas with sparse vegetation cover are vulnerable to hydraulic erosion and surface runoff. Hydro-meteorological data reflect the slope stability. Therefore, calculating the distance to rivers and drainage networks is crucial for investigating geological disasters. Soil erosion is one of the primary drivers of land degradation and desertification, especially in arid and semi-arid regions [44]. Soil monitoring is essential for shaping conservation policies and informing restoration efforts in areas prone to erosion.

2.3. SAR Datasets

A total of 120 Sentinel-1A SAR images, including 60 images from frame 499 and 60 images from frame 504, were used to cover the study area. The coverage period was from January 2020 to May 2022. A 12 × 3 multilooking ratio was applied to reduce speckle noise and improve phase quality. The SAR image acquired on 17 January 2021 was selected as the master image, with the other SAR images registered and resampled accordingly. The spatial–temporal baseline distribution is shown in Figure 3. The terrain visibility of the study area can be quantified based on the cosine of the angle between the local terrain surface and the radar beam [45]. The average coherence of all interferograms was calculated based on Pearson’s coherence [46]. In this study, we utilized the standard SBAS-InSAR to derive the surface deformation of the study area [47]. Two kinds of deformation information were derived, including the annual deformation rate and the time-series cumulative deformation. For the latter, only the maximum cumulative deformation on the last date was selected as representative and used as the model input.

3. Methodology

Figure 4 illustrates the proposed framework for InSAR landslide susceptibility mapping. The framework comprises six main steps: (1) dataset acquisition, (2) conditioning factor optimization, (3) training sample construction using an adaptive sampling strategy, (4) landslide susceptibility model construction by integrating an SVM, RF, and XGBoost, and (5) model performance evaluation using multiple metrics.

3.1. Conditioning Factor Optimization

The conditioning factors triggering landslides differ among the different regions [48]. Before conducting the modeling, it is essential to investigate the conditioning factors contributing to disasters of the study area. Sample deficiencies affect the model accuracy [41]. High-quality factors should be collected, while low-reliability factors should be eliminated. This section analyzes the conditioning factors from three aspects: collinearity analysis, importance assessment, and frequency ratio analysis.

3.1.1. Collinearity Analysis

Highly correlated samples in the assessment model can lead to inaccurate results and reduce predictive performance [49]. Selecting independent conditioning factors reduces factor complexity, thereby improving model performance [50]. Common methods for collinearity analysis include the Pearson correlation coefficient (PCC), tolerance (TOL), and variance inflation factor (VIF) [51]. PCC measures the degree of linear correlation between two variables [52]. For the i-th conditioning factor

T_{i}

and the j-th conditioning factor

T_{j}

, the correlation coefficient

γ_{i, j}

can be expressed as

γ_{i, j} = \frac{\sum_{n = 1}^{N} \sum (T_{i, n} - {\bar{T}}_{i}) (T_{j, n} - {\bar{T}}_{j})}{\sqrt{\sum_{n = 1}^{N} {(T_{i, n} - {\bar{T}}_{i})}^{2}} \sqrt{\sum_{n = 1}^{N} {(T_{j, n} - {\bar{T}}_{j})}^{2}}}

(1)

where

γ_{i, j} > 0

indicates a positive correlation between the two variables, with a larger value representing a stronger positive correlation.

γ_{i, j} < 0

indicates a negative correlation, with a smaller value representing a stronger negative correlation. When

γ_{i, j}

approaches 0, the two variables become less correlated.

TOL and VIF are used to assess the independence between conditioning factors.

V I F_{i, j} = \frac{1}{1 - γ_{i, j}} = \frac{1}{T O L_{i, j}}

(2)

where

V I F_{i, j}

is greater than 10 or

T O L_{i, j}

is less than 0.1. It indicates a strong collinearity between the variables.

3.1.2. Importance Assessment Based on Gini Index

Different conditioning factors contribute differently to the assessment model. Common methods for measuring feature importance include the information gain ratio, Gini index, and others [53]. The Gini index offers clear advantages over the information gain ratio, particularly in handling continuous variables and maintaining robust performance under noisy conditions. The Gini index measures feature importance through split-induced reduction and remains robust even under low signal-to-noise ratios. When the variables are continuous and uncorrelated, the Gini index is unbiased. Assume that there are n variables

T_{1}

,

T_{2}

, …,

T_{n}

, and J classes; then, the impurity

G i n i_{k}

at node k can be expressed as

G i n i_{k} = \sum_{j = 1}^{J} {\hat{p}}_{j, k} (1 - {\hat{p}}_{j, k})

(3)

where

{\hat{p}}_{j, k}

denotes the probability that a sample at node k belongs to class j.

When the problem is binary, i.e.,

J = 2

, the Gini impurity

G i n i_{k}

at node k can be expressed as

G i n i_{k} = 2 {\hat{p}}_{k} (1 - {\hat{p}}_{k})

(4)

where

{\hat{p}}_{k}

represents the probability of a sample belonging to a specific class at node k.

The change in Gini impurity at node k caused by splitting on variable

T_{i}

can be expressed as

I_{i, k} = G i n i_{k} - G i n i_{k_{1}} - G i n i_{k_{2}} - \dots - G i n i_{k_{o}}

(5)

where

k_{1}

,

k_{2}

, …,

k_{o}

are the o child nodes resulting from the split of node k, and

G i n i_{k_{1}}

,

G i n i_{k_{2}}

, …,

G i n i_{k_{o}}

are the corresponding Gini impurities.

Assume that variable

T_{i}

appears M times in the l-th decision tree; then, the importance

I_{i, l}

can be expressed as

I_{i, l} = \sum_{k = 1}^{M} I_{i, k}

(6)

Assume that there are L decision trees in the random forest; then, the overall importance

I_{i}

of variable

T_{i}

can be expressed as

I_{i} = \frac{1}{L} \sum_{l = 1}^{L} I_{i, l}

(7)

where

I_{i}

represents the importance of the i-th variable

T_{i}

. In the RF model,

I_{i}

denotes the average reduction in node impurity during splitting.

3.1.3. Frequency Ratio Analysis Based on Monte Carlo Method

Frequency ratio (FR) analysis was used to evaluate the relationship between the proportion of non-stable samples and the conditioning factors. The method facilitates the interpretation of the statistical relationships between the distributions of conditioning factors and geological hazards [16]. The natural breaks method incorporates a clustering concept to maximize within-class similarity and between-class dissimilarity. It ensures that the range and number of elements in each class remain balanced [54]. In this study, the natural breaks method was employed to partition the conditioning factors. For large datasets, natural breaks may be computationally inefficient. To address this, the Monte Carlo method was adopted. The relationship between the number of iterations and running time in this study is illustrated in Figure 5. The number of groups was set to 5 and the number of samples to

4, 628, 437

. The details of the enhanced FR analysis are presented as follows:

Assume that there are n features

T_{1}

,

T_{2}

, …,

T_{n}

, and the i-th feature

T_{i}

is divided into M groups. The natural breaks

λ_{i, m}

are calculated using the Monte Carlo method, and the natural breaks index

J e n k s_{i}

can be expressed as

J e n k s_{i} = \frac{1}{M} \sum_{m = 1}^{M} λ_{i, m} (T_{i, m})

(8)

where

T_{i, m}

represents the feature

T_{i}

in the m-th group.

Based on the

J e n k s_{i}

,

T_{i}

is grouped into

T J_{i, J e n k s_{i}}

. The proportion

σ_{t, i}

in the corresponding partition is calculated. The spatial location

L o c_{i, J e n k s_{i}}

of the partition samples is determined as follows:

\begin{matrix} T J_{i, J e n k s_{i}} = J e n k s_{i}^{- 1} (T_{i}) \\ σ_{t, i} = \frac{\sum (T J_{i, J e n k s_{i}})}{\sum (T_{i})} \end{matrix}

(9)

Based on the location of the partition samples

L o c_{i, J e n k s_{i}}

and the non-stable samples

G H

, the non-stable samples in the partition are determined if they belong to the non-stable samples set

G H | L o c_{i, J e n k s_{i}} |

. The proportion of non-stable samples in each natural breaks partition can be expressed as

σ_{g h, i} = \frac{\sum (G H | L o c_{i, J e n k s_{i}} |)}{\sum (G H | L o c_{i} |)}

(10)

The proportion

σ_{g h, i}

of non-stable samples in each partition relative to the total number of non-stable samples, and the proportion

σ_{t, i}

of samples in each partition relative to the overall total, are calculated. For the i-th feature, the frequency ratio

σ_{i}

can be expressed as

σ_{i} = \frac{σ_{g h, i}}{σ_{t, i}}

(11)

3.2. Adaptive Sampling Strategy Based on Hotspot Analysis Method

Traditional sampling methods use historical data on geological disasters and the double-buffer sampling method. Whether such events are currently occurring remains uncertain. Geohazard-prone areas identified from InSAR-derived surface deformation can be classified as non-stable samples. Based on InSAR-derived deformation, hotspot analysis can be used to automatically identify the deformation areas [55]. Based on this, an adaptive sampling strategy is proposed, and the detailed workflow is illustrated in Figure 6.

Candidate stable areas are initially obtained using the double-buffer sampling method. For each geohazard point, an inner buffer with a radius of a km and an outer buffer with a radius of b km are established. Generally, a is set to 1 km and b is set to 15 km [34]. The candidate stable area

S_{a, b}

, defined as the region between the two buffers, can be expressed as

S_{a, b} = \{x ∣ a km < min_{g_{i} \in G} d (x, g_{i}) \leq b km\}

(12)

where

a = 1

,

b = 15

,

G = {g_{i} ∣ i = 1, 2, \dots, n}

represents the set of geohazard points,

g_{i}

represents the i-th geohazard point, x represents an arbitrary location, and

d (x, g_{i})

represents the Euclidean distance from the point x to the i-th point.

Ground deformation information covering the study area was derived using the SBAS-InSAR technique. The Getis–Ord

G_{i}

statistic, which quantifies the clustering of deformation points within a specified distance, can be used to identify these points [33]. For each point, a Z-score threshold of ±1.96 and a P-value threshold of 95% are applied to identify hotspots [55]. These hotspots represent the deformation points. For the i-th point to j-th surrounding point, the

G_{i}^{*} (d)

within a distance d can be expressed as

G_{i}^{*} (d) = \frac{\sum v_{i} + v_{i} - n_{i j} \times \bar{v}}{\sqrt{S^{*} \{[(n \times n_{i j}) - {n_{i j}}^{2}] / (n - 1)\}}}

(13)

where

n_{i j}

is the number of surrounding points, v is the deformation rate, and S is the standard deviation. By applying a distance threshold, these hotspots are delineated as potential deformation areas. Generally, the distance threshold is set to 100 m [56]. Each pixel within the candidate stable area was screened, and the remaining region was designated as the refined stable area, with the corresponding pixels treated as stable samples. Meanwhile, geohazard points were treated as non-stable samples. By applying a deformation rate threshold, the non-stable points were further refined. Typically, a rate threshold of 10 mm/year was established to differentiate between the stable and non-stable samples [56].

Random sampling was conducted to extract the equal numbers of non-stable and stable samples. It should be noted that conditioning factor optimization should be done beforehand. The properties of conditioning factors were extracted to construct the sample dataset. Finally, the non-stable and stable samples were divided into training and validation sets according to a specific ratio, typically 7:3 or 8:2.

3.3. Stacking-Based Improved Meta-Ensemble Model

Single learners exhibit limited capability due to the diversity of features. The integration of multiple base learners, known as meta-ensemble learning (MEL), has become a trend in machine learning. MEL is typically categorized into three types [4]: bagging, boosting, and stacking. Bagging and boosting methods belong to a homogeneous MEL model and have been widely applied in various fields [57]. RF, AdaBoost, and XGBoost are representative examples. The stacking method, a type of heterogeneous MEL model, provides greater robustness compared to the homogeneous model. It mitigates model limitations and provides flexibility for various scenarios. The structure of the stacking-based meta-ensemble model is illustrated in Figure 7.

The stacking method employs a two-layer structure that utilizes an RF, SVM, and XGBoost as base learners. The learners used in the first layer are referred to as primary learners, while those in the second layer used for fitting are called secondary learners [57]. The secondary learner receives training samples, generates a feature set, and then inputs them into the secondary learner. The data fed into the secondary learner is known as the secondary training set. In the second layer, the secondary learner analyzes the outputs of the primary learners and adjustments. It should be noted that as the number of model layers increases, model complexity grows while computational efficiency declines. Given the potential of InSAR-derived deformation, we propose an improved landslide susceptibility assessment model that integrates InSAR-derived information with a stacking-based meta-ensemble learning method. InSAR-derived information, including the deformation rate, cumulative deformation, coherence, and visibility, is also incorporated into the model. In addition, a grid search strategy is utilized to determine the optimal parameters.

3.4. Performance Evaluation

The performance of the model is typically evaluated by comparing predicted results with actual values, using metrics such as true positive (TP), false negative (FN), false positive (FP), and true negative (TN). In addition, a K-fold cross-validation strategy is utilized to enhance the reliability of the results. In general, K is set to 10 [13]. The process is repeated K times, and the average of the evaluation metrics is computed to yield the final measure. The ROC curve and AUC value are two additional metrics used to assess the performance of models. The ROC curve plots the FN rate on the x-axis and the TP rate on the y-axis. By adjusting the thresholds, a series of FP and TP values can be obtained. The AUC value is the area under the ROC curve, with a maximum value of 1. The closer the AUC value is to 1, the better the model’s performance.

4. Experimental Results and Analysis

A total of 26 conditioning factors were selected, including elevation, aspect, slope, contour, mean curvature, plan curvature, profile curvature, distance to road, distance to building, population density, land cover, NDVI, lithology, distance to fault, vegetation coverage, soil erosion, topographic factor, VPD, AVP, distance to Jinsha River, distance to river network, land surface temperature, visibility, coherence, deformation rate, and cumulative deformation.

4.1. Results and Analysis of Conditioning Factors

4.1.1. Collinearity Detection

High collinearity among conditioning factors increases sample complexity, degrades model performance, and may lead to erroneous predictions. Therefore, collinearity analysis should be conducted before modeling. The importance of conditioning factors is evaluated through independence analysis using the Pearson correlation coefficient (PCC), tolerance (TOL), and variance inflation factor (VIF). The statistics of TOL and VIF values of the 26 conditioning factors are listed in Table 3.

Compared with TOL and VIF, PCC values provide similarity among all conditioning factors, as illustrated in Figure 8. Among the 26 conditioning factors, cumulative deformation exhibits the strongest correlation with deformation rate, with a PCC value of 0.88. By comparison, the deformation rate can accurately describe the surface deformation. Land surface temperature shows strong correlations with four variables, namely elevation, VPD, AVP, and distance to the river network, with PCC values of −0.76, 0.71, 0.70, and −0.56, respectively, and was excluded. AVP is highly correlated with both elevation and VPD, with PCC values of −0.74 and 0.79, respectively, and was excluded. In addition, the correlation between slope and the topographic factor reaches 0.79. The sums of the absolute correlation coefficients between each of these two variables and other factors are 5.47 and 5.22, respectively. Slope exhibits a higher degree of correlation with the other variables and was excluded.

4.1.2. Frequency Ratio Analysis

The relationship between the occurrence rate of geological disasters and conditioning factors is illustrated in Figure 9. In each subplot, the red histogram represents the proportion of geological disasters within the classification intervals of the conditioning factor. The black histogram represents the proportion of the conditioning factor within each interval. The blue line represents the ratio of the disaster proportion to the factor proportion in each interval. A ratio greater than 1 indicates a stronger correlation between the category of the conditioning factor and the occurrence of geological disasters [1]. The relationship between elevation and disasters shows that geological disasters are prevalent in low- to mid-elevation areas, with a lower occurrence at higher elevations. For the three curvature-related factors, disasters occur frequently in the middle intervals. Due to the limited number of pixels with values in the contour, it is difficult to identify frequency-based patterns. The relationship between distance to road and geological disasters shows a trend, with the frequency of disasters gradually decreasing as the distance increases. Disasters occur frequently in sparsely populated areas, with occurrences reaching 70%, while densely populated areas show no disasters. Regarding distance to faults, the occurrence of disasters decreases with increasing distance from the fault. Active fault zones destabilize nearby rock masses, promoting the development of fractures, which facilitate disasters formation. Higher vegetation coverage contributes to soil and water stabilization, reducing disaster risk. Specifically, in the 80.21∼99.48 interval, disaster occurrence is low, whereas it doubles in the 70.46∼80.21 interval. For VPD, the proportion of geological disasters gradually increases with the index. In terms of distance to the Jinsha River, disasters decrease as distance from the river increases, indicating that rivers also promote disaster development. Finally, for the InSAR-derived features of cumulative deformation and deformation rate, areas with low deformation rates show little disaster occurrence. Regions with significant subsidence are typically located in mining areas, which are not included in the disaster inventory.

4.2. Prediction of Landslide Susceptibility

After performing collinearity analysis, importance evaluation, and frequency ratio analysis, 21 factors are ultimately selected as the samples. Among these, two conditioning factors derived from InSAR are included: coherence and deformation rate. The remaining 19 factors are: elevation, aspect, slope, mean curvature, plan curvature, profile curvature, distance to road, distance to buildings, population distribution, land cover, NDVI, lithology, distance to faults, vegetation coverage, atmospheric humidity index VPD, distance to the Jinsha River, distance to river systems, soil erosion, and topographic factors. Cumulative deformation is excluded due to the high correlation of 0.88 with the deformation rate. The contour lines are excluded due to the minimal contribution to the assessment model. The landslide susceptibility map generated by the IME-InSAR model is illustrated in Figure 10.

Using the natural breaks method based on Monte Carlo, the degree of risk is classified into five categories: very high, high, moderate, low, and very low. The color scale ranges from red to green. The greener the pixel, the less likely it is to trigger landslides, while the redder the pixel, the more susceptible the area is to landslides. The impact of the disaster on the model’s performance is assessed by determining whether the disasters fall within very-high- or high-risk zones. The prediction results obtained from the IME-InSAR model are further analyzed, with Areas S1 and S2 of the study area selected as representative examples.

The landslide susceptibility map obtained by the IME-InSAR model and the optical imagery of Area S1 are illustrated in Figure 11. Area S1 includes Jin County in Liangshan Yi Autonomous Prefecture, Sichuan Province, and Zhaoyang District in Zhaotong City, Yunnan Province, along with other surrounding areas. These areas are separated by the Jinsha River, which is characterized by steep terrain, high mountains, and narrow valleys. On the western bank of the Jinsha River, the terrain in Jin County slopes from northwest to southeast, with the highest elevation of 4076 m at Shizi Mountain in the north, and the lowest elevation of 430 m in Ludian County. Most areas have an elevation above 2100 m, which includes eight types of landforms, such as flat plains, low hills, and mountain plateaus. Area S1 is characterized by widespread fault zones and a complex geological structure. Rainfall is the main conditioning factor. In addition, human activities have disturbed surface vegetation, leading to the occurrence of geohazards, particularly in the rainy season. On the eastern side, Zhaotong City is characterized by a higher western region and a lower eastern region, with a plateau landform, numerous ravines, and frequent seismic activity. Due to the abundant rainfall during the rainy season, Area S1 has become one of the regions severely affected by geological hazards in the upper Yangtze River Basin. Disasters in Area S1 are mainly distributed along the flow direction of the Jinsha River. There is a total of 153 geological disasters, with 140 located in very-high-risk zones. Most unidentified geological disasters are distributed far from the Jinsha River, with some in flat terrain areas that receive little rainfall and have high population density, beyond the capacity of the model.

Figure 12 presents the landslide susceptibility map for Area S2, obtained by the IME-InSAR model, along with its optical imagery. Area S2 is located in the southwest of the study area, within Ningnan County, Liangshan Yi Autonomous Prefecture, Sichuan Province. The area is rich in water resources, including 15 rivers, such as the Heishui River and Jinsha River. It boasts abundant mineral resources, including iron, lead, copper, and limestone. Precipitation is mainly concentrated between June and October. Figure 12 illustrates the distribution of five fault zones across the area, including the Muhe Fault Zone and the Ningnan–Huili Fault Zone. Due to intense tectonic activity and frequent seismic events, the rocks of the area are loose. Furthermore, activities such as mining and the development of the Baihetan Hydroelectric Station have made the region susceptible to disasters. There is a total of 61 geological disasters in Area S2. Among these, the landslide at Luojia Slope, located along the fault zone in the northwest of the study area, has been predicted. These disasters are situated in very-high-risk areas, with an accuracy of 100%. Disasters in the central area are concentrated, forming a high-susceptibility geohazard-prone zone that requires widespread attention.

5. Discussion

5.1. Importance Ranking of Conditioning Factors

Conditioning factors influence the assessment model to varying degrees. It is essential to assess the importance of the conditioning factors. Statistics of the importance of conditioning factors are presented in Table 4. Figure 13 shows the importance ranking of all conditioning factors based on the Gini index. Higher importance values indicate a greater contribution to the model, such as distance to roads, elevation, and AVP. This ranking aligns with the geohazard-prone conditions observed in the study area. The study area is characterized by steep terrain and rugged roads. Road construction disrupts the surface stress balance, loosens the soil, and produces unstable rock, making the area prone to landslides. Features such as contour lines and land cover provide limited information and exert only a minor influence on the model [9]. In particular, the contour data derived from elevation at 500 m intervals provide limited information and exhibit redundancy with the elevation. Among the InSAR-derived factors, coherence, visibility, and deformation rate are ranked relatively high, while the importance of cumulative deformation is relatively low. Combined with the collinearity analysis and frequency ratio analysis, a comprehensive comparison is made to determine the final factors.

5.2. Performance Evaluation of Sampling Method

Traditional methods generate training samples by using historical geological disasters and the double-buffer sampling method. However, historical data only represent locations where disasters have occurred in the past, and whether changes have occurred since then remains unknown. Traditional sampling methods are often inadequate for dynamic monitoring environments. The adaptive sampling strategy proposed in this study, based on the hotspot analysis method, leverages the high temporal and spatial resolution as well as the high reliability of InSAR results. Figure 14 illustrates the distributions of stable and non-stable samples in the study area, generated by the traditional sampling method and the proposed sampling strategy, respectively. By incorporating potential deformation points identified through hotspot analysis, the extent of the risk zones is substantially increased, thereby refining the stable samples. To accurately evaluate the effect of optimization on sample quality, the operations with and without absolute value transformation were compared. The comparison focuses on three parameters: quantity, mean, and Std. The statistics of the samples’ deformation rates before and after optimization are listed in Table 5. The number of risk samples increased from 857,650 to 2,143,226, representing a 2.5-fold increase. With the inclusion of the absolute operation, both the mean and Std increased accordingly. This indicates that additional deformation samples were incorporated, including samples exhibiting subsidence and samples exhibiting uplift. However, the parameters of stable pixels exhibited trends opposite to those of the risk samples. This reflects a reduction in the quantity of deformation samples. In addition, the quantity of residual samples decreased from 506,713 to 6946, with the proportion decreased from 10.95% to 0.15%. The proposed sampling method accurately classified the residual samples, thereby refining the training samples. In summary, the proposed sampling method contributes to improving the quality of the training samples.

5.3. Performance Evaluation of Predictive Models

To evaluate the performance of the proposed method, the experimental results were compared using four models: RF, SVM, XGBoost, and IME. The impact of InSAR-derived information was further analyzed. Overall, two experimental settings were designed in this study: one excluding InSAR-derived information and the other including it. Figure 15 presents the landslide susceptibility maps generated by eight models. A K-fold cross-validation method was adopted to avoid bias caused by insufficient samples. Four evaluation metrics were employed, including accuracy, precision, recall, and F1-score, as listed in Table 6.

Overall, these models exhibit high accuracy, validating the reliability of the proposed sampling strategy. This strategy effectively mitigated the low-quality samples by employing an improved double-buffer sampling method. To mitigate the bias in the results arising from random partitioning, a K-fold cross-validation method was adopted. InSAR-derived information enhanced the performance of all models. InSAR deformation was not directly used as the training samples. Instead, the models relied on actual disasters, which made it challenging to establish the relationship between the samples and the InSAR-derived information. The above was validated through the frequency ratio analysis. InSAR-derived deformation can identify potential sites for geological hazard investigations, but it cannot be used as the sole criterion. Among the eight models, the SVM model exhibited an accuracy of 0.932 and can be attributed to the high quality of the samples. In contrast, the IME-InSAR model outperforms the other models as it integrates heterogeneous learners to explore the relationships between features and predictions from multiple aspects. Figure 16a illustrates the ROC curves of all models. Figure 16b presents an enlarged view of the curves at SS1. All curves indicate that as the FP rate threshold increases, the TP rate steadily increases. Among these models, the IME and IME-InSAR models exhibit the fastest increase, while the SVM and SVM-InSAR models show a slower rise.

A comparison of the AUC values for the eight models is listed in Table 7. All models yielded AUC values greater than 0.95. The IME model achieved the highest AUC value of 0.97772, whereas the SVM model exhibited the lowest AUC value of 0.95654. These results are consistent with the ROC curves. The IME model demonstrates superior performance compared with other models. The differences in AUC values among the models are small. The DeLong test was used to assess the statistical significance of the difference between the ROC curves of IME and IME-InSAR. The p-value from the DeLong test was 0.7068, indicating that the difference between the two AUCs is not statistically significant. This suggests that the inclusion of InSAR-derived information provides limited improvement to the model. It can be attributed to the high quality of the training samples generated by the double-buffer sampling strategy with InSAR-derived deformation information. As a result, models such as RFs also achieve high AUC values, with only slight differences compared with the proposed method.

5.4. Advantages and Limitations in Our Research

Based on the adaptive sampling strategy, we developed an optimized ensemble learning framework for landslide susceptibility assessment by integrating an SVM, RF, and XGBoost. Experimental results demonstrate its superior performance in predicting landslide susceptibility. To mitigate the redundancy among conditioning factors, a Monte Carlo-based frequency ratio analysis was implemented. To address the issue of sample quality, the adaptive sampling strategy leverages surface deformation information derived from InSAR, which serves as a critical indicator for assessing ground stability. Notably, the strategy also incorporates hotspot analysis, enhancing both the efficiency and quality of non-stable sample selection. A detailed statistical analysis of samples before and after optimization, in terms of both quantity and quality, is presented in Section 5.2. The results confirm the effectiveness of the adaptive sampling strategy. Lu et al. [16] selected negative samples based on environmental similarity and a statistical model. This method requires iterative operations that constrain its applicability to large-scale, high-resolution areas. Under complex geological conditions, the variability of landslide-triggering conditions may result in misclassification or over-removal of non-landslide samples. However, InSAR-derived deformation is less affected by environmental variability. Experimental results indicate that all models achieved accuracies exceeding 0.932, demonstrating the robustness of the proposed sampling strategy. Moreover, a comparison between models with and without InSAR-derived data revealed no significant improvement, as confirmed by the DeLong test for AUC differences. The proposed framework effectively leverages InSAR deformation information while reducing the complexity of conditioning factors. Overall, experimental results indicate that the IME method demonstrates superior performance.

Nevertheless, the proposed method has certain limitations. Although a wide range of landslide-triggering factors was considered, the number of relevant conditioning factors remains limited. Landslide occurrence is influenced by multiple conditioning factors, and the reliability of each factor varies in terms of resolution, sampling method, precision, and so on. Despite the use of advanced interpolation techniques, these differences among the factors may still influence model predictions. Furthermore, given the large extent of the study area, the training samples cannot fully represent the entire region, particularly in zones with complex topography and heterogeneous conditioning factors. While the proposed sampling strategy increases both the quantity and quality of samples, the most reliable samples remain those recorded during field surveys. The parameters of this study are determined using a grid search strategy. However, because of the possibility of local optima, the selected parameters may not be globally optimal, which could affect model accuracy. Parameter settings and reference ranges for eight models are listed in Table 8. Even if the landslide inventory used for training is based on official records, it may still include long-stabilized or inactive landslides. Without sufficient field verification, these misclassified stable samples cannot be removed, potentially affecting the model’s predictive performance. Our proposed model is structurally more complex than a single model, resulting in longer training and inference times, which need optimization. Finally, although it demonstrates superior performance within the selected area, additional validation is required to evaluate the applicability and generalizability across different regions.

6. Conclusions

Landslide susceptibility assessment has demonstrated effectiveness across various scenarios. The generated predictions can support geological hazard identification and analysis of evolution mechanisms. Therefore, it is crucial to study the landslide susceptibility assessment.

In this study, we propose an optimized heterogeneous ensemble learning framework for landslide susceptibility assessment, combined with an adaptive sampling strategy. Considering the unreliability of traditional methods, the proposed sampling strategy incorporates InSAR-derived deformation information and performs hotspot analysis using the Getis–Ord

G_{i}

statistic to refine the stable samples. The stacking method employs a two-layer structure, that utilizes RF, SVM, and XGBoost as base learners. InSAR-derived information, including deformation rate, cumulative deformation, coherence, and visibility, is incorporated into the model to enhance the robustness. In addition, a grid search strategy is utilized to obtain the optimal parameters. The middle and lower reaches of the Jinsha River were selected as the study area. First, a total of 26 conditioning factors were selected. To reduce the factor complexity, apart from collinearity analysis and importance evaluation, a Monte Carlo-based frequency ratio analysis was introduced. Conditioning factor importance is evaluated based on the Gini index, and a frequency ratio analysis is conducted using the Monte Carlo method to reveal the relationships between conditioning factors and geological hazards. The proposed sampling strategy was then employed to generate the high-quality samples. Finally, by coupling InSAR-derived information, an improved meta-ensemble (IME) stacking-based heterogeneous framework integrating an SVM, RF, and XGBoost was used to assess the landslide susceptibility. The optimal parameters are determined through a grid search strategy. Eight models, including RF, RF-InSAR, SVM, SVM-InSAR, XGBoost, XGBoost-InSAR, and IME, and IME-InSAR, are employed for comparative analysis. Model performance is evaluated using accuracy, precision, recall, and F1-score. Among all models, the proposed IME model achieved the best performance, with a maximum accuracy of 0.976. All models achieve AUC values greater than 0.95, among which the IME model attains the highest AUC value of 0.97772. Despite the formally highest IME metrics, the inclusion of InSAR information did not lead to a statistically significant improvement in the forecast compared to the high-quality basic sampling strategy. Overall, the proposed algorithm provides support for landslide susceptibility assessment in dynamic environments. The main findings of this study can be summarized as follows:

(1): The adaptive sampling strategy proposed in this study utilizes InSAR-derived deformation information with hotspot analysis to automatically identify potentially non-stable samples. Compared with traditional methods, this strategy improved both the quantity and quality of the training samples. Compared with the pre-optimization state, the variance of stable samples decreased. Under absolute value operation, the variance declined from 5.11 to 4.15. Furthermore, the robustness of the proposed sampling strategy is demonstrated by the fact that all models achieved accuracies exceeding 0.932.
(2): The incorporation of InSAR-derived deformation information enhances the predictive performance of models in landslide susceptibility assessments. To assess the impact of InSAR-derived information, two experimental settings were designed. Experimental results indicated that InSAR-derived information had a limited effect on the IME models. From another perspective, it highlights the capability of the hotspot analysis method to identify non-stable samples. In addition, the introduction of the adaptive sampling strategy reduced the complexity of conditioning factors and enhanced the robustness of the predictive models.
(3): Improved heterogeneous ensemble algorithms, which combine diverse classifiers to extract features, demonstrated superior performance. Landslide susceptibility maps generated by eight models were compared to evaluate the performance. The prediction results were further validated against optical imagery and field survey, with Areas S1 and S2 selected as representative examples. In Area S2, all 61 landslide-prone locations were identified within very-high-risk zones. Notably, in Area S2, concentrated disasters have formed a high-susceptibility geohazard zone that warrants significant attention.

Author Contributions

Conceptualization, L.L. and H.C.; methodology, L.L.; software, L.L.; validation, L.L. and Y.G.; formal analysis, L.L.; investigation, L.L. and S.L.; resources, L.L.; data curation, L.L.; writing—original draft preparation, L.L.; writing—review and editing, L.L. and J.Y.; visualization, L.L.; supervision, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors thank the European Space Agency for Sentinel-1 data. The DEM data was freely downloaded from https://www.eorc.jaxa.jp (accessed on 27 April 2026). The land cover data and population density data were freely downloaded from https://www.resdc.cn (accessed on 27 April 2026). The vegetation coverage data, humidity index data, and land surface temperature data were freely downloaded from https://data.tpdc.ac.cn (accessed on 27 April 2026). The NDVI data was freely downloaded from https://nesdc.org.cn (accessed on 27 April 2026). The soil erosion and topographic factor data were freely downloaded from https://www.scidb.cn (accessed on 27 April 2026). The SAR data was freely downloaded from https://search.asf.alaska.edu (accessed on 27 April 2026). The road data, building data, and river data were freely downloaded from https://www.openstreetmap.org (accessed on 27 April 2026). The lithology data and fault data were freely downloaded from https://geocloud.cgs.gov.cn (accessed on 27 April 2026). The geological disaster data was freely downloaded from https://www.gisrs.cn (accessed on 27 April 2026).

Conflicts of Interest

Author Lu Li, Hongyan Cheng, Yuhua Guo, Shangqiang Liu, Jianyong Yin were employed by the company Space Star Technology. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Pourghasemi, H.; Sadhasivam, N.; Amiri, M.; Eskandari, S.; Santosh, M. Landslide susceptibility assessment and mapping using state-of-the art machine learning techniques. Nat. Hazards 2021, 108, 1291–1316. [Google Scholar] [CrossRef]
Hakim, W.L.; Achmad, A.R.; Lee, C.-W. Land Subsidence Susceptibility Mapping in Jakarta Using Functional and Meta-Ensemble Machine Learning Algorithm Based on Time-Series InSAR Data. Remote Sens. 2020, 12, 3627. [Google Scholar] [CrossRef]
Wang, X.; Wang, Y.; Lin, Q.; Yang, X. Identifying human activities and rainfall impacts in landslide: A case study from southwestern China. Nat. Hazards 2025, 121, 23121–23144. [Google Scholar] [CrossRef]
Zhou, C.; Gan, L.; Cao, Y.; Wang, Y.; Segoni, S.; Shi, X.; Motagh, M.; Singh, R.P. Landslide susceptibility assessment of the Wanzhou district: Merging landslide susceptibility modelling (LSM) with InSAR-derived ground deformation map. Int. J. Appl. Earth Obs. Geoinf. 2025, 136, 104365. [Google Scholar] [CrossRef]
Rahmati, O.; Falah, F.; Naghibi, S.A.; Biggs, T.; Soltani, M.; Deo, R.C.; Cerdà, A.; Mohammadi, F.; Bui, D.T. Land subsidence modelling using tree-based machine learning algorithms. Sci. Total Environ. 2019, 672, 239–252. [Google Scholar] [CrossRef]
Devara, M.; Tiwari, A.; Dwivedi, R. Landslide susceptibility mapping using MT-InSAR and AHP enabled GIS-based multicriteria decision analysis. Geomat. Nat. Hazards Risk 2021, 12, 675–693. [Google Scholar] [CrossRef]
Topali, Z.K.; Ozcan, A.K.; Gokceoglu, C. Performance Comparison of Landslide Susceptibility Maps Derived from Logistic Regression and Random Forest Models in the Bolaman Basin, Türkiye. Nat. Hazards Rev. 2024, 25, 16. [Google Scholar] [CrossRef]
Liu, Y.; Xu, P.; Cao, C.; Zhang, W.; Han, B.; Zhao, M. A quick method of early landslide identification based on dynamic susceptibility analysis using M-SVM method: A case study. Bull. Eng. Geol. Environ. 2023, 82, 454. [Google Scholar] [CrossRef]
Liu, M.; Xu, B.; Li, Z.; Mao, W.; Zhu, Y.; Hou, J.; Liu, W. Landslide Susceptibility Zoning in Yunnan Province Based on SBAS-InSAR Technology and a Random Forest Model. Remote Sens. 2023, 15, 2864. [Google Scholar] [CrossRef]
Xie, Z.; Chen, G.; Meng, X.; Zhang, Y.; Qiao, L.; Tan, L. A comparative study of landslide susceptibility mapping using weight of evidence, logistic regression and support vector machine and evaluated by SBAS-InSAR monitoring: Zhouqu to Wudu segment in Bailong River Basin, China. Environ. Earth Sci. 2017, 76, 313. [Google Scholar] [CrossRef]
Chen, W.; Xie, X.; Peng, J.; Shahabi, H.; Hong, H.; Bui, D.T.; Duan, Z.; Li, S.; Zhu, A.-X. GIS-based landslide susceptibility evaluation using a novel hybrid integration approach of bivariate statistical based random forest method. CATENA 2018, 164, 135–149. [Google Scholar] [CrossRef]
Kavzoglu, T.; Colkesen, I.; Sahin, E.K. Machine Learning Techniques in Landslide Susceptibility Mapping: A Survey and a Case Study. In Landslides: Theory, Practice and Modelling; Pradhan, S., Vishal, V., Singh, T., Eds.; 2019; Volume 50, pp. 283–301. [Google Scholar] [CrossRef]
Ouyang, S.; Chen, W.; Liu, H.; Li, Y.; Xu, Z. A novel landslide susceptibility prediction framework based on contrastive loss. GISci. Remote Sens. 2024, 61, 2306740. [Google Scholar] [CrossRef]
Huang, F.; Cao, Z.; Guo, J.; Jiang, S.-H.; Li, S.; Guo, Z. Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. CATENA 2020, 191, 104580. [Google Scholar] [CrossRef]
Azarafza, M.; Azarafza, M.; Akgün, H.; Atkinson, P.M.; Derakhshani, R. Deep learning-based landslide susceptibility mapping. Sci. Rep. 2021, 11, 24112. [Google Scholar] [CrossRef] [PubMed]
Lu, Y.; Xu, H.; Wang, C.; Yan, G.; Huo, Z.; Peng, Z.; Liu, B.; Xu, C. A Novel Strategy Coupling Optimised Sampling with Heterogeneous Ensemble Machine-Learning to Predict Landslide Susceptibility. Remote Sens. 2024, 16, 3663. [Google Scholar] [CrossRef]
Zhang, A.; Zhao, X.-W.; Zhao, X.-Y.; Zheng, X.-Z.; Zeng, M.; Huang, X.; Wu, P.; Jiang, T.; Wang, S.-C.; He, J.; et al. Comparative study of different machine learning models in landslide susceptibility assessment: A case study of Conghua District, Guangzhou, China. China Geol. 2024, 7, 104–115. [Google Scholar] [CrossRef]
Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; Pham, B.T.; Bui, D.T.; Avtar, R.; Abderrahmane, B. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance. Earth-Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
Chen, Y.; Yu, S.; Tao, Q.; Liu, G.; Wang, L.; Wang, F. Accuracy Verification and Correction of D-InSAR and SBAS-InSAR in Monitoring Mining Surface Subsidence. Remote Sens. 2021, 13, 4365. [Google Scholar] [CrossRef]
Zhu, C.; Wang, C.; Zhang, B.; Qin, X.; Shan, X. Differential Interferometric Synthetic Aperture Radar data for more accurate earthquake catalogs. Remote Sens. Environ. 2021, 266, 112690. [Google Scholar] [CrossRef]
Goldstein, R.M.; Zebker, H.A.; Werner, C.L. Satellite radar interferometry: Two-dimensional phase unwrapping. Radio Sci. 1988, 23, 713–720. [Google Scholar] [CrossRef]
Przyłucka, M.; Herrera, G.; Graniczny, M.; Colombo, D.; Béjar-Pizarro, M. Combination of Conventional and Advanced DInSAR to Monitor Very Fast Mining Subsidence with TerraSAR-X Data: Bytom City (Poland). Remote Sens. 2015, 7, 5300–5328. [Google Scholar] [CrossRef]
Jaboyedoff, M.; Oppikofer, T.; Abell, A.; Derron, M.-H.; Loye, A.; Metzger, R.; Pedrazzini, A. Use of LIDAR in landslide investigations: A review. Nat. Hazards 2010, 61, 5–28. [Google Scholar] [CrossRef]
Ma, P.; Zheng, Y.; Zhang, Z.; Wu, Z.; Yu, C. Building risk monitoring and prediction using integrated multi-temporal InSAR and numerical modeling techniques. Int. J. Appl. Earth Obs. Geoinf. 2022, 114, 103076. [Google Scholar] [CrossRef]
Xu, Q.; Li, W.; Liu, J.; Wang, X. A geographical similarity-based sampling method of non-fire point data for spatial prediction of forest fires. For. Ecosyst. 2023, 10, 100104. [Google Scholar] [CrossRef]
Dou, H.; He, J.; Huang, S.; Jian, W.; Guo, C. Influences of non-landslide sample selection strategies on landslide susceptibility mapping by machine learning. Geomat. Nat. Hazards Risk 2023, 14, 2285719. [Google Scholar] [CrossRef]
Hong, H.; Wang, D.; Zhu, A.-X.; Wang, Y. Landslide susceptibility mapping based on the reliability of landslide and nonlandslide sample. Expert Syst. Appl. 2024, 243, 122933. [Google Scholar] [CrossRef]
Huang, F.; Yin, K.; Huang, J.; Gui, L.; Wang, P. Landslide susceptibility mapping based on self-organizing-map network and extreme learning machine. Eng. Geol. 2017, 223, 11–22. [Google Scholar] [CrossRef]
Oh, H.-J.; Lee, S.; Hong, S.-M. Landslide Susceptibility Assessment Using Frequency Ratio Technique with Iterative Random Sampling. J. Sens. 2017, 2017, 3730913. [Google Scholar] [CrossRef]
Han, Y.; Li, T.; Dai, K.; Lu, Z.; Yuan, X.; Shi, X.; Liu, C.; Wen, N.; Zhang, X. Revealing the Land Subsidence Deceleration in Beijing (China) by Gaofen-3 Time Series Interferometry. Remote Sens. 2023, 15, 3665. [Google Scholar] [CrossRef]
Ji, Y.; Zhang, X.; Li, T.; Fan, H.; Xu, Y.; Li, P.; Tian, Z. Mining Deformation Monitoring Based on Lutan-1 Monostatic and Bistatic Data. Remote Sens. 2023, 15, 5668. [Google Scholar] [CrossRef]
Zhang, Y.; Jiao, Y.-Y.; He, L.-L.; Tan, F.; Zhu, H.-M.; Wei, H.-L.; Zhang, Q.-B. Susceptibility mapping and risk assessment of urban sinkholes based on grey system theory. Tunn. Undergr. Space Technol. 2024, 152, 105893. [Google Scholar] [CrossRef]
Lu, P.; Casagli, N.; Catani, F.; Tofani, V. Persistent scatterers interferometry hotspot and cluster analysis (PSI-HCA) for detection of extremely slow-moving landslides. Int. J. Remote Sens. 2012, 33, 466–489. [Google Scholar] [CrossRef]
Zhu, J.; Baise, L.G.; Thompson, E.M. An Updated Geospatial Liquefaction Model for Global Application. Bull. Seismol. Soc. Am. 2017, 17, 1365–1385. [Google Scholar] [CrossRef]
Gharechaee, H.; Samani, A.N.; Sigaroodi, S.K.; Baloochiyan, A.; Moosavi, M.S.; Hubbart, J.A.; Sadeghi, S.M.M. Land Subsidence Susceptibility Mapping Using Interferometric Synthetic Aperture Radar (InSAR) and Machine Learning Models in a Semiarid Region of Iran. Land 2023, 12, 843. [Google Scholar] [CrossRef]
Liu, H.; Luo, Y.; Feng, W.; Wang, Y.; Ma, H.; Hu, P. Site response of ancient landslides to initial impoundment of Baihetan Reservoir (China) based on ambient noise investigation. Soil Dyn. Earthq. Eng. 2023, 164, 107590. [Google Scholar] [CrossRef]
Cheng, Z.; Liu, S.; Fan, X.; Shi, A.; Yin, K. Deformation behavior and triggering mechanism of the Tuandigou landslide around the reservoir area of Baihetan hydropower station. Landslides 2023, 20, 1679–1689. [Google Scholar] [CrossRef]
Ni, W.; Zhao, L.; Zhang, L.; Xing, K.; Dou, J. Coupling Progressive Deep Learning with the AdaBoost Framework for Landslide Displacement Rate Prediction in the Baihetan Dam Reservoir, China. Remote Sens. 2023, 15, 2296. [Google Scholar] [CrossRef]
Chen, X.; Li, R.; Hu, B.; Yin, Y.; Yang, J.; Jiang, S.; Qin, P.; Huang, B. Deformation response and mechanical analysis of the Wangjiashan landslide in Baihetan Hydropower Station, China, during initial impoundment. Bull. Eng. Geol. Environ. 2023, 82, 344. [Google Scholar] [CrossRef]
Gabriel, A.K.; Goldstein, R.M.; Zebker, H.A. Mapping small elevation changes over large areas: Differential radar interferometry. J. Geophys. Res. Solid Earth 1989, 94, 9183–9191. [Google Scholar] [CrossRef]
Abidin, H.Z.; Andreas, H.; Gumilar, I.; Sidiq, T.P.; Fukuda, Y. Land subsidence in coastal city of Semarang (Indonesia): Characteristics, impacts and causes. Geomat. Nat. Hazards Risk 2013, 4, 226–240. [Google Scholar] [CrossRef]
Ding, Q.; Chen, W.; Hong, H. Application of frequency ratio, weights of evidence and evidential belief function models in landslide susceptibility mapping. Geocarto. Int. 2017, 32, 619–639. [Google Scholar] [CrossRef]
Guo, C.; Montgomery, D.R.; Zhang, Y.; Wang, K.; Yang, Z. Quantitative assessment of landslide susceptibility along the Xianshuihe fault zone, Tibetan Plateau, China. Geomorphology 2015, 248, 93–110. [Google Scholar] [CrossRef]
Duan, X.; Bai, Z.; Rong, L.; Li, Y.; Ding, J.; Tao, Y.; Li, J.; Li, J.; Wang, W. Investigation method for regional soil erosion based on the Chinese Soil Loss Equation and high-resolution spatial data: Case study on the mountainous Yunnan Province, China. CATENA 2020, 184, 104237. [Google Scholar] [CrossRef]
Notti, D.; Davalillo, J.C.; Herrera, G.; Mora, O. Assessment of the performance of X-band satellite radar data for landslide mapping and monitoring: Upper Tena Valley case study. Nat. Hazards Earth Syst. Sci. 2010, 10, 1865–1875. [Google Scholar] [CrossRef]
Ferretti, A.; Prati, C.; Rocca, F. Nonlinear subsidence rate estimation using permanent scatterers in differential SAR interferometry. IEEE Trans. Geosci. Remote Sens. 2000, 38, 2202–2212. [Google Scholar] [CrossRef]
Berardino, P.; Fornaro, G.; Lanari, R.; Sansosti, E. A new algorithm for surface deformation monitoring based on small baseline differential SAR interferograms. IEEE Trans. Geosci. Remote Sens. 2002, 40, 2375–2383. [Google Scholar] [CrossRef]
Tien Bui, D.; Shahabi, H.; Shirzadi, A.; Chapi, K.; Pradhan, B.; Chen, W.; Khosravi, K.; Panahi, M.; Bin Ahmad, B.; Saro, L. Land Subsidence Susceptibility Mapping in South Korea Using Machine Learning Algorithms. Sensors 2018, 18, 2464. [Google Scholar] [CrossRef]
Mohammady, M.; Pourghasemi, H.R.; Amiri, M. Assessment of land subsidence susceptibility in Semnan plain (Iran): A comparison of support vector machine and weights of evidence data mining algorithms. Nat. Hazards 2019, 99, 951–971. [Google Scholar] [CrossRef]
Band, S.S.; Janizadeh, S.; Chandra Pal, S.; Saha, A.; Chakrabortty, R.; Shokri, M.; Mosavi, A. Novel ensemble approach of deep learning neural network (DLNN) model and particle swarm optimization (PSO) algorithm for prediction of gully erosion susceptibility. Sensors 2020, 20, 5609. [Google Scholar] [CrossRef]
Zhao, F.; Miao, F.; Wu, Y.; Xiong, Y.; Gong, S.; Sun, D. Land subsidence susceptibility mapping in urban settlements using time-series PS-InSAR and random forest model. Gondwana Res. 2024, 125, 406–424. [Google Scholar] [CrossRef]
Zhang, L.; Arabameri, A.; Santosh, M.; Pal, S.C. Land subsidence susceptibility mapping: Comparative assessment of the efficacy of the five models. Environ. Sci. Pollut. Res. 2023, 30, 77830–77849. [Google Scholar] [CrossRef]
Arabameri, A.; Saha, S.; Roy, J.; Chen, W.; Blaschke, T.; Tien Bui, D. Landslide susceptibility evaluation and management using different machine learning methods in the Gallicash River Watershed, Iran. Remote Sens. 2020, 12, 475. [Google Scholar] [CrossRef]
Jenks, G. The Data Model Concept in Statistical Mapping. Int. Yearb. Cartogr. 1967, 7, 186–190. Available online: https://api.semanticscholar.org/CorpusID:215850874 (accessed on 27 April 2026).
Dai, H.; Zhang, H.; Dai, H.; Wang, C.; Tang, W.; Zou, L.; Tang, Y. Landslide Identification and Gradation Method Based on Statistical Analysis and Spatial Cluster Analysis. Remote Sens. 2022, 14, 4504. [Google Scholar] [CrossRef]
Xia, Y.; Wang, Y. InSAR- and PIM-Based Inclined Goaf Determination for Illegal Mining Detection. Remote Sens. 2020, 12, 3884. [Google Scholar] [CrossRef]
Cui, S.; Qiu, H.; Wang, S.; Wang, Y. Two-stage stacking heterogeneous ensemble learning method for gasoline octane number loss prediction. Appl. Soft. Comput. 2021, 113, 107989. [Google Scholar] [CrossRef]

Figure 1. Overview of the study area. (a) Location. (b) Proportions of geological disaster types. (c) Topography.

Figure 2. Distribution of all conditioning factors of the study area. (a) elevation; (b) aspect; (c) slope; (d) contour; (e) mean curvature; (f) plan curvature; (g) profile curvature; (h) distance to road; (i) distance to building; (j) population density; (k) land cover; (l) NDVI; (m) lithology; (n) distance to fault; (o) vegetation coverage; (p) soil erosion; (q) topographic factor; (r) VPD; (s) AVP; (t) distance to Jinsha river; (u) distance to river network; (v) land surface temperature; (w) visibility; (x) coherence; (y) deformation rate; (z) cumulative deformation.

Figure 3. The distribution of spatial–temporal baseline.

Figure 4. Workflow of InSAR landslide susceptibility mapping for the IME model.

Figure 5. The relationship between the number of iterations and running time.

Figure 6. Detailed workflow of the adaptive sampling strategy.

Figure 7. The structure of stacking-based improved meta-ensemble model.

Figure 8. The correlation of all conditioning factors.

Figure 9. The relationship between the occurrence rate of geological disasters and conditioning factors. (a) elevation; (b) aspect; (c) slope; (d) contour; (e) mean curvature; (f) plan curvature; (g) profile curvature; (h) distance to road; (i) distance to building; (j) population density; (k) land cover; (l) NDVI; (m) lithology; (n) distance to fault; (o) vegetation coverage; (p) soil erosion; (q) topographic factor; (r) VPD; (s) AVP; (t) distance to Jinsha river; (u) distance to river network; (v) land surface temperature; (w) visibility; (x) coherence; (y) deformation rate; (z) cumulative deformation.

Figure 10. The susceptibility maps of geological disasters generated by IME-InSAR model. Areas S1 and S2 are selected as representative examples.

Figure 11. Landslide susceptibility map and the optical imagery of Area S1. (a) Landslide susceptibility map predicted by IME-InSAR model. (b) The optical imagery.

Figure 12. Landslide susceptibility map and the optical imagery of Area S2. (a) Landslide susceptibility map predicted by IME-InSAR model. (b) The optical imagery.

Figure 13. Importance ranking of all conditioning factors.

Figure 14. Distribution of stable and non-stable sample in the study area. (a) Distribution map before optimization. (b) Distribution map after optimization.

Figure 15. Landslide susceptibility maps generated by eight models. (a) RF; (b) RF-InSAR; (c) SVM; (d) SVM-InSAR; (e) XGBoost; (f) XGBoost-InSAR; (g) IME; and (h) IME-InSAR.

Figure 16. (a) ROC curves of eight landslide susceptibility models. (b) The enlarged ROC curves of eight landslide susceptibility models at SS1.

Table 1. The statistics of the conditioning factors covering the study area.

No.	Category	Number	Conditioning Factors
1	Topographic data	7	Elevation, aspect, slope, contour, mean curvature, plan curvature, and profile curvature
2	Anthropogenic data	4	Distance to road, distance to building, population density, and land cover
3	Geological data	4	NDVI, lithology, distance to fault, and vegetation coverage
4	Hydro-meteorological data	7	Soil erosion, topographic factor, humidity index VPD, humidity index AVP, distance to Jinsha River, distance to river network, and land surface temperature
5	InSAR-derived data	4	Visibility, coherence, deformation rate, and cumulative deformation
6	Geological hazard data	1	Geological disaster

Table 2. The source and resolution of the conditioning factors.

No.	Factors	Attribute	Resolution	Sources	Data-Derived
1	Elevation	Raster	30	ALOS Global Digital Surface Model (https://www.eorc.jaxa.jp (accessed on 27 April 2026))	Aspect, slope, contour, mean curvature, plan curvature, and profile curvature
2	Land cover	Raster	250	Resource and Environmental Science Data Platform (https://www.resdc.cn (accessed on 27 April 2026))	\
3	Population density	Raster	1000	Resource and Environmental Science Data Platform (https://www.resdc.cn (accessed on 27 April 2026))	\
4	Vegetation coverage	Raster	250	Institute of Tibetan Plateau Research Chinese Academy of Sciences (https://data.tpdc.ac.cn (accessed on 27 April 2026))	\
5	Humidity index, land surface temperature	Raster	1000	Institute of Tibetan Plateau Research Chinese Academy of Sciences (https://data.tpdc.ac.cn (accessed on 27 April 2026))	\
6	NDVI	Raster	30	Chinese Academy of Science Discipline Data Center for Ecosystem (https://nesdc.org.cn (accessed on 27 April 2026))	\
7	Soil erosion, topographic factor	Raster	30	Science Data Bank (https://www.scidb.cn (accessed on 27 April 2026))	\
8	SAR data	Raster	13.99, 2.33 (Az, Rg)	European Space Agency (https://search.asf.alaska.edu (accessed on 27 April 2026))	Visibility, coherence, deformation rate, and cumulative deformation
9	Road, building, and river	Vector	\	OpenStreetMap (https://www.openstreetmap.org (accessed on 27 April 2026))	Distance to road, distance to building, distance to Jinsha River, and distance to river network
10	Lithology, fault	Vector	\	China Geological Survey GeoCloud (https://geocloud.cgs.gov.cn (accessed on 27 April 2026))	Distance to fault
11	Geological disaster	Vector	\	Geographic Remote Sensing Ecological Network Platform (https://www.gisrs.cn (accessed on 27 April 2026))	\

Table 3. The statistics of the TOL and VIF of conditioning factors.

No.	Conditioning Factor	TOL	VIF	No.	Conditioning Factor	TOL	VIF
1	Elevation	0.107	9.365	14	Distance to fault	0.701	1.426
2	Aspect	0.941	1.063	15	Vegetation coverage	0.617	1.621
3	Slope	0.328	3.046	16	VPD	0.24	4.166
4	Mean curvature	0.391	2.56	17	AVP	0.112	8.924
5	Plan curvature	0.501	1.997	18	Distance to Jinsha River	0.636	1.573
6	Profile curvature	0.571	1.752	19	Distance to river network	0.557	1.794
7	Contour	0.987	1.013	20	Soil erosion	0.794	1.259
8	Distance to road	0.687	1.456	21	Land surface temperature	0.177	5.65
9	Distance to building	0.692	1.445	22	Topographic factor	0.348	2.871
10	Population density	0.928	1.078	23	Coherence	0.423	2.365
11	Land cover	0.584	1.714	24	Cumulative deformation	0.224	4.46
12	NDVI	0.593	1.688	25	Deformation rate	0.221	4.522
13	Lithology	0.854	1.172	26	Visibility	0.899	1.112

Table 4. Statistics of the importance of all conditioning factors.

No.	Conditioning Factor	Importance	No.	Conditioning Factor	Importance	No.	Conditioning Factor	Importance
1	Distance to road	0.10794	10	Vegetation coverage	0.04247	19	NDVI	0.01427
2	Elevation	0.09598	11	Coherence	0.04158	20	Cumulative deformation	0.01386
3	AVP	0.08506	12	Lithology	0.0284	21	Soil erosion	0.00987
4	Land surface temperature	0.07717	13	Slope	0.02487	22	Profile curvature	0.00933
5	VPD	0.07089	14	Visibility	0.02416	23	Mean curvature	0.00849
6	Distance to building	0.06998	15	Aspect	0.02292	24	Plan curvature	0.00781
7	Distance to Jinsha River	0.0635	16	Population distribution	0.01971	25	Land cover	0.00442
8	Distance to river network	0.06083	17	Deformation rate	0.01895	26	Contour	0.00089
9	Distance to fault	0.05885	18	Topographic factor	0.0178

Table 5. Statistics of the samples’ deformation rates before and after optimization.

	Type	Stable Samples		Risk Samples		Residual Samples
	Absolute Operation	No	Yes	No	Yes	No	Yes
Before Optimization	Quantity	3,264,074		857,650		506,713
	Mean	−2.46	3.80	−2.25	3.56	−2.01	3.43
	Std	5.11	4.21	4.82	3.95	4.79	3.90
After Optimization	Quantity	2,478,265		2,143,226		6946
	Mean	−1.32	2.93	−3.59	4.63	−1.57	2.20
	Std	4.15	3.22	5.64	4.82	2.68	2.18

Table 6. Comparison of the performance of eight models.

Models	Accuracy	Precision	Recall	F1-Score
RF	0.949	0.970	0.994	0.971
RF-InSAR	0.953	0.972	0.994	0.973
SVM	0.932	0.911	0.961	0.935
SVM-InSAR	0.938	0.938	0.939	0.939
Xgboost	0.964	0.941	0.991	0.965
Xgboost-InSAR	0.970	0.949	0.994	0.971
IME	0.974	0.963	0.986	0.974
IME-InSAR	0.976	0.964	0.990	0.977

Table 7. Comparison of AUC values for the eight models.

Models	AUC Value	Models	AUC Value
RF	0.97723	RF-InSAR	0.97719
SVM	0.95654	SVM-InSAR	0.96424
Xgboost	0.97655	Xgboost-InSAR	0.97668
IME	0.97772	IME-InSAR	0.97738

Table 8. Parameter settings and reference ranges for eight models.

Hyperparameters	Setting	Setting	Reference Range
	RF	RF-InSAR
n_estimators	45	49	40–50
max_depth	36	25	20–40
	SVM	SVM-InSAR
C	5	5	5–10
Gamma	21	23	20–25
	XGBoost	XGBoost-InSAR
learning_rate	0.1	0.1	0.1
n_estimators	41	43	40–50
max_depth	36	25	10–20
	IME	IME-InSAR
Above parameters	Same as above	Same as above	Same as above

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, L.; Cheng, H.; Guo, Y.; Liu, S.; Yin, J.; Wang, J. An Optimized Heterogeneous Ensemble Learning Algorithm for InSAR Landslide Susceptibility Mapping Based on the Adaptive Sampling Strategy. Remote Sens. 2026, 18, 1985. https://doi.org/10.3390/rs18121985

AMA Style

Li L, Cheng H, Guo Y, Liu S, Yin J, Wang J. An Optimized Heterogeneous Ensemble Learning Algorithm for InSAR Landslide Susceptibility Mapping Based on the Adaptive Sampling Strategy. Remote Sensing. 2026; 18(12):1985. https://doi.org/10.3390/rs18121985

Chicago/Turabian Style

Li, Lu, Hongyan Cheng, Yuhua Guo, Shangqiang Liu, Jianyong Yin, and Jili Wang. 2026. "An Optimized Heterogeneous Ensemble Learning Algorithm for InSAR Landslide Susceptibility Mapping Based on the Adaptive Sampling Strategy" Remote Sensing 18, no. 12: 1985. https://doi.org/10.3390/rs18121985

APA Style

Li, L., Cheng, H., Guo, Y., Liu, S., Yin, J., & Wang, J. (2026). An Optimized Heterogeneous Ensemble Learning Algorithm for InSAR Landslide Susceptibility Mapping Based on the Adaptive Sampling Strategy. Remote Sensing, 18(12), 1985. https://doi.org/10.3390/rs18121985

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

An Optimized Heterogeneous Ensemble Learning Algorithm for InSAR Landslide Susceptibility Mapping Based on the Adaptive Sampling Strategy

Highlights

Abstract

1. Introduction

2. Study Area and Datasets

2.1. Study Area

2.2. Conditioning Factors

2.3. SAR Datasets

3. Methodology

3.1. Conditioning Factor Optimization

3.1.1. Collinearity Analysis

3.1.2. Importance Assessment Based on Gini Index

3.1.3. Frequency Ratio Analysis Based on Monte Carlo Method

3.2. Adaptive Sampling Strategy Based on Hotspot Analysis Method

3.3. Stacking-Based Improved Meta-Ensemble Model

3.4. Performance Evaluation

4. Experimental Results and Analysis

4.1. Results and Analysis of Conditioning Factors

4.1.1. Collinearity Detection

4.1.2. Frequency Ratio Analysis

4.2. Prediction of Landslide Susceptibility

5. Discussion

5.1. Importance Ranking of Conditioning Factors

5.2. Performance Evaluation of Sampling Method

5.3. Performance Evaluation of Predictive Models

5.4. Advantages and Limitations in Our Research

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI