1. Introduction
Landslides are widely distributed all over the world and are one of the natural disasters with strong destructive power [
1]. Climate change and the intensification of human activities will lead to an increase in the frequency of landslides, posing a serious threat to infrastructure and the lives of people and hindering social development [
2]. Identifying, analyzing, and evaluating the sensitivity of early landslides in complex mountainous areas can provide local governments with a foundational framework for managing landslide risk zones and guide land-use planning [
3,
4].
Traditional landslide identification and cataloguing primarily rely on field surveys, which pose significant challenges in vast and topographically complex regions [
5]. Since the 1970s, researchers have employed a combination of remote-sensing imagery and ground survey data for the manual visual interpretation of landslides [
6]. However, optical remote-sensing technology has certain limitations in landslide identification. The quality of optical imagery is easily affected by weather conditions, particularly in areas with cloud cover, and it is challenging to effectively detect landslides that involve minor deformations. Subsequently, various remote-sensing data sources have emerged, including radar, SAR, InSAR, satellite stereo imagery, high-resolution images, drone imagery, and light detection and ranging (LiDAR) [
7,
8,
9]. InSAR is a microwave remote-sensing technology that has developed rapidly in recent years. Compared to the traditional way, it has the advantage of superior wide coverage, high resolution, all-day detection, and high monitoring accuracy. All of these make up for the insufficiency of the traditional methods of recognizing and monitoring landslides in the mountainous areas, especially in places difficult to be reached by ground-monitoring means [
10]. Currently, the mainstream methods for landslide identification based on InSAR technology include differential InSAR (D-InSAR), permanent-scatterer InSAR (PS-InSAR), and SBAS-InSAR [
11,
12,
13]. Among them, SBAS-InSAR can effectively alleviate the problems of incoherence and atmospheric effect caused by a too-long spatial baseline in D-InSAR. At the same time, SBAS-InSAR improves the temporal sampling frequency, so that it can more accurately obtain the deformation information of slopes and reveal their safety state [
14]. Compared with PS-InSAR, the deformation maps obtained by SBAS-InSAR are more continuous in spatial resolution, giving it a significant advantage in the monitoring of landslides in mountainous areas [
15]. In recent years, many scholars have used SBAS-InSAR to carry out landslide-monitoring research, carried out with the aim of realizing the early identification and determination of landslides. These researchers have achieved remarkable results [
16,
17]. Although InSAR is widely used in landslide research, challenges remain in mountainous areas. Dense vegetation and steep terrain can cause data incoherence, while geometric distortions and atmospheric disturbances complicate the analysis. Relying on data from a single orbit may also lead to misidentifying landslides. These issues emphasize the need for more comprehensive monitoring methods, such as high-resolution imagery and multi-orbit approaches, to improve landslide detection accuracy.
Landslide sensitivity evaluation is used to assess the probability of landslides occurring. The effectiveness of landslide sensitivity modeling depends not only on the quality of the algorithms used but also on the screening of disaster triggers, the handling of positive and negative landslide samples, and the treatment of missing values, noise, and erroneous data [
18]. Currently, the selection of landslide disaster-inducing factors relies mainly upon expert experience, but a uniform indicator system may not be fully applicable in different geo-geological contexts [
19]. With the advantages of the geodetector in identifying spatial differentiation and understanding of the mechanisms of influencing factors, this method has gradually been applied for use in geological disaster factor identification and has achieved significant results [
20]. In landslide sensitivity modeling, common machine-learning methods include LR, support vector machines (SVM), RF, GBDT, and CatBoost [
21]. Deep-learning models, such as Transformer, long short-term memory networks (LSTM), and convolutional neural networks (CNN), are also gradually being applied in this field [
22]. Using the example of Piedmont in Italy, Taalab demonstrated that RF can generate highly accurate landslide susceptibility maps for large heterogeneous areas without multiple evaluations [
23]. Gu introduces a semi-supervised learning method for the screening of non-landslide samples, and the results show that the method works best when combined with CatBoost [
24]. Akgun concluded that LR was the most accurate model based on the evaluation of its results using the area under the curve (AUC) [
25]. In addition to single classifiers, many scholars have used stacking and deep-learning models to manage complex data and accurately predict landslide-sensitive areas [
26,
27,
28]. Although there are many kinds of landslide sensitivity evaluation models, the practical application effect will still be affected by many factors. Hence, it remains essential to establish a corresponding model demonstration study for the areas with specific geological characteristics.
The purpose of this paper is to explore an integrated method that is applicable to the identification, analysis, and sensitivity evaluation of landslide hazards in small samples in complex mountainous areas. Taking Lamping County as the study area, this study combines two-track data, applies SBAS-InSAR technology with high-resolution optical imagery for landslide hazard interpretation, and analyzes disaster-inducing factors through the geodetector. The GD–machine-learning model is further constructed to carry out a demonstration study of landslide sensitivity evaluation for small samples, which provides guidance and reference for landslide research in similar geo-geographical environments.
5. Discussion
5.1. Landslide Hazard Identification in Complex Mountainous Areas
The joint method used in this paper effectively overcomes the limitations of optical imagery, which is susceptible to weather and has difficulty monitoring landslides in deformed areas, and reduces the impact of single-orbit radar satellite identification errors in high mountain canyon areas. Through UAV aerial photography and on-site validation, the study found that the landslide hazards in Cheyiping are more serious and that there are many residents living in this vulnerable area than had previously been. Therefore, it is of great practical significance to carry out early identification of landslides in high mountain valley areas for disaster prevention and mitigation, and for the protection of people’s lives and properties.
However, this study still has some limitations. First, the C-band radar data used in the research has limited penetration capability in areas with dense vegetation, resulting in less effective landslide hazard identification in some regions. To improve the accuracy of InSAR monitoring, future studies will consider incorporating L-band radar data, such as LuTan data, which has stronger penetration capabilities compared to C-band radar and can more effectively address landslide identification challenges in complex surface conditions. Second, it should be noted that, during the field validation of landslide hazards, we only selected a few representative areas for on-site investigation. This selection may not have comprehensively covered all types of landslide risks within the region. Therefore, future studies will expand the scope of field validation in high mountains, deep valleys, and areas with complex geological conditions, and will consider using high-resolution remote-sensing technologies such as LiDAR for more precise terrain and deformation monitoring. This will further improve the accuracy of the landslide hazard identification.
5.2. Screening and Risk Zoning of Landslide Disaster-Inducing Factors
The selection of landslide predisposing factors is usually based on the combined effects of natural conditions and human activities on landslide occurrence. Currently, many scholars rely only on expert experience to obtain indicator factors for sensitivity studies, without exploring whether the factors have provided sufficient contributions to the evaluation and the division of the factor risk zones. This expert experience has affected our ability to discriminate landslide-inducing factors in a specific area. This study fully leverages the advantages of the geodetector method, which not only identifies significant disaster-inducing factors for landslides from the examination and use of vast datasets but also is able to distinguish high-risk factor intervals. With the factor detector, we found that landslides exhibit stronger significance in factors such as DEM, rainfall, distance from roads, and geological rock formation. However, there were factors like curvature and surface roughness that did not pass the significance test and should, therefore, be excluded.
DEM primarily affects the probability of landslide occurrence indirectly through terrain, slope, climate, and vegetation conditions. Areas at a certain altitude are typically characterized by steeper slopes, enhanced gravitational forces, and a higher likelihood of instability in the rock and soil masses. Therefore, in mountainous areas, landslide occurrences are typically closely related to specific elevation conditions. Rainfall increases the infiltration of water on slopes, which reduces the friction coefficient between weak zones, thereby weakening the shear strength of these zones and promoting slope failure. Additionally, the distance from main roads is often used as an indicator of human engineering activity. Roads built on slopes disrupt the support structure at the base of the slope, and as terrain changes and support is lost, cracks may form and expand. When moisture further infiltrates the slope, it can eventually lead to instability. The differences in the properties of engineering geological rock formation, such as rock composition, hardness, and degree of weathering and fracturing, determine the development characteristics of landslides.
The factor detector takes the above factors as the most important influencing factors, which does not mean that the role of other important factors, such as side slopes, is neglected. Whereas landslides are often caused by the interaction of multiple factors, in the interaction detector results, we can find that the combined effect of slope, DEM, and other factors can have higher explanatory power. Therefore, it is more reasonable to assume that it is probable that we can derive the higher sensitivity of landslides to the above-mentioned significant disaster-inducing factors through the use of geodetector. Finally, through the analysis of the risk detector, we were also able to determine the distribution of the risk intervals of landslides with respect to the risk factors of the breeding factors. This analysis provides important insights for the identification and risk assessment of different types of landslides and can help guide the formulation of landslide mitigation and prevention measures.
5.3. Demonstration Study on the Sensitivity Evaluation of Small Samples of Landslides
This study selected 13 disaster-inducing factors based on geodetector and used RF, GBDT, CatBoost, LR, and stacking algorithms for landslide sensitivity evaluation. For sample selection, we innovatively applied the K-means clustering algorithm to optimize non-landslide samples, ensuring the scientific and rational design of the experiment. In our model optimization, Bayesian optimization was used for hyperparameter tuning to minimize the impact of human intervention on model performance. The experimental results show that all four models performed well, with the GD-CatBoost model achieving the highest accuracy.
However, the study area is relatively limited, with only 245 landslide samples. We conducted an extensive selection and comparison of the learning algorithms. An initial attempt was made using the Tab-Transformer deep-learning model for landslide sensitivity evaluation, but the experimental results indicated that the model performed poorly, mainly due to the limited number of landslide samples. Although deep-learning methods have shown promising results in landslide sensitivity evaluations, this study indicates that, under small sample conditions, the GD-CatBoost model is an excellent classification tool that can effectively distinguish and identify potentially sensitive areas for landslide occurrence.
In future research, we expect to expand the landslide sample size to further incorporate the application of deep-learning models (Transformer, CNN-LSTM, etc.) through in-depth studies of large landslide areas in order to improve the evaluation precision and accuracy under small sample conditions.