Landslide Susceptibility Evaluation Based on a Coupled Informative–Logistic Regression Model—Shuangbai County as an Example

: Shuangbai County, located in Yunnan Province, Southwest China, possesses a complex and diverse geological environment and experiences frequent landslide disasters. As a signiﬁcant area for disaster prevention and control, it is crucial to assess the susceptibility of landslides for effective geological disaster prevention, urban planning, and development. This research focuses on eleven in-ﬂuencing factors, including elevation, slope, slope direction, rainfall, NDVI, and distance from faults, selected as evaluation indexes. The assessment model is constructed using the information quantity method and the information quantity logistic regression coupling method to analyze the landslide susceptibility in Shuangbai County. The entire region’s landslide susceptibility is classiﬁed into four categories: not likely to occur, low susceptibility, medium susceptibility, and high susceptibility. The accuracy and reasonableness of the models are tested and compared. The results indicate that the coupled information–logistic regression model (80.0% accuracy) outperforms the single information model (74.2% accuracy). Moreover, the density of disaster points in the high-susceptibility area of the coupled model is higher, making it more reasonable. Thus, this model can serve as a valuable tool for evaluating regional landslide susceptibility in Shuangbai County and as a basis for disaster


Introduction
A landslide, which is a geological hazard, refers to the downward movement of rock and soil on a slope due to gravity and external forces. Studies have shown that developing nations are more vulnerable to such hazards compared to developed ones [1]. China, being the most populous developing nation, faces considerable landslide risks due to its mountainous topography, complex geological conditions, and frequent seismic activity [2]. Landslides in China account for a significant proportion of all geological hazards, characterized by their widespread occurrence, large scale, uneven distribution, and severe impact [3].
To optimally utilize land resources and strategically manage geological disasters, it is crucial to investigate, analyze, evaluate, and predict landslides in specific regions. Creating landslide susceptibility maps, used for early urban infrastructure planning, is an essential proactive strategy for mitigating geological disasters [4,5]. A comprehensive evaluation of landslides includes assessing susceptibility, hazard, vulnerability, and sensitivity, where susceptibility indicates the likelihood of landslides occurring. Various assessment models employ both qualitative and quantitative analyses. Local and international researchers have utilized qualitative analyses, such as hierarchical analysis, heavily relying on expert opinion and subjectivity [6][7][8][9][10]. Quantitative analyses, on the other hand, focus on numerical data interpretation, using statistical models like the information quantity method, frequency ratio, logistic regression, random forest, and artificial neural network to quantify landslide zoning [11][12][13][14][15][16]. Previous single-model evaluations have some limitations. The informativeness method can reflect the degree of influence of factor grading on landslides but cannot judge the relative magnitude of each influencing factor's impact on landslide occurrence [17]. The logistic regression model can reflect the contribution of each factor to landslides well but cannot account for the influence of different factor gradings. Combining the informativeness model with the logistic regression model can address their respective shortcomings, reflecting both the contribution of influencing factors to landslides and the magnitude of the influence of different factor conditions on landslides. Consequently, this paper selects the coupled model of informativeness and logistic regression to assess landslide susceptibility in Shuangbai County. This approach is novel, reduces human subjectivity, improves modeling effectiveness and credibility, and allows for understanding of the influence of each factor on landslides, both globally and locally. The model significantly enhances the logistic regression model by calculating the value of each factor's informativeness as a selection of logistic regression features, providing more interpretable results compared to methods like random forest and neural networks. The findings have important theoretical and practical implications for guiding landslide disaster prevention and control efforts.

Overview of the Study Area
Shuangbai County, nestled within the core of Central Yunnan Province, borders the Miaolun Mountains to the east and the watershed of the Jinsha River and Red River system to the south. It falls under the jurisdiction of the Chuxiong Autonomous Prefecture of Yunnan Province and serves as a confluence for the cities of Chuxiong, Yuxi, and Pu'er. The geographical coordinates for the county extend from 101 • 03 E to 102 • 02 E, and 24 • 13 N to 24 • 55 N ( Figure 1). The area is characterized by a northern subtropical-highland monsoon climate, marked by its hot and rainy periods and a clear distinction between wet and dry seasons. All river systems within Shuangbai County are part of the Lishe River region in the upper portion of the Red River Basin. This network includes 35 large and small streams and rivers, with the more significant tributaries being the Malong, Lushui, and Shadian Rivers. Positioned on the western edge of Yunnan's "mountain" front arc and the central segment of the Qinghai, Tibet, Yunnan, and Myanmar "outlaw" tectonic structures, the county covers part of the Nanhua-Chuxiong-Qujiang Fault. These challenging geological conditions contribute to a vulnerable geological environment prone to a myriad of geological disasters. The study and understanding of these dynamics hold considerable theoretical value and practical relevance for local disaster prevention and mitigation efforts.

Research Methodology
The information quantity encapsulates both the quality and quantity of data acquired  The information quantity encapsulates both the quality and quantity of data acquired about landslides. The information quantity model (IM) employs the total count of values at the time of a geological hazard's manifestation to depict its susceptibility. The occurrence of landslides is subject to numerous factors and these factors' magnitude and roles can vary across different geological environments; hence, there is an optimal combination of factors [18]. Utilizing probability theory, information theory, and engineering geological analogy, the information quantity serves to express the relative probability of landslide occurrence under the compounded effects of various elements. That is to say, a higher information quantity corresponds to an increased likelihood of a landslide occurrence [19,20]. The equation to compute the amount of information is as follows [21][22][23]: where N i represents the number of landslides occurring in each classification of the impact factor; N denotes the total number of landslides in the study area; S i represents the number of graded rasters for each influence factor; and S denotes the total number of rasters in the study area. When I > 0, conditions are favorable for landslide occurrence; when I < 0, conditions are unfavorable for landslide occurrence.

Logistic Regression Model
Logistic regression (LR) can be conceptualized as a linear regression model encapsulated within a sigmoid function. In this analysis model, the chosen factor operates as the independent variable, while the existence of a sigmoid (designated as 1 for sigmoid presence and 0 for absence) acts as the dependent variable. Often employed in regression analysis for binary dependent variables, logistic regression represents a non-linear categorical statistical methodology. This model can be articulated using the subsequent expressions [24][25][26]: where β 0 is the constant of logistic regression, while β 1 and β 2 . . . β n represent the logistic regression coefficients corresponding to each factor. p symbolizes the probability of landslide occurrence, and x n represents the nth factor.

Informativeness-Logistic Regression Model
The information quantity model effectively reflects the influence of each factor's grading on landslides, but it does not easily reveal the contribution rate of each influencing factor. On the other hand, the logistic regression model reflects the influence of each factor's grading on landslides and provides an overall understanding of their impact. The IM-LR coupled model combines the strengths of both models, considering the influence and grading of factors on landslides. To create non-landslide points for comparison, the same number of points is randomly generated using ArcGIS 10.8. The data of landslide points and non-landslide points are imported into SPSS software, with the information amount as an independent variable and the presence of a landslide point as a dependent variable (1 for yes, 0 for no). The constant term and regression coefficients of the factors are calculated and then substituted into Equations (2) and (3) to obtain the value of p.

Data Sources
The disaster points, totaling 148 landslides, were obtained from the "Geo-Remote Sensing Ecological Network Scientific Data Registration and Publication System" (www.gisrs.cn, accessed on 1 April 2023). The distribution of these landslides is illustrated in Figure 2. In this region, the landslides are predominantly situated near faults with fractured rock bodies and are also found along water systems and roads. As for the non-hazardous points, they were randomly generated at 1 km intervals using ArcGIS.
are calculated and then substituted into Equations (2) and (3) to obtain the value of p.

Data Sources
The disaster points, totaling 148 landslides, were obtained from the "Geo-Rem Sensing Ecological Network Scientific Data Registration and Publication Syste (www.gisrs.cn, accessed on 1 April 2023). The distribution of these landslides is illustra in Figure 2. In this region, the landslides are predominantly situated near faults with f tured rock bodies and are also found along water systems and roads. As for the non-h ardous points, they were randomly generated at 1 km intervals using ArcGIS. In this study, 11 influential factors have been chosen as evaluation metrics for sessing landslide susceptibility in Shuangbai County. The data sources for each respec factor are outlined below (Table 1): In this study, 11 influential factors have been chosen as evaluation metrics for assessing landslide susceptibility in Shuangbai County. The data sources for each respective factor are outlined below (Table 1):

Evaluation Module
An evaluation unit, which can be either regular or irregular, represents the smallest spatial element used for assessing susceptibility to geological hazards. Typical types of cells utilized in landslide hazard susceptibility zoning include raster cells, sub-basin cells, and slope cells [27]. In this study, in alignment with previous landslide hazard research and the unique conditions of the area, we opted for raster cells as the evaluation units. To maintain consistency in the number and size of the raster cells across all influencing factors, the raster size is computed according to the following empirical formula [28,29]: where G S represents the empirical raster size, while S signifies the denominator of the scale in the topographic map used to generate the DEM. Following this calculation, an evaluation unit size of 30 m × 30 m was selected. Furthermore, the coordinate system for all layers was standardized to WGS_1984_UTM_Zone_47N, in accordance with the longitude range of the study area.

Analysis of Impact Factors
Following the determination of the 11 impact factors, we introduced the concept of landslide-relative point density (LRPD). This measure can ascertain the spatial relationship between different gradations of evaluation factors and the geological hazards. The magnitude of the relative point density serves as a reflection of the importance of evaluation factors at all levels in relation to landslide hazard occurrences; a larger value indicates a higher susceptibility of the graded range to landslide occurrences. The calculation formula is as follows: where m i represents the number of landslides within a specific grading, while M signifies the total number of landslides in the study area. Similarly, n i indicates the area of a particular classification, and N denotes the total area of the study area.
(1) Topography Elevation: Landslides generally demonstrate a notable pattern in their distribution with elevation [30]. Factors such as vegetation type, sunlight exposure, vegetation cover, degree of weathering, human engineering activities, and rainfall all have certain correlations with elevation. In this study, the natural breakpoint method is employed to divide elevation into six levels, as illustrated in Table 2 and Figure 3a. Elevation within the range of 1659-1914 m registers the highest LRPD, indicating a higher susceptibility for landslide occurrences, whereas the range of 2284-3017 m, with the smallest LRPD, shows lesser susceptibility.    Slope: The slope of the surface unit plays a significant role in landslide formation, with a certain degree of slope being a crucial factor [31]. In this research, ArcGIS is used to perform slope analysis, and the slopes are classified into six classes using ten-degree intervals, as shown in Table 2 and Figure 3b. Landslide occurrences require a specific slope, not excessively steep; a moderate slope ranging from 10 • to 20 • is more conducive to landslide formation.
Slope Aspect: The slope aspect can impact solar radiation, precipitation, and vegetation cover [32]. Given that the study area is located in the northern hemisphere, the south-facing slopes are sunlit, while the north-facing ones are shaded. In this study, the slope direction is analyzed by ArcGIS. The slope aspect is divided into intervals of 45 • :  Table 2 and Figure 3c, the LRPD values are 1.192151, 1.08671, and 1.368685 for east, northeast, north, and northwest slope aspects, respectively, suggesting that shaded slopes with lesser light, hence poorer vegetation, contribute to weaker slope stabilization; thus, increasing the susceptibility to landslide hazards in this area.
Variations in plane curvature and profile curvature can also significantly impact landslides. In this study, curvature analysis is conducted using ArcGIS. Both plane curvature and profile curvature are divided into six categories, as presented in Table 2, and illustrated in Figure 3j,k. The profile curvature demonstrates the largest LRPD is in the range of 38.0-51.9, with a value of 1.271001, indicating a substantial influence on landslide occurrences. Similarly, the plane curvature showcases its largest LRPD within the range of 0-4.6, with a value of 1.314082, reflecting its considerable impact on landslides.
(2) Hydrological environment Proximity to Water Systems: Rivers can influence the stress conditions and physicalmechanical properties of slopes. The rivers in Shuangbai County exhibit significant variations in water level, swift currents, and substantial volume changes. These are predominantly mountainous rivers, which persistently erode slopes, thereby accelerating landslide incidents. This study creates a multi-ring buffer zone through ArcGIS analysis, divided into six tiers, as illustrated in Table 2 and Figure 3i. The LRPD in the 0-300 m range is 1.296010, with the LRPD decreasing as the distance increases. Consequently, the closer the proximity to the water system, the greater the likelihood of a landslide occurrence.
Average Annual Rainfall: Surface-accumulated atmospheric precipitation either infiltrates the ground or enters the slope via surface runoff fissures, thereby facilitating lubrication, erosion, and dissolution [33]. Using the natural breakpoint method through ArcGIS analysis, this study divides the annual rainfall into five levels, as shown in Table 2 and Figure 3g. Landslides tend to occur within the range of 628-721 mm of average annual rainfall, constituting 67.57% of incidents.
(3) Geological structure and ground cover Fault Proximity: The distance from a fault line is a significant contributing factor, with greater fragmentation of the geotechnical body closer to the fault. As presented in Table 2 and Figure 3h, this paper classifies the distance from the fault into six categories at 1 km intervals through ArcGIS for multi-ring buffer analysis. Landslides are most prone to occur within the 0-1 km range, exhibiting an LRPD of 1.25916 and constituting 31.76% of landslides.
Normalized Difference Vegetation Index (NDVI): Vegetation exerts a bidirectional effect on slope stability, and landslides can take place across various levels of vegetation cover. The formula is as follows: The NDVI was calculated using ENVI, where NIR represents the near-infrared band, and R denotes the infrared band. In this paper, the natural breakpoint method was employed to classify NDVI into five categories, as shown in Table 2 and Figure 3f. The maximum LRPD value occurs when the vegetation cover is within the 0.58-0.65 range, a condition conducive to landslide occurrence.

(4) Stratigraphic lithology
Stratigraphic Lithology: The physical and chemical properties of stratigraphic rocks vary, influencing their resistance to weathering and thus impacting landslide formation [34,35]. In general, hard and harder rocks are less likely to experience landslides than soft rocks, loose rocks, and harder rocks interspersed with soft rocks. As depicted in Table 2 and Figure 3d, in this paper, the geological map is vectorized using ArcGIS, and the stratigraphic lithology is classified into four categories: weak sandwich, softer rock, harder rock, and loose rock. The softer rock demonstrates the highest LRPD value of 1.773527, suggesting that this lithology is conducive to the occurrence of landslides.
(5) Human engineering activities Proximity to Roads: Human engineering activities substantially impact landslide susceptibility by disrupting the existing balance [36,37]. The proximity to roads can, to a certain extent, reflect the intensity of human activities. In this paper, a multi-ring buffer is created in ArcGIS and divided into seven levels at 500 m intervals, as shown in Table 2 and Figure 3e. It is observed that the largest LRPD is associated with areas within 500 m of a road. This proximity is conducive to landslide development, primarily because engineering activities, such as slope cutting and improper loading, disrupt the slope's balance. Generally, areas farther from roads are associated with fewer human activities, thereby leading to fewer landslides.

Correlation Analysis of Impact Factors
The selection of influencing factors holds paramount importance when assessing landslide hazard susceptibility. When executing the susceptibility analysis through each model, the factors are typically assumed to be mutually independent and non-correlated. This assumption necessitates performing a correlation analysis on these factors to prevent model overfitting. The Pearson correlation coefficient provides a measure of the correlation between two continuous numerical variables [38]. In the context of evaluating landslide susceptibility, it is generally understood that a correlation coefficient within the range of 0 ≤ |PCC| ≤ 0.3 signifies no correlation between factors. As indicated in Table 3, no linear correlation exists among the evaluated factors. Suppose the dataset of the sample (A i .B j ) = comprises pairs such as (a 1 , b 1 ), (a 2 , b 2 ), . . ., (a n , b n ). In such a case, the PCC is calculated using the following equation [39,40]: where a i and b j correspond to the values of the variables A i and B j , respectively. a and b are the average values of A i and B j , respectively.

Information Volume Model Evaluation
Following the correlation analysis among the variables, 11 index factors were determined for the construction of the landslide susceptibility assessment index system. The corresponding information value for each factor state was subsequently calculated in line with Equation (1). Utilizing the raster calculator, the information values of each factor were superimposed, assigning the total information to each raster. A higher information value signifies a higher likelihood of a landslide occurrence, hence providing a measure of landslide susceptibility. By deploying the natural breakpoint method in ArcGIS, landslideprone areas were categorized into four groups: less-prone, low-prone, medium-prone, and high-prone, as depicted in Figure 4. This resulted in an information-model-based landslide susceptibility zoning map. The proportion of the total area occupied by each category was found to be: less prone (9.01%), low susceptibility (22.46%), medium susceptibility (37.38%), and high susceptibility (31.15%). High-susceptibility areas were predominantly located in Tuodian Township, Anlongbao Township, the central region of Ejia Township, Faju Township, and Dazhuang Township. Conversely, areas of low susceptibility were primarily situated in the western parts of Ejia Township, and the southern regions of Damaidi Township and Aniyama Township.

Informative-Logistic Regression Model Evaluation
Prior to the execution of logistic regression analysis, multicollinearity analysis is con ducted for each factor. The analysis shows that all tolerance values exceed 0.1 and all var iance inflation factors remain below 10. Consequently, it is determined that all 11 factor can be incorporated into the predictive model. The hybrid informativeness-logistic regres sion model utilizes the computed informativeness value as the indicator value and under takes binomial logistic regression analysis employing SPSS software. In this analysis, th informativeness value of each evaluation factor is considered an independent variable while the potential occurrence of a landslide is treated as the dependent variable. The re sults of this regression analysis are presented in Table 4. Each factor displays a significanc level below 0.05, thereby satisfying the required conditions. The coefficients derived from the analysis are then incorporated into Equations (2) and (3), culminating in the derivation of Equation (8):

Informative-Logistic Regression Model Evaluation
Prior to the execution of logistic regression analysis, multicollinearity analysis is conducted for each factor. The analysis shows that all tolerance values exceed 0.1 and all variance inflation factors remain below 10. Consequently, it is determined that all 11 factors can be incorporated into the predictive model. The hybrid informativenesslogistic regression model utilizes the computed informativeness value as the indicator value and undertakes binomial logistic regression analysis employing SPSS software. In this analysis, the informativeness value of each evaluation factor is considered an independent variable, while the potential occurrence of a landslide is treated as the dependent variable. The results of this regression analysis are presented in Table 4. Each factor displays a significance level below 0.05, thereby satisfying the required conditions. The coefficients derived from the analysis are then incorporated into Equations (2) and (3), culminating in the derivation of Equation (8): 1+e Y Y = 0.053 + 0.791x 1 + 1.170x 2 + 1.183x 3 + 0.730x 4 + 0.456x 5 +0.217x 6 + 0.836x 7 + 0.999x 8 + 1.254x 9 + 0.909 − 0.122x 11 (8)  The variables x 1 -x 11 represent the information values associated with slope direction, NDVI, distance from the water system, plane curvature, profile curvature, stratigraphic lithology, distance from the road, slope, distance from the fault, elevation, and average annual rainfall, respectively. Using the natural intermittent point method in ArcGIS, these variables are categorized into four grades: less-prone area, low-prone area, mediumprone area, and high-prone area, as depicted in Figure 5. This classification enables the creation of a landslide susceptibility zoning map based on the IM-LR model. The distribution of these areas is as follows: not-likely-to-occur area (9.44%), low-susceptible area (22.00%), medium-susceptible area (36.52%), and high-susceptible area (32.03%). The high-susceptible area is primarily concentrated in the central part of Ejia Township, the eastern part of Anlongbao Township, Fabiao Township, the southern part of Dazhuang Township, and Toudian Township. Conversely, the not-likely-to-occur area and lowsusceptible area are concentrated in the western part of Ejia Township, the southern part of Anlongbao Township, Dutian Township, and the southern part of Damaidi Township. Moreover, the town is situated to the west of Ejia Township, the south of Anlongbao Township, the south of Dutian Township, and the south of Damadian Township. From the regression coefficients, it can be observed that the importance of each influencing factor on landslides in the study area is in descending order: distance to fault, distance from water system, NDVI, slope, elevation, distance from road, slope direction, plane curvature, profile curvature, stratigraphic lithology, average annual rainfall.

Comparative Analysis of Evaluation Model Accuracy Assessments
The accuracy of a model is reflected by the proximity of the ROC curve to the upper left corner [41]. The model's predictive accuracy is characterized by the size of the AUC value, which represents the area under the ROC curve and the area enclosed by the x-axis. A higher AUC value, closer to 1, indicates a greater prediction accuracy for the tested model. Figure 6 illustrates that the IM model has an AUC value of 0.742, while the IM-LR coupled model has an AUC value of 0.800. Both models exhibit high diagnostic values, but the IM-LR model demonstrates significant improvement compared to the IM model. These findings suggest that the IM-LR model is more suitable for evaluating landslide susceptibility in the study area.

Comparative Analysis of Evaluation Model Accuracy Assessments
The accuracy of a model is reflected by the proximity of the ROC curve to the upper left corner [41]. The model's predictive accuracy is characterized by the size of the AUC value, which represents the area under the ROC curve and the area enclosed by the x-axis. A higher AUC value, closer to 1, indicates a greater prediction accuracy for the tested model. Figure 6 illustrates that the IM model has an AUC value of 0.742, while the IM-LR coupled model has an AUC value of 0.800. Both models exhibit high diagnostic values, but the IM-LR model demonstrates significant improvement compared to the IM model. Sustainability 2023, 15, x FOR PEER REVIEW 13 These findings suggest that the IM-LR model is more suitable for evaluating land susceptibility in the study area. The confusion matrix assesses the model accuracy by quantifying the discrepa between the actual and model-predicted results. The IM model's confusion matrix i played in Table 5, while the IM-LR model's confusion matrix is shown in Table 6 confusion matrix comprises precision and accuracy, which serve as criteria to evaluat accuracy of the model. Higher precision and accuracy values indicate more acc model evaluation results. The formulas are as follows: The confusion matrix assesses the model accuracy by quantifying the discrepancies between the actual and model-predicted results. The IM model's confusion matrix is displayed in Table 5, while the IM-LR model's confusion matrix is shown in Table 6   The accuracy (68.2%) and precision (70.3%) of the IM model, calculated from Tables 5 and 6, are lower than the accuracy (72.3%) and precision (73.6%) of the IM-LR model. Therefore, the IM-LR model is more accurate.
The density of landslide occurrences provides a valuable measure of the landslide-tosusceptibility area ratio, serving as a crucial indicator of model performance [42]. Table 7 presents the landslide point densities for various susceptibility classes in the IM model, including less-prone (0.006), low-prone (0.010), medium-prone (0.037), and high-prone (0.068) areas. The results of the information quantity model align with reasonable expectations, as the densities of landslide points increase across susceptibility classes. Similarly, the IM-LR coupled model exhibits increasing landslide point densities for the less-prone (0.005), low-prone (0.012), medium-prone (0.030), and high-prone (0.075) areas. The prediction outcomes of the information quantity model remain reasonable. Notably, the density of landslide points is higher in the information quantity-logistic regression coupled model compared to the IM model. This suggests that the IM-LR model's predictions are denser in areas with higher susceptibility, closely approximating the actual distribution of landslide points. Consequently, the landslide prediction accuracy is enhanced, which holds significant implications for guiding landslide disaster prevention and control efforts in Shuangbai County.

Discussion
Analyzing the susceptibility of the study area, four types of zones have been classified as not susceptible, low susceptible, medium susceptible, and high susceptible, where appropriate land use planning is essential to prevent and mitigate the consequences of disaster occurrence in areas of high susceptibility to landslides. This study provides spatial data related to the distribution of areas prone to landslides, and the zoning maps produced can be used by the authorities to guide the adoption of policies and strategies aimed at mitigating the disaster, and detailed work on the vulnerability of exposed bearers in the central part of Ejia Township, eastern part of Anlongbao Township, Fabiao Township, the southern part of Dazhuang Township, and Toudian Township areas will help in proposing appropriate protective measures. The state of landslide disaster prevention and control in these regions is grim and there is a need to focus on strengthening monitoring and prevention. Prevention and control work should be based on "group monitoring and prevention + relocation and avoidance", and monitoring needs to be a combination of dynamic monitoring and key monitoring.

Conclusions
This paper presents a case study conducted in Shuangbai County, Yunnan Province, focusing on landslide susceptibility evaluation using the information quantity model (IM) and the information quantity-logistic regression model (IM-LR). The accuracy and applicability of both models were examined, leading to the following conclusions: (1) An index system comprising slope direction, NDVI, distance from water systems, plane curvature, profile curvature, stratigraphic lithology, distance from roads, slope, distance from faults, elevation, and average annual rainfall was established to construct the landslide susceptibility evaluation models. The IM model and the IM-LR coupled model were utilized to assess susceptibility and derive susceptibility zones. The density of landslide points within each zone was calculated, revealing that the point density increases in the less-susceptible, low-susceptible, medium-susceptible, and high-susceptible areas for both models, aligning with the reasonableness of the models. Notably, the IM-LR coupled model exhibits higher point density in the medium-and high-susceptibility zones compared to the IM model, indicating that the susceptibility zoning of the IM-LR coupled model better approximates reality. (2) Analysis of the susceptibility zoning map indicates that the high-susceptibility zone is predominantly distributed in the central areas of Ejia Township, the eastern part of Anlongbao Township, Fabiao Township, the southern part of Dazhuang Township, and regions near rivers and roads in Toudian Township. These areas feature lower elevation, less vegetation cover, and increased human engineering activities, contributing to unstable slopes. In contrast, the less-susceptibility and low-susceptibility zones are primarily located in the western part of Ejia Township, the southern part of Anlongbao Township, Duda Township, and the southern part of Damaidi Township. These areas have higher elevation, limited human engineering activities, and more stable slopes. (3) The AUC values of the two models were obtained from the ROC curves. The AUC value of the IM-LR coupled model, 0.800, was larger than that of the IM model, 0.74. Additionally, the accuracy and precision of the IM model were lower than those of the IM-LR model, as obtained from the confusion matrix. These results indicate that the IM-LR coupling model performs better in evaluating the susceptibility of the study area and can provide a more accurate landslide susceptibility zoning for Shuangbai County. This model can serve as a scientific basis for the planning of related departments and a reference for researchers evaluating susceptibility in similar areas.
(4) Five factors, namely distance to fault, distance from water system, NDVI, slope, and elevation, have significant contributions to the landslide hazard vulnerability in Shuangbai County. Landslides in this region are mainly distributed near roads, water systems, and faults. The probability of landslides decreases with increasing distance from these areas due to relatively broken rocks and poor slope stability. Most landslides occur in the areas of soft rock, loose rock, and softer rock. Additionally, landslides are more likely to occur on shady slopes with medium vegetation cover and significant changes in surface undulation within the elevation range of 1659 m to 1914 m, where the slope is approximately 20 • . These findings serve as a valuable reference for geotechnical workers and practitioners in understanding the development characteristics of landslides in the study area.
Author Contributions: All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by S.T., H.W., J.Z. and J.X. The first draft of the manuscript was written by H.W., and all authors commented on previous versions of the manuscript. All authors have read and agreed to the published version of the manuscript.