Geological Disaster Risk Assessment Under Extreme Precipitation Conditions in the Ili River Basin

Li, Xinxu; Liu, Jinghui; Zhang, Zhiyong; Yuan, Xushan; Li, Yanmin; Wang, Zixuan

doi:10.3390/ijgi14090346

Open AccessArticle

Geological Disaster Risk Assessment Under Extreme Precipitation Conditions in the Ili River Basin

by

Xinxu Li

¹

,

Jinghui Liu

^1,*,

Zhiyong Zhang

²,

Xushan Yuan

¹,

Yanmin Li

¹ and

Zixuan Wang

¹

School of Emergency Technology and Management, Institute of Disaster Prevention, Sanhe 065201, China

²

Center for Integrated Monitoring and Early Warning of Natural Disasters, Department of Emergency Management of Xinjiang Uygur Autonomous Region, Urumqi 830000, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2025, 14(9), 346; https://doi.org/10.3390/ijgi14090346

Submission received: 13 June 2025 / Revised: 22 August 2025 / Accepted: 29 August 2025 / Published: 7 September 2025

(This article belongs to the Special Issue Advances in Remote Sensing and GIS for Natural Hazards Monitoring and Management)

Download

Browse Figures

Versions Notes

Abstract

Geological Disasters (Geo-disasters) are common in the Ili River Basin, with extreme precipitation being a major triggering factor. As the frequency and intensity of these events increase, the associated risks also rise. This study proposes a hazard assessment framework that integrates extreme precipitation recurrence periods with Geo-disaster susceptibility. Furthermore, based on a comprehensive risk assessment model encompassing hazard, exposure, vulnerability, and disaster mitigation capacity, the study evaluates Geo-disaster risk in the Ili River Basin under extreme precipitation conditions. Hazard levels are assessed by integrating geo-disaster susceptibility with recurrence periods of extreme precipitation, resulting in hazard and risk maps under various conditions. The susceptibility indicator system is refined using K-means clustering, the certainty factor (CF) model, and Pearson correlation to reduce redundancy. Key findings include: (a) Geo-disasters are influenced by a combination of factors. High-susceptibility areas are typically found in moderately sloped terrain (8.5–17.64°) at elevations between 1412 m and 2234 m, especially on east- and southeast-facing slopes. Lithology, soil, hydrology, fault proximity, and the topographic wetness index (TWI) are the primary influences, while high NDVI values reduce susceptibility. (b) The hazard pattern varies with the recurrence period of extreme precipitation. Shorter periods lead to broader high-hazard zones, while longer periods concentrate hazards, particularly in Yining City. (c) Exposure is higher in the east, vulnerability aligns with transportation networks, and disaster mitigation capacity is stronger in the north, particularly in Yining. (d) Low-risk areas are found in valleys and flat terrains, while medium to high-risk zones concentrate in southeastern Zhaosu, Tekes, and Gongliu counties. Some economically active regions require special attention due to their high exposure and vulnerability.

Keywords:

Ili river basin; geological disaster risk; extreme precipitation conditions; CF model; machine learning

1. Introduction

Geo-disasters (landslides, debris flows, and collapses) are among the most destructive and widespread natural disasters globally, posing severe threats to human lives and property, particularly in mountainous regions and geologically fragile areas [1,2,3,4]. Precipitation is a critical triggering factor for Geo-disaster, with extreme precipitation events significantly increasing the likelihood of landslides, debris flows, and collapses [5,6]. With the intensification of climate change, both the frequency and intensity of extreme precipitation events are rising, further amplifying the risk of precipitation-induced Geo-disasters [7], which underscores the growing importance of related research. Existing studies have primarily focused on threshold analyses of extreme precipitation triggering Geo-disasters and associated risk assessment methods. However, assessing the risk of Geo-disaster under extreme precipitation conditions remains a significant challenge. Therefore, it is necessary to systematically review existing methodologies and analyze their respective advantages and limitations.

The role of extreme precipitation in triggering Geo-disaster is primarily reflected in the infiltration of water into the soil, leading to increased soil saturation, elevated pore water pressure, and a reduction in shear strength, which ultimately results in landslides and collapses [8,9]. In recent years, researchers have employed physical experiments and numerical simulations to reveal the key mechanisms by which precipitation contributes to the formation of Geo-disaster. Therefore, this study integrates case studies to illustrate different risk assessment approaches for Geo-disaster under extreme precipitation conditions, aiming to enhance the understanding of the triggering characteristics of extreme precipitation conditions.

The precipitation threshold model was first proposed by Caine N [10] in the 1980s, with its core concept being the identification of critical precipitation levels associated with Geo-disaster occurrences based on historical data. The model introduced the concepts of precipitation duration (D) and precipitation intensity (I), formulating an intensity-duration (ID) function in a power-law form using precipitation data from historical disaster events. The ID function delineates the threshold at which extreme precipitation is likely to trigger geological hazards, providing a foundational framework for subsequent studies. Most later research has been centered around the ID function. For example, Saha S and Berap B [11] employed the ID function in combination with antecedent rainfall methods to establish the relationship between landslide occurrence and rainfall, and subsequently determined the best-fit distribution of rainfall data for the four westernmost districts of the Garhwal Himalaya using goodness-of-fit tests. However, threshold models are generally grounded in empirical statistics, which makes it challenging to accurately capture the spatial heterogeneity of geological environments, thereby limiting their regional applicability.

In recent years, an increasing number of studies have incorporated precipitation as a key factor within geological hazard risk assessment frameworks. Researchers have utilized indicators such as the maximum annual daily precipitation, N-year recurrence period precipitation, and the rainfall erosivity index (R-Factor) to construct risk assessment models. R.Z. Abidin et al. [12] applied the Universal Soil Loss Equation (USLE), integrating rainfall erosivity, soil erodibility, slope length, and slope steepness, to predict landslide susceptibility in Malaysia, with model performance validated in Fraser’s Hill and Genting Highlands. Liu et al. [13] analyzed long-term daily precipitation data from the Ili River Basin (1981–2024), calculated the frequency of different-intensity precipitation events, and combined CMIP6 model projections to construct extreme precipitation intensity indicators, providing quantitative evidence for regional Geo-disaster risk assessment. Although these traditional risk assessment methods can effectively quantify the contribution of precipitation to Geo-disasters, they lack characterization of multi-factor interactions and cannot dynamically simulate precipitation-induced disaster processes.

In recent years, to overcome the limitations of traditional methods, integrated risk assessment approaches combining multi-source data such as remote sensing and meteorological information have become a research focus [14]. These mainly include statistical methods, machine learning methods, physics-based methods, and disaster chain-based assessment frameworks.

Statistical methods are among the most commonly used approaches in Geo-disaster risk assessment, often integrating multiple statistical approaches to produce accurate and objective risk maps. M.E. Kincey et al. [15] combined fuzzy overlay methods with topography, landslide inventories, and population/building data to construct nationwide susceptibility and exposure models for rainfall-induced landslides in Nepal, resulting in comprehensive landslide risk maps. Mosaffaie J. et al. [16] combined landslide susceptibility with relative vulnerability, using fuzzy gamma operators and ten landslide-influencing factors to evaluate and validate landslide risk in the Alamut watershed. Statistical methods are relatively straightforward and provide intuitive results, but they struggle to capture nonlinear relationships and interactions among influencing factors, and some weighting methods remain subject to subjective bias.

With the development of machine learning, these methods have been increasingly applied to Geo-disaster risk assessment. Zhao Z et al. [17] combined SBAS-InSAR technology with high-resolution optical imagery for landslide identification, analyzed influencing factors using geographical detectors, and applied RF, GBDT, CatBoost, LR, and Stacking models to assess landslide susceptibility, achieving coupled modeling with small sample regions. Peng and Wu [18] constructed landslide susceptibility prediction models using multi-layer perceptron (MLP) regression and 3D convolutional neural networks based on 453 rainfall-induced landslide events, integrating precipitation thresholds to provide daily landslide risk warnings. Machine learning methods can effectively overcome limitations of statistical methods in handling complex nonlinear relationships, but they have low interpretability and require high-quality data and computational resources, making them sensitive to small samples or missing data.

Physics-based methods are also commonly used in Geo-disaster risk assessment. Ortiz-Giraldo L et al. [19] employed SLIDE, RAMMS-DF, and Iber 2D hydraulic models to simulate rainfall-induced shallow landslides and debris flows affecting river channels, providing physics-based support for landslide–debris flow risk assessment. Wu Y et al. [20] analyzed the Zhangjiawan landslide in Xining, Qinghai Province, through field surveys and geological analysis, summarizing the formation and evolution mechanisms of landslides and assessing slope stability using limit equilibrium and numerical simulation methods, offering theoretical support for landslide susceptibility. Physics-based methods have a solid theoretical foundation and strong interpretability; however, their complexity, parameter sensitivity, and computational demands limit their applicability over large areas.

Major disasters are rarely isolated events; instead, they are often linked through cascading triggers such as human activities or natural disasters. Heavy rainfall, flooding, or earthquakes can induce geological hazards, which in turn may lead to secondary disasters like barrier lakes. This phenomenon is referred to as a “disaster chain,” and quantitative analysis of disaster chain relationships remains a key research focus. Ke K et al. [21] evaluated earthquake-induced landslide disaster chain risk by calculating factor sensitivities using deterministic factor methods, employing sensitivity values as input for SVM classification, and training the SVM to assess regional Geo-disaster vulnerability. Li C et al. [22] embedded a “rainfall–landslide–flash flood” disaster chain into the CAESAR-Lisflood landscape evolution model, constructing a coupled simulation framework to predict landslide susceptibility and disaster occurrence under extreme rainfall in earthquake-prone areas. Yu Ze et al. [23] analyzed the landslide–debris flow disaster chain in Hanping Village, Shaanxi Province, in October 2021, revealing movement processes and causal mechanisms through field surveys, UAV photogrammetry, satellite remote sensing interpretation, and SBAS-InSAR analysis. Wang W et al. [24] constructed a vulnerability assessment model for rainfall–landslide disaster chains in the Guangdong–Hong Kong–Macau Greater Bay Area, evaluating sensitivity, exposure, and adaptive capacity using CNN–OPGD–AHP, sequence relationship–TOPSIS, and entropy–TOPSIS methods, highlighting interactions and synergies that single-disaster models fail to capture. Disaster chain-based assessments can effectively reflect multi-hazard interactions and synergies while considering vulnerability and exposure, but model complexity, multi-source data requirements, and difficulties in computation and validation constrain their development.

The Ili River Basin, situated in the heart of Central Asia, is a typical mountain–valley transitional region in northwestern China, characterized by complex topography and geological structures. Influenced by the surrounding Tianshan Mountains and the interaction between westerly circulation and topographic uplift, precipitation in the region exhibits highly uneven spatial–temporal distribution, with frequent extreme precipitation events. The combination of complex geological conditions and extreme precipitation leads to recurrent landslides, debris flows, and collapses, showing pronounced spatial heterogeneity. Despite the substantial socio-economic impacts of Geo-disasters in the Ili River Basin, systematic studies under extreme precipitation conditions remain limited. To address this research gap, and building upon the Third Xinjiang Comprehensive Scientific Survey project, this study selects the Ili River Basin as the study area.

In summary, this study represents extreme precipitation recurrence periods as the frequency of extreme precipitation and Geo-disaster susceptibility as Geo-disaster intensity, and, for the first time, integrates these two aspects to construct a Geo-disaster hazard framework under extreme precipitation conditions. Compared with traditional threshold- or single-indicator-based approaches, this framework more comprehensively captures the coupling effects between precipitation-triggering mechanisms and geological conditions while revealing spatial variation patterns of hazard under different extreme precipitation conditions. Based on this framework, a systematic risk assessment for the Ili River Basin is conducted using a generalized approach encompassing hazard, exposure, vulnerability, and disaster mitigation capacity, providing scientific evidence and methodological guidance for disaster prevention and resource allocation in the region.

2. Study Area and Materials

2.1. Study Area

The Ili River is an inland river in Central Asia, crossing the international boundary between China and Kazakhstan. It has a total length of 1236 km, of which 442 km lie within China, with a drainage area of 56,000 km². Administratively, the Ili River Basin mainly includes ten counties and cities in the Ili Kazakh Autonomous Prefecture of Xinjiang Uygur Autonomous Region (excluding Kuitun City), parts of Jing County in the Bayingolin Mongol Autonomous Prefecture, and the provincial-level city of Kokdala (Figure 1). According to statistics, by 2024, the Ili River Basin had a resident population of 2.9536 million and a GDP of 349.172 billion CNY. By February 2022, infrastructure such as roads had reached a total of 14,413.95 km of prefecture-level roads, 2288.4 km of national and provincial highways, 12,125.5 km of rural roads, and 608.91 km of expressways (including first-class), placing the total prefecture-level road length among the top five in Xinjiang Uygur Autonomous Region. Xinjiang has a total of 4357 recorded geological hazard sites, 1034 of which are located in the Ili River Basin, accounting for 24% of the total.

Within the Ili River Basin, mountainous areas dominate over plains, and windward slopes receive more rainfall than leeward slopes, resulting in significant spatial variability in precipitation. Rainfall in the river valley plains ranges from 200 to 500 mm, while mountainous areas have experienced maxima exceeding 1000 mm. The basin exhibits a typical “three mountains enclosing two basins” topography, with the three mountains being the Borokonu Mountains in the north, the Koguchin Mountains in the center, and the Nalati Mountains in the south, and the two basins comprising the Ili River Valley and the Zhaosu–Tekes Basin. This unique terrain results in pronounced topographic relief and complex geological conditions, making geological hazards among the most frequent natural disasters in the Ili River Basin, with extreme precipitation serving as the primary meteorological trigger.

Geo-disasters triggered by extreme precipitation in the Ili River Basin occur frequently and have widespread impacts. For example, from 12–17 May 2015, the Ili River Valley experienced continuous heavy rainfall, causing debris flows of varying severity in 13 townships and four state-owned farms across Xinyuan County, Zhaosu County, and Chabuchar County. In particular, in Xinyuan County, 11 townships and four state-owned farms experienced multiple debris flows and landslides triggered by sustained heavy rainfall, resulting in severe local impacts. Similarly, from 31 July to 1 August 2016, widespread continuous heavy rainfall in the Ili River Valley led to an extreme rainstorm in Gongliu Kuurdining, with a recorded rainfall of 123.2 mm, which triggered debris flows along County Road X737 from Longkou Tunnel to Qiapuqihai and the Tarim section.

2.2. Data Source

The data sources are shown in Table 1.

Table 1 summarizes the data sources, while the following paragraph details their resolution, temporal coverage, preprocessing, and quality control. The DEM dataset (100 m, static) was used to derive multiple topographic indicators, including elevation, slope, aspect, plan curvature, profile curvature, STI, TWI, SPI, and TRI, all processed in ArcGIS using surface analysis and terrain calculation tools. NDVI data (1000 m, 2020) were computed from satellite imagery to obtain yearly averages representing overall vegetation status. Distance-based indicators, including distance to river, distance to fault, distance to road, and distance to disaster points, were calculated from vector datasets using the buffer tools in ArcGIS. Density-based indicators such as road density, building height density, POI density, density of transportation points, and density of medical points were obtained through kernel density analysis applied to the respective vector datasets. Socioeconomic variables, including the density of enterprises above scale, hospital beds, welfare institution beds, fixed telephone subscribers, and urban and rural residents’ deposits, were aggregated at the county/city level using original vector datasets.

To ensure spatial and temporal consistency and facilitate subsequent calculations and analyses, all datasets in this study were first unified within the study area, with raster data resampled to a 30 m spatial resolution and projected to CGCS2000, Gauss-Krüger 3-degree zone with central meridian at 84° E. Temporal resolution was harmonized by using annual averages for NDVI and the latest available year for vector and socioeconomic datasets.

All datasets were further subjected to necessary spatial standardization and quality control to ensure reliability of the analyses. Topographic and geological datasets (DEM, Landform Index, Lithology Index, Soil Index) were obtained from authoritative sources and processed with void filling, classification harmonization, and coordinate correction to ensure spatial accuracy. Remote sensing and vegetation datasets (NDVI, Land Type) were derived from atmospherically and geometrically corrected satellite imagery; NDVI was calculated as the annual mean to represent overall vegetation conditions, with clouds, haze, and anomalous pixels removed; land use data were reclassified as necessary to maintain a consistent classification system. Vector-based infrastructure datasets (Distance to River, Distance to Fault, Distance to Road, Road Density, Building Height Density, POI Density, Density of Transportation, Density of Medical Points) underwent topological checks and coordinate correction, with kernel density or buffering used to generate indicators and ensure positional accuracy. Socioeconomic datasets (Population Density, GDP, Density of Enterprises above Scale, Hospital Beds, Welfare Institution Beds, Fixed telephone subscribers, Urban and Rural Residents’ Deposits) were obtained from statistical yearbooks, with missing values supplemented using weighted estimates from neighboring administrative units, and spatialization preserving total amounts. Historical Disaster Point Distribution Data were deduplicated and coordinate-corrected to ensure positional precision. Overall, all datasets underwent necessary standardization and quality control to guarantee the reliability of subsequent analyses.

3. Methods

The research on Geo-disaster risk assessment in the Ili River basin under extreme precipitation conditions is composed of three parts: data collection, hazard assessment, and risk assessment (Figure 2). (a) The relevant data for the study, including precipitation data, topographic data, and human activity data, are collected and transformed into measurable indicators; (b) The hazard assessment under extreme precipitation conditions is based on the relationship between geological susceptibility and extreme precipitation recurrence periods. The hazard is classified using K-means clustering, quantified by the Certainty Factor method and followed by testing for multi-collinearity of the indicators using Pearson’s correlation; (c) The exposure, vulnerability and disaster mitigation capacity are assessed, and the results are combined with the hazard assessment through a weighted aggregation to determine the Geo-disaster risk under the extreme precipitation conditions.

3.1. Risk Assessment

3.1.1. Risk Assessment Model

The risk assessment model is based on the Generalized Risk Assessment Model proposed by Shi Peijun [27], which is designed to evaluate the overall risk of the disaster system. Based on the evaluation of hazard, exposure, vulnerability, and disaster mitigation capacity, it performs a comprehensive assessment of disaster risk. The relationship can be described by the following equation:

F (H, E, V, C) = H * W_{h} + E * W_{e} + V * W_{v} + (1 - C) * W_{c}

(1)

where

H

, E, V, and C respectively represent hazard under extreme precipitation conditions, exposure, vulnerability and disaster mitigation capacity, while the corresponding weights for

H

, E, V, and C are denoted as

w_{H}, w_{E}, w_{V}

and

w_{C}

.

3.1.2. Entropy Weight-Random Forest Method (EW-RFM)

The entropy weighting method, based on the principle of information entropy, is an objective weighting method that uses the obtained objective data to determine the weights of indicators [28]. Due to its simplicity and effectiveness, it is widely used by scholars. According to the basic theory of the entropy weighting method, the greater the numerical difference between the data indicators, the larger the degree of dispersion of the indicator, indicating a greater impact on the comprehensive evaluation result, and thus a higher corresponding weight. The formula is as follows:

E_{i} = - \frac{1}{\ln (n)} \sum_{j = 1}^{n} P_{i j} * \ln (P_{i j})

(2)

where

P_{i j}

represents the contribution value of the i-th data at the j-th level, n is the total number of levels, and

E_{i}

is the entropy value of the i-th data.

The entropy weighting method calculates the entropy value of data to represent its weight and is one of the most commonly used objective weighting methods. However, it has the limitation of being unable to accurately express the relationships between data. Random Forest (RF), as a classical machine learning algorithm [29], can effectively learn the relationships between data during the training process and determine the contribution of feature data to the target data. Therefore, we consider combining the entropy weighting method with the RF method to construct an EW-RFM weighting model that optimizes the entropy weighting method’s weights, allowing for the consideration of both the entropy value of the data itself and the relationships between the data. The specific workflow is shown in Figure 3.

RF training was performed using five-fold cross-validation, with each fold split into training and test sets at a 4:1 ratio. Leveraging the inherent characteristics of Random Forest, overfitting risk was mitigated. After training, the feature importance results from each fold were averaged and normalized to enhance the robustness of the optimized weights.

3.2. Hazard Assessment

3.2.1. Hazard Assessment Model

Hazard is one of the most significant components influencing the risk assessment of Geo-disasters. The assessment of hazard typically focuses on two aspects: spatial probability and temporal probability [30]. Spatial probability refers to the susceptibility under the influence of specific factors such as topography, geological conditions, etc. Temporal probability reflects the occurrence frequency of the triggering factors of Geo-disasters. Extreme precipitation, as the main triggering factor of Geo-disasters, is characterized as the frequency of extreme precipitation by its recurrence period. Therefore, hazard is composed of susceptibility and extreme precipitation frequency. The formula is as follows:

H = S * P_{f}

(3)

where

H

represents the hazard,

S

represents the Susceptibility, and

P_{f}

represents the frequency of extreme precipitation (f represents 10-year, 20-year and 50-year).

3.2.2. Susceptibility Assessment

A. Susceptibility Indicator Analysis

Susceptibility assessment is one of the most fundamental evaluation studies in Geo-disaster related research. Its essence is to study the spatial probability of Geo-disasters occurring under different topographic, geological, and other influencing factors. Based on the hypothesis that “Geo-disaster influencing factors are closely related to the occurrence of Geo-disasters,” it is assumed that Geo-disasters are likely to occur in the future under similar influencing factors that caused past disasters [31]. Therefore, by learning the influencing factors during past Geo-disaster events, it is possible to predict the spatial characteristics of future Geo-disasters. The influencing factors for Geo-disasters are divided into four categories: topographic conditions, geological environment conditions, vegetation conditions, and hydro-geological conditions. Each category includes one or more subordinate indicators. Based on regional characteristics and the current state of research, this study identifies 15 indicators under the four categories of influencing factors. The indicator data are input into ArcGIS and resampled to a unified spatial scale (30 m × 30 m), forming a spatial information database of Geo-disaster influencing factors (Table 2; Figure 4).

Selecting susceptibility indicators from the four perspectives of topographic conditions, geological environment conditions, vegetation conditions, and hydro-geological conditions allows for a comprehensive consideration of the natural conditions that contribute to the occurrence of Geo-disasters. Among these, some indicators related to landform conditions and hydro-geological conditions need to be calculated and processed as follows:

(a). Landform Index, Lithology Index, Soil Index:

I = n / N

(4)

where I represents the Landform Index, Lithology Index, and Soil Index, n is the number of points for a specific type of Geo-disaster, and N is the total number of disaster points in the study area.

(b). SPI, TWI, STI, TRI:

S P I = A_{s} \tan β

(5)

T W I = \ln \frac{A_{s}}{\tan β}

(6)

S T I = \frac{A_{s} \tan β}{E}

(7)

T R I = |E_{i} - E_{i - 1}|

(8)

where

A_{s}

represents the catchment area,

β

represents the slope, E represents the elevation value, and

|E_{i} - E_{i - 1}|

represents the elevation difference between adjacent points.

After constructing the spatial information database of Geo-disaster influencing factors, to explore the influence of different indicator levels on Geo-disasters, the optimal segmentation points are obtained using K-means clustering, dividing each indicator into five levels. Then, the Certainty Factor Model is applied to quantify each level, resulting in the confidence associated with each indicator level and its correlation with Geo-disasters. To address the issue of multicollinearity between indicators, Pearson’s correlation coefficient is used to represent the correlation between indicators. Indicators with multicollinearity are filtered out, resulting in an optimal indicator system with low redundancy.

(a): K-means Clustering

In recent years, with the rise of data classification methods based on mathematical theories such as operations research and multivariate analysis, these methods have been increasingly applied to geographical data classification. Cluster analysis is a method aimed at dividing a dataset into multiple groups or categories, such that data points within the same group are similar to each other, while data points in different groups are significantly different. Among them, the K-means clustering method is one of the most commonly used clustering methods, first proposed by MacQueen J [32]. Its core idea is to iteratively partition the dataset into K categories, minimizing the sum of the distances from each data point in a category to the center (mean) of that category.

K-means offers advantages such as high computational efficiency, simplicity of implementation, and ease of interpretation when handling continuous indicators and small- to medium-sized geographical datasets. It has also been widely applied by numerous researchers in the field of Geo-disaster susceptibility assessment [33,34]. In this study, K-means is applied to classify continuous indicators such as elevation, slope, and NDVI, enabling stable and efficient grading operations.

(b): Certainty Factor Model (CF)

The Certainty Factor (CF) model was first proposed by Shortliffe and Buchanan [35] and is a method that uses certainty factors to handle uncertainty. In the field of disaster assessment, it can be used to analyze the sensitivity of various influencing factors of disaster events, revealing the contribution of each factor to the occurrence of the disaster event. The CF value ranges from [−1, 1] [−1, 1], where −1 indicates that the influencing factor is definitively unrelated to the disaster event, 1 signifies a definite correlation between the influencing factor and the disaster event, and 00 represents uncertainty regarding the correlation. The formula is as follows:

C F = \{\begin{matrix} \frac{P_{a} - P_{s}}{P_{a} (1 - P_{s})} P_{a} \geq P_{s} \\ \frac{P_{a} - P_{s}}{P_{s} (1 - P_{a})} P_{a} < P_{s} \end{matrix}

(9)

where

P_{a}

represents the prior probability of a certain level of an influencing factor, which in this study is expressed as the ratio of the number of disaster points within that factor level to the proportion of the land area it occupies.

P_{s}

represents the overall prior probability within the study area, which is calculated as the ratio of the total number of disaster points in the study area to the proportion of the total land area.

The CF model can quantitatively characterize the contribution of a single influencing factor to disaster occurrence under conditions of uncertainty, providing stable results even when sample sizes are limited or data contain significant noise. It also offers clear interpretability for the quantified contributions and has been widely applied in the quantification of Geo-disaster influencing factors [36,37]. In Geo-disaster studies, it helps preliminarily screen and evaluate the importance of indicators, providing a basis for subsequent modeling.

(c): Pearson’s Correlation Coefficient

Pearson’s correlation coefficient, introduced by the American statistician Karl Pearson [38], is one of the most commonly used correlation coefficients. It is a statistical measure used to quantify the degree of linear relationship between two variables, with values ranging from [−1, 1]; a value of −1 indicates perfect negative correlation, and a value of 1 indicates perfect positive correlation. In this study, Pearson’s correlation coefficient is used to characterize the multicollinearity among various influencing factors of Geo-disasters, thereby avoiding model instability and inaccuracies. The formula is as follows:

r = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}

(10)

where n represents the sample size,

x_{i}

and

y_{i}

are the i-th observed values of variables X and Y,

\bar{x}

and

\bar{y}

respectively denote the sample means of variables X and Y.

B. Susceptibility Assessment Model

In this study, four machine learning models—Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), and XGBoost—were initially selected for susceptibility assessment. All four models are suitable for small- to medium-sized datasets, exhibit high training efficiency, and have relatively low sample size requirements. They also represent four typical modeling paradigms: RF is a representative ensemble model with strong robustness and low sensitivity to feature dimensionality; SVM is a margin-based classifier that generally performs well on small-sample, high-dimensional data; LR is a linear regression model with low computational cost and easy interpretability; and XGBoost is an efficient gradient-boosted tree model capable of capturing complex nonlinear interactions among features. Selecting these four models considers both the limited sample size and computational resources while covering multiple paradigms, allowing for a comprehensive evaluation of each algorithm’s suitability for Geo-disaster datasets and providing multi-perspective insights into Geo-disaster complexity.

The sample characteristics differ among the three Geo-disaster types—landslides, debris flows, and collapses. Discrete variables such as lithology, landform, soil type, and land use type are influenced by the spatial distribution of disaster points, resulting in certain differences across datasets. Therefore, model selection was conducted separately for each disaster type. Model hyperparameters were further optimized using grid search combined with cross-validation to obtain the best-performing model for each dataset.

Figure 5 presents the flowchart of Geo-disaster susceptibility assessment, with the specific process outlined as follows:

(a) Dataset Construction: Based on the number of disaster points, non-disaster points were randomly sampled within the study area at a 1:1 ratio. The “Extract Multi Values to Points” tool was used to extract indicator values for both disaster and non-disaster points, forming training datasets for landslides, debris flows, and collapses.

(b) Cross-Validation for Optimal Parameters: The three geological hazard datasets were combined with four machine learning models (Random Forest, Support Vector Machine, Logistic Regression, and XGBoost) for five-fold cross-validation, obtaining the optimal parameters for each hazard type in each model.

(c) Model Accuracy Evaluation: The models were tested for accuracy using their respective optimal parameters. Performance metrics such as AUC, Accuracy, Precision, Recall, and F1 Score were used to validate model accuracy, and the model with the highest accuracy was selected for prediction.

(d) Geo-disaster Susceptibility Prediction: The optimal model for each hazard type was used to generate susceptibility maps for landslides, debris flows, and collapses. These were then integrated based on the proportion of hazard points to produce a Geo-disaster susceptibility map.

(a): Random Forest (RF)

Random Forest (RF) is an ensemble learning algorithm based on decision trees, proposed by Breiman [39]. It constructs multiple decision trees using random sampling and feature subset selection and combines their results using strategies like majority voting or averaging to solve regression or classification problems (Figure 6). Each decision tree represents a “weak classifier,” and RF improves the accuracy and robustness of the model by combining multiple “weak classifiers” into a “strong classifier.”

RF generates sample subsets using the Bagging method, and for each subset, a corresponding decision tree is created. The ensemble of these decision trees forms a forest. Bagging, also known as the bootstrap aggregating algorithm, involves randomly selecting n samples from the total dataset. These selected samples are referred to as in-bag samples (In-of-Bag, IOB), while the samples that are not selected are called out-of-bag samples (Out-of-Bag, OOB). The IOB are used for model training, while the OOB are used for model validation.

(b): Support Vector Machine (SVM)

Support Vector Machine (SVM) is a binary classification machine learning algorithm that follows the principle of structural risk minimization. It classifies data by mapping low-dimensional data into a high-dimensional space and finding the hyperplane that maximizes the margin between classes [40,41]. In Figure 7a, the circles and triangles represent two different classes of sample data, and H is the classification hyperplane. H1 and H2 are lines parallel to H that pass through the closest sample points to H, known as support vectors. The distance between the support vectors and the classification hyperplane is called the margin. The goal of the optimal classification hyperplane is to correctly classify the sample data while maximizing the margin. When the sample data is nonlinear, SVM solves the nonlinear classification problem by introducing kernel functions. Figure 7b demonstrates the working process of SVM using kernel functions. SVM uses the kernel function to map the sample data into a higher-dimensional space, where linear classification can then be performed on the transformed data.

(c): Logistic Regression (LR)

Logistic Regression (LR) is a generalized linear regression model initially used for classification problems in biostatistics and epidemiology. With the development and improvement of the method, it is now widely applied in the field of Geo-disaster susceptibility assessment. The formula is as follows:

P = \frac{1}{1 + e^{- (θ_{0} + θ_{1} x_{1} + θ_{2} x_{2} + \dots + θ_{n} x_{n})}}

(11)

where P is the probability of occurrence, n is the number of features,

x_{i}

is the i-th feature,

θ_{0}

is the constant, and

θ_{i}

is the regression coefficient of the i-th feature.

LR uses the sigmoid function to transform the output of linear regression into a probability value, mapping it to the (0, 1) range. The goal is to find a set of optimal regression parameters that minimize the difference between the predicted values and the actual values.

(d): Extreme Gradient Boosting (XGBoost)

XGBoost is an efficient implementation of Gradient Boosted Decision Trees (GBDT), proposed by Chen and Guestrin [42]. It solves classification or regression problems by incrementally building a series of weak learners (usually CART trees) and combining their results. XGBoost improves GBDT in the following ways:

(a) Weighted objective function optimization: It introduces a second-order Taylor expansion to approximate the loss function and controls model complexity through a regularization term, thereby improving generalization ability.

(b) Shrinkage (learning rate): By scaling the predictions of each tree, the model is progressively optimized to avoid overfitting.

(c) Column sampling: In the feature selection phase, a random subset of features is selected for modeling, further reducing the risk of over-fitting and improving computational efficiency.

(d) Parallelization and distributed computing: It optimizes the process of finding split points to enable parallel training and supports distributed deployment to handle large-scale datasets.

Compared with traditional GBDT methods, XGBoost has significant improvements in model efficiency, scalability, and predictive performance. It is widely applied across various fields due to its fast computation speed and high prediction accuracy.

3.2.3. Extreme Precipitation Recurrence Period Model

The Generalized Extreme Value (GEV) Distribution is a type of probability distribution that combines three extreme value distributions: Gumbel Distribution, Frechet Distribution, and Weibull Distribution. Due to its inclusion of these three distribution types, the GEV Distribution has good applicability in many practical problems, particularly in describing the statistical properties of extreme events such as extreme precipitation events. Therefore, this study fits the extreme precipitation dataset of daily observations from 10 stations in the Ili River Basin during 1981–2024 using the GEV distribution, aiming to accurately estimate the frequency and intensity of extreme precipitation events. Its Probability Density Function (PDF) and Cumulative Distribution Function (CDF) are expressed as follows:

f (x; μ, σ, ξ) = \frac{1}{σ} {(1 + ξ \frac{x - μ}{σ})}^{- \frac{1}{ξ} - 1} e x p (- {(1 + ξ \frac{x - μ}{σ})}^{- \frac{1}{ξ}})

(12)

F (x; μ, σ, ξ) = e x p (- {(1 + ξ \frac{x - μ}{σ})}^{- \frac{1}{ξ}})

(13)

where

x

represents the observed data,

μ

is the location parameter, which determines the position of the distribution,

σ

is the scale parameter, which determines the width of the distribution, and

ξ

is the shape parameter, which determines the tail behavior of the distribution.

When

ξ

= 0, the GEV distribution becomes the Gumbel distribution; when

ξ

> 0, the GEV distribution becomes the Frechet distribution (heavy-tailed distribution); and when

ξ > 0

, the GEV distribution becomes the Weibull distribution (bounded tail distribution).

The most commonly used method for estimating the three parameters (

μ, σ, ξ

) of the GEV distribution is Maximum Likelihood Estimation (MLE). For a given sample data

x_{1}

,

x_{2}

,…,

x_{n}

, the optimal parameter estimates of the GEV distribution are obtained by maximizing the following likelihood function:

L (μ, σ, ξ) = \prod_{i = 1}^{n} f (x_{i}; μ, σ, ξ)

(14)

The definition of extreme precipitation is a crucial factor for distinguishing extreme precipitation events from general precipitation conditions, as it determines the length and characteristics of the recurrence period. In this study, the annual maximum series is selected as the extreme precipitation dataset, which is consistent with Extreme Value Theory (EVT) and satisfies the practical requirements of regional risk assessment.

To examine the distribution characteristics of the annual maximum series, the Kolmogorov–Smirnov (K–S) goodness-of-fit test is employed. This non-parametric statistical method compares the empirical distribution function of the sample with a hypothesized theoretical distribution, calculating the K–S statistic and the corresponding P-value to evaluate the fit. A smaller K–S statistic indicates closer agreement between the sample and the theoretical distribution, while the P-value tests the significance of the difference. A p-value greater than 0.05 indicates that the null hypothesis cannot be rejected, meaning the sample can be considered to follow the theoretical distribution. The K–S test for the annual maximum series yields a p-value of 0.92 (>0.05) and a K–S statistic of 0.026, indicating that the series is well fitted by the GEV distribution.

To further assess the reliability of the fit, uncertainty analysis of the GEV distribution parameters is conducted. Based on the standard errors of the parameters, 95% confidence intervals for the shape, location, and scale parameters are estimated (Table 3). The narrow confidence intervals indicate stable parameter estimates and reliable fit results, effectively characterizing the probability distribution features of extreme precipitation events in the study area.

After fitting the GEV distribution to the extreme value sample set from each meteorological station, the extreme precipitation values for the 10-year, 20-year, and 50-year recurrence periods are further calculated by combining the corresponding CDF. The formula is as follows:

T = \frac{1}{1 - F (x_{t})}

(15)

where T represents the recurrence period of extreme precipitation amount

x_{t}

, and

F (x_{t})

is the CDF value of the extreme precipitation amount

x_{t}

.

3.3. Exposure Assessment Model

Exposure refers to the extent to which elements such as population, economy, and infrastructure are spatially exposed to disaster-causing factors, reflecting their potential impact within the disaster-affected area. Exposure elements can be considered from both social and natural system perspectives. Based on regional characteristics and practical conditions, distance to roads, density of large-scale enterprises, and building height density are selected as social exposure elements: road distribution data are complete and easily obtainable, large-scale enterprises provide a quantifiable measure of property exposure, and building height density reflects the size and volume of built structures. Distance to disaster points and land type are selected as natural exposure elements: distance to disaster points directly indicates areas of high vulnerability within the natural system, while land type differentiates the variation of disaster exposure across different terrains. The gridded data of these indicators are shown in Figure 8.

To standardize the dimensional differences of the indicators, the indicator data are normalized based on their attributes. The indicator weights are determined using the EW-RF weighting method (Table 4). Exposure is calculated using a raster weighting model, as shown in the following equation:

E = \sum_{i = 1}^{n} X_{i} W_{i}

(16)

where E represents exposure, n is the number of indicators,

X_{i}

is the raster data for the i-th indicator, and

W_{i}

is the weight of the i-th indicator.

3.4. Vulnerability Assessment Model

Vulnerability refers to the hazard of elements in a region when exposed to disaster-causing factors, reflecting the degree to which these elements can withstand impacts and the potential severity of the resulting disaster. Vulnerability is typically assessed from multiple dimensions, including population, economy, and infrastructure. In this study, population density is used to represent population vulnerability, indicating the potential impact on people during a disaster; GDP is used to represent economic vulnerability, reflecting the economic scale and potential loss; and POI density, road network density, and transportation facility density are used to represent infrastructure vulnerability. Specifically, POI density quantifies the concentration of public service facilities, and road network and transportation facility densities reflect the potential impact on infrastructure. The gridded data of these indicators are shown in Figure 9.

To standardize the dimensional differences of the indicators, the indicator data are normalized based on their attributes. The indicator weights are determined using the EW-RF weighting method (Table 5). Vulnerability is calculated using a raster weighting model, as shown in the following equation:

V = \sum_{i = 1}^{n} X_{i} W_{i}

(17)

where V represents vulnerability, n is the number of indicators,

X_{i}

is the raster data for the i-th indicator, and

W_{i}

is the weight of the i-th indicator.

3.5. Disaster Mitigation Capacity Assessment Model

Disaster mitigation capacity refers to the ability of a region or system to organize, reduce, and respond to disasters, encompassing pre-disaster, during-disaster, and post-disaster capabilities, with the goal of reducing hazard impacts and enhancing the resilience and recovery capacity of social systems. In this study, five indicators are selected across the three phases—pre-disaster, during-disaster, and post-disaster—as factors influencing disaster mitigation capacity: urban and rural residents’ deposits, number of landline telephones, hospital beds, welfare institution beds, and medical facility density. Among these, residents’ deposits reflect the economic reserves and potential disaster resistance of the population; the number of landline telephones reflects information transmission capacity during disasters; hospital beds and welfare institution beds represent available medical resources and social emergency capacity during disasters; and medical facility density reflects the speed and capability of post-disaster recovery. The gridded data of these indicators are shown in Figure 10.

To standardize the dimensional differences of the indicators, the indicator data are normalized based on their attributes. The indicator weights are determined using the EW-RF weighting method (Table 6). Disaster Mitigation Capacity is calculated using a raster weighting model, as shown in the following equation:

C = \sum_{i = 1}^{n} X_{i} W_{i}

(18)

where C represents disaster mitigation capacity, n is the number of indicators,

X_{i}

is the raster data for the i-th indicator, and

W_{i}

is the weight of the i-th indicator.

4. Results

4.1. Susceptibility Indicator Analysis

The K-means clustering method was used to classify the Geo-disaster susceptibility indicators at the grid scale for the Ili River Basin, and the CF model was applied for quantitative processing of the indicators (Figure 11).

As shown in Figure 11, Geo-disasters are more likely to occur under terrain conditions with elevations of 1412–2234 m and slopes of 8.57–17.64°, with east- and southeast-facing slopes being the primary susceptible aspects. These low-gradient natural slopes are mostly composed of loose materials with low foundation shear strength, making the slopes more prone to instability and thus more susceptible to Geo-disasters. High-gradient hills also exhibit high Geo-disaster susceptibility, while Eolian deposits and Haplic Chernozems are the lithology and soil types contributing most to Geo-disaster occurrence. Hydrological conditions and faults also significantly impact the distribution of Geo-disasters. Areas within 10 km of rivers and faults are more likely to experience Geo-disasters than those farther away. NDVI reflects vegetation coverage and indicates that areas with higher vegetation coverage in the study area can significantly reduce the occurrence of Geo-disasters. Therefore, regions with moderate or high NDVI levels have a very low probability of Geo-disasters. Among the hydrogeological indicators, lower TWI values are most strongly correlated with Geo-disasters. This is because lower TWI values usually lead to loose soil structure and reduced shear strength, making the area more susceptible to Geo-disasters under precipitation conditions. A lower TRI is associated with the gentle slopes and mountain areas (elevation 1412–2234 m, slope 8.57–17.64°), which are favorable for the accumulation of sediments, thus increasing the likelihood of Geo-disasters. The frequent occurrence of Geo-disasters is also related to SPI and STI. In areas with moderate SPI levels, a certain water flow intensity can cause sediment movement, increasing hazard. Higher STI levels further enhance the role of sediment movement in promoting Geo-disasters. In conclusion, the occurrence of Geo-disasters is the result of multiple factors working together, with the interaction of different influencing factors determining the susceptibility.

Zhang et al. [43] defined a correlation as high when the absolute value of the correlation coefficient exceeds 0.9 while optimizing the water security rating indicator system. As shown in Figure 12, in the correlation matrix of Geo-disaster susceptibility indicators, NDVI and DFD have the strongest correlation, but this does not reach the threshold for high correlation.

Table 7 presents the optimized weights of susceptibility indicators for landslides, debris flows, and collapses, showing both commonalities and differences among them. The landform index and lithology index consistently exhibit relatively high weights, reaching 0.1482 and 0.1624 for landslides, and even 0.1517 and 0.1193 for collapses, indicating that topographic and geomorphologic conditions are the primary controlling factors for Geo-disaster occurrence. Additionally, STI and TRI also carry notable weights; for instance, in debris flows, STI reaches 0.1341 and TRI 0.0911, suggesting that hydrogeological conditions play a significant role in triggering debris flows, which aligns with practical observations. In contrast, indicators such as NDVI and SPI have relatively low weights—for example, NDVI accounts for only 0.0241 in landslides—indicating that vegetation cover and local runoff conditions have limited explanatory power for overall disaster susceptibility. Overall, geological and geomorphological factors dominate across the three disaster types, while vegetation-related factors exert a relatively minor influence.

In summary, after the Pearson correlation and Weight test, no indicators were excluded from the Geo-disaster susceptibility indicator system, and the original indicator system was maintained for susceptibility prediction.

4.2. Hazard Map

The selected Geo-disaster susceptibility indicators were combined into a sample set for model selection. As shown in Figure 13, in the model accuracy test for the landslide sample set, both RF and XGBoost showed similar accuracy and F1 Score, with RF slightly outperforming XGBoost in terms of AUC. Therefore, the RF model was chosen to predict the landslide sample set. In the accuracy test for the collapse sample set, RF slightly outperformed XGBoost in AUC and Precision, while XGBoost significantly led RF in Recall and F1 Score. Thus, XGBoost was selected as the prediction model for the collapse sample set. In the accuracy test for the debris flow sample set, LR showed comparable Precision to SVM but outperformed it in the other four metrics. Therefore, LR was chosen as the prediction model for the debris flow sample set.

The LR, XGBoost, and RF models are respectively used to predict the landslide, debris flow, and collapse, resulting in susceptibility maps for each hazard type (Figure 14a–c). These maps are then combined into a weighted composite based on the proportion of landslide, debris flow, and collapse points in the total geological hazard points. The formula is as follows:

S_{G} = S_{L} * w_{L} + S_{D} * w_{D} + S_{C} * w_{C}

(19)

where

S_{G}

,

S_{L}

,

S_{D}

,

S_{C}

represent respectively the susceptibility of Geo-disasters, landslides, debris flows, and collapses, and

w_{L}

,

w_{D}

,

w_{C}

represent the weights of landslide, debris flow, and collapse points within the Geo-disaster points, with values of 0.8126, 0.0574, and 0.13.

Geo-disasters susceptibility map was generated based on the weighted composite susceptibility maps of landslide, debris flow, and collapse (Figure 14d). The high-susceptibility areas are primarily concentrated in the low hill slopes of the Ili River Basin. These terrains are characterized by gentle slopes, which facilitate the accumulation of loose materials such as soil and debris from higher elevations, leading to high hazard. In contrast, areas at higher elevations, where the slopes are steeper, experience rapid runoff of loose materials, resulting in a lower probability of Geo-disaster. Additionally, susceptibility is closely related to the distribution of rivers and faults. Along river corridors, surface runoff erosion and enhanced groundwater infiltration lead to a reduction in rock and soil stability. In fault zones, the rock mass is highly fractured due to long-term tectonic activity, and the intense topographic cutting further increases the hazard. In contrast, low-susceptibility areas are found in the flat terrain of the Ili River Valley. This region exhibits significantly different geographic features from high-susceptibility areas, lacking mountainous terrain, steep slopes, and sources of loose materials, resulting in a lower probability of Geo-disaster.

The annual extreme precipitation values are defined as the maxima, and extreme precipitation sample sets for each meteorological station are generated and fitted using the GEV distribution. The fitting curves for different meteorological stations are shown in Figure 15.

After the fitting is completed, the extreme precipitation values for the 10, 20, and 50-year recurrence periods are calculated based on the CDF values for each meteorological station. These values are then interpolated using the Kriging method in ArcGIS to generate the extreme precipitation maps based on the 10, 20, and 50-year recurrence periods in the study area (Figure 16).

By combining the Geo-disaster susceptibility map with the extreme precipitation recurrence period maps for the study area, the hazard maps based on the 10-year, 20-year, and 50-year extreme precipitation recurrence periods are generated (Figure 17). The hazard maps under different extreme precipitation conditions show significant regional distribution differences. Under the 10-year period, the areas with high precipitation intensity closely coincide with high susceptibility areas. Under the combined influence of precipitation intensity and geological susceptibility, high-hazard zones are formed in the eastern part of Nilka County, the central part of Xinyuan County, and the northern part of Zhaosu County. When the recurrence period is 20 years, extreme precipitation mainly concentrates in the regions of Yining City and Yining County, while other areas experience moderate or lower levels of precipitation. Due to the spatial distribution of precipitation, the overall high-hazard area reduces in size, with Yining County turning into a high-hazard area. Under the 50-year recurrence period, extreme precipitation intensity becomes more concentrated, mainly occurring in Yining County. Due to the concentrated distribution of extreme precipitation, the hazard map for the 50-year recurrence period classifies only Yining County as a high-hazard area, indicating that when precipitation intensity reaches a 50-year recurrence interval, Yining County is the highest-hazard area in the study region.

4.3. Exposure Map

The exposure classification map of the study area was generated through the weighted calculation of the exposure model and categorized into five levels using the natural break method (Figure 18). The study shows that the overall exposure level in the region is relatively high, with a spatial distribution pattern of high exposure in the east and low exposure in the west. High-exposure areas are mainly concentrated in the county and town center regions, as well as in grassland terrain. This distribution is closely related to the higher road distances, building height density, and the density of large-scale enterprises in the county and town centers compared to other regions. In contrast, the western part of the study area, being farther from disaster-prone areas, has a relatively low exposure level, particularly in the southwestern part of Zhaosu County and the northwestern part of Khorgas City. Additionally, from the perspective of land type, grasslands, where Geo-disasters occur most frequently, tend to have higher exposure levels in the regions dominated by grassland.

4.4. Vulnerability Map

The vulnerability classification map of the study area was generated through the weighted calculation of the vulnerability model and categorized into five levels using the natural break method (Figure 19). Overall, the vulnerability of the study area is relatively low, exhibiting a distribution pattern along the road network. Areas with moderate to high vulnerability levels are primarily concentrated in county and town centers. The road network density is the main controlling factor for vulnerability, with its weight accounting for more than 70%, leading to a high correlation between vulnerability distribution and road network density. In addition, population density and GDP density significantly influence the distribution of vulnerability, with high-vulnerability areas concentrated in the cities of Khorgas, the southern part of Huocheng County, and the center of Yining City. These regions are characterized by dense population and economic activities. Similarly, the distribution of transportation infrastructure density and POI density aligns with the aforementioned indicators, further reinforcing the high vulnerability in the county and town center areas.

4.5. Disaster Mitigation Capacity Map

The disaster mitigation capacity classification map of the study area was generated through the weighted calculation of the disaster mitigation capacity model and categorized into five levels using the natural break method (Figure 20). The results show that the overall disaster mitigation capacity level in the region is relatively high, with a spatial distribution pattern of higher capacity in the north and lower capacity in the south, with Yining City having the highest disaster mitigation capacity. Urban and rural residents’ savings, as an important indicator of disaster preparedness, show that the southern regions, such as Zhaosu County and Tekes County, have slightly lower levels than the northern counties. The number of fixed telephones indicates that the disaster response capacity in Xinyuan County and Huocheng County is at a very high level. Furthermore, hospital beds, welfare institution beds, and medical facility density, as key indicators of post-disaster recovery capacity, exhibit a general trend of higher values in the north and lower values in the south. Yining City, with its higher medical facility density, further highlights its advantage in disaster mitigation capacity, a feature clearly reflected in the disaster mitigation capacity map.

5. Discussion

5.1. Risk Patterns and Evolution Under Different Recurrence Periods

The weights of risk components were obtained using the EWM–RF weighting method (Table 8), and the risk results under extreme precipitation conditions were calculated by applying these weights within the risk assessment model to the hazard, exposure, vulnerability, and disaster mitigation capacity components.

A sensitivity test was conducted by perturbing the component weights by ±5% for each recurrence period. The results indicate that under the 10-year recurrence period condition, the risk distribution is most sensitive to hazard and exposure, with changes in the number of high-risk pixels reaching ±7%. For the 20-year recurrence period, sensitivity decreases, with variations narrowing to ±3–4%, while the 30-year recurrence period condition is overall the most stable, with hazard values hardly affected by perturbations. Overall, the stability of the risk results under these perturbations remains above 90%, demonstrating their reliability.

The risk results were classified into five levels using the natural breaks method, producing a geo-disaster risk map under extreme precipitation conditions (Figure 21) and the corresponding area proportion of each risk level for different extreme precipitation conditions (Figure 22).

The area proportion of Geo-disaster risk at different levels in the Ili River Basin exhibits clear dynamic characteristics with changes in the recurrence period. Overall, the proportion of areas classified as low and below under the 10-, 20-, and 50-year recurrence periods are 56%, 52%, and 50%, respectively, indicating that most areas maintain relatively low and stable risk levels, although there is a slight decrease as the recurrence period increases. In contrast, the proportion of medium-risk areas significantly rises with recurrence period, increasing from 25% for the 10-year recurrence period and 28% for the 20-year recurrence period to 42% for the 50-year recurrence period, reflecting an overall elevation in risk levels under stronger extreme precipitation disturbances. Simultaneously, the proportion of high-risk and above areas gradually decreases, with very high-risk areas dropping from 2% for the 10-year recurrence period to 0.4% for the 50-year recurrence period, demonstrating that the impact of high-intensity precipitation on high-risk zones is locally concentrated.

The spatial distribution of Geo-disaster risk in the Ili River Basin also varies noticeably with recurrence period. Under the 10-year recurrence period, very low-risk areas are mainly distributed in the Ili River Valley and the high mountainous terrain at the northern and southern ends of the basin. These regions either possess stable geological structures or exhibit low exposure and vulnerability, resulting in overall low risk. Medium-risk areas are scattered, primarily concentrated in gentle slopes of low mountains, while high-risk and above areas are relatively concentrated in the western part of Nilka County, southern parts of Xinyuan County, and the northern part of Zhaosu County. Under the 20-year recurrence period, medium- and high-risk zones expand significantly, forming more continuous distributions in northern Huocheng County and parts of Gongliu County, whereas very high-risk areas become locally clustered, slightly reducing in area. For the 50-year recurrence period, medium-risk zones further expand, while high-risk areas become more localized, mainly concentrated in northern Yining County and western Nilka County, whereas low-risk zones remain generally stable, indicating that geological structures and exposure patterns dominate risk stability.

Spatial overlay analysis of risk levels under different recurrence periods reveals the evolution of risk patterns and identifies high-risk hotspots (Figure 23). The Ili River Valley terrain and high mountainous regions at the northern and southern ends of the basin consistently remain low-risk zones under all extreme precipitation conditions: the valley, despite its dense population, economic activities, and infrastructure, lacks the geological and topographic conditions for Geo-disasters, while the high mountains are characterized by low exposure and vulnerability. Conversely, areas such as southeastern Zhaosu County, northern Tekes County, and southeastern Gongliu County show risk levels gradually increasing with precipitation intensity. These regions feature gentle slopes of low mountains, high exposure and vulnerability, and limited disaster mitigation capacity, leading to progressively higher risk under stronger extreme precipitation. Areas consistently classified as high-risk or above across all recurrence periods, accounting for approximately 9% of the study area, are considered high-risk hotspots, including southern Khorgos, Yining City, and southwestern Nilka County. In southern Khorgos, hazard and vulnerability maintain the risk at a high level. Yining City, as the political, economic, and cultural center of Ili Prefecture, exhibits the highest vulnerability and exposure levels; although hazard levels are relatively low and disaster mitigation capacity is strong, it remains a key area for prevention and control. Southwestern Nilka County, influenced by geological conditions and exposure, maintains high-risk status across all recurrence periods. These findings suggest that localized risk management and resource allocation strategies are necessary to effectively reduce Geo-disaster risks under extreme precipitation conditions.

5.2. Comparison with Previous Studies and Regional Characteristics

Comparative analyses indicate that the spatial distribution and recurrence-period evolution patterns identified in this study are consistent with trends reported in previous research. Wang and Hou [44], in their assessment of rainstorm–geohazard disaster chain risks, observed that under varying rainfall conditions, low-risk areas generally remain stable, medium-risk zones expand with increasing rainfall intensity, and high-risk zones tend to cluster spatially. This pattern aligns closely with the risk distribution identified in the Ili River Basin. Liu et al. [45] further demonstrated that geological hazard risk progressively intensifies with rainfall magnitude, with high-precipitation areas typically concentrated locally, resulting in peak hazard levels—consistent with the localized high-risk hotspots revealed in the present study. Similarly, Li et al. [46], investigating flood hazards in the Bohai Rim under rainfalls of different recurrence periods, reported that as the recurrence period increases, medium- and high-risk areas expand, low-risk areas contract, and high-risk zones predominantly occur in densely populated and economically active regions. Collectively, these findings suggest that the relationship between extreme precipitation and hazard evolution exhibits a broadly consistent pattern across different regions and study contexts.

Regarding regional characteristics, Guo et al. [47] analyzed landslide susceptibility in the western mountainous area of Wenzhou and found that landslides predominantly occur on low-elevation, gentle-to-moderate slopes of low hills, with 84% of events triggered by rainfall and approximately 70% occurring during the rainy season, highlighting the critical influence of topography under rainfall-triggered conditions. Consistently, Zhuang et al. [48] conducted detailed investigations of landslide mechanisms in the Ili River Basin. Their study of the Piriqing River No. 2 landslide indicated that low hills with gentle slopes are prone to local instability and overall sliding under snowmelt or rainfall, and that localized zones exert a significant influence on overall slope stability. These studies collectively suggest that high-risk areas are determined not only by rainfall intensity but also by topographic conditions, in agreement with the concentration of high-risk zones on gentle low-hill slopes in the Ili River Basin observed in this study.

Overall, the literature supports our findings: medium- and high-risk areas expand with increasing rainfall intensity or recurrence period, low-risk areas remain largely stable, high-risk zones exhibit localized clustering, and gentle low-hill slopes are highly susceptible to forming high-risk zones under precipitation.

5.3. Implications, Case Example, and Limitations

Although this study provides a systematic assessment of geological hazard risk in the Ili River Basin under extreme rainfall using multi-source data and modeling approaches, the impacts of individual extreme rainfall events cannot be overlooked. For instance, on 7 October 2023, a landslide occurred on the west-facing slope of Shuangbi Gully, Kuan Gou, Xinjiang Jinchuan Mining, resulting in one fatality and direct economic losses of approximately 1.8 million CNY. The site, located in Aoyimanbulake Village, Kalayagaqi Township, Yining County, experienced continuous precipitation from 01:30 to 12:00, which infiltrated the slope, substantially increasing soil saturation and accelerating slope failure, culminating in the landslide around 19:10. In this study, Yining County represents a typical area where hazard levels rise with increasing rainfall intensity, and this event aligns closely with the assessment results, demonstrating that the findings reliably reflect real-world conditions. Furthermore, local seismic influences contributed to the landslide, suggesting that future research could extend the current framework to examine hazard evolution under varying seismic conditions.

Despite providing a systematic evaluation of geological hazard risk under extreme rainfall, some limitations remain. First, the assessment relies on multi-source datasets and is inevitably affected by data accuracy and completeness; future research should incorporate higher-resolution precipitation measurements, high-precision DEMs, and detailed socioeconomic data to enhance result reliability. Second, although the entropy weight–Random Forest optimization method effectively mitigates the limitations of single-method weighting, it still lacks expert knowledge integration; future studies could combine modeling with expert judgment to refine indicator weights and reduce uncertainties.

Overall, the findings of this study offer practical guidance for disaster prevention and mitigation in the Ili River Basin and beyond. In the aftermath of extreme rainfall events, authorities can allocate emergency resources according to rainfall intensity, deploy monitoring and rescue efforts based on risk distribution, and implement localized risk zoning strategies to enhance the scientific rigor and precision of disaster management. Moreover, these results provide valuable insights for regional spatial planning, infrastructure layout, and the siting of major projects, supporting evidence-based decision-making in disaster risk governance under climate change conditions.

6. Conclusions

This study focuses on the Ili River Basin and innovatively integrates extreme precipitation recurrence periods with geological hazard susceptibility to construct a hazard framework under extreme precipitation conditions. Based on the risk assessment model, combined with exposure, vulnerability, and disaster mitigation capacity, a comprehensive Geo-disaster risk assessment was conducted, resulting in risk maps under different extreme precipitation conditions. The key findings are summarized as follows:

(a) Spatial distribution of Geo-disaster hazard: The hazard exhibits significant spatial variation under different extreme precipitation recurrence periods. Under the 10-year recurrence period, areas of high precipitation intensity largely coincide with regions of high susceptibility, forming high-hazard zones represented by eastern Nilka County, central Xinyuan County, and northern Zhaosu County. Under the 20- and 50-year recurrence periods, the spatial distribution of extreme precipitation becomes more concentrated, reducing the overall area of high-hazard zones. Notably, in the 50-year recurrence period, Yining County emerges as the only area classified as very high hazard, indicating its pronounced susceptibility during extreme precipitation events.

(b) Spatial patterns of exposure, vulnerability, and disaster mitigation capacity: Exposure generally shows a pattern of higher values in the east and lower values in the west. High-exposure areas are mainly concentrated in county and township centers as well as grassland regions, closely related to road network density, building density, and infrastructure distribution. Vulnerability is primarily distributed along road networks, with high-vulnerability areas concentrated in southern Khorgos, southern Huocheng County, and central Yining City, due to dense population, active economic activities, and high transportation density. Disaster mitigation capacity exhibits a north-high, south-low pattern. Yining City, as the political, economic, and cultural center, has relatively strong capabilities in disaster preparedness, response, and post-disaster recovery. In contrast, southern regions, such as Zhaosu and Tekes Counties, have lower disaster mitigation capacity, particularly in post-disaster recovery.

(c) Dynamic evolution of Geo-disaster risk: Geo-disaster risk in the Ili River Basin evolves with extreme precipitation intensity and recurrence period. The proportion of low and below-risk areas is 56%, 52%, and 50%, remaining generally stable with a slight decrease. Medium-risk areas increase significantly with stronger precipitation, rising from 25% and 28% to 42%, indicating an overall rise in risk levels. High-risk and above areas become increasingly localized, with very high-risk areas decreasing from 2% to 0.4%, demonstrating the concentrated impact of high-intensity precipitation on high-risk zones.

(d) Spatial concentration and hotspot analysis: Spatial overlay analysis shows that Geo-disaster risk exhibits clear spatial concentration and dynamic changes with precipitation intensity. High-risk hotspots gradually intensify with increasing precipitation, particularly in gentle slopes of low mountains, such as southeastern Zhaosu County, northern Tekes County, and southeastern Gongliu County. Across the entire basin, areas classified as high-risk or above account for approximately 9%, with southern Khorgos, Yining City, and southwestern Nilka County as representative high-risk zones. Southern Khorgos maintains a high-risk level due to hazard and vulnerability control; Yining City, as the political, economic, and cultural center, has the highest vulnerability and exposure; southwestern Nilka County remains consistently high-risk under varying extreme precipitation conditions due to geological and exposure factors.

In summary, this study reveals the spatial patterns and evolutionary characteristics of Geo-disaster risk under different extreme precipitation conditions in the Ili River Basin, providing a theoretical basis and practical reference for differentiated disaster prevention and mitigation, especially for targeted management of risk hotspots. To further refine risk assessment and enhance model applicability, future research may focus on: (1) incorporating higher-resolution meteorological, topographic, and socio-economic data to improve accuracy; (2) integrating dynamic precipitation scenarios and single-event extreme event simulations to better reflect actual disaster processes; (3) extending the framework to other disaster types, such as earthquakes and floods, to explore model applicability under multiple hazard conditions. These directions will enhance the generality and foresight of risk assessment, offering more scientific support for regional disaster prevention, mitigation, and planning decisions.

Author Contributions

Xinxu Li wrote and analyzed the manuscript, Jinghui Liu and Zhiyong Zhang edited and revised the manuscript, Xushan Yuan, Yanmin Li and Zixuan Wang assisted with data collection and analysis. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Third Xinjiang Scientific Expedition Program (Grant No. 2022xjkk0600). No funding was received to assist with the publication of this manuscript from any other funding agency.

Data Availability Statement

No datasets were generated or analyzed during the current study.

Conflicts of Interest

The authors declare no competing interests.

References

Froude, M.J.; Petley, D.N. Global fatal landslide occurrence from 2004 to 2016. Nat. Hazards Earth Syst. Sci. 2018, 18, 2161–2181. [Google Scholar] [CrossRef]
Petley, D.; Alcantara-Ayala, I.; Goudie, A.S. Landslide hazards. In Geomorphological Hazards and Disaster Prevention; Cambridge University Press: New York, NY, USA, 2010; pp. 63–74. [Google Scholar]
Klose, M.; Maurischat, P.; Damm, B. Landslide impacts in Germany: A historical and socioeconomic perspective. Landslides 2016, 13, 183–199. [Google Scholar] [CrossRef]
Aditian, A.; Kubota, T.; Shinohara, Y. Comparison of GIS-based landslide susceptibility models using frequency ratio, logistic regression, and artificial neural network in a tertiary region of Ambon, Indonesia. Geomorphology 2018, 318, 101–111. [Google Scholar] [CrossRef]
Segoni, S.; Tofani, V.; Lagomarsino, D.; Moretti, S. Landslide susceptibility of the Prato–Pistoia–Lucca provinces, Tuscany, Italy. J. Maps 2016, 12, 401–406. [Google Scholar] [CrossRef]
Araújo, J.R.; Ramos, A.M.; Soares, P.M.; Melo, R.; Oliveira, S.C.; Trigo, R.M. Impact of extreme rainfall events on landslide activity in Portugal under climate change scenarios. Landslides 2022, 19, 2279–2293. [Google Scholar] [CrossRef]
Kharin, V.V.; Zwiers, F.W.; Zhang, X.; Hegerl, G.C. Changes in temperature and precipitation extremes in the IPCC ensemble of global coupled model simulations. J. Clim. 2007, 20, 1419–1444. [Google Scholar] [CrossRef]
Gerscovich, D.M.S.; Vargas, E.A., Jr.; De Campos, T.M.P. On the evaluation of unsaturated flow in a natural slope in Rio de Janeiro, Brazil. Eng. Geol. 2006, 88, 23–40. [Google Scholar] [CrossRef]
Zhu, J.H.; Anderson, S.A. Determination of shear strength of Hawaiian residual soil subjected to rainfall-induced landslides. Géotechnique 1998, 48, 73–82. [Google Scholar] [CrossRef]
Caine, N. The Rainfall Intensity-Duration Control of Shallow Landslides and Debris Flows. Geogr. Ann. Ser. A. Phys. Geogr. 1980, 62, 23–27. [Google Scholar]
Saha, S.; Bera, B. Rainfall threshold for prediction of shallow landslides in the Garhwal Himalaya, India. Geosyst. Geoenviron. 2024, 3, 100285. [Google Scholar] [CrossRef]
Abidin, R.Z.; Abdullah, J.; Kasim, M.Z.M.; Yusof, M.F.; Krishnan, K.K.; Mahamud, M.A. Forecasting Landslides with Regards to Rainfall Erosivity and Soil Erodibility: A Case Study at Malaysian Highlands. PaperASIA 2025, 41, 435–440. [Google Scholar] [CrossRef]
Liu, J.; Yuan, X.; Li, Y.; Li, X. Spatiotemporal Analysis of Extreme Precipitation in the Ili River Basin Based on CMIP6. Arid. Land Geogr. 2025, 48, 1329–1341. [Google Scholar] [CrossRef]
Sujatha, E.R.; Sudharsan, J.S. Landslide susceptibility mapping methods—A review. In Landslide: Susceptibility, Risk Assessment and Sustainability: Application of Geostatistical and Geospatial Modeling; Springer Nature: Berlin/Heidelberg, Germany, 2024; pp. 87–102. [Google Scholar]
Kincey, M.E.; Rosser, N.J.; Swirad, Z.M.; Robinson, T.R.; Shrestha, R.; Pujara, D.S.; Basyal, G.K.; Densmore, A.L.; Arrell, K.; Oven, K.J.; et al. National-scale rainfall-triggered landslide susceptibility and exposure in Nepal. Earth’s Future 2024, 12, e2023EF004102. [Google Scholar] [CrossRef]
Mosaffaie, J.; Salehpour Jam, A.; Sarfaraz, F. Landslide risk assessment based on susceptibility and vulnerability. Environ. Dev. Sustain. 2024, 26, 9285–9303. [Google Scholar] [CrossRef]
Zhao, Z.; Li, Z.; Lv, P.; Zhao, F.; Niu, L. The Study on Landslide Hazards Based on Multi-Source Data and GMLCM Approach. Remote Sens. 2025, 17, 1634. [Google Scholar] [CrossRef]
Peng, B.; Wu, X. Optimizing Rainfall-Triggered Landslide Thresholds to Warning Daily Landslide Hazard in Three Gorges Reservoir Area. Nat. Hazards Earth Syst. Sci. Discuss. 2024, 24, 3991–4013. [Google Scholar] [CrossRef]
Ortiz-Giraldo, L.; Botero, B.A.; Vega, J. An integral assessment of landslide dams generated by the occurrence of rainfall-induced landslide and debris flow hazard chain. Front. Earth Sci. 2023, 11, 1157881. [Google Scholar] [CrossRef]
Wu, Y.; Dong, Y.; Wei, Z.; Dong, J.; Peng, L.; Yan, P.; Ma, W. Genetic mechanisms and a stability evaluation of large landslides in Zhangjiawan, Qinghai Province. Front. Earth Sci. 2023, 11, 1140030. [Google Scholar] [CrossRef]
Ke, K.; Zhang, Y.; Zhang, J.; Chen, Y.; Wu, C.; Nie, Z.; Wu, J. Risk assessment of earthquake–landslide hazard chain based on CF-SVM and Newmark model—Using Changbai Mountain as an example. Land 2023, 12, 696. [Google Scholar] [CrossRef]
Li, C.; Wang, M.; Chen, F.; Coulthard, T.J.; Wang, L. Integrating the SLIDE model within CAESAR-Lisflood: Modeling the ‘rainfall-landslide-flash flood’disaster chain mechanism under landscape evolution in a mountainous area. Catena 2023, 227, 107124. [Google Scholar] [CrossRef]
Yu, Z.; Zhan, J.; Yao, Z.; Peng, J. Characteristics and mechanism of a catastrophic landslide-debris flow disaster chain triggered by extreme rainfall in Shaanxi, China. Nat. Hazards 2024, 120, 7597–7626. [Google Scholar] [CrossRef]
Wang, W.; Song, Y.; Huang, L.; Shi, Y.; Zhang, C. Vulnerability assessment of disaster chains: A case study of rainstorm–landslide disaster chains in the Greater Bay Area. Int. J. Disaster Risk Reduct. 2025, 119, 105272. [Google Scholar] [CrossRef]
Che, Y.; Li, X.; Liu, X.; Wang, Y.; Liao, W.; Zheng, X. Building Height of Asia in 3D-GloBFP [Data Set]; Zenodo: Geneva, Switzerland, 2024. [Google Scholar]
Yang, J.; Huang, X. The 30m Annual Land Cover Datasets and Its Dynamics in China from 1985 to 2023 [Data Set]; Zenodo: Geneva, Switzerland, 2024. [Google Scholar]
Shi, P. Discussion on the theory and practice of disaster research (III). J. Nat. Disasters 2002, 2002, 1–9. [Google Scholar]
Zhu, Y.; Tian, D.; Yan, F. Effectiveness of entropy weight method in decision-making. Math. Probl. Eng. 2020, 2020, 3564835. [Google Scholar] [CrossRef]
Bersabe, J.T.; Jun, B.-W. The Machine Learning-Based Mapping of Urban Pluvial Flood Susceptibility in Seoul Integrating Flood Conditioning Factors and Drainage-Related Data. ISPRS Int. J. Geo-Inf. 2025, 14, 57. [Google Scholar] [CrossRef]
Rong, G.; Zhang, J.; Li, T.; Fang, W. Risk assessment of disaster chains induced by extreme precipitation: A case study of Shuicheng County, Guizhou Province. J. Catastrophology 2022, 37, 201–210. [Google Scholar]
Reichenbach, P.; Rossi, M.; Malamud, B.D.; Mihir, M.; Guzzetti, F. A review of statistically-based landslide susceptibility models. Earth-Sci. Rev. 2018, 180, 60–91. [Google Scholar] [CrossRef]
MacQueen, J.B. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 21 June–18 July 1965 and 27 December 1965–7 January 1966; Volume 1, pp. 281–297. [Google Scholar]
Wu, Y.F.; Fa-you, A.; Yang, C.; Yan, S.; Kang, X. Accuracy Improvement of Different Landslide Susceptibility Evaluation Models through K-Means Clustering: A Case Study on China’s Funing County. Math. Probl. Eng. 2023, 2023, 2913890. [Google Scholar] [CrossRef]
Alalade, S.; Seng, C.; Chheun, D.; Choi, Y.; Kim, H.J.; Kim, S.W.; Lee, S.B.; Adhikari, M.D. Landslide susceptibility mapping using frequency ratio, logistic regression and K-means clustering algorithms: A case study in Inje, South Korea. Environ. Earth Sci. 2025, 84, 369. [Google Scholar] [CrossRef]
Shortliffe, E.H.; Buchanan, B.G. A model of inexact reasoning in medicine. Math. Biosci. 1975, 23, 351–379. [Google Scholar] [CrossRef]
Ma, W.; Dong, J.; Wei, Z.; Peng, L.; Wu, Q.; Wang, X.; Dong, Y.; Wu, Y. Landslide susceptibility assessment using the certainty factor and deep neural network. Front. Earth Sci. 2023, 10, 1091560. [Google Scholar] [CrossRef]
Ding, D.; Wu, Y.; Wu, T.; Gong, C. Landslide susceptibility assessment in Tongguan District Anhui China using information value and certainty factor models. Sci. Rep. 2025, 15, 12275. [Google Scholar] [CrossRef]
Pearson, K. Notes on the history of correlation. Biometrika 1920, 13, 25–45. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Vapnik, V. The Nature of Statistical Learning Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Yo, X.; Tham, L.G.; Dai, F.C. Landslide susceptibility mapping based on support vector machine. Geomorphology 2008, 101, 572–582. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16), San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Zhang, X.; Wang, X.; Qi, G. Construction of a water security evaluation index system. J. Arid Land Resour. Environ. 2022, 36, 44–53. [Google Scholar]
Wang, Q.; Hou, J. Hazard assessment of rainstorm-geohazard disaster chain based on multiple scenarios. Nat. Hazards 2023, 118, 589–610. [Google Scholar] [CrossRef]
Liu, X.; Lyu, H.M.; Shen, S.L. Assessment of geo-disaster risk levels induced by extreme rainfall using integrated FCM-VIKOR approach. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2025, 2025, 1–20. [Google Scholar] [CrossRef]
Li, Y.; Zhang, Z.; Gong, S.; Liu, M.; Zhao, Y. Risk assessment of rainstorm disasters under different return periods: A case study of Bohai Rim, China. Ocean. Coast. Manag. 2020, 187, 105107. [Google Scholar] [CrossRef]
Guo, Z.; Zhang, Y.; Wang, L.; Guo, Z.; Shi, L.; Zhou, X.; Wang, H.; Cheng, M.; Chen, G. Distribution and temporal-spatial characteristics of geohazards in a rainstorm-prone area of SE China considering the land use. Biogeotechnics 2025, 100184, in press. [Google Scholar] [CrossRef]
Zhuang, M.; Gao, W.; Zhao, T.; Hu, R.; Wei, Y.; Shao, H.; Zhu, S. Mechanistic investigation of typical loess landslide disasters in Ili Basin, Xinjiang, China. Sustainability 2021, 13, 635. [Google Scholar] [CrossRef]

Figure 1. Study Area.

Figure 2. Flowchart of Geo-Disaster Risk Assessment.

Figure 3. Flowchart of EW-RFM.V-shaped arrows indicate results obtained from weighted overlay; plus signs represent combined sample sets; arrows indicate the process of obtaining optimized weights.

Figure 4. Geo-disaster Susceptibility Factors: (a). ELE, (b). SLP, (c). ASP, (d). LIT, (e). SOL, (f). LDF, (g). SLC, (h). PLC, (i). NDVI, (j). DTR, (k). DTF, (l). TWI, (m). TRI, (n). STI, (o). SPI.

Figure 5. Flowchart of Susceptibility Assessment.

Figure 6. Random Forest.

Figure 7. Support Vector Machine. (a) Two-dimensional plane showing the data points, decision boundary, and support vectors, (b) Mapping of the two-dimensional input into a three-dimensional feature space, illustrating the effect of the kernel function.

Figure 8. Exposure factors. (a). Land Type, (b). Distance to Road, (c). Buding Height Density, (d). Distance to Disaster Point, (e). Enterprises above Scale.

Figure 9. Vulnerability factors. (a). Population Density, (b). Road Density, (c). Transportation Facilities Density, (d). POI Density, (e). GDP.

Figure 10. Disaster Mitigation Capacity factors. (a). Hospital Beds, (b). Density of Medical Points, (c). Fixed telephone subscribers, (d). Welfare Institution Beds, (e). Urban and Rural Residents’ Deposits.

Figure 11. Grading results and Certainty Factor values.

Figure 12. Correlation Heat Map.

Figure 13. Model Performance Evaluation Plot.

Figure 14. Susceptibility Map: (a). Landslide, (b). Collapse, (c). Debris Flow, (d). Geo-disaster.

Figure 15. GEV Fitting Curve.

Figure 16. Extreme Precipitation Recurrence Period Maps: (a). 10-year, (b). 20-year, (c). 50-year.

Figure 17. Hazard Map Based on recurrence Period: (a). 10-year, (b). 20-year, (c). 50-year.

Figure 18. Exposure Map.

Figure 19. Vulnerability Map.

Figure 20. Disaster Mitigation Capacity Map.

Figure 21. Risk Map Based on Recurrence Period: (a). 10-year, (b). 20-year, (c). 50-year.

Figure 22. Proportion of Risk Area by Level for Different Recurrence Periods.

Figure 23. Risk Spatial Overlay Map.

Table 1. Data Source.

Data	Source	Processed Indicator	Spatial Resolution	Temporal Resolution
Daily precipitation dataset	China Meteorological Administration (CMA)	-	Vector Data	1981–2024
DEM (Digital Elevation Model)	Geo-Spatial Data Cloud	Elevation	100 m	Static
		Slope
		Aspect
		Plan Curvature
		Profile Curvature
		STI (Sediment Transport Index)
		TWI (Topographic Wetness Index)
		SPI (Stream Power Index)
		TRI (Terrain Ruggedness Index)
Landform Hierarchy Data	GLADA Dataset	Landform Index	1000 m	Static
Lithology Data		Lithology Index
Soil Type Data		Soil Index
Vegetation Distribution Data	NASA	NDVI	1000 m	2020
River Distribution Data	National Geographic Information Resource Directory Service System	Distance to River	Vector Data	2020
Fault Distribution Data		Distance to Fault
Road Distribution Data		Distance to Road
Road Distribution Data		Road Density
Building Height Distribution Data [25]	Zenodo Database	Building Height Density	Vector Data	2020
Land Use Data [26]	Zenodo Database	Land Type	30 m	Static
Enterprises Above Designated Size	<China County Statistical Yearbook>	Density of Enterprises above Scale	Vector Data	2020
Hospital Beds		Hospital Beds
Welfare Institution Beds		Welfare Institution Beds
Fixed telephone subscribers		Fixed telephone subscribers
Urban and Rural Residents’ Deposits		Urban and Rural Residents’ Deposits
Disaster Point Distribution Data	Geo-Remote Sensing Ecological Network	Distance to Disaster Points	Vector Data	2020
Population Distribution Data	World Pop Official Website	Population Density	100 m	2020
GDP Distribution Data	CAS (Chinese Academy of Sciences) Resource and Environment Science Data Center	GDP	100 m	2020
Point of Interest Distribution Data	Amap (Gaode Map)	POI Density	Vector Data	2020
		Density of Transportation Points
		Density of Medical Points

Table 2. Susceptibility Indicator System.

Hazard Component	Category	Indicator
Susceptibility	Topographic Conditions	Elevation
		Slope Angle
		Slope Aspect
		Plan Curvature
		Profile Curvature
		Distance to Fault
	Geological Environment Conditions	Landform Index
		Lithology Index
		Soil Index
	Vegetation Condition	NDVI
	Hydro-geological Conditions	Distance to River
		SPI
		TWI
		STI
		TRI

Table 3. Confidence Interval Table.

Parameter	Estimate	95%CI Lower Bound	95%CI Upper Bound
shape	−0.0587	−0.125	0.0073
loc	21.05	20.44	21.73
scale	6.42	5.87	6.92

Table 4. Exposure Indicator System.

Risk Component	Indicator	Indicator Attribute	Weight
Exposure	Land Type	+	0.3760
	Distance to Road	-	0.1137
	Building Height Density	+	0.2274
	Distance to Disaster Point	-	0.1481
	Enterprises above Scale	+	0.1347

Table 5. Vulnerability Indicator System.

Risk Component	Indicator	Indicator Attribute	Weight
Vulnerability	Population Density	+	0.0436
	Road Density	+	0.7735
	Transportation Facilities Density	+	0.0586
	POI Density	+	0.0448
	GDP	+	0.0795

Table 6. Disaster Mitigation Capacity Indicator System.

Risk Component	Indicator	Indicator Attribute	Weight
Disaster Mitigation Capacity	Hospital Beds	+	0.1904
	Density of Medical Points	+	0.3154
	Fixed telephone subscribers	+	0.1484
	Welfare Institution Beds	+	0.1726
	Urban and Rural Residents’ Deposits	+	0.1733

Table 7. Susceptibility Indicator Weights.

Indicator	Landslide	Debris Flow	Collapse
Elevation	0.0416	0.0495	0.0477
Slope Angle	0.0497	0.0591	0.0569
Slope Aspect	0.0372	0.0442	0.0426
Plan Curvature	0.0342	0.0407	0.0392
Profile Curvature	0.0641	0.0763	0.0735
Landform Index	0.1482	0.127	0.1517
Lithology Index	0.1624	0.1222	0.1193
Soil Index	0.112	0.0644	0.0675
NDVI	0.0241	0.0287	0.0276
Distance to Fault	0.03	0.0357	0.0344
Distance to River	0.0305	0.0363	0.0349
SPI	0.0251	0.0299	0.0288
TWI	0.0514	0.0611	0.0589
STI	0.1128	0.1341	0.1292
TRI	0.0766	0.0911	0.0877

Table 8. Risk Component Weights.

Recurrence Period	Hazard	Exposure	Vulnerability	Disaster Mitigation Capacity
10-year	0.2583	0.1922	0.3558	0.1937
20-year	0.2575	0.1924	0.3562	0.194
50-year	0.2701	0.1891	0.3501	0.1907

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, X.; Liu, J.; Zhang, Z.; Yuan, X.; Li, Y.; Wang, Z. Geological Disaster Risk Assessment Under Extreme Precipitation Conditions in the Ili River Basin. ISPRS Int. J. Geo-Inf. 2025, 14, 346. https://doi.org/10.3390/ijgi14090346

AMA Style

Li X, Liu J, Zhang Z, Yuan X, Li Y, Wang Z. Geological Disaster Risk Assessment Under Extreme Precipitation Conditions in the Ili River Basin. ISPRS International Journal of Geo-Information. 2025; 14(9):346. https://doi.org/10.3390/ijgi14090346

Chicago/Turabian Style

Li, Xinxu, Jinghui Liu, Zhiyong Zhang, Xushan Yuan, Yanmin Li, and Zixuan Wang. 2025. "Geological Disaster Risk Assessment Under Extreme Precipitation Conditions in the Ili River Basin" ISPRS International Journal of Geo-Information 14, no. 9: 346. https://doi.org/10.3390/ijgi14090346

APA Style

Li, X., Liu, J., Zhang, Z., Yuan, X., Li, Y., & Wang, Z. (2025). Geological Disaster Risk Assessment Under Extreme Precipitation Conditions in the Ili River Basin. ISPRS International Journal of Geo-Information, 14(9), 346. https://doi.org/10.3390/ijgi14090346

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Geological Disaster Risk Assessment Under Extreme Precipitation Conditions in the Ili River Basin

Abstract

1. Introduction

2. Study Area and Materials

2.1. Study Area

2.2. Data Source

3. Methods

3.1. Risk Assessment

3.1.1. Risk Assessment Model

3.1.2. Entropy Weight-Random Forest Method (EW-RFM)

3.2. Hazard Assessment

3.2.1. Hazard Assessment Model

3.2.2. Susceptibility Assessment

A. Susceptibility Indicator Analysis

B. Susceptibility Assessment Model

3.2.3. Extreme Precipitation Recurrence Period Model

3.3. Exposure Assessment Model

3.4. Vulnerability Assessment Model

3.5. Disaster Mitigation Capacity Assessment Model

4. Results

4.1. Susceptibility Indicator Analysis

4.2. Hazard Map

4.3. Exposure Map

4.4. Vulnerability Map

4.5. Disaster Mitigation Capacity Map

5. Discussion

5.1. Risk Patterns and Evolution Under Different Recurrence Periods

5.2. Comparison with Previous Studies and Regional Characteristics

5.3. Implications, Case Example, and Limitations

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI