Next Article in Journal
Reevaluating Yeast Metabolism: Understanding Crabtree–Warburg Effects Differences with the snf1∆ Strain as a New Model of the Warburg Effect
Previous Article in Journal
Milk Thistle’s Secret Weapon: Thromboelastometry Reveals How Silybin Modulates Coagulation in Human Plasma In Vitro
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Slope Geological Hazard Risk Assessment Using Bayesian-Optimized Random Forest: A Case Study of Linxiang City, China

1
Geohazards Survey and Monitor Institute of Hunan Province, Changsha 410004, China
2
Hunan Geological Disaster Monitoring, Early Warning and Emergency Rescue Engineering Technology Research Center, Changsha 410004, China
3
School of Geosciences and Info-Physics, Central South University, Changsha 410083, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(3), 1309; https://doi.org/10.3390/app16031309
Submission received: 23 December 2025 / Revised: 19 January 2026 / Accepted: 26 January 2026 / Published: 28 January 2026

Abstract

In order to meet the urgent needs of refined geological disaster risk assessment at a county scale, and in view of the shortcomings of existing methods in the aspects of sample dependence, rainfall time-varying differences, and vulnerability quantification, this study takes Linxiang City as an example, integrates multi-source data such as geology, geography, meteorology, remote sensing, and field survey, and explores practical methods. A random forest (RF) model was implemented for geological hazard susceptibility mapping, and its hyper-parameters were tuned using Bayesian optimization. Based on a statistical analysis of the frequency of historical disaster events, a risk classification of rainfall in the flood season and non-flood season was evaluated. A vulnerability simplification method based on the value and exposure of disaster-bearing bodies was proposed. Finally, rapid risk assessment was achieved by matrix superposition. The results showed that the model had high accuracy (AUC = 0.903). The use of field survey risk types effectively enhanced the susceptibility sample set and verified the accuracy of risk assessment. The risk factor in the flood season and non-flood season was significantly different, and the very-high- and high-risk areas in the flood season were mainly distributed in the shallow metamorphic rock mountainous area in the east of Yanglousi Town and the granite residual soil area in the south of Zhanqiao Town, the latter of which was highly consistent with the field survey results. This study demonstrated value in terms of sample enhancement, model optimization, consideration of time-varying rainfall, and vulnerability simplification. The evaluation results can provide direct support for the construction of a “point–area dual control” system for geological disasters in Linxiang City, and the methodological framework can also provide a practical reference for risk evaluation in other counties.

1. Introduction

The construction of the county-level dual-control system of “hazard points + risk zones” places higher demands on the refinement level of geological hazard risk identification and assessment. There is an urgent need for a refined, practical, and operable slope geological hazard risk assessment method suitable for the county scale. Since 2019, Hunan Province has conducted 1:10,000 geological hazard investigations and risk assessment works in 89 counties, cities, and districts, acquiring extensive data on hazard distribution and predisposing conditions based on residential slope units. How to efficiently transform these massive field survey data into practical risk assessment products remains a key challenge.
In regional geological hazard susceptibility assessment, domestic and international scholars have researched various methods. The Analytic Hierarchy Process (AHP) deconstructs complex problems and calculates factor weights for evaluation [1]; Logistic Regression (LR) analyzes the relationship between risk factors and hazard occurrence based on probability statistics [2]; Neural Networks (NNs) predict by learning the mapping relationship between factors and hazards [3]. Machine learning methods, especially Random Forest (RF), have shown advantages in geological hazard susceptibility assessment in recent years due to their ability to handle complex nonlinear relationships. However, these methods generally rely on a large number of reliable hazard samples, limiting assessment accuracy in areas with sparse hazard records.
Geological hazard assessment typically overlays triggering factors like rainfall onto susceptibility. Existing methods often use regional average rainfall, calculating extreme rainfall for different return periods (e.g., 10-year, 20-year, 50-year, 100-year) based on historical hazard frequency to predict hazard under different scenarios [4,5,6]. Although Fu et al. attempted a monthly-scale assessment [5], overall, these methods neglect the spatial heterogeneity of rainfall, and the temporal scale (e.g., monthly averages) may be insufficient to finely characterize the dynamic triggering effects of rainfall. While distinguishing between flood season and non-flood season rainfall patterns for assessment is necessary, more refined spatiotemporal characterization is still needed in practical applications.
Research on geological hazard vulnerability assessment mostly focuses on single hazards, often having modeling use the relationship between hazard physical characteristics and the response of elements at risk [7]. Regional-scale vulnerability assessment remains relatively underdeveloped and is often based on proxy indicators such as population density and building distribution [8]. Such methods, on the one hand, require detailed data on elements at risk for support. On the other hand, relying on a single indicator (e.g., population density) makes it difficult to accurately quantify the comprehensive vulnerability of grid cells, and the calculation procedures are often ambiguous. There is an urgent need to explore simplified vulnerability assessment pathways based on key human activity characteristics.
In summary, the main challenges currently faced by county-scale slope geological hazard risk assessment are (1) How to improve the robustness and accuracy of susceptibility models when hazard samples are limited (especially in low-record areas); (2) How to finely characterize the spatiotemporal differences of rainfall triggering factors (beyond regional averages and monthly scales), in particular, distinguishing flood season and non-flood season patterns; and (3) How to construct a comprehensive vulnerability quantification method with moderate data requirements, clear calculations, and the ability to reflect regional characteristics. Addressing these challenges, this study uses Linxiang City, Hunan Province, as the study area, aiming to explore an integrated county-scale slope geological hazard risk assessment workflow incorporating GIS technology, machine learning optimization algorithms, and a simplified vulnerability model.

2. Methods and Data

2.1. Technical Framework

Specific tasks include (1) Utilizing field survey data, adopting a RF model with hyper-parameters optimized via Bayesian optimization for susceptibility assessment, enhancing model performance through hyper-parameter optimization and exploring sample enhancement strategies to address sample limitations; (2) Combining the spatiotemporal distribution characteristics of rainfall in the area, conducting hazard assessment under two typical scenarios (flood season and non-flood season), and finely characterizing the dynamic risk induced by rainfall; (3) Constructing a simplified regional vulnerability assessment model based on building sensitivity (as a key indicator of human activity); (4) Integrating susceptibility, hazard, and vulnerability assessment results to complete risk calculation and zoning [9]. The innovation of this study lies in proposing and implementing an integrated county-scale risk assessment framework combining a RF model with hyper-parameters optimized via Bayesian optimization, refined rainfall time-variant scenarios (flood/non-flood season), and a simplified vulnerability model based on human activity sensitivity (buildings), which is of significant importance for improving risk mapping accuracy and guiding risk management (Figure 1).

2.2. Methods

2.2.1. Random Forest Principle

The random forest model can be used in the evaluation of geological disaster susceptibility in different countries and regions [10,11,12,13,14,15]. The RF algorithm enhances diversity among classification models by utilizing multiple distinct subsets of training samples, thereby improving the model’s predictive capability [16]. Compared with other machine learning algorithms, RF features high accuracy, few parameters, and stable performance, and has achieved good results in remote sensing image classification and change detection. Its basic principle involves using bootstrap resampling technology to repeatedly and randomly extract a certain number of samples with replacement from the original training sample set to generate new training sample sets. Then, based on the selected new training samples, multiple completely independent decision tree classifiers are generated. Combining these classifiers forms the RF model. For prediction data, the classification result is determined by the final vote count of these decision tree classifiers. It is essentially an improved algorithm based on the decision tree model.

2.2.2. Sample Construction

The positive samples in this study consist of two parts: 106 historical landslides and 291 unstable slope units confirmed through field geological surveys. These slope units, identified as highly susceptible areas with significant predisposing conditions and potential geological hazard risk (i.e., no hazard events have occurred) through detailed investigation of topography, geomorphology, rock–soil mass structure, deformation signs, hydrological conditions, human activities, combined with preliminary stability analysis and other professional judgments, serve as effective supplementary positive samples, referencing similar research strategies. Negative samples were randomly selected (n = 397) from 2829 field-confirmed low-risk slope units, ensuring a balanced number of positive and negative samples. Finally, all samples were divided into a 70% training set and a 30% test set.

2.2.3. RF Model Optimization

In machine learning, model accuracy depends not only on the learning algorithm but also on hyper-parameters and feature (factor) selection. Therefore, each model needs optimization, including hyper-parameter optimization and feature selection. Currently, a commonly used method for hyper-parameter optimization is the grid search method [17]. To improve accuracy and speed and obtain the optimal value in a shorter time, this study adopted the Bayesian optimization algorithm to determine the optimal hyper-parameter values [18,19]. After Bayesian optimization, the key hyper-parameters for the Random Forest susceptibility model “max_depth”, “max_features”, “min_samples_split”, and “n_estimators” were 20, 0.3, 10, and 100, respectively.

2.3. Data Sources

Linxiang City is located in northeastern Hunan Province, China, along the south bank of the Yangtze River, with geographic coordinates ranging from 113°18′04″ E to 113°44′56″ E and from 29°11′48″ N to 29°51′02″ N, covering an administrative area of approximately 1760 km2. The elevation within the domain ranges between 50 and 300 m. Strata from various geological historical periods are exposed. Granite bodies are widely distributed in the southeast, with shallow metamorphic rock strata being the most widespread. The Indosinian–Yanshanian period shaped and finalized the tectonic framework of the area. According to the results of the 1:10,000 geological hazard investigation and risk assessment project in Linxiang City, Hunan Province, there are 106 geological hazard sites in Linxiang City (Figure 2), mainly landslides (89 sites, 84%) and debris flows (17 sites, 16%). There are 291 Unstable slopes and 14,246 high and steep slopes. A landslide is defined, in strict geological terms, as a hazard event where a mass of soil or rock on a slope moves downslope as a whole or in separate parts along one or more weak surfaces (or zones) under the influence of gravity. This movement is typically triggered by factors such as intense rainfall, river erosion, groundwater activity, or seismic events. An unstable slope, as identified in this study, refers to a slope unit that shows clear signs of potential failure but has not yet undergone a complete collapse. These units were identified through a detailed, slope-by-slope manual investigation across the entire study area. The identification was based on specific field indicators, such as a steep frontal free face (usually exceeding 60°), the presence of tensile cracks, and signs of surface bulging or water seepage, all of which suggest an ongoing accumulation of deformation and a heightened risk of instability.
The data for this study were provided by the Third Surveying and Mapping Institute of Hunan Province, the Hunan Provincial Natural Resources Affairs Center, and the Linxiang City Natural Resources Bureau, as well as collected through field investigations by the research team. Data types and formats are shown in Table 1. All field validation data points (samples) for model training and validation have accurate geographical locations.

3. Results

3.1. Susceptibility Assessment

3.1.1. Factor Preliminary Selection and Frequency Ratio Analysis

Based on the developmental patterns of slope geological hazards in Linxiang City and previous research [20,21], this study selected five primary indicators for susceptibility assessment: topography and geomorphology, basic geology, hydrogeology, human engineering activities, and surface vegetation. By analyzing the frequency ratio between hazards and various causal factors, 18 representative secondary factors were identified.
Topography and geomorphology included six factors: elevation, slope, aspect, curvature, surface roughness, and Terrain Ruggedness Index (TRI). Statistical analysis showed that hazards occurred most frequently within the 450–550 m elevation range, with the frequency decreasing at higher altitudes. The frequency ratio exceeded 1.5 for slopes between 60 and 65°. Hazards were concentrated on south- and southwest-facing slopes, with the highest frequency ratio on south-facing aspects. The curvature interval of 1.2–3.4 was more prone to hazards. A frequency ratio greater than 1.5 was observed in the ground roughness range of 1.87–2.15. Hazards were most frequent in areas with a relief amplitude of 30–60 m, particularly within the 40–60 m interval.
Basic geology comprised five factors: rock engineering, lithology, slope structure, soil thickness, and distance to faults. Statistics indicated that the southern argillaceous rock and shale areas are prone to landslides, whereas the central hard rock formations remain stable. Low hilly areas with shallow metamorphic rocks and high hilly areas with granite—characterized by weaker lithology, developed joints, and strong weathering—exhibit higher susceptibility to landslides and debris flows compared to clastic rock areas. Slopes with massive rock structures and consequent slopes showed frequent hazard occurrences. The frequency ratio exceeded 1.5 (indicating higher susceptibility) at distances of 3750–4750 m from fault zones, and dropped below 0.5 (lower susceptibility) beyond 4750 m. Hazard frequency was also higher in areas with a loose-layer thickness of 0–1 m.
Hydrogeology incorporated three factors: topographic wetness index (TWI), stream power index (SPI), and distance to rivers. Analysis revealed that the TWI interval of −8.1 to −5.7 had a frequency ratio greater than 1.4 (more prone). The SPI range of 2.7–10.9 showed a frequency ratio above 2.0 (highly prone). Distance to rivers reflects the influence of flow convergence and anti-scouring capacity on slope stability.
Human engineering activities included three factors: cut-slope strength, distance to roads, and land cover. Statistics demonstrated that greater artificial cut-slope strength, especially where cut slopes exceed 25°, leads to a more pronounced concentration of hazards. Distance to roads indirectly reflects the intensity of human activity. Forest land and residential areas showed higher hazard frequencies compared to other land use types, with elevated risk in zones of high development intensity. Specifically, cut-slope strength was derived from the kernel density of engineered slopes created by housing construction, road building, and quarrying, reflecting the degree of human-induced slope disturbance.
Surface vegetation selected 1 factor: The 0.78–0.86 NDVI segment had a frequency ratio > 1.6 (prone), indicating that high vegetation cover may interact with other factors to increase risk in some cases.

3.1.2. Factor Classification

A 10 m × 10 m grid was used as the basic unit for landslide susceptibility assessment. Discrete causal factors were classified according to their attributes. Continuous causal factors were divided into eight levels using the natural breaks method. The initial susceptibility ratings for the causal factors are shown in Figure 3.

3.1.3. Factor Correlation and Contribution Analysis

A correlation analysis of causal factors is necessary before using susceptibility modeling to eliminate strongly correlated factors, reduce model redundancy, and improve prediction accuracy. The Pearson correlation coefficient was used to analyze correlations among the causal factors (Figure 4). To ensure the correlation coefficient between factors was below 0.4, the following factors were eliminated from the causal factors: Slope angle, Lithology, Relief Amplitude, Artificial Slope Cutting Intensity, and Loose Layer Thickness. In addition, the information gain method was used to measure the contribution of the remaining causal factors to landslide development (Figure 5), and factors with low contribution were eliminated [22]: NDVI and Curvature. Finally, the remaining 11 factors constituted the susceptibility evaluation indicator system.

3.1.4. Susceptibility Results

Applying the susceptibility model to the entire area yielded the slope geological hazard susceptibility index for the whole county, which was classified into four levels using the natural breaks method: Very high, High, Medium, and Low susceptibility (Figure 6). The results show that the areas of Very high, High, Medium, and Low susceptibility are 235.54 km2, 273.43 km2, 287.93 km2, and 921.76 km2, accounting for 13.7%, 15.9%, 16.7%, and 53.7% of the total study area, respectively. The Very high and High susceptibility areas are mainly distributed in Yanglousi Town and Zhanqiao Town.
The Receiver Operating Characteristic (ROC) curve was used to analyze the evaluation accuracy of the susceptibility model on the test set data (Figure 7). The model accuracy in this paper is as high as 0.903, indicating that it can be effectively used for slope geological hazard susceptibility zoning in the study area.

3.2. Hazard Assessment

3.2.1. Hazard Assessment Method

This study obtained monthly rainfall data from 2018 to 2020. According to the meteorological station division, April–September in the area is defined as the flood season, and January–March and October–December as the non-flood season. The monthly average rainfall from April to September was taken as the flood season monthly average rainfall (Figure 8a), and the monthly average rainfall from January to March and October to December was taken as the non-flood season monthly average rainfall (Figure 8b).
Rainfall is one of the important factors causing slope geological hazards. All landslides in the area occurred during the flood season. Based on the occurrence time of historical hazard points and the corresponding location’s monthly average rainfall, the hazard frequency was statistically analyzed (Figure 9), and the flood season rainfall-induced effect degree was divided into five levels.
No historical hazards occurred in the non-flood season within the area. The non-flood season rainfall-induced effect degree was divided into three levels using the equal interval method (Table 2). Using the equal interval method for classification is a conservative estimation based on rainfall variation, aiming to identify areas with relatively high rainfall and relatively greater potential risk (although absolute risk is much lower than in the flood season). The classification results primarily reflect the spatial relative differences in risk, not the absolute probability. Thus, rainfall-induced hazard intensity classification maps for the flood season and non-flood season were obtained (Figure 10).
Geological hazard assessment typically considers rainfall-induced effects based on susceptibility assessment, often calculated using the matrix method [23]. This study adopted the hazard discrimination matrix (Figure 11), where four susceptibility levels combined with five rainfall-induced effect levels generate four hazard levels. Different numbers in the matrix represent different hazard levels, with 1–4 indicating Low, Medium, High, and Very high, respectively.

3.2.2. Hazard Assessment Results

The slope geological hazard assessment results for the study area show (Figure 12) that the flood season slope geological hazard is divided into four levels: Very high area covers 101.63 km2, High area 201.46 km2, Medium area 579.05 km2, and Low area 836.51 km2, accounting for 5.9%, 11.7%, 33.7%, and 48.7% of the total study area, respectively. The Very high and High areas are mainly distributed in Yanglousi Town and Zhanqiao Town. The non-flood season slope geological hazard is divided into three levels: High area covers 6.50 km2, Medium area 578.38 km2, and Low area 1133.78 km2, accounting for 0.4%, 33.6%, and 66.0% of the total study area, respectively. The high area is mainly distributed in Zhanqiao Town.

3.3. Risk Assessment

3.3.1. Vulnerability Assessment Method

Based on the value and exposure of elements at risk, the geological hazard vulnerability of the entire city was divided into four levels: Very high, High, Medium, and Low. The urban built-up core areas and buffer zones around important infrastructure (schools, hospitals, transportation hubs) were classified as Very high vulnerability zones, including urban buildings in non-plain areas and their 50 m buffer zones; general urban built-up areas and dense villages were classified as High vulnerability zones, including rural houses in non-plain areas and their 25 m buffer zones; scattered villages, areas along major transportation routes, and important farmland were classified as Medium vulnerability zones, including main roads and their 5 m buffer zones, and houses in plain areas and their 5 m buffer zones; forest land, grassland, wasteland, water bodies, and other uninhabited or very low population density areas were classified as Low vulnerability zones. Using high-resolution remote sensing images, residential area and land use data from 1:2000-scale topographic maps, and field verification data, buffer zones and spatial analysis were performed on the ArcGIS 10.5 platform for vulnerability classification.
The primary elements at risk considered in this assessment were residential buildings, key public infrastructure (schools, hospitals), and major transportation routes. Their vulnerability was determined by integrating two key factors: (1) the spatial proximity to potential slope hazards (i.e., the distance to identified unstable slopes or historical landslide bodies), which directly influences the exposure level; and (2) the inherent structural characteristics and density of the buildings. For instance, densely clustered villages with predominantly masonry structures were assigned higher vulnerability values within a given hazard proximity compared to sparse, reinforced-concrete buildings. The final vulnerability zoning presented in Figure 13 results from the overlay and weighted synthesis of these exposure and susceptibility criteria.

3.3.2. Risk Assessment Method

Risk is jointly determined by the hazard level of geological hazards and the vulnerability of elements at risk. According to risk assessment criteria, the risk calculation formula is:
R = L × S
where R is the risk value, L is the hazard level degree, and S is the vulnerability magnitude of the elements at risk.
For areas with frequent human activities and high vulnerability of elements at risk, this study considers the slope geological hazard level to be equivalent to the risk level in that area. For areas with infrequent human activities and low vulnerability, this study considers that the risk level can be downgraded by one level relative to the hazard level. Therefore, a risk matrix was used (Figure 14), where different numbers represent different risk degree levels, with 1–4 indicating Low, Medium, High, and Very high, respectively. To facilitate effective risk management, the slope geological hazard risk assessment results calculated based on grid cells were converted into geological hazard risk values for natural slope units.

3.3.3. Risk Assessment Results

Statistics show that flood season slope geological hazard risk is divided into four levels (Figure 15a): Very-high-risk area (14.55 km2), High-risk area (120.79 km2), Medium-risk area (271.39 km2), and Low-risk area (1309.16 km2), accounting for 0.8%, 7.0%, 15.8%, and 76.4% of the total study area, respectively. The Very-high- and High-risk areas are mainly distributed in the shallow metamorphic rock mountainous area in the east of Yanglousi Town and the granite residual soil area in the south of Zhanqiao Town. Non-flood season slope geological hazard risk is divided into two levels (Figure 15b): Medium-risk area (106.22 km2) and Low-risk area (1609.68 km2), accounting for 6.2% and 93.8% of the total study area, respectively. The Medium-risk area is mainly distributed in Zhanqiao Town, Yanglousi Town, and Taolin Town.

4. Discussion

4.1. Model Performance and Sample Strategy

The RF model with hyper-parameters optimized via Bayesian optimization developed in this paper achieved a high AUC value of 0.903 on the test set, outperforming recent similar county-scale studies in Hunan Province, such as the Blending ensemble model in Yiyang City (AUC = 0.878) [24] (AUC = 0.835), the Pearson + RF model in Anhua County [25] (AUC = 0.821), and the GBDT + LR hybrid model in Zhangjiajie City (AUC = 0.817) [26] (AUC = 0.839). The high model accuracy mainly stems from three aspects. The first aspects, the innovative use of field-identified medium- and high-risk units to supplement positive samples (which significantly increases sample size and enhances representation of potential risk areas), and the strict use of low-risk units as the negative sample pool, effectively alleviates the limited generalization ability caused by relying solely on historical hazard points (small sample size, uneven coverage) (a potential limitation is that this strategy depends on survey accuracy and expert judgment). The second aspect is that a detailed screening of conditioning factors was conducted to reduce redundant information and input noise. The third aspect is the application of the Bayesian optimization algorithm for the automatic tuning of key RF hyper-parameters, which fully exploited the model’s potential.
Although the sample classification (unstable/low-risk units) of this study is based on expert experience and has certain rationality, subjective judgment may introduce classification bias. To minimize the potential uncertainty introduced by sample classification, we carefully separated the training and validation sets during model evaluation. Specifically, only the 106 well-documented historical landslide samples were used for accuracy validation, while the 291 expert-identified unstable slope units were included solely in the training phase. This approach aims to enhance the model’s ability to recognize potential geological hazards while ensuring the reliability of the validation process. Future work needs to combine more objective dynamic data such as time-series InSAR monitoring and geotechnical parameters to build a training sample set with smaller deviation.
Compared to previous studies within the province, this paper, combining a high-quality sampling strategy, Bayesian-optimization-based hyper-parameters tuning, and conditioning-factor screening, collectively contributes to a significant improvement in model prediction accuracy, providing reliable support for precise geological hazard susceptibility zoning in the study area. To further verify the inherent adaptability and generalization ability of this model, its core method framework has been preliminarily applied in multiple research areas in Hunan Province with different geological backgrounds (such as the red layer detrital rock basin in Yuanling County and the granite residual slope soil distribution area in Ningxiang City’s hilly and valley areas in Pingjiang County), and has achieved good risk identification results. This preliminarily indicates that the “point surface dual control” method system constructed in this paper has certain robustness and can provide reference for areas under similar conditions.

4.2. Data Accuracy

Although the monthly average rainfall data (500 m grid) used in this study can effectively distinguish seasonal patterns, it is difficult to capture the key short-term heavy rainfall events triggering landslides, which may smooth the immediate driving effect of extreme rainfall peaks on disasters. At the same time, the scale difference between rainfall data and other high-resolution data (such as 10 m terrain), as well as the uncertainty that may be introduced in the process of resampling, will limit the ability of the model to identify local high-risk micro areas on the slope scale. Future research needs to integrate rainfall observation or prediction data with higher spatial–temporal resolution, and pay attention to the intensity–duration–frequency relationship, so as to improve the fine description of rainfall induced disasters.

4.3. Key Factors and Screening

Among the 11 factors included in the evaluation system, the engineering geological rock group comprehensively characterizes the strength properties of rock and soil masses (integrating lithology information); the slope structure controls potential instability modes; elevation is associated with geomorphological and hydrological conditions; distance to river/road network reflects the disturbance intensity of human engineering activities; and other factors characterize key environmental elements such as micro-topography, hydrology, and structure. Slope angle and lithology were eliminated due to significant redundancy (correlation coefficient ≥ 0.4) with other factors. Their core information is effectively represented: the engineering geological rock group covers the key attributes of lithology, and ground roughness reflects the trend of slope variation. The information-gain analysis further showed that the importance of slope angle and lithology was significantly lower than core factors like engineering geological rock group and slope structure, supporting the rationality of elimination and ensuring a concise and efficient model.

4.4. Rainfall Hazard Assessment

To guide risk management efforts over a defined period (e.g., 3–5 years) during both flood and non-flood seasons, the mapping outputs must be practically instructive. Using monthly rainfall data aligns well with this objective, as it provides a suitable temporal scale for seasonal risk planning. In contrast, short-term rainfall data, which can vary significantly over periods of several days even within the flood season, would necessitate frequent dynamic updates of hazard maps—an approach less feasible for regional risk management. Therefore, the triggering factors in this study were determined based on the regional variability of monthly average rainfall. Conducting slope geological hazard assessment by distinguishing between the flood season and non-flood season significantly improves the spatiotemporal specificity and practicality of the assessment. Research shows that flood season rainfall is a key triggering factor for geological hazards, constituting the focus of annual prevention and control: hazard was divided into four levels (Very high, High, Medium, and Low), with the Very high and High zones (17.6% of the study area) concentrated in Yanglousi Town and Zhanqiao Town. Non-flood season hazard is significantly reduced and was divided into three levels (High, Medium, and Low) using the equal interval method, with the High area proportion sharply reduced to 0.4% (mainly concentrated in Zhanqiao Town). This significant difference highlights the core position of flood season prevention and control. It should be noted that the non-flood season equal interval classification method, while simple, may be conservative in the absence of hazard event verification and has limitations in assessing risks from non-typical triggering mechanisms. Future work could incorporate soil moisture content and early warning models for optimization. Rainfall also varies monthly within the flood season; future attempts could conduct monthly hazard assessments to provide more refined timeframes for prevention and control.

4.5. Vulnerability Assessment

Although it is effective in methodology that vulnerability is mainly treated as a relatively static spatial exposure configuration (such as land use, building density, etc.), it does not fully reflect its dynamic evolution characteristics. Future research should focus on incorporating more timely and procedural vulnerability factors. For example, we can consider integrating real-time or seasonal social–economic activity data (such as population day and night flow, economic activity intensity), evaluating the functional status of key infrastructure during rainfall (such as road capacity, drainage system load), and discussing the changes in community disaster response capacity under different rainfall seasons, so as to build a “hazard-triggered” dynamic vulnerability assessment framework, which is more suitable for real risk scenarios.

4.6. Risk Assessment Result Validation

Based on the field results of the 1:10,000 geological hazard investigation project in Linxiang City, Hunan Province, a total of 3120 residential slope units were delineated in the area (including 2829 low-risk, 285 medium-risk, and 6 high-risk units). Given the order-of-magnitude difference in results caused by the differing methods between field evaluation (High-risk proportion 0.2%) and indoor evaluation (High-risk zone proportion 7.0%), this paper merged Medium/High risk and Low risk into two categories for binary classification accuracy analysis. Considering that all hazard points in the area occurred during the flood season, and field surveys already included rainfall-triggering factors, their judgment results are closer to the flood season scenario. Therefore, the indoor evaluation results for the flood season scenario were compared with the field survey results (Table 3).
The comparison showed that, among the 2829 field-identified low-risk slope units, the indoor evaluation classified 1193 as Low risk and 1636 as Medium/High risk, yielding an accuracy of 42.17%. Among the 291 field-identified unstable slopes, the indoor evaluation classified 280 as Medium/High risk and only 11 as Low risk, with a high accuracy of 96.22%. The relatively low accuracy for low-risk units is mainly attributable to inconsistent grading systems: the field survey adopted a three-level scheme (Low/Medium/High), whereas the indoor evaluation used a four-level scheme (Low/Medium/High/Very high), causing the field-defined “Low-risk” category to cover a broader range than the indoor “Low-risk” category.
The four risk levels (very high, high, medium, and low) identified in this study aim to provide an actionable decision-making basis for geological hazard risk management in Linxiang City. Specifically, high-risk areas should initiate the highest level of response, including immediate implementation of engineering governance measures (such as support and drainage), strict prohibition of new projects, deployment of real-time automated monitoring networks, and the development of detailed emergency plans and mandatory evacuation plans. High-risk areas should be the focus of monitoring and intervention, with priority given to deploying professional monitoring equipment, conducting detailed surveys to design targeted engineering measures, and strictly limiting construction activities in national spatial planning. Medium-risk areas can be included in the routine group monitoring and prevention system, strengthened with regular inspections, and considered as important factors in land use approval. Low-risk areas are mainly used as regional background references to optimize emergency resource layout and evacuation path planning. This clear transformation from “risk identification” to “graded action” can significantly improve the efficiency of disaster prevention resource allocation and the priority of management actions, making research results truly serve the refined risk management practices of local governments.

4.7. Method Advantages

This paper constructed a complete chain of research from susceptibility identification, time-variant hazard analysis, graded vulnerability assessment to final risk determination, covering the entire risk assessment process, with prominent systematicity. The performance of the Random Forest model was improved through Bayesian optimization; the seasonal variation characteristics of rainfall were fully considered; the proposed simplified vulnerability assessment method is feasible; the model accuracy meets practical application requirements, demonstrating strong practicality. The core framework has strong transferability and can adapt to applications in different regions; meanwhile, it also clearly states that the selection of specific influencing factors and rainfall classification standards need to be adapted and adjusted according to local conditions, ensuring the method’s flexibility and applicability, with good generalizability.

4.8. Limitations

While the proposed model demonstrates practical utility in Linxiang City, certain limitations should be acknowledged. First, the non-flood season geological hazard assessment lacks sufficient data support and mechanism analysis, making the evaluation basis relatively weak. Second, the vulnerability assessment adopted a simplified classification treatment and failed to further quantify specific economic losses, limiting its refinement. Third, the model prediction performance is highly dependent on the quality and representativeness of the sample data; data bias may affect the reliability of the results. Fourth, the “hazard points + risk zones” framework, though effective locally, may require certain adjustments when applied to regions with distinctly different geological or social–economic conditions. Additionally, the current model primarily focuses on rainfall-induced factors and has not yet considered the effects of other important triggers such as earthquakes. Future research could focus on exploring more refined (e.g., monthly scale) temporal dynamic analysis; attempting to couple physical mechanism models to deepen mechanistic understanding; integrating real-time monitoring data to improve early warning timeliness; and expanding the model to cover multiple triggering factors, thereby enhancing the comprehensiveness and accuracy of the assessment.

5. Conclusions

This study focuses on slope geological hazards in Linxiang City and conducts a county-scale (1:10,000) risk assessment based on susceptibility, hazard, and vulnerability, with risk derived through their integration. A county-level geo-hazard risk assessment framework was developed, which integrates a RF model with hyper-parameters optimized via Bayesian optimization, considers two seasonal rainfall scenarios (flood and non-flood seasons), and adopts a simplified four-level vulnerability classification. By supplementing and refining the hazard-related sample set through field surveys and incorporating Bayesian optimization, the susceptibility model achieved strong predictive performance (AUC = 0.903) compared with similar studies within the province. The risk assessment results indicate that risk during the flood season is substantially greater than during the non-flood season. Very-high- and high-risk areas are primarily concentrated in Yanglousi Town in the east (shallow metamorphic rock area) and Zhanqiao Town in the south (granite residual soil area), which aligns with field investigation findings. The proposed method effectively supports refined geological hazard risk management in Linxiang City via “point-area dual control” and can serve as a reference for other counties with similar conditions.

Author Contributions

Conceptualization, C.W., Z.Q., L.X. and T.X.; methodology, C.W., T.X. and M.M.; writing—original draft preparation, C.W., Z.Q. and X.L.; writing—review and editing, C.W., T.X., R.P. and L.X.; visualization, Z.Q., R.P. and M.M.; funding acquisition, C.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Science and Technology Program of Geological Bureau of Hunan Province (No. HNGSTP202463), the Natural Science Foundation of Hunan Province, China (No. 2025JJ80406), the Natural Resources Research (Standard) Post subsidy Project of the Hunan Provincial Department of Natural Resources (No. HBZ20240127), Open Fund (No. hndzgczx202404) of Hunan Geological Disaster Monitoring, Early Warning and Emergency Rescue Engineering Technology Research Center.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets presented in this article are not readily available due to institutional and/or third-party data-use restrictions. Requests to access the datasets should be directed to the corresponding author at zuohui@csu.edu.cn.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Panchal, S.; Shrivastava, A.K. Landslide hazard assessment using analytic hierarchy process (AHP): A case study of National Highway 5 in India. Ain Shams Eng. J. 2022, 13, 101626. [Google Scholar] [CrossRef]
  2. He, F.; Cai, F. Feature Selection and Comparison of Logistic Regression and Random Forest for Stability Assessment of Landslide Dams. Landslides 2025, 1–13. [Google Scholar] [CrossRef]
  3. Liu, X.D.; Xiao, T.; Zhang, S.H.; Sun, P.H.; Liu, L.L.; Peng, Z.W. Comparative study of sampling strategies for machine learning-based landslide susceptibility assessment. Stoch. Environ. Res. Risk Assess. 2024, 38, 4935–4957. [Google Scholar] [CrossRef]
  4. Huang, F.; Liu, K.; Li, Z.; Zhou, X.; Zeng, Z.; Li, W.; Huang, J.; Catani, F.; Chang, Z. Single landslide risk assessment considering rainfall-induced landslide hazard and the vulnerability of disaster-bearing body. Geol. J. 2024, 59, 2549–2565. [Google Scholar] [CrossRef]
  5. Fu, Z.; Li, D.Q.; Wang, S.; Zhang, L.; Du, W. Causes of episodic movement of the Baijiabao landslide based on multiple-time scale analysis. Landslides 2024, 21, 1069–1082. [Google Scholar] [CrossRef]
  6. Deng, P.; Bing, J.; Wang, L.; Li, L. Extreme rainfall frequency distribution and its flood risk assessment in the Upper Hanjiang River Basin. Nat. Hazards 2025, 121, 13173–13192. [Google Scholar] [CrossRef]
  7. Strouth, A.; McDougall, S. Individual risk evaluation for landslides: Key details. Landslides 2022, 19, 977–991. [Google Scholar] [CrossRef]
  8. Sun, Y.; Li, Y.; Ma, R.; Gao, C.; Wu, Y. Mapping urban socio-economic vulnerability related to heat risk: A grid-based assessment framework by combing the geospatial big data. Urban Clim. 2022, 43, 101169. [Google Scholar] [CrossRef]
  9. Mosaffaie, J.; Salehpour Jam, A.; Sarfaraz, F. Landslide risk assessment based on susceptibility and vulnerability. Environ. Dev. Sustain. 2024, 26, 9285–9303. [Google Scholar] [CrossRef]
  10. Trigila, A.; Iadanza, C.; Esposito, C.; Scarascia-Mugnozza, G. Comparison of logistic regression and random forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy). Geomorphology 2015, 249, 119–136. [Google Scholar] [CrossRef]
  11. Taalab, K.; Cheng, T.; Zhang, Y. Mapping landslide susceptibility and types using Random Forest. Big Earth Data 2018, 2, 159–178. [Google Scholar] [CrossRef]
  12. Nhu, V.-H.; Shirzadi, A.; Shahabi, H.; Chen, W.; Clague, J.J.; Geertsema, M.; Jaafari, A.; Avand, M.; Miraki, S.; Talebpour Asl, D.; et al. Shallow landslide susceptibility mapping by random forest base classifier and its ensembles in a semi-arid region of Iran. Forests 2020, 11, 421. [Google Scholar] [CrossRef]
  13. Akinci, H.; Kilicoglu, C.; Dogan, S. Random forest-based landslide susceptibility mapping in coastal regions of Artvin, Turkey. ISPRS Int. J. Geo-Inf. 2020, 9, 553. [Google Scholar] [CrossRef]
  14. Naghibi, S.A.; Ahmadi, K.; Daneshi, A. Application of support vector machine, random forest, and genetic algorithm optimized random forest models in groundwater potential mapping. Water Resour. Manag. 2017, 31, 2761–2775. [Google Scholar] [CrossRef]
  15. Park, S.; Kim, J. Landslide susceptibility mapping based on random forest and boosted regression tree models, and a comparison of their performance. Appl. Sci. 2019, 9, 942. [Google Scholar] [CrossRef]
  16. Rigatti, S.J. Random forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef]
  17. Kanwar, M.; Pokharel, B.; Lim, S. A new random forest method for landslide susceptibility mapping using hyperparameter optimization and grid search techniques. Int. J. Environ. Sci. Technol. 2025, 22, 10635–10650. [Google Scholar] [CrossRef]
  18. Lei, X.; Liu, J.; Du, Y.; Liu, L.; Yuan, Y.; Zhao, X.; Wu, X. Optimizing machine learning and bagging-based hybrid models for landslide susceptibility mapping: A case study in Chenggu County, China. Sci. Rep. 2025, 15, 44211. [Google Scholar] [CrossRef]
  19. Wang, Q.; Luan, S.; Jiang, J.; Chen, Y.; Liu, S. A multi-objective optimization framework for mudflow susceptibility mapping in the Yanshan Mountains: Integrating nondominated sorting genetic algorithm-II, random forest, and gradient boosting decision trees. Phys. Fluids 2025, 37, 76613. [Google Scholar] [CrossRef]
  20. Xiao, T.; Huang, W.; Wang, L.; Yang, B.; Qin, Z.; Liu, X.; Xiao, Y. Uncertainty-aware ensemble learning and dynamic threshold optimization for landslide susceptibility mapping. Comput. Geosci. 2025, 206, 106042. [Google Scholar] [CrossRef]
  21. Tan, C.; Feng, Z. Mapping forest fire risk zones using machine learning algorithms in Hunan province, China. Sustainability 2023, 15, 6292. [Google Scholar] [CrossRef]
  22. Zeng, T.; Jin, B.; Glade, T.; Xie, Y.; Li, Y.; Zhu, Y.; Yin, K. Assessing the imperative of conditioning factor grading in machine learning-based landslide susceptibility modeling: A critical inquiry. Catena 2024, 236, 107732. [Google Scholar] [CrossRef]
  23. Wu, S.; Wang, H.; Zhang, J.; Qin, H. Hybrid method for rainfall-induced regional landslide susceptibility mapping. Stoch. Environ. Res. Risk Assess. 2024, 38, 4193–4208. [Google Scholar] [CrossRef]
  24. Hou, C.; Liu, H.; Wang, X.; Hu, J.; Tang, Y.; Yao, X. Landslide Susceptibility Analysis Based on Dataset Construction of Landslides in Yiyang Using GIS and Machine Learning. Appl. Sci. 2025, 15, 5597. [Google Scholar] [CrossRef]
  25. Liu, L.L.; Zhang, Y.L.; Xiao, T.; Yang, C. A frequency ratio–based sampling strategy for landslide susceptibility assessment. Bull. Eng. Geol. Environ. 2022, 81, 360. [Google Scholar] [CrossRef]
  26. Huan, Y.; Song, L.; Khan, U.; Zhang, B. Stacking ensemble of machine learning methods for landslide susceptibility mapping in Zhangjiajie City, Hunan Province, China. Environ. Earth Sci. 2023, 82, 35. [Google Scholar] [CrossRef]
Figure 1. Methodological framework.
Figure 1. Methodological framework.
Applsci 16 01309 g001
Figure 2. Landslides and unstable slopes in Linxiang City. (a) Spatial distribution of landslides and unstable slopes. (b) Location of Linxiang City in Hunan Province. (c) Landslide. (d) Unstable slope.
Figure 2. Landslides and unstable slopes in Linxiang City. (a) Spatial distribution of landslides and unstable slopes. (b) Location of Linxiang City in Hunan Province. (c) Landslide. (d) Unstable slope.
Applsci 16 01309 g002
Figure 3. The initial susceptibility ratings for the causal factors. (a) Elevation. (b) Slope. (c) Aspect. (d) Rock engineering. (e) Surface roughness. (f) TRI. (g) Slope structure. (h) Distance to fault. (i) Soil thickness. (j) SPI. (k) Distance to river. (l) Distance to road. (m) Curvature. (n) Lithology. (o) TWI. (p) Cut-slope strength. (q) Land cover. (r) NDVI.
Figure 3. The initial susceptibility ratings for the causal factors. (a) Elevation. (b) Slope. (c) Aspect. (d) Rock engineering. (e) Surface roughness. (f) TRI. (g) Slope structure. (h) Distance to fault. (i) Soil thickness. (j) SPI. (k) Distance to river. (l) Distance to road. (m) Curvature. (n) Lithology. (o) TWI. (p) Cut-slope strength. (q) Land cover. (r) NDVI.
Applsci 16 01309 g003
Figure 4. Heat map of Pearson’s correlation coefficient.
Figure 4. Heat map of Pearson’s correlation coefficient.
Applsci 16 01309 g004
Figure 5. Information-gain-based contributions of causal factors.
Figure 5. Information-gain-based contributions of causal factors.
Applsci 16 01309 g005
Figure 6. Geological hazard susceptibility map. (a) Susceptibility map classified into four levels using the natural breaks method. (b) Enlarged view of a selected high-susceptibility area. (c) Field example for panel (b). (d) Enlarged view of another selected high-susceptibility area. (e) Field example for panel (d).
Figure 6. Geological hazard susceptibility map. (a) Susceptibility map classified into four levels using the natural breaks method. (b) Enlarged view of a selected high-susceptibility area. (c) Field example for panel (b). (d) Enlarged view of another selected high-susceptibility area. (e) Field example for panel (d).
Applsci 16 01309 g006
Figure 7. Receiver operating characteristic curve. The dotted line represents the performance of the random classifier, i.e., the diagonal (y = x).
Figure 7. Receiver operating characteristic curve. The dotted line represents the performance of the random classifier, i.e., the diagonal (y = x).
Applsci 16 01309 g007
Figure 8. Monthly average rainfall in Linxiang City (2018–2020). (a) Flood season monthly average rainfall. (b) Non-flood season monthly average rainfall.
Figure 8. Monthly average rainfall in Linxiang City (2018–2020). (a) Flood season monthly average rainfall. (b) Non-flood season monthly average rainfall.
Applsci 16 01309 g008
Figure 9. Relationship between historical hazard frequency and monthly average rainfall.
Figure 9. Relationship between historical hazard frequency and monthly average rainfall.
Applsci 16 01309 g009
Figure 10. Rainfall-induced hazard intensity classification. (a) Flood season. (b) Non-flood season.
Figure 10. Rainfall-induced hazard intensity classification. (a) Flood season. (b) Non-flood season.
Applsci 16 01309 g010
Figure 11. Hazard classification matrix.
Figure 11. Hazard classification matrix.
Applsci 16 01309 g011
Figure 12. Geological hazard map. (a) Flood season. (b) Non-flood season.
Figure 12. Geological hazard map. (a) Flood season. (b) Non-flood season.
Applsci 16 01309 g012
Figure 13. Vulnerability rating in Linxiang City. (a) Vulnerability map. (b) Example of very high vulnerability. (c) Example of high vulnerability. (d) Example of medium vulnerability. (e) Example of low vulnerability.
Figure 13. Vulnerability rating in Linxiang City. (a) Vulnerability map. (b) Example of very high vulnerability. (c) Example of high vulnerability. (d) Example of medium vulnerability. (e) Example of low vulnerability.
Applsci 16 01309 g013
Figure 14. Risk classification matrix.
Figure 14. Risk classification matrix.
Applsci 16 01309 g014
Figure 15. Geological hazard risk Map. (a) Flood season. (b) Non-flood season.
Figure 15. Geological hazard risk Map. (a) Flood season. (b) Non-flood season.
Applsci 16 01309 g015
Table 1. Data sources.
Table 1. Data sources.
NumberTypeNameFormatPrecisionSource
1geographic dataDEMTIF10 × 10 mThe Third Surveying and Mapping Institute of Hunan Province
2River network systemSHP1:10,000
3Transportation networkSHP1:10,000
4Residential areaSHP1:10,000
5geological data Lithology of strataSHP1:50,000Hunan Provincial Natural Resources Affairs Center
6Engineering geological rock formation surfaceSHP1:50,000
7Fault boundary lineSHP1:50,000
8remote sensing dataNormalized vegetation indexTIF10 × 10 mGeospatial Data Cloud
9survey dataGeological hazard investigation points/areasSHP1:10,000Field investigation and collection by the research team
10Slope unit surfaceSHP1:10,000
11Cut slope building survey pointSHP1:10,000
12Investigation points for rock and soil structureSHP1:10,000
13meteorological dataMonthly Rainfall Distribution MapTIF500 × 500 mNational Qinghai Tibet Plateau Science Data Center
14other dataPresent situation of land useSHP1:10,000Linxiang Natural Resources Bureau
Table 2. Comparison of rainfall-induced effect classification and hazard frequency.
Table 2. Comparison of rainfall-induced effect classification and hazard frequency.
GradingThe Possibility of Inducing DisastersProbability RangeAccumulated Number of Historical DisastersMonthly Average Rainfall Classification During Flood SeasonMonthly Average Rainfall Classification During Non-Flood Season
Ivery slim0–0.055<184.9 mm<90 mm
IIrelatively low0.05–0.2526184.9~192.7 mm90~100 mm
IIIrelatively high0.5–0.7580192.7~196.8 mm<100 mm
IVhigh0.75–0.95101196.8~203.2 mm
Vvery high0.95–1106>203.2 mm
Table 3. Analysis of the accuracy of slope risk assessment.
Table 3. Analysis of the accuracy of slope risk assessment.
Field SurveyIndoor Evaluation: Low RiskIndoor Evaluation: Medium/High RiskTotalAccuracyOverall Accuracy
Low risk11931636282942.17%47.56%
Medium/High risk1128029196.22%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, C.; Qin, Z.; Xiao, T.; Xiang, L.; Peng, R.; Mi, M.; Liu, X. Slope Geological Hazard Risk Assessment Using Bayesian-Optimized Random Forest: A Case Study of Linxiang City, China. Appl. Sci. 2026, 16, 1309. https://doi.org/10.3390/app16031309

AMA Style

Wang C, Qin Z, Xiao T, Xiang L, Peng R, Mi M, Liu X. Slope Geological Hazard Risk Assessment Using Bayesian-Optimized Random Forest: A Case Study of Linxiang City, China. Applied Sciences. 2026; 16(3):1309. https://doi.org/10.3390/app16031309

Chicago/Turabian Style

Wang, Can, Zuohui Qin, Ting Xiao, Longlong Xiang, Renwei Peng, Maosheng Mi, and Xiaodong Liu. 2026. "Slope Geological Hazard Risk Assessment Using Bayesian-Optimized Random Forest: A Case Study of Linxiang City, China" Applied Sciences 16, no. 3: 1309. https://doi.org/10.3390/app16031309

APA Style

Wang, C., Qin, Z., Xiao, T., Xiang, L., Peng, R., Mi, M., & Liu, X. (2026). Slope Geological Hazard Risk Assessment Using Bayesian-Optimized Random Forest: A Case Study of Linxiang City, China. Applied Sciences, 16(3), 1309. https://doi.org/10.3390/app16031309

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop