Next Article in Journal
ICT as a Support for Value Chain Management in Tourism Destinations: The Case of the City of Cuenca, Ecuador
Next Article in Special Issue
The Deformation Characteristics of the Zhuka Fault in Lancang River and Its Influence on the Geostress Field
Previous Article in Journal
Understanding the Impact of Underwater Noise to Preserve Marine Ecosystems and Manage Anthropogenic Activities
Previous Article in Special Issue
Landslide Susceptibility Mapping in Guangdong Province, China, Using Random Forest Model and Considering Sample Type and Balance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Analysis of Optimal Buffer Distance for Linear Hazard Factors in Landslide Susceptibility Prediction

1
School of Earth Science and Engineering, Hohai University, 8 Focheng West Road, Nanjing 211100, China
2
School of Naval Architecture and Ocean Engineering, Jiangsu Maritime Institute, Nanjing 211199, China
3
Shandong Gold Group Penglai Mining Co., Ltd., Yantai 265621, China
4
School of Geography Science and Geomatics Engineering, Suzhou University of Science and Technology, Suzhou 215009, China
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(13), 10180; https://doi.org/10.3390/su151310180
Submission received: 20 April 2023 / Revised: 15 June 2023 / Accepted: 23 June 2023 / Published: 27 June 2023

Abstract

:
A linear hazard-causing factor is the environmental element of landslide susceptibility prediction, and the setting of buffer distance of a linear hazard-causing factor has an important influence on the accuracy of landslide susceptibility prediction based on machine learning algorithms. A geographic information system (GIS) has generally been accepted in the correlation analysis between linear hazard-causing factors and landslides; the most common are statistical models based on buffer zone analysis and superposition analysis for linear causative factor distances and landslide counts. However, there is a problem in the process of model building: the buffer distance that is used to build the statistical model and its statistical results can appropriately reflect the correlation between the linear disaster-causing factors and landslides. To solve this problem, a statistical model of landslide density and distance of linear disaster-causing factors under different single-loop buffer distances was established based on Pearson’s method with 12 environmental factors, such as elevation, topographic relief, and distance from the water system and road, in Ruijin City, Jiangxi Province to obtain the most relevant single-loop buffer distance linear disaster-causing factor combinations; random forest (RF) machine learning models were then used to predict landslide susceptibility. Finally, the Kappa coefficient and the distribution characteristics of the susceptibility index were used to investigate the modeling laws. The analysis results indicate that the prediction accuracy of the most correlated single-loop buffer distance combination reaches 96.65%, the error rate of non-landslide points is 4.2%, and the error of landslide points is 11.3%, which is higher than the same single-loop buffer distance combination, confirming the reasonableness of the method of using correlation to obtain the linear disaster-causing factor buffer distance.

1. Introduction

Landslide is one of the main geological disasters in China, which poses a serious threat to the socio-economic development of mountainous areas [1,2]. In areas with high incidences of geological disasters, predicting landslide susceptibility and analyzing the probability and spatial distribution of landslide disasters have important guiding significance for landslide prediction and early warning, land use planning, urban construction, and rural development [3,4,5]. The complexity regarding landslides and the diversity pertaining to disaster-causing factors have resulted in the prediction of landslide disaster susceptibility becoming a hot and difficult research topic at home and abroad [6]. In addition to the traditional geodetic methods and GNSS technology [7,8], there are also monitoring methods based on remote sensing technology for landslide monitoring [9,10]. With the rapid development of artificial intelligence, progressively more researchers are integrating traditional engineering geological analogy methods with computer technologies, such as a GIS and machine learning models, to explore the possibility of landslides under the influence of environmental factors so as to comprehensively assess the susceptibility of landslides [11,12]. The evaluation of landslide susceptibility based on machine learning algorithms uses historical landslide data and environmental factors related to landslide occurrence to conduct training and fitting, and then predicts the landslide susceptibility in other regions [13]. Although the research on susceptibility modeling has achieved rapid progress, there remain problems to be explored and improved upon in the research on landslide-related environmental factors. The suitability analysis of landslide environmental factors plays a significant role in improving the overall performance of modeling [14].
Research on landslides as a form of complexes is currently primarily focused on a GIS, where various environmental elements are normalized and processed into quantitative data and then statistical, inferential, deterministic, and other models are used to perform data operations and zone out areas within a region that are relatively susceptible to landslides [15]. An example includes Yin Kunlong based on the past and present occurrence of landslides and based on a certain landslide risk evaluation model to zone the landslide susceptibility of a certain area [16]. Using a GIS, Cao Hongyang and Ma Jinhui determined the strength of the influence of faults on landslide development in a certain area by statistically analyzing the relationship between the number of landslides and the vertical distance of landslides from faults [17,18]. A common statistical model is involved in all the above studies, using buffer analysis and superposition analysis in the GIS spatial analysis function to superimpose the buffer zone (part of the buffer zone) generated by extending outward from the fault margin with the geographic coordinates of the landslide in question, and the width of the buffer zone is equal to the buffer distance used in generating the buffer zone, but there is no definite standard for determining the buffer distance in this statistical model based on a buffer zone of a specific size. Among many environmental factors, linear hazard factors, such as lithologic and geological boundaries, rivers, and roads, have varying degrees of impact on the occurrence of landslides. For example, in the process of road construction, engineering slope cutting will produce a floating surface, leading to slope toe instability [19]. The existing research has shown that, when exploring linear environmental factors, such as water systems, highways, and faults, predecessors often use the distance from water systems and highways obtained based on buffer analysis to express them [20]. In the machine learning model that has emerged in recent years, linear disaster factors are processed using GIS technology for buffer analysis, and individual factors are superimposed with other factors as input variables [21]. Different scholars’ single-loop buffer distances for processing linear disaster factors range from tens to hundreds of meters, mainly relying on experience and expert knowledge, and there is some subjective uncertainty [22]. For example, the buffer distance setting of linear factors does not fully consider the differences in the impact degree and scope of different linear disaster factors, which is the main factor affecting the accuracy of landslide susceptibility prediction models [23,24]. Integration learning, a widely used intelligent algorithm in recent years, can improve the correctness and generalization by integrating multiple weak classifiers into a single strong classifier. Random forest (RF) is the most representative algorithm in bagging-based integrated learning. Random forests integrate multiple decision trees using random sampling and finally make predictions through a majority voting mechanism. Compared with traditional machine learning methods, such as support vector machines and artificial neural networks, random forests have a very high accuracy rate, and the introduction of their randomness ensures that the algorithms are less prone to overfitting and able to handle very high-dimensional data. Therefore, in order to improve the accuracy of the regional landslide susceptibility prediction model, this paper combines the landslide environmental factors with the random forest model, carries out the linear environmental factor suitability analysis, and integrates the coupling between the statistical analysis model and integrated learning to establish the landslide prediction model.
Ruijin is the “Red Capital” of China. Landslides in the region are mainly small, and, after artificial slope cutting, such as residential bases, highways, and water conservancy facilities construction, the natural slope loose accumulation (soil) or broken rock (mainly thousand-slab slate and rocks with down-slope level or fracture surface) lose their supporting force and balance, forming a brand new slope prostrate, regarding which it is easy to induce slope instability under the effect of strong rainfall. Therefore, in this study, Ruijin City is selected as the study area, and the environmental factors in the region are extracted using a combination of statistical analysis models and machine learning methods using remote sensing and geographic information technology. Based on Pearson correlation statistical analysis, the correlation between the distance of linear hazard-causing factors and the density of landslides in the corresponding buffer zone under different single-loop buffer distances is explored to determine the optimal buffer distance of linear hazard-causing factors, set up differential buffers, and establish a prediction model of RF landslide susceptibility so as to achieve accurate quantitative evaluation of landslide susceptibility and thus provide a basis and guidance for future disaster prevention and control and town planning in the area.

2. Susceptibility Modeling Methods

2.1. Subsection

The key research in this paper is to select lithologic geological boundaries, roads, and river factors as analysis objects, and evaluate the impact of each factor on landslides in different grading ranges using the relationship between landslide density and linear factor distance. The overall modeling process is as follows: (1) firstly, obtain the basic landslide data in the study area, and analyze and select 12 basic environmental factors that have strong correlations with landslide occurrence as the original factor combination. (2) Using the frequency ratio method to process the factors to obtain the frequency ratio (FR) value, assign a value of 1 to the landslide grid unit in all sample points, randomly select a grid equal to the landslide grid from the non-landslide area as the non-landslide sample and assign it a value of 0, and then randomly divide the data source into training sets and test sets at a ratio of 7:3. (3) Select the currently widely used RF model at home and abroad to construct a machine learning model. (4) Comparative analysis of the machine learning model susceptibility index distribution and the landslide master control factor were conducted separately to derive the prediction effect of the RF machine learning model.

2.2. Frequency Ratio

The correlation between landslides and basic environmental factors is often a non-linear response relationship, and the frequency ratio method can better quantitatively reflect the mapping relationship between various environmental factors and landslide susceptibility [25]. The natural breakpoint method is used to divide the basic environmental factors into six attribute intervals, construct a grid layer of factors based on the classification interval, and automatically calculate the total number [26]. The principal formula of the frequency ratio method is shown in Equation (1). The impact of environmental factors on landslide susceptibility can be reflected by the magnitude of FR values: FR > 1 indicates that this factor range is conducive to landslide occurrence, and its value is positively correlated with the contribution weight of landslide occurrence; FR < 1 indicates that the landslide environment factor range is not conducive to landslide occurrence.
F R = X i / X Y i / Y
where X i is the total grid number of landslides within the attribute interval for various factors; X is the total grid number of landslides in the study area; Y i is the total grid number of various factors within the attribute interval; Y is the total number of grids in the study area.

2.3. Random Forest (RF)

RF is an integrated learning method that combines the bagging method to generate multiple mutually independent training sets and multiple classification and regression trees (CARTs) for prediction. The results are determined by the highest or average voting scores [27]. The main idea is that the combined judgment results of multiple classifiers are superior to the judgment results of a single classifier. Using the bagging method, n samples (accounting for 2/3 of the total samples) were randomly selected as independent spatial training sets, and a CART tree was established for each training set. Among them, m factors ( m ≤ the total number of factors) are randomly selected for internal node branching without branch reduction, resulting in n independent random decision trees [28]. Synthesize the results of n decision trees and take the class with the highest number of votes or its average value as the result. The 1/3 data that are not sampled in each random sampling are called out-of-bag (OOB) data. This portion of data is used to perform internal error estimation to obtain the OOB error of each tree. The OOB error of all trees is averaged to obtain the OOB error of RF.
The OOB error is an unbiased estimate that approximates the error obtained by cross-validation and is bounded by the generalized error of RF [29]:
p ρ ¯ ( 1 s 2 ) / s 2
where p is the generalization error of RF; ρ ¯ is the average value of correlation between CART trees; s is the average strength of the decision tree.

2.4. Uncertainty Evaluation

2.4.1. Kappa Coefficient

The Kappa coefficient can be used to measure the classification accuracy of the evaluation model, and its calculation is based on the confusion matrix [30]. The calculation formula for Kappa coefficient is:
k = p 0 p e 1 p e
where p 0 is the number of samples correctly classified for each category divided by the total number of samples, which is the overall classification accuracy. Assuming that the actual number of samples for each category is a 1 , a 2 ,     , a n , while the predicted number of samples for each category is b 1 , b 2 ,     , b n , and the total number of samples is n , there is p e = ( a 1 b 1 + a 2 b 2 +...+ a n b n ) n 2 . The Kappa coefficient usually ranges from 0 to 1, and, the larger the value, the higher the accuracy of the evaluation model.

2.4.2. Distribution Characteristics of Landslide Susceptibility Index

The two important characteristics of exponential distribution are centralized trend and discrete trend. Centralized trend refers to the aggregation of data concentrated in a certain value or range, commonly expressed as mean value; discrete trends represent the degree of dispersion of data, often described by standard deviation. The smaller the mean value of the landslide susceptibility index, the larger the standard deviation, indicating that more landslides are predicted using less data and the uncertainty of the model is low.

3. Data Source

3.1. Overview of the Research Area

Ruijin City is located in the middle of Ganzhou City, Jiangxi Province in the Ningyu depression and Wuyi uplift belt on the southwest side of the Mount Wuyi vein, with strong structural deformation and frequent magmatic activity, characterized by strong fault activity. The territory belongs to the Gongjiang River system, with the main rivers including the Meijiang River, Mianjiang River, and Jiubao River (Figure 1). The transportation is mainly composed of highways, with national highways supplemented by crisscrossing county township (town) and village highways, forming a “three vertical and four horizontal” highway transportation network centered around the urban area. However, due to the construction of roads in the area adjacent to mountains and rivers, especially the reconstruction and expansion of roads, the mountains on both sides of the road have become unstable due to man-made slope cutting, which has resulted in multiple engineering geological disasters, such as collapses and landslides. At the same time, there are also serious geological hazards in some sections [31].

3.2. Indicator Selection and Data Source

The accurate evaluation of landslide disasters requires the correct selection of environmental factors, combining the environmental geological characteristics of Ruijin City and the occurrence regularity of landslide disasters, based on GIS technology and remote sensing images, selecting twelve environmental factors, including four linear hazard factors, such as lithology, geological boundaries, rivers, and roads, including geology, topography, vegetation coverage, and hydrology. The basic data sources mainly come from 1:50,000 geological maps, Landsat 4-5 TM remote sensing images (geospatial data cloud http://www.gscloud.cn/ accessed on 19 April 2023), ASTER GDEM data with a spatial resolution of 30 m (Geospatial Data Cloud http://www.gscloud.cn/ accessed on 19 April 2023), Soil Type Structure Data of Jiangxi Province (China Soil Database http://vdb3.soil.csdb.cn/ accessed on 19 April 2023), and precipitation data from meteorological stations in Jiangxi Province. Google Earth’s high-resolution remote sensing images can serve as an important complementary source of historical landslide disasters and basic geographic environmental data, such as roads and rivers.
According to the 1:50,000 geological disaster survey data in Ruijin City, from 1970 to 2013, a total of 370 landslides occurred in the study area, as shown in Figure 1. In the RF classification problem, the selection of non-landslide stability points is also very important [32]. In Google Earth in this area, non-landslide stable points with the same amount of landslide data are selected from flat areas with low slopes, such as cities, farmland, and water bodies, and, together with historical landslide samples, a sample set of landslide susceptibility prediction models is formed. Select 70% of the data as the training set to build the model; select 30% of the data as a verification set to verify the accuracy of the model.

3.3. Environment Factors

Landslide environmental factors refer to internal factors that control the occurrence of landslides, mainly including basic geological and hydrological conditions, topography, and other factors [33,34]; The external inducing factors mainly include factors such as earthquake action, human activity, and surface cover. Select the twelve types of landslide environmental factors shown in Table 1 and use the natural breakpoint method in ArcGIS to divide the selected factors into six interval levels. After processing with the frequency ratio method, the FR values of each attribute interval are obtained as shown in Table 1.
(1)
Topographical factor
Using ArcGIS and DEM, topographic and geomorphic factors, such as elevation, aspect, slope, plane curvature, section curvature, and topographic relief, are extracted. As shown in Table 1 and Figure 2, the relationship between elevation and FR value is approximately inversely correlated. When the elevation is less than 335 m, the FR value is greater than 1. The difference between slope orientations is mainly manifested in the differences in temperature and surface vegetation on sunny and cloudy slopes. The FR value of slope orientations is greater than 1 at 67.5° to 245.5°. The slope can measure the steepness of the surface, as shown in Table 1. When the slope is between 8.8° and 17.9°, the FR value is greater than 1. The plane curvature is an environmental factor that can describe the horizontal terrain characteristics obtained by extracting the slope from the slope direction. In Ruijin City, the plane curvature range with an FR value greater than 1 is 0 to 37.710. The section curvature can be referred to as the slope of the slope. When the section curvature is between 2.029 and 8.949, its FR value is higher than 1, which is conducive to the development of landslides. The topographic relief can be used to describe the topographic and geomorphic characteristics from a macro perspective. When the value is between 12.420 and 22.969 m, FR > 1 can create good conditions for the development of landslides.
(2)
Surface cover factor
It includes the normalized vegetation index (NDVI) and the normalized building index (NDBI), which determine landslide development conditions through vegetation coverage and building coverage, respectively. As shown in Table 1 and Figure 2, landslides are more likely to occur when NDVI is between 0.018 and 0.025, 0.033 and 0.098, and NDBI is between −0.318 and −0.219.
(3)
Hydrological condition factor
This includes distance from the water system, water system density, normalized differential water body index (MNDWI), and topographic humidity. As shown in Table 1 and Figure 2, MNDWI is commonly used to reflect surface water information. When its value is between 0.217 and 0.276, the FR value is greater than 1. The impact of water systems on landslide development can be expressed by the distance from the water system. According to the buffer analysis, about 89.2% of the landslide grids are included in the 0–300 m interval from the water system, indicating that the water system in this interval can promote the development of landslides.
(4)
Basic geological factors
Including lithology, distance from highway, etc. As shown in Table 1 and Figure 2, the rock types at the landslide site mostly belong to metamorphic and clastic rocks. In the impact factor machine learning model, the distance from the highway is selected as the environmental factor for modeling.
(5)
Human activity impact factors
Including distance from highway, highway density, etc. As shown in Table 1 and Figure 2, the buffer analysis shows that 83.73% of landslide units are included within a distance of 450 m from the highway. Artificial slope cutting has a significant impact on the development of landslides in Ruijin City. During the construction of highways, it is often necessary to manually cut the slope toe of the excavation.

4. Buffer Distance Analysis Based on Pearson Model

4.1. Landslide Density

Based on the selection and analysis of linear hazard factors in previous landslide susceptibility assessment work, this study selected lithologic geological boundaries [35], road, and river factors as the analysis objects based on the actual situation of landslide disasters in Ruijin City. Using the relationship between landslide density and linear factor distance, the impact of each factor on landslides in different grading ranges was evaluated. The higher the density of landslides, the greater the likelihood of landslides occurring within the classified state.
As shown in Figure 3, the susceptibility of landslides is closely related to the distance of linear factors. The closer the linear factor distance is, the easier it is for landslides to occur, especially the more obvious the distance between lithologic and geological boundaries. This is because different strata contact zones are prone to generate unstable surfaces, which, triggered by multiple factors, lead to sliding along the contact surface. When the buffer boundary is greater than 300 m from the geological boundary and river factor, and greater than 120 m from the road factor, the landslide density is the lowest, and the impact on the landslide is small (Figure 3).

4.2. Pearson Model Establishment and Analysis

In the process of establishing a landslide susceptibility prediction model, the main treatment of linear disaster factors is to establish a multi-ring buffer zone [36]. The elements of a multi-ring buffer include a buffer band and a single-ring buffer distance, as shown in Figure 4.
Pearson correlation coefficients are widely used to analyze the correlation between variables, specifically the covariance cov ( X , Y ) between two variables divided by the product ( σ X σ Y ) of their respective standard deviations:
P ( x , y ) = i = 1 N ( x i x ¯ ) ( y i y ¯ ) i = 1 N ( x i x ¯ ) 2 i = 1 N ( y i y ¯ ) 2 1 2
where P ( x , y ) is the correlation coefficient between the variables to be analyzed; x i is the linear factor distance (m) of the outer boundary of each buffer zone in the multi-ring buffer; x ¯ is the mean value of the linear factor distance (m) of the outer boundary of each buffer zone; y i is the density of landslides within the buffer zone (pcs / m 2 ); y ¯ is the mean value of the density of landslides within the buffer zone (pcs / m 2 ); N denotes the total number of buffer zones. The P ( x , y ) coefficient takes values from −1.0 to 1.0, with larger absolute values indicating stronger correlations.
Previous studies on the prediction of landslide susceptibility have shown that the buffer distance of linear disaster-causing factors is 50–500 m [37,38]. In order to better reflect the influence of the linear disaster-causing factor on the landslide, the minimum single-loop buffer distance is set to 30 m, with a total of 10 loops to cover the influence range of the linear factor; the maximum single-loop buffer distance is set to 150 m, beyond which the influence of the linear disaster-causing factor on the landslide will not be well reflected, resulting in the deviation of the prediction. In this study, a total of eight rings of buffer zones with single-ring buffer distances of 30 m, 80 m, 100 m, and 150 m were created for lithologic geological boundaries, roads, and rivers regarding linear disaster-causing factors, respectively. The Pearson correlation model of linear factor distance and landslide density within the corresponding buffer zone was developed using Matlab software under different single-loop buffer distances. The correlation between landslide density and linear factor distance under different single-loop buffer distances was analyzed by reflecting the correlation degree through the absolute value magnitude of Pearson correlation coefficient.
As shown in Figure 5, the correlation between the distance of the linear disaster-causing factor and the landslide density in the corresponding buffer zone is the largest when the single-loop buffer distances of the lithological geological boundary, river, and road factors are 50 m, 30 m, and 30 m, respectively, which are 0.776, 0.838, and 0.834. Due to the large influence of the road on the landslide, setting a larger single-loop buffer zone is optimal. The impact of rivers and roads on landslides is limited in scope, especially for road construction, which is very small and mainly reflected in the area of cut slope instability on both sides of the road. The lithologic geological boundary factor distance and landslide density as a whole did not reflect an extremely strong correlation. In the field survey, it was found that landslides mainly occurred at the junction between the quaternary and other strata and were mainly influenced by human engineering activities, such as road construction and house building.

5. Optimal Distance Verification

The RF model is an integrated classifier consisting of multiple decision trees, and the final result of the model is determined by the votes of all decision trees. The training samples used for each decision tree in the set are obtained by bootstrap sampling, i.e., randomly drawing the same number of training samples as the original training set samples with put-back. Suppose the original training set contains N training samples, and the probability that each sample is not drawn is ( 1 1 / N ) N . When N is large enough, ( 1 1 / N ) N will converge to 1 / e 0.368 , which indicates that nearly 37% of the original sample set will not appear in the bootstrapping sample training samples, which are called out-of-bag data, and the metric used to estimate model performance is called out-of-bag error. In contrast to cross-validation, the out-of-bag error is internally estimated, is unbiased, and fluctuates from the beginning to gradually decrease and converge to a threshold as the number of trees increases. Out-of-bag error helps to understand the model classification accuracy and how to improve it.
This study focuses on the selection of the optimal single-loop buffer distance based on buffer analysis for linear disaster-causing factors. The RF out-of-bag error accuracy and confusion-matrix-based accuracy metrics are used to verify the reasonableness of the single-loop buffer distance setting, and the RF modeling process and other factor processing methods are not highlighted here. The lithological geological boundary, road, and river factors with single-loop buffer distances of 50 m, 30 m, and 30 m, respectively, are the most relevant single-loop buffer distance combinations with nine other environmental factors to form a landslide-causing factor set and establish the RF model. As a comparison, the combination of linear disaster-causing factors with the same single-loop buffer distance was also modeled separately for RF landslide susceptibility prediction. Only under the most optimized parameters can the performance of the model be improved optimally, and the prediction will reach the optimal state. Therefore, how to choose the optimal factor is the key to construct the random forest model. To solve this problem is mainly based on the calculated out-of-bag (OOB) error. In this study, in order to obtain the best disaster-causing factor for random selection of the RF model, the out-of-bag error under different buffer distances was selected by using circular iterations of Matlab language, and, the smaller the out-of-bag error, the higher the prediction accuracy of the corresponding model. As shown in Figure 6, the trend of the out-of-bag error of the RF landslide susceptibility prediction model with increasing number of decision trees was calculated to obtain linear disaster-causing factors at different single-loop buffer distances. The most relevant single-loop buffer distance combination model shows the lowest trend of out-of-bag error with the number of decision trees, and its model accuracy is the highest.
The optimal number of decision trees is selected by iterating the out-of-bag error (OOB error) with different numbers of decision trees, and the model performance is evaluated by combining the confusion matrix. The out-of-bag error results approximate the k-fold cross-validation that requires extensive computation. The ratio of the number of all misclassifications to the total number of samples was calculated as the OOB error of the random forest. The lower the OOB error, the better the accuracy of the model classification. In this study, Matlab was used to iterate and calculate from 15 to 500 decision trees to obtain the OOB error for different numbers of decision trees. When the number of decision trees is small, the OOB error is large, so the number of decision trees is increased continuously while the error of each curve is calculated and observed. When the number of decision trees increases to a certain degree, the OOB error and the curves basically remain stable, which is expressed as a small fluctuation of the OOB error within a certain range. Thus, in this study, the optimal number of decision trees is 300.
The accuracy rate in the confusion-matrix-based accuracy index represents how many landslide points predicted by the model are correct, the recall rate represents how many landslide samples are predicted by the model, the Kappa coefficient represents the reliability of the model, and the accuracy rate represents the overall accuracy of the RF model. As shown in Table 2, the recall, kappa coefficient, and accuracy of the validation set of the most correlated single-loop buffer distance combination model are 96.65%, 88.67%, 83.17%, and 91.58%, respectively. With 300 decision trees, the final error rate for non-slippery slope points was 4.2% in the test sample. The error for landslide points was 11.3%, and 328 landslide points were correctly predicted out of 370 landslide point samples. With 300 decision trees, the final error rate of non-slippery points in the test sample was 4.2%. The error of landslide points was 11.3%, and 328 landslide points were correctly predicted in the sample of 370 landslide points, and the overall accuracy is better than the accuracy of the model with the same single-loop buffer distance combination, which also indicates that the model is a model with high accuracy calibration. The reasonableness of the most relevant buffer is further corroborated from the perspective of landslide susceptibility prediction accuracy.

6. Model Predictions

Based on the above analysis, the number of decision trees is 300, and the most relevant combination of single-loop buffer distance and nine other environmental factors is selected to form a landslide-causing factor set, and then the RF model is established to predict the whole study area. The probability of susceptibility was classified into five grades of low susceptibility zone, lower susceptibility zone, medium susceptibility zone, higher susceptibility zone, and high susceptibility zone by the natural breakpoint method, and the susceptibility grading chart was obtained (Figure 7). According to the model prediction results, the number of graded rasters, the proportion of graded rasters, the number of landslide points, the proportion of graded landslides, and the frequency ratio under each susceptibility class were counted to obtain Table 3. From Table 3, it can be concluded that 43.2% of the landslide sites occurred in the higher- and high-susceptibility areas, and 63.8% of the landslide sites occurred in the areas above the medium susceptibility level. The landslide hazard increases from low to high, and the corresponding landslide ratio gradually increases from 0.248 to 3.076. The area of the lower susceptibility zone and low-susceptibility zone in the study area is about 36.28%, which indicates that the landslide hazard is high in most areas of the study area. The model calculation shows that the landslide hazard increases from small to large as the susceptibility level increases.

7. Discussion

Taking Ruijin City as the research object, 12 environmental factors, such as elevation, topographic relief, distance from water system, and road, were selected to establish a Pearson correlation statistical model of landslide density and distance of linear disaster-causing factors at different single-loop buffer distances to obtain the most relevant combination of linear disaster-causing factors at single-loop buffer distances. It provides ideas for studying the influence of correlation between linear disaster-causing factors and landslides and provides appropriate suggestions for the selection of buffer distance size.
(1)
For regional landslide studies, the sensitivity analysis of impact factors (environmental factors and trigger factors) is one of the difficult points. Moreover, different landslide susceptibility evaluation models have different sensitivities to the landslide evaluation factors. The importance of the 12 environmental factors selected in this paper still has certain limitations and cannot yet fully reflect the degree of influence of landslide evaluation factors on landslides in an objective and realistic way. Therefore, how to find the evaluation factors that play a decisive role in landslide evaluation needs further research.
(2)
The number of decision trees is selected by iteratively calculating the out-of-bag error under different numbers of decision trees, and the model parameters are evaluated by combining the confusion matrix to finally obtain the optimal RF parameters. The integrated and stochastic nature of the RF model gives it the advantage of landslide susceptibility modeling that is less affected by disturbances in the data, high judgment accuracy, and effective prevention of overfitting [39]. Some of the literature investigating the performance of machine learning models for predicting landslide susceptibility shows that RF exhibits higher prediction accuracy than other models, such as logistic regression, SVM, and conventional artificial neural networks, and is more suitable for landslide susceptibility mapping [40,41,42,43]. The findings of this paper are more in line with the existing literature.
(3)
Slope is an important influence factor of landslide and is one of the important parameters for landslide susceptibility evaluation. In this paper, the non-landslide points are distributed in the flat topographic area of Ruijin City, which ensures the stability and good prediction accuracy of the selected non-landslide points. However, the largest problem of the model is that it overemphasizes the role of slope, and the results of the factor importance ranking in the random forest model show the importance of slope at the top [44,45], resulting in a large area of very high- and high-susceptibility zones in the prediction results of the model and weak identification of stable areas with high slope. The next step of the study will focus on the selection of non-landslide sites.
(4)
This paper was conducted in the Ruijin area as the study area, and the experiment verified that the proposed RF model of landslide susceptibility evaluation method has better results. However, experiments and analyses are needed for larger spatial scales and study areas characterized by other geological formations (e.g., fault structures) and hydrogeological conditions to further verify the generalizability of the method proposed in this paper, which is also the focus of the authors’ subsequent research work.

8. Conclusions

Using Ruijin City as the study area, Pearson correlation was used to analyze the correlation between landslide density and the distance of linear hazard-causing factors under different single-loop buffer distances, and an RF model was developed to verify the rationality of the combination of the most relevant single-loop buffer distance and linear hazard-causing factors for landslide susceptibility measurement.
(1)
There are 370 landslide hazards and historical landslide sites in Ruijin. Landslide hazard development is mainly small- to medium-sized, shallow to medium, earthy or mounded landslides, while large or giant, deep, and rocky landslides are less developed.
(2)
The correlation between the landslide density and the distance of linear disaster-causing factors is greatest in the Ruijin area when the single-loop buffer distances of the lithological geological boundary, road, and river factors are 50 m, 30 m, and 30 m, respectively, reflecting the different influence ranges of different factors on landslides. In other areas, the differences in the influence range of different linear factors should be fully considered when building predictive models for landslide susceptibility.
(3)
The landslide susceptibility prediction model built with the most correlated single-ring buffer distance combination of linear hazard-causing factors has a low out-of-bag error curve trend, and the validation set accuracy index is overall higher than that of the same single-ring buffer distance combination of linear hazard-causing factors. The most correlated buffer approach is justified in terms of both model accuracy and prediction accuracy.
(4)
The RF model effectively reflects the spatial distribution pattern of landslide susceptibility. The very low and low susceptibility areas are mainly distributed in the north and west of Ruijin City, while the very high- and high-susceptibility areas are mainly concentrated in the central zone. According to the results of susceptibility evaluation, corresponding disaster mitigation and prevention measures should be arranged for the above areas.

Author Contributions

Conceptualization, L.F.; methodology, J.Y. and Q.W.; formal analysis, L.F. and Y.X.; investigation, Q.W. and Y.X.; resources, Y.X. and L.F.; writing—original draft preparation, L.F.; writing—review and editing, L.F., J.Y. and Y.X.; supervision, Y.X.; funding acquisition, J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China (2018YFC1508603).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would highly thank the National Service Center for Specialty Environmental Observation Station for providing relevant data.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Huang, F.; Zhang, J.; Zhou, C.; Wang, Y.; Huang, J.; Zhu, L. A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 2020, 17, 217–229. [Google Scholar] [CrossRef]
  2. Huang, F.; Cao, Z.; Jiang, S.H.; Zhou, C.; Guo, Z. Landslide susceptibility prediction based on a semi-supervised multiple-layer perceptron model. Landslides 2020, 17, 2919–2930. [Google Scholar] [CrossRef]
  3. Xing, Y.; Yue, J.; Chen, C.; Cai, D.; Hu, J.; Xiang, Y. Prediction interval estimation of landslide displacement using adaptive chicken swarm optimization-tuned support vector machines. Appl. Intell. 2021, 51, 8466–8483. [Google Scholar] [CrossRef]
  4. Huang, F.; Cao, Z.; Guo, J.; Jiang, S.H.; Li, S.; Guo, Z. Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. Catena 2020, 191, 104580. [Google Scholar] [CrossRef]
  5. Xing, Y.; Yue, J.; Chen, C.; Qin, Y.; Hu, J. A hybrid prediction model of landslide displacement with risk-averse adaptation. Comput. Geosci. 2020, 141, 104527. [Google Scholar] [CrossRef]
  6. Huang, F.; Chen, J.; Liu, W.; Huang, J.; Hong, H.; Chen, W. Regional rainfall-induced landslide hazard warning based on landslide susceptibility mapping and a critical rainfall threshold. Geomorphology 2022, 408, 108236. [Google Scholar] [CrossRef]
  7. Zhang, Q.; Bai, Z.W.; Huang, G.W.; Du, Y.; Wang, D. Review of GNSS landslide monitoring and early warning. Acta Geod. Cartogr. Sin. 2022, 51, 1985–2000. [Google Scholar]
  8. Deng, L.Z.; Yuan, H.Y.; Zhang, M.Z.; Chen, J.G. Research progress of landslide Deformation monitoring and early warning technology. J. Tsinghua Univ. 2023, 63, 849–864. [Google Scholar]
  9. Zhan, Z.; Shi, W.; Zhang, M.; Liu, Z.; Peng, L.; Yu, Y.; Sun, Y. Landslide Trail Extraction Using Fire Extinguishing Model. Remote Sens. 2022, 14, 308. [Google Scholar] [CrossRef]
  10. Shao, P.; Shi, W.; Liu, Z.; Dong, T. Unsupervised change detection using fuzzy topology-based majority voting. Remote Sens. 2021, 13, 3171. [Google Scholar] [CrossRef]
  11. Tehrani, F.S.; Calvello, M.; Liu, Z.; Zhang, L.; Lacasse, S. Machine learning and landslide studies: Recent advances and applications. Nat. Hazards 2022, 114, 1197–1245. [Google Scholar] [CrossRef]
  12. Chang, Z.; Du, Z.; Zhang, F.; Huang, F.; Chen, J.; Li, W.; Guo, Z. Landslide Susceptibility Prediction Based on Remote Sensing Images and GIS: Comparisons of Supervised and Unsupervised Machine Learning Models. Remote Sens. 2020, 12, 502. [Google Scholar] [CrossRef] [Green Version]
  13. Jiang, S.H.; Huang, J.; Huang, F.; Yang, J.; Yao, C.; Zhou, C.B. Modelling of spatial variability of soil undrained shear strength by conditional random fields for slope reliability analysis. Appl. Math. Modell. 2018, 63, 374–389. [Google Scholar] [CrossRef]
  14. Chang, Z.; Catani, F.; Huang, F.; Liu, G.; Meena, S.R.; Huang, J.; Zhou, C. Landslide susceptibility prediction using slope unit-based machine learning models considering the heterogeneity of conditioning factors. J. Rock Mech. Geotech. Eng. 2022, 15, 1127–1143. [Google Scholar] [CrossRef]
  15. Li, J.; Zhou, C.H. Grid size in the raster-based GIS landslide risk evaluation method Analysis of grid size selection in raster-based GIS landslide risk evaluation method. J. Remote Sens. 2003, 7, 86–92. [Google Scholar]
  16. Yin, K.L.; Zhu, L.F. Study on spatial zoning of landslide hazards and GIS application. Geol. Geol. 2001, 8, 279–284. [Google Scholar]
  17. Cao, H.Y.; Chen, J.H.; Gao, Y.L. Analysis of GIS-based fault structure on landslide hazard development Analysis of the control role of fault structure on landslide hazard development based on GIS—Take Yucheng District of Ya’an City as an example. Earth Environ. 2012, 40, 595–598. [Google Scholar]
  18. Ma, J.; Nian, Y.; Cai, D. Factors of regional landslide risk and correlation between landslide and geology structure in Lanzhou area. J. Nat. Disasters 2006, 15, 14–17. [Google Scholar]
  19. Xing, Y.; Yue, J.; Guo, Z.; Chen, Y.; Hu, J.; Trave, A. Large-scale landslide susceptibility mapping using an integrated machine learning model: A case study in the Lvliang mountains of China. Front. Earth Sci. 2021, 9, 722491. [Google Scholar] [CrossRef]
  20. Akinci, H.; Kilicoglu, C.; Dogan, S. Random forest-based landslide susceptibility mapping in coastal regions of Artvin, Turkey. ISPRS Int. J. Geo-Inf. 2020, 9, 553. [Google Scholar] [CrossRef]
  21. Xing, Y.; Yue, J.; Chen, C. Interval estimation of landslide displacement prediction based on time series decomposition and long short-term memory network. IEEE Access 2019, 8, 3187–3196. [Google Scholar] [CrossRef]
  22. Huang, F.; Yan, J.; Fan, X.; Yao, C.; Huang, J.; Chen, W.; Hong, H. Uncertainty pattern in landslide susceptibility prediction modelling: Effects of different landslide boundaries and spatial shape expressions. Geosci. Front. 2022, 13, 101317. [Google Scholar] [CrossRef]
  23. Lucchese, L.V.; de Oliveira, G.G.; Pedrollo, O.C. Investigation of the influence of nonoccurrence sampling on landslide susceptibility assessment using Artificial Neural Networks. Catena 2021, 198, 105067. [Google Scholar] [CrossRef]
  24. Yunus, A.P.; Fan, X.; Subramanian, S.S.; Jie, D.; Xu, Q. Unraveling the drivers of intensified landslide regimes in Western Ghats, India. Sci. Total Environ. 2021, 770, 145357. [Google Scholar] [CrossRef]
  25. Yan, F.; Zhang, Q.; Ye, S.; Ren, B. A novel hybrid approach for landslide susceptibility mapping integrating analytical hierarchy process and normalized frequency ratio methods with the cloud model. Geomorphology 2019, 327, 170–187. [Google Scholar] [CrossRef]
  26. Chen, L.; Guo, H.; Gong, P.; Yang, Y.; Zuo, Z.; Gu, M. Landslide susceptibility assessment using weights-of-evidence model and cluster analysis along the highways in the Hubei section of the Three Gorges Reservoir Area. Comput. Geosci. 2021, 156, 104899. [Google Scholar] [CrossRef]
  27. Chen, W.; Xie, X.; Wang, J.; Pradhan, B.; Hong, H.; Bui, D.T.; Duan, Z.; Ma, J. A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. Catena 2017, 151, 147–160. [Google Scholar] [CrossRef] [Green Version]
  28. Tanyu, B.F.; Abbaspour, A.; Alimohammadlou, Y.; Tecuci, G. Landslide susceptibility analyses using Random Forest, C4. 5, and C5. 0 with balanced and unbalanced datasets. Catena 2021, 203, 105355. [Google Scholar] [CrossRef]
  29. Krkač, M.; Špoljarić, D.; Bernat, S.; Arbanas, S.M. Method for prediction of landslide movements based on random forests. Landslides 2017, 14, 947–960. [Google Scholar] [CrossRef]
  30. Talukdar, S.; Singha, P.; Mahato, S.; Pal, S.; Liou, Y.A.; Rahman, A. Land-use land-cover classification by machine learning classifiers for satellite observations—A review. Remote Sens. 2020, 12, 1135. [Google Scholar] [CrossRef] [Green Version]
  31. Goetz, J.N.; Brenning, A.; Petschko, H.; Leopold, P. Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling. Comput. Geosci. 2015, 81, 1–11. [Google Scholar] [CrossRef]
  32. Dou, J.; Yunus, A.P.; Bui, D.T.; Merghadi, A.; Sahana, M.; Zhu, Z.; Chen, C.W.; Khosravi, K.; Yang, Y.; Pham, B.T. Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan. Sci. Total Environ. 2019, 662, 332–346. [Google Scholar] [CrossRef]
  33. Sun, D.; Wen, H.; Wang, D.; Xu, J. A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm. Geomorphology 2020, 362, 107201. [Google Scholar] [CrossRef]
  34. Lin, W.; Yin, K.; Wang, N.; Xu, Y.; Guo, Z.; Li, Y. Landslide hazard assessment of rainfall-induced landslide based on the CF-SINMAP model: A case study from Wuling Mountain in Hunan Province, China. Nat. Hazards 2021, 106, 679–700. [Google Scholar] [CrossRef]
  35. Wang, W.; He, Z.; Han, Z.; Li, Y.; Dou, J.; Huang, J. Mapping the susceptibility to landslides based on the deep belief network: A case study in Sichuan Province, China. Nat. Hazards 2020, 103, 3239–3261. [Google Scholar] [CrossRef]
  36. Cui, W.; Liu, G.; Yu, F.; Zhang, X.; Sun, C. Analysis of buffer distance size selection in buffer-based GIS fault and landslide correlation analysis. J. Jiamusi Univ. Nat. Sci. Ed. 2020, 38, 4–8. [Google Scholar]
  37. Hadmoko, D.S.; Lavigne, F.; Samodra, G. Application of a semiquantitative and GIS-based statistical model to landslide susceptibility zonation in Kayangan Catchment, Java, Indonesia. Nat. Hazards 2017, 87, 437–468. [Google Scholar] [CrossRef]
  38. Kumar, D.; Thakur, M.; Dubey, C.S.; Shukla, D.P. Landslide susceptibility mapping & prediction using support vector machine for Mandakini River Basin, Garhwal Himalaya, India. Geomorphology 2017, 295, 115–125. [Google Scholar]
  39. Wu, R.Z.; Hu, X.D.; Mei, H.B.; He, J.Y.; Yang, J.Y. Spatial susceptibility assessment of landslides based on random forest: A case study from Hubei section in the Three Gorges Reservoir Area. Earth Sci. 2021, 46, 321–330. [Google Scholar]
  40. Achour, Y.; Pourghasemi, H.R. How do machine learning techniques help in increasing accuracy of landslide susceptibility maps? Geosci. Front. 2020, 11, 871–883. [Google Scholar] [CrossRef]
  41. Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; ThaiPham, B.; Bui, D.T.; Avtra, R.; Abderrahmane, B. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance. Earth-Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
  42. Shahri, A.A.; Spross, J.; Johansson, F.; Larsson, S. Landslide susceptibility hazard map in southwest Sweden using artificial neural network. Catena 2019, 183, 104225. [Google Scholar] [CrossRef]
  43. Huang, F.; Chen, J.; Tang, Z.; Fan, X.; Huang, J.; Zhou, C.; Chang, Z. Uncertainties of landslide susceptibility prediction under different spatial resolutions and different proportions of training and testing datasets. Chin. J. Rock Mech. Eng. 2021, 40, 1155–1169. [Google Scholar]
  44. Choi, J.; Oh, H.J.; Won, J.S.; Lee, S. Validation of an artificial neural network model for landslide susceptibility mapping. Environ. Earth Sci. 2011, 60, 473–483. [Google Scholar] [CrossRef]
  45. Zhou, X.; Wu, W.; Lin, Z.; Zhang, G.; Chen, R.; Song, Y.; Wang, Z.; Lang, T.; Qin, Y.; Ou, P.; et al. Zonation of landslide susceptibility in Ruijin, Jiangxi, China. Int. J. Environ. Res. Public Health 2021, 18, 5906. [Google Scholar] [CrossRef]
Figure 1. Location and geological structure map of the study area.
Figure 1. Location and geological structure map of the study area.
Sustainability 15 10180 g001
Figure 2. Basic environmental factors of landslide. (a) Elevation; (b) slope; (c) Aspect; (d) Plane curvature; (e) Profile curvature; (f) Lithology; (g) Distance to river; (h) Topographic relief; (i) NDVI; (j) NDBI; (k) MNDWI; (l) Distance to roads.
Figure 2. Basic environmental factors of landslide. (a) Elevation; (b) slope; (c) Aspect; (d) Plane curvature; (e) Profile curvature; (f) Lithology; (g) Distance to river; (h) Topographic relief; (i) NDVI; (j) NDBI; (k) MNDWI; (l) Distance to roads.
Sustainability 15 10180 g002aSustainability 15 10180 g002b
Figure 3. Relationship between landslide distribution and linear hazard factors. (a) The relationship between landslide density and geological boundary distance; (b) The relationship between landslide density and road distance; (c) The relationship between landslide density and river distance.
Figure 3. Relationship between landslide distribution and linear hazard factors. (a) The relationship between landslide density and geological boundary distance; (b) The relationship between landslide density and road distance; (c) The relationship between landslide density and river distance.
Sustainability 15 10180 g003
Figure 4. Principle of linear factor buffer generation.
Figure 4. Principle of linear factor buffer generation.
Sustainability 15 10180 g004
Figure 5. Absolute value of Pearson correlation index versus single-loop buffer distance. (a) Relationship between the absolute value of Pearson’s correlation index and the buffer distance of the single ring of the geological boundary; (b) Relationship between the absolute value of Pearson’s correlation index and the buffer distance of a single loop of the river; (c) Relationship between the absolute value of Pearson correlation index and road single-loop buffer distance.
Figure 5. Absolute value of Pearson correlation index versus single-loop buffer distance. (a) Relationship between the absolute value of Pearson’s correlation index and the buffer distance of the single ring of the geological boundary; (b) Relationship between the absolute value of Pearson’s correlation index and the buffer distance of a single loop of the river; (c) Relationship between the absolute value of Pearson correlation index and road single-loop buffer distance.
Sustainability 15 10180 g005
Figure 6. The curves of random forest out-of-bag error with number of decision trees.
Figure 6. The curves of random forest out-of-bag error with number of decision trees.
Sustainability 15 10180 g006
Figure 7. Landslide susceptibility maps respectively produced by RF model.
Figure 7. Landslide susceptibility maps respectively produced by RF model.
Sustainability 15 10180 g007
Table 1. Frequency ratios of index factors.
Table 1. Frequency ratios of index factors.
Environmental
Factors
ValuesNumber of Grids in the Whole AreaGrid Scale/%Landslide GridLandslide Grid Scale/%FR
Elevation (m)139.7~250.9836,74530.41917346.7571.152
250.9~335.3796,48228.95612132.7031.175
335.3~423.5578,69121.0384913.2430.796
423.5~538.6332,05612.072205.4050.650
538.6~695.9147,9305.37841.0810.159
695.9~1117.858,7872.13730.8110.376
Slope (°)0~4.4685,21824.9114111.0810.260
4.4~8.8643,53523.39512533.7840.986
8.8~13.2608,75522.13111330.5411.276
13.2~17.9446,52016.2335615.1351.945
17.9~28.7344,70312.532349.1890.632
28.7~51.221,9600.79810.2700.398
Aspect−1155,9405.669277.2970
0~22.5297,92410.831318.3780.994
22.5~67.5354,47912.8876215.7570.954
67.5~112.5359,79113.0804812.9731.301
112.5~157.5332,83012.0995414.5951.198
157.5~202.5332,14312.0754211.3511.160
202.5~247.5378,01113.7424812.9731.086
247.5~292.5370,19513.4583810.2700.792
292.5~337.5169,3786.158205.4050.716
Profile curvature0~2.029884,49932.1569826.4860.596
2.029~4.057773,41628.11712533.7841.072
4.057~6.324561,55120.4156918.6491.126
6.324~8.949324,37611.7935414.5951.213
8.949~14.529187,6476.822236.2160.823
14.529~30.42819,2020.69810.2701.829
Plan curvature0~13.422651,67723.69111130.0001.246
13.422~24.927625,67522.7469926.7571.417
24.927~37.710471,54417.1435815.6761.434
37.710~52.091354,66612.8944211.3510.932
52.091~67.749301,69610.968246.4860.657
67.749~81.491345,43312.558369.7290.852
Topographic relief0~6.022651,45023.683359.4590.476
6.022~12.420721,23626.22014840.0000.566
12.420~18.819641,93823.33710428.1081.163
18.819~22.969293,59710.6744311.6222.575
22.969~35.379385,79914.0263810.2700.431
35.379~95.97572,8552.64930.8110.367
LithologyMetamorphic rock1,218,58444.30110829.1891.301
Magmatic rock503,74818.314277.2971.611
Clastic rock899,36332.696195.1350.209
Carbonatite128,9964.68921658.3780.659
NDVI−0.054~0.00668,0982.47651.3510.192
0.006~0.018299,11510.874349.1890.803
0.018~0.025580,37321.0995615.1351.009
0.025~0.033848,42030.84314639.4590.955
0.033~0.042635,13223.0898723.5141.075
0.042~0.098315,48811.4694211.3511.141
NDBI−0.650~−0.38974,9632.725133.5131.101
−0.389~−0.318234,6328.529287.5680.928
−0.318~−0.267428,67415.5845815.6761.418
−0.267~−0.219699,58125.4339224.8651.233
−0.219~−0.173803,44529.20911029.7290.901
−0.173~−0.050505,33118.3716918.6490.729
MNDWI−0.035~0.110365,88213.3014812.9731.374
0.110~0.164773,62128.12511831.8921.221
0.164~0.217772,21228.0739425.4050.952
0.217~0.276492,15817.8926718.1081.174
0.276~0.352256,7189.333297.8381.082
0.352~0.64386,0353.128133.5140.708
Distance to river (m)<150155,2125.6424712.7032.586
150~30055,8082.02971.8911.689
300~450279,11410.14711831.8920.672
>4502,274,11682.67419853.5140.497
Distance to roads (m)<150265,20631.43111230.2700.963
150~300366,47928.1349024.3243.139
300~450337,20121.7898222.1621.663
450~600599,35112.2594512.1620.558
600~800773,8726.052349.1890.327
>800408,5820.33571.8910.127
Table 2. Accuracy of validation set based on confusion matrix.
Table 2. Accuracy of validation set based on confusion matrix.
Buffer Distance/mRecall/%Kappa Factor/%Accuracy/%
3084.0576.4588.20
5085.6378.4089.18
8086.7980.3690.16
10088.2479.1289.61
15086.7982.4891.22
Most relevant buffer88.6783.1791.58
Table 3. Frequency ratios of five susceptibility classes assessed with the RF model.
Table 3. Frequency ratios of five susceptibility classes assessed with the RF model.
ClassTotal Grid
Number
Proportion (%)Landslide Grid
Number
Proportion (%)Landslide Density
Very high587,27021.3524375.953.076
High600,89021.856113.240.755
Moderate565,39120.55337.030.434
Low591,49021.50202.700.249
Very low406,54814.78131.080.248
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fang, L.; Wang, Q.; Yue, J.; Xing, Y. Analysis of Optimal Buffer Distance for Linear Hazard Factors in Landslide Susceptibility Prediction. Sustainability 2023, 15, 10180. https://doi.org/10.3390/su151310180

AMA Style

Fang L, Wang Q, Yue J, Xing Y. Analysis of Optimal Buffer Distance for Linear Hazard Factors in Landslide Susceptibility Prediction. Sustainability. 2023; 15(13):10180. https://doi.org/10.3390/su151310180

Chicago/Turabian Style

Fang, Lu, Qian Wang, Jianping Yue, and Yin Xing. 2023. "Analysis of Optimal Buffer Distance for Linear Hazard Factors in Landslide Susceptibility Prediction" Sustainability 15, no. 13: 10180. https://doi.org/10.3390/su151310180

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop