Debris Flow Susceptibility Mapping Using Machine-Learning Techniques in Shigatse Area, China

: Debris ﬂows have been always a serious problem in the mountain areas. Research on the assessment of debris ﬂows susceptibility (DFS) is useful for preventing and mitigating debris ﬂow risks. The main purpose of this work is to study the DFS in the Shigatse area of Tibet, by using machine learning methods, after assessing the main triggering factors of debris ﬂows. Remote sensing and geographic information system (GIS) are used to obtain datasets of topography, vegetation, human activities and soil factors for local debris ﬂows. The problem of debris ﬂow susceptibility level imbalances in datasets is addressed by the Borderline-SMOTE method. Five machine learning methods, i.e., back propagation neural network (BPNN), one-dimensional convolutional neural network (1D-CNN), decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost) have been used to analyze and ﬁt the relationship between debris ﬂow triggering factors and occurrence, and to evaluate the weight of each triggering factor. The ANOVA and Tukey HSD tests have revealed that the XGBoost model exhibited the best mean accuracy (0.924) on ten-fold cross-validation and the performance was signiﬁcantly better than that of the BPNN (0.871), DT (0.816), and RF (0.901). However, the performance of the XGBoost did not signiﬁcantly di ﬀ er from that of the 1D-CNN (0.914). This is also the ﬁrst comparison experiment between XGBoost and 1D-CNN methods in the DFS study. The DFS maps have been veriﬁed by ﬁve evaluation methods: Precision, Recall, F1 score, Accuracy and area under the curve (AUC). Experiments show that the XGBoost has the best score, and the factors that have a greater impact on debris ﬂows are aspect, annual average rainfall, proﬁle curvature, and elevation. can provide mountainous areas as the southwestern Tibet steep terrain where the sites are always accessible for investigation. Higher resolution does allow the image to better describe the terrain where the debris flow occurs [48] and potentially improve further analysis. It is important and necessary to use topographical factors, human activities, vegetation cover, climatic and soil elements provided by satellite remote sensing to estimate regional debris flow susceptibility. Five machine learning algorithms used to construct DFS map in Shigatse. The results confirm that all five methods can be used to analyze the susceptibility of debris flows. According to the performance, XGBoost ranks the first, and 1D-CNN is the second, followed by RF, BPNN, and DT. XGBoost has the best predictive performance with the highest score among the five evaluation methods. The ANOVA method and the Tukey’s HSD test showed that the accuracy of XGBoost is significantly better than those of RF, BPNN, and DT, but it is not significantly different from 1D-CNN. In terms of the time required for prediction, DT takes the least time, and the time required for 1D-CNN is moderate and close to XGBoost. RF and BPNN are slower to calculate. It is notable that this is the first comparative experiment of XGBoost and 1D-CNN in the study of DFS. The ranking of the model based on the “feature importance” indicates that the slope aspect, rainfall, profile curvature and DEM have a greater impact on the debris flows. The results of this study are significant for the local public facility construction and the residential property protection. Therefore, the XGBoost method has good prospects in estimating the DFS. (3) By comparing the debris flow susceptibility maps of the five prediction algorithms, it is found that the prediction results of five models all show that the moderately susceptible areas account for a large proportion. This experiment has not yet explained the reasons for the different


Introduction
Debris flows involve gravity-driven motion of solid-fluid mixtures with abrupt surge fronts, free upper surfaces, variably erodible basal surfaces, and compositions that may change with position and time [1]. They can cause great damage to the safety of people's lives and property, public facilities and ecological environment. Due to the harsh natural environment and deforestation caused by over-exploitation of human beings, Shigatse is a typical area with active debris flows in the Tibet Autonomous Region. Debris flows can cause very high damages because the study area is densely populated. Therefore, mitigating and reducing the disasters caused by debris flows are critical to part of the Gangdise-Nyqinqin Tanggula Mountains. Its elevation is high in the northern part and southern part, including the southern Tibetan Plateau and Yarlung Zangbo River basin. The overall terrain of the Shigatse region is complex and diverse, mainly consisting of mountains, wide valleys and lake basins with a maximum elevation of over 8700 m. The study area belongs to semi-arid monsoon climate zone of inland plateau. It is featured with dry climate, less precipitation, rainy season coincident with hot season, and annual average sunshine hours of 3240 h.
The transportation mainly includes three main lines: China-Nepal (Zhongni) Highway, 318 National Road and Largo Railway passing through the study area. The geological disasters in the study area are serious, mainly including debris flows, rock collapses, and landslides. Among them, the debris flow is the most common one. A large number of debris flows exist in many parts of the study region. They directly threaten the safety of the three major transportation lines and residents' lives and properties. According to the collected data and previous studies [31], the debris flows in the study area are mostly caused by heavy rain.

Debris Flow Dataset
Collection and analysis of debris flow event datasets are prerequisites for the DFS assessment. There are 1944 debris flow sites in the study area from 1998 to 2008. Each case includes information obtained from field disaster investigation, such as time, debris flow susceptibility level, and geographic location. The information on debris flows is provided by the Tibet Meteorological Bureau. These events can be viewed through the geological cloud portal [32].

Debris Flow Triggering Factors
It is significant to analyze the environmental characteristics of the debris flow events for the DFS estimation. Due to the complexity of the environment and various development stages of debris flows, the causes of debris flows are controversial. Researchers have done a lot of studies on the

Debris Flow Dataset
Collection and analysis of debris flow event datasets are prerequisites for the DFS assessment. There are 1944 debris flow sites in the study area from 1998 to 2008. Each case includes information obtained from field disaster investigation, such as time, debris flow susceptibility level, and geographic location. The information on debris flows is provided by the Tibet Meteorological Bureau. These events can be viewed through the geological cloud portal [32].

Debris Flow Triggering Factors
It is significant to analyze the environmental characteristics of the debris flow events for the DFS estimation. Due to the complexity of the environment and various development stages of debris flows, the causes of debris flows are controversial. Researchers have done a lot of studies on the relationship between debris flows and triggering factors, such as topography, soil, climate, and human activities. Therefore, we have classified 15 environmental factors into five categories as shown in Table 1. Topographic factors that include elevation, slope, aspect, and curvature are extracted from the Shuttle Radar Topography Mission Digital Elevation Model (SRTM DEM) using the ArcGIS platform [33]. The vegetation coverage is represented by the normalized difference vegetation index (NDVI), calculated from the obtained 2000-2008 MODIS images and averaged to generate the thematic layer of the annual average NDVI. Rainfall data are collected from the Tropical Rainfall Measurement Task (TRMM) [34]. We use a rainfall dataset (No: 3B42v7) with a time interval of three hours and a spatial resolution of 0.25 degree during 1998-2008 to construct a thematic layer of annual average precipitation. The 15 types of land use information layers are provided by National Earth System Science Data Sharing Infrastructure, National Science & Technology Infrastructure of China (http://www.geodata.cn) [35,36]. In addition, the road vector data provided by OpenStreetMap (OSM) (https://www.openstreetmap.org/#map=11/22.3349/113.76000) is used to calculate the distance from the road. Soil factors are provided by the Resource and Environmental Science Data Center (RESDC) of the Chinese Academy of Sciences.
Higher resolution is conducive to the topographic analysis of single-ditch debris flow, but in this work, our research focuses on the use of multiple attribute factors to analyze the disaster susceptibility of the entire study area. Golovko [2] and Ahmed [9] believe that 30M resolution Digital Elevation Model (DEM) can be used for the analysis of the susceptibility to mountain disasters. Therefore, DEM data with a pixel size of 30 m is used (Figure 2a). The slope angle is a fundamental factor calculated by the DEM data and the range of it obtained by statistics is wide (0-89 • ) (Figure 2b). The aspect of the slope is another key factor affecting the DFS. Because the slope surface in different directions is exposed to the wind and rain in different degrees. The aspect thematic layers are reclassified into nine categories: flat (−1), north (337.5-360 • , 0-22.5 • ), north-east (22.5-67.5 • ), east (67.5-112.5 • ), south-east (112.5-157.5 • ), south (157.5-202.5 • ), south-west (202.5-247.5 • ), west (247.5-292.5 • ), and north-west (292.5-337.5 • ) (Figure 2c). The second derivative of the slope, i.e., the curvature, helps us understand the characteristics of the basin runoff and erosion processes. In this study, three curvature functions are used to show the shape of the terrain (Figure 2d-f). They are the curvature of the profile, the curvature of the plane, and the total curvature of the surface defined as the curvature of the maximum slope, the curvature of the contour, and the combination of the curvatures, respectively.
Human activities affect the geographical environment, which in turn influences the occurrence of debris flows. The land cover thematic map shows how human production can change natural land, and 14 land use types including farmland and forest can be identified (Figure 2g). The DFS assessment often takes the distance from the road into account, because the road construction and maintenance cause certain change and damage to the local morphology. This variable is calculated by using the Euclidean distance calculation technique in the spatial analysis tool of ArcGIS 10.2 ( Figure 2h).
The vegetation coverage is one of the important parameters to evaluate the DFS. NDVI extracted from remote sensing images is a commonly used vegetation index for inferring the vegetation density. It is very sensitive to the presence of chlorophyll on vegetation surface (Figure 2i). We calculated the NDVI value using the following formula: where NIR and RED represent the near-infrared and red-band, respectively, and they are the second and first channels of the MODIS image. The NDVI value ranges between −1 and 1. The negative value indicates that the ground cover is an object highly reflective to visible light such as clouds, snow, water, etc. 0 means bare land. A positive value represents a vegetation coverage area and it increases with the vegetation coverage density. Debris flows are usually influenced by changes in humidity caused by rainfall infiltration. Permeability can be expressed by soil type (Figure 2j), soil texture (Figure 2k-m) and soil erosion (Figure 2n). Since the particle distribution determines the shape of soil water characteristic curve and affects the soil hydraulic characteristics, the soil type and texture have a great influence on the DFS. Most of the study area is covered by alpine soil, including grass mat soil, cold soil, and frozen soil. According to statistics, most of the alpine soil is brown and has a strong acidity. The alpine soil is mainly composed of silt, sand and clay fine sand, and has fast permeable and low moisturized ability. Soil erosion is sometimes used as a synonym for soil and water loss, and areas with severe erosion are susceptible to debris flows. The external causes of soil erosion are mainly hydro, wind, and freeze-thaw. Clearly, fragile soil characteristics accompanied by concentrated rainfall usually result in debris flows.
Rainfall is the main factor leading to debris flows. The study area is affected by the monsoon climate with rare precipitation and an annual average precipitation less than 1300 mm ( Figure 2o). However, statistical analyses of the geological hazard points occurring in the study area show that heavy rain and continuous rainfall are important external factors leading to geological disasters in the Shigatse area.
Remote Sens. 2018, 10, x FOR PEER REVIEW 6 of 26 assessment often takes the distance from the road into account, because the road construction and maintenance cause certain change and damage to the local morphology. This variable is calculated by using the Euclidean distance calculation technique in the spatial analysis tool of ArcGIS 10.2 ( Figure 2h). The vegetation coverage is one of the important parameters to evaluate the DFS. NDVI extracted from remote sensing images is a commonly used vegetation index for inferring the vegetation density. It is very sensitive to the presence of chlorophyll on vegetation surface ( Figure  2i). We calculated the NDVI value using the following formula: where NIR and RED represent the near-infrared and red-band, respectively, and they are the second and first channels of the MODIS image. The NDVI value ranges between −1 and 1. The negative value indicates that the ground cover is an object highly reflective to visible light such as clouds, snow, water, etc. 0 means bare land. A positive value represents a vegetation coverage area and it increases with the vegetation coverage density. Debris flows are usually influenced by changes in humidity caused by rainfall infiltration. Permeability can be expressed by soil type (Figure 2j), soil texture (Figure 2k-m) and soil erosion (Figure 2n). Since the particle distribution determines the shape of soil water characteristic curve and affects the soil hydraulic characteristics, the soil type and texture have a great influence on the DFS. Most of the study area is covered by alpine soil, including grass mat soil, cold soil, and frozen soil. According to statistics, most of the alpine soil is brown and has a strong acidity. The alpine soil is mainly composed of silt, sand and clay fine sand, and has fast permeable and low moisturized ability. Soil erosion is sometimes used as a synonym for soil and water loss, and areas with severe erosion are susceptible to debris flows. The external causes of soil erosion are mainly hydro, wind, and freeze-thaw. Clearly, fragile soil characteristics accompanied by concentrated rainfall usually result in debris flows.
Rainfall is the main factor leading to debris flows. The study area is affected by the monsoon climate with rare precipitation and an annual average precipitation less than 1300 mm ( Figure 2o). However, statistical analyses of the geological hazard points occurring in the study area show that heavy rain and continuous rainfall are important external factors leading to geological disasters in the Shigatse area.

Methods
The main purpose of our research is to fit the relationship between the triggering factors and occurrence of debris flows. The problem can be expressed as a multi-class classification. Given a set of input quantities, the classification model attempts to label the DFS level for each pixel in the study area. The input quantities to the models are the triggering factors of the debris flow events that were collected by the local Chinese Geological Survey researchers after many years of field investigation. According to the researchers' investigation of the debris flow sites, we obtain the values of 15 triggering factors at the corresponding positions through the value extraction function of ArcGIS v10.2 software. That is, the input of the model is a one-dimensional vector form [×1, ×2, …, ×15]. The output value of the model is the DFS level, indicating the occurrence probability of debris flows. The division criteria of regional DFS are based on the detailed survey and specification of landslide collapse debris flows by the China Geological Survey as shown in Table 2.

Methods
The main purpose of our research is to fit the relationship between the triggering factors and occurrence of debris flows. The problem can be expressed as a multi-class classification. Given a set of input quantities, the classification model attempts to label the DFS level for each pixel in the study area. The input quantities to the models are the triggering factors of the debris flow events that were collected by the local Chinese Geological Survey researchers after many years of field investigation.  According to statistics, the number of moderately susceptible units in the study area is much higher than that of the other susceptible grades ( Figure 3). Therefore, before constructing a debris flow assessment model, the oversampling technique Borderline-SMOTE algorithm should be used to eliminate the classification imbalance in the dataset. The number of each debris flow susceptibility level after oversampling is shown in Figure 3. The original dataset is divided into training sets and test sets according to a percentage of 75 and 25%, respectively. The training set of the debris flow triggering factors is used to learn the ability to fit the actual DFS classification, and the validation set is used for adjusting the model parameters to prevent over-fitting or under-fitting. In this study, five DFS models have been established using DT, BPNN, 1D-CNN, RF, and XGBoost. Among them, the DT and BPNN have been the most commonly used machine learning models in the past few decades. The one-dimensional convolutional neural network (1D-CNN) has achieved remarkable results in one-dimensional signal processing, such as fault diagnosis and speech recognition. RF is based on the DT. It is a typical integrated algorithm in machine learning algorithms. The XGBoost is also a tree-based integration model, which counts on the residuals generated by the last iteration. To the best of our knowledge, the XGBoost and 1D-CNN have not been used for the DFS. Based on the above introductions, the research framework for the DFS in Shigatse is shown in Figure 4.  According to statistics, the number of moderately susceptible units in the study area is much higher than that of the other susceptible grades ( Figure 3). Therefore, before constructing a debris flow assessment model, the oversampling technique Borderline-SMOTE algorithm should be used to eliminate the classification imbalance in the dataset. The number of each debris flow susceptibility level after oversampling is shown in Figure 3. The original dataset is divided into training sets and test sets according to a percentage of 75 and 25%, respectively. The training set of the debris flow triggering factors is used to learn the ability to fit the actual DFS classification, and the validation set is used for adjusting the model parameters to prevent over-fitting or under-fitting. In this study, five DFS models have been established using DT, BPNN, 1D-CNN, RF, and XGBoost. Among them, the DT and BPNN have been the most commonly used machine learning models in the past few decades. The one-dimensional convolutional neural network (1D-CNN) has achieved remarkable results in one-dimensional signal processing, such as fault diagnosis and speech recognition. RF is based on the DT. It is a typical integrated algorithm in machine learning algorithms. The XGBoost is also a tree-based integration model, which counts on the residuals generated by the last iteration. To the best of our knowledge, the XGBoost and 1D-CNN have not been used for the DFS. Based on the above introductions, the research framework for the DFS in Shigatse is shown in Figure 4.

DFS Level Frequency of Debris Flow
Very low No debris flow occurs within 100 years Low Debris flow occurs once within 10-100 years Medium Debris flow occurs once in within1-10 years High 1-10 debris flow events occurred with a year According to statistics, the number of moderately susceptible units in the study area is much higher than that of the other susceptible grades ( Figure 3). Therefore, before constructing a debris flow assessment model, the oversampling technique Borderline-SMOTE algorithm should be used to eliminate the classification imbalance in the dataset. The number of each debris flow susceptibility level after oversampling is shown in Figure 3. The original dataset is divided into training sets and test sets according to a percentage of 75 and 25%, respectively. The training set of the debris flow triggering factors is used to learn the ability to fit the actual DFS classification, and the validation set is used for adjusting the model parameters to prevent over-fitting or under-fitting. In this study, five DFS models have been established using DT, BPNN, 1D-CNN, RF, and XGBoost. Among them, the DT and BPNN have been the most commonly used machine learning models in the past few decades. The one-dimensional convolutional neural network (1D-CNN) has achieved remarkable results in one-dimensional signal processing, such as fault diagnosis and speech recognition. RF is based on the DT. It is a typical integrated algorithm in machine learning algorithms. The XGBoost is also a tree-based integration model, which counts on the residuals generated by the last iteration. To the best of our knowledge, the XGBoost and 1D-CNN have not been used for the DFS. Based on the above introductions, the research framework for the DFS in Shigatse is shown in Figure 4.    In addition, we have also implemented other traditional machine learning algorithms, such as support vector machine, logistic regression, and naive Bayesian model, but the results are disappointing. Therefore, these methods are not introduced here. The following part is a brief introduction to the data sampling generation algorithm and five classifiers used in this paper.

Borderline-SMOTE
It is well known that in the model training process, when a certain class in the classified data set is of a high proportion, the classifier performance will be seriously affected. Synthetic Minority Oversampling Technique is often referred to as SMOTE that has been improved for its application in solving data imbalance problems [37]. It is used to artificially generate vector data to achieve the consistency among each category in the dataset. In the study, it is common that most units are with moderate susceptibility. In the classification process, the scarcity of the category data with fewer samples (the minority class) is one of the main factors for over-fitting and inaccuracy. This paper chooses the boundary-based SOMTE algorithm (Borderline-SMOTE) to handle the imbalance of the data. Specifically, the k-nearest neighbor algorithm is used to calculate the nearest neighbor sample in the minority sample set in the training set. The boundary sample set is constructed according to whether the majority class in the nearest neighbor sample set is dominant. The k-nearest neighbors of the sample T i in the boundary sample set are calculated, and the sample T j is randomly selected. The SMOTE algorithm is used to randomly insert the feature vector between the selected neighbor samples and the initial vector. The SMOTE algorithm is shown in Equation (2), Finally, the generated new sample is added to the original sample set.

Back Propagation Neural Network
Back propagation neural network (BPNN) is a mathematical model that simulates the processing information of the biological nervous system. The BPNN, as the most classic part of artificial neural network, usually has three or more network structures, including input layer, output layer and hidden layer. The main structural functions of the BPNN are the forward propagation of the signal and the back propagation of the error. The neurons between the layers are fully connected, while the neurons of each layer are not connected to each other. The network is a gradient descent method, using gradient search technology to minimize the cost function of the network. It has strong nonlinear mapping ability and is especially suitable for dealing with the intricate relationship between debris flow triggering factors and DFS susceptibility. The general operation of the network is as follows. The input sample leaves the input layer and enters the hidden layer. After being activated by the transfer function (such as Tanh, Relu, Sigmoid and Tanh used in this article.), it passes to the next hidden layer until the output layer. The output formula for each layer is as follows: where f (θ) represents the transfer function; θ = {w, b} represents the network parameter; w is the weight; and b is the threshold.

One-Dimensional Convolutional Neural Network
As a feedforward neural network, one-dimensional neural network (1D-CNN) is inspired by the mammalian visual cortex receptive field. The network perceives the local features and integrates the local features in high-dimensional space, and finally obtains global information. The basic structure of the convolutional neural network includes an input layer, alternating convolution layers, pooling layers, and a fully connected layer. The convolutional layer captures the information of the local connections in the input features through the convolution kernel and reduces the parameters of the model using the weight sharing principle. The convolution formula is: where f () represents the transfer function, x j l represents the j-th feature map of convolutional layer l, M represents the number of feature maps, x i l−1 represents the ith feature map of the l-1 layer, * represents convolution operation, k ij l represents trainable convolution kernel, and b j l represents bias. The shape and number of one-dimensional convolution kernels can largely determine the feature-extraction ability of the overall network. The shape of the convolution kernel affects the fineness of feature extraction. The number of convolution kernels corresponds to the size of the feature layer, affecting multiple ways of feature extraction and the computational scale of the network. The pooling layer combines multiple adjacent nodes to merge similar features and performs down-sampling operation on the feature layer extracted by the convolutional layer, thereby reducing training parameters and preventing network over-fitting to enhance the generalization ability of the network. At present, the main pooling methods include maximum pooling, mean pooling, and L2-norm pooling. After the convolutional layer and pooling layer are located, the fully connected layer trains the weights and biases of the convolution kernels in the entire convolutional neural network based on the back-propagation principle. The fully connected layer structure is similar to the BP neural network mentioned in the previous section, which has a hidden layer and uses the Softmax activation function to complete the classification. The structure of the entire network is shown in Figure 5.
where () represents the transfer function, l j x represents the j-th feature map of convolutional layer l, M represents the number of feature maps, represents the i th feature map of the l-1 layer, * represents convolution operation, represents trainable convolution kernel, and represents bias. The shape and number of one-dimensional convolution kernels can largely determine the feature-extraction ability of the overall network. The shape of the convolution kernel affects the fineness of feature extraction. The number of convolution kernels corresponds to the size of the feature layer, affecting multiple ways of feature extraction and the computational scale of the network. The pooling layer combines multiple adjacent nodes to merge similar features and performs down-sampling operation on the feature layer extracted by the convolutional layer, thereby reducing training parameters and preventing network over-fitting to enhance the generalization ability of the network. At present, the main pooling methods include maximum pooling, mean pooling, and L2-norm pooling. After the convolutional layer and pooling layer are located, the fully connected layer trains the weights and biases of the convolution kernels in the entire convolutional neural network based on the back-propagation principle. The fully connected layer structure is similar to the BP neural network mentioned in the previous section, which has a hidden layer and uses the Softmax activation function to complete the classification. The structure of the entire network is shown in Figure 5.

Decision Tree
DT is a common machine learning algorithm similar to the tree structure, often used to find pattern structures in data. It aims to establish correct decision rules and determine the relationship between feature attributes and target variables without expert experience. Usually DT contains a root node, multiple internal nodes, and a set of leaf nodes from top to bottom. The leaf node corresponds to the prediction result, and the node contains all samples. The key to DT learning is to divide the best attributes. At present, the algorithms for constructing DT mainly include CART, C4.5 and ID3. This study uses a CART algorithm with better performance and efficiency [38]. CART uses the Gini coefficient to divide the node properties and establish a DT by selecting the attributes that minimize the Gini coefficient after dividing the nodes. The Gini index is shown in Equation (5) where k is the category and t is the node. Finally, pruning techniques are used to deal with the over-fitting problem of the model. Upon completing the entire algorithm, we can clearly understand

Decision Tree
DT is a common machine learning algorithm similar to the tree structure, often used to find pattern structures in data. It aims to establish correct decision rules and determine the relationship between feature attributes and target variables without expert experience. Usually DT contains a root node, multiple internal nodes, and a set of leaf nodes from top to bottom. The leaf node corresponds to the prediction result, and the node contains all samples. The key to DT learning is to divide the best attributes. At present, the algorithms for constructing DT mainly include CART, C4.5 and ID3. This study uses a CART algorithm with better performance and efficiency [38]. CART uses the Gini coefficient to divide the node properties and establish a DT by selecting the attributes that minimize the Gini coefficient after dividing the nodes. The Gini index is shown in Equation (5) where k is the category and t is the node. Finally, pruning techniques are used to deal with the over-fitting problem of the model. Upon completing the entire algorithm, we can clearly understand the internal decision-making mechanism and thus get a more objective knowledge of debris flow triggering factors.

Random Forest
As an integrated classification algorithm of machine learning, RF aims to improve the flexibility and accuracy of classification trees. In RF, a large number of trees are generated by constructing a base DT on multiple bootstrap sample sets of data during the running of the algorithm. Because the feature attributes of each node are randomly selected, the node characteristics are effectively reduced without increasing the deviation. Each feature subset is much smaller than the total number of features in the input space so that each DT is decorrelated. Finally, the output of the classification task is predicted by a simple voting method. RF has been constructed with a number of DTs. It has been greatly improved compared with a single DT. However, RF is as complex as the single basic DT. Therefore, RF is also a fairly effective integrated learning algorithm.

XGBoost
XGBoost, also known as extreme Gradient Boosting, is a gradient-enhanced integration algorithm based on classification trees or regression trees. It works the same way as Gradient Boosting, but adds features similar to Adaboost. The algorithm combines multiple DT models in an iterative way to form a classification model with a better structure and higher precision. It has achieved impressive results in many international data mining competitions and won more than two championships in the Kaggle competition. In the DFS analysis experiment, the XGBoost can classify the DFS level according to the environmental characteristics of the Shigatse region and rank the importance of the triggering factors.
The XGBoost uses both the first and second derivatives to perform Taylor expansion on the loss function and adds a regular term to it. Therefore, while considering the model accuracy, the model complexity is also well controlled. Finally, the predictive power of the model is trained by minimizing the total loss function [39]. The objective function of the model can be expressed as Equation (6): where i represents the ith sample,ŷ i (t−1) represents the predicted value of the (t − 1)th model for the sample i, f t (x i ) represents the newly added tth model, Ω( f t ) represents the regular term, C represents some constant terms, and the outermost L() represents the error. The optimizer aims to calculate the structure and the leaf score of the CART tree. XGBoost accelerates existing lifting tree enhancement algorithms through the cache-aware read-ahead technology, distributed external memory computing technology and AllReduce fault-tolerant tools. It can also be trained by using a graphics processing unit to provide a very high speed boost.
In this work, we can import the XGBoost toolkit in Python. The training process controls the establishment of DT by adjusting five hyper-parameters: the number of iterations, the number of trees generated, the learning rate, the maximum depth of each tree, and the L2 regularization. The Gamma hyper-parameter limits the gain amount required for segmentation.

Evaluation Measures
DFS mapping should have the ability to effectively predict the probability of future debris flows in the study area. In order to evaluate the predictive power of several machine learning algorithms, five common evaluation methods are used to quantify model performances, including Precision, Recall, F1 score, Accuracy and AUC. Finally, 293 debris flow events are applied as a test set. In

Precision
In the binary classification task, precision represents the ratio of the correct labeled True class samples to the total number of predicted values labeled true. The formula is as follows: Precision is expressed as a weighted average of the precision of each class in a multivariate classification task.

Recall
The Recall rate is the ratio of the correct labeled True sample to the total number of True samples, expressed as Equation (8) The Recall rate represents the weighted average of the Recall rates for each category in a multivariate classification task.

F1 Score
The F1 score is represented by Precision and Recall, with values between 0 and 1, which represent the worst and best, respectively. The relative contributions of accuracy and recall to the F1 score are equal. The formula is defined as follows: F1 score = 2 * (Precision * Recall)/(Precision + Recall) In a multivariate classification task, the F1 score represents a weighted average of F1 scores for each category.

Accuracy
In a multivariate classification task, accuracy represents the ratio of correctly classified samples to the total number of samples.

Area Under the Curve (AUC)
The AUC method is defined as the area under the receiver operating characteristic curve (ROC), which can give different distributions of each class. It can be used to judge classifiers' performance. The ROC curve is plotted as the relationship between the true positive rate (TPR) and false positive rate (FPR). TPR represents the ratio of the positive instance correctly classified to the total number of all the positive instances, as represented by Equation (10): FPR is the ratio of the positive instance misclassified to the total number of all the negative instances, as represented by Equation (11): The AUC method is also designed to evaluate the binary classification. First, we need to convert the multivariate classification task into multiple binary classification, and then separately calculate the AUC values of the respective categories. Finally, the multivariate classification result is obtained by obtaining the average of the total AUC values [40].

Cross-Validation
In this paper, the cross-validation method is used to complete the parameter optimization. Specifically, based on the error-based verification evaluation index, the training set is divided into k pairs of mutually equal exclusion subsets, where k − 1 pairs are used as the training sets and the remaining subset are used as the verification set. The experiment is performed by rotating the subset k times in turn, and the k verification results are averaged. In this paper, the GridSearchCV module via the scikit-learn and the cv function via the XGBoost library are used to optimize the parameters of the decision tree, random forest and the XGBoost model. In the Keras framework, the cross-validation method based on the GridSearchCV module is also used to search in the parameter space, and the optimal parameter estimation of the model in the data set is given.

Statistical Analysis
In order to compare the performance differences between the models, we conducted a statistical analysis. One-way ANOVA can be used to test whether there is a statistically significant difference in two or more unrelated groups [41]. Model performance is evaluated by the accuracy of test results during the model training. Therefore, the accuracy group obtained by cross-validation of different algorithms is used for one-way ANOVA. The null hypothesis given by H0 tested by One-way ANOVA is as follows.
H0: The accuracy of all algorithms is not significantly different. H1: There are significant differences in the accuracy of at least two or more algorithms. The One-way ANOVA results in a P-value, and the P-value is the risk of rejecting the hypothesis H0.
The results can only determine whether there is a significant difference between at least one group of samples and other groups, but it is impossible to judge whether there is a difference between the two groups. Therefore, comparisons between specific groups are often performed after one-way ANOVA. The honestly significant difference (HSD) method was developed by Tukey and is favored by researchers as a simple and practical pairwise comparison technique. The main idea of HSD is to use the statistical distribution to calculate the true significant difference between two mean values and call it q-distribution [42]. This distribution gives an exact sampling distribution of the largest difference between a set of mean values in the same population. All pairwise differences were evaluated by using this distribution. This paper uses HSD as a post-hoc analysis to test the variance homogeneity of performance indicators from different algorithms.
All statistical analyses were completed by using the Statistical Package for Social Sciences (IBM SPSS Statistics for Windows Version 22.0).

Feature Importance
The tree-based machine learning approach in this study provides a "feature importance" toolkit for ordering index factors based on the function strength of a particular problem [43]. In the basic decision tree, feature attributes are selected for the node segmentation, and the number of times measures the importance of the attribute. For a single decision tree T, Equation (12) represents the score of importance for each predictor feature x l , and the decision tree has J − 1 internal nodes.
The selected feature is the one that provides maximal estimated improvementτ 2 t in the squared error risk over that for a constant fit over the entire region. The following formula represents the importance calculation over the additive M trees.
In reality, a frequently used attribute often has a good distinguishing ability. In this study, the importance of the factors affecting the debris flows occurrences is ranked from high to low according to the characteristic attribute of the decision-making process of DFS.

Results
In this section, a specific implementation of five machine learning algorithms is introduced. Using Python as the development language, the BPNN and the 1D-CNN are constructed based on the Keras learning framework. The DT and the RF are implemented by API provided by the scikit-learn module, and the XGBoost algorithm is implemented by the Python-based code provided by its official website. The performance of DFS model depends largely on the choice, adjustment and optimization of its parameters. Therefore, the optimization of the model structure and parameters requires multiple experiments. The cross-validation method is used to complete the parameter optimization. After many experiments, the optimal parameters of the five DFS models are obtained as shown in Table 3.

Performance Metrics Evaluation
Cross-validation produces a list of accuracy, which can be seen from the first row in Table 4. The poor accuracy value in the second line indicates the performance of the model under the non-SMOTE data set. In order to obtain robust verification results, we use the One-way ANOVA method to test whether there is a significant difference between the methods. The ANOVA method is used according to the five groups of accuracy, and the results are shown in Table 5. The F value in the table indicates the ratio of the mean square between the groups to the mean square within the group. The corresponding P value is found according to the F value through the lookup table. Sig represents the P value, which is 0 and less than 0.05, indicating that we can reject the null hypothesis H0. We can think that there are significant differences between at least two sets of models. Significant differences are calculated according to post-hoc Tukey's HSD for all pairwise comparisons between accuracies as shown in Table 6. According to statistics, XGBoost performs best in terms of accuracy, and there is a significant difference (p < 0.005) from BPNN, DT and RF. There is no significant difference between XGBoost and CNN, but the average accuracy of XGBoost is higher than that of 1D-CNN.

DFS Map Construction
In this study, the relationships between the debris flow triggering factors and the DFS levels are fitted by training the BPNN, CNN, DT, RF and XGBoost models to predict the susceptibility index of each pixel in the study area, and to establish a pixel-based DFS classification map ( Figure 6). The result is a raster map with each raster pixel assigned a unique susceptibility index value. The susceptibility index values are divided into four categories: 0, 1, 2 and 3, indicating very low, low, medium and high debris flow levels, respectively.
In the debris flow map constructed by the BPNN model, about 2% of the study area is not susceptible to debris flow, and the other 19.8%, 68.1%, and 9.6% are low, medium and high probability DFS levels, respectively, as shown in Figure 6a.
The debris flow susceptibility map predicted by 1D-CNN is shown in Figure 6c. Among them, the medium-prone area accounts for the vast majority of the study area. The other 11.12%, 21.02% and 3.35% are very low, Low and high probability DFS levels, respectively.
The DFS map generated by the DT model is shown in Figure 6b. 11.2% of the study area is not susceptible to debris flows, 21.8% is seldom affected by debris flows, while the medium probability debris flow area accounts for 56.5%, and the remaining 10.3% is high-probability area (Figure 6b).
The proportion of each RF susceptibility level in the DFS map fitted by the RF model is very low (2.1%), low (34.5%), medium (62.8%), and high (0.27%), as shown in Figure 6d.
Finally, based on the XGBoost model, a DFS map is generated as shown in Figure 6e. The results of DFS level distribution are very similar to those based on the random forest model. The medium susceptibility is the main debris flow level, which accounts for 52.5%; the second large area corresponds to the low susceptibility level, 37.2%. Very low and highly susceptible areas are small, accounting for 6.6% and 3%, respectively (Figure 7).
Finally, based on the XGBoost model, a DFS map is generated as shown in Figure 6e. The results of DFS level distribution are very similar to those based on the random forest model. The medium susceptibility is the main debris flow level, which accounts for 52.5%; the second large area corresponds to the low susceptibility level, 37.2%. Very low and highly susceptible areas are small, accounting for 6.6% and 3%, respectively (Figure 7).

Model Evaluation
After successfully constructing the DFS model, we evaluated its performance using five traditional evaluation methods, i.e., Recall, Precision, F1 score, Accuracy, and AUC. We also calculated the time required to forecast the entire study area. The results of each model are shown in Table 7. In particular, because the ROC curve corresponding to the AUC index of each model can directly reflect the advantages and disadvantages of the model, it is drawn in Figure 8. As seen from

Model Evaluation
After successfully constructing the DFS model, we evaluated its performance using five traditional evaluation methods, i.e., Recall, Precision, F1 score, Accuracy, and AUC. We also calculated the time required to forecast the entire study area. The results of each model are shown in Table 7. In particular, because the ROC curve corresponding to the AUC index of each model can directly reflect the advantages and disadvantages of the model, it is drawn in Figure 8. As seen from Table 7 and Figure 8, the following findings are listed.
(1) The values of the Recall, Precision, F1 score, Accuracy, and AUC evaluation score of the five algorithms are quite different. That is, the performances of different algorithms show great differences in the test set. (2) Despite large differences in the evaluation index values, their differences show the same trend.
That is, the algorithm is superior when each evaluation index is superior to the other algorithms.

Importance of Triggering Factors
After successfully implementing the five evaluation algorithms, it is concluded that the XGBoost model performs the best in the DFS mapping, and the model can be used to fit the

Importance of Triggering Factors
After successfully implementing the five evaluation algorithms, it is concluded that the XGBoost model performs the best in the DFS mapping, and the model can be used to fit the relationship between the debris flow triggering factors and the susceptibility. However, not all selected debris flow triggering factors have a good predictive power. Different triggering factors have different contributions to the model. Therefore, it is necessary to understand the contribution of each triggering factor. The practical significance of this research is that we can provide suggestions and references for local governments and researchers on site selection for public facilities by studying the importance of different debris flow triggering factors characteristics.
As shown in Figure 9, the ordinate represents fifteen index factors for evaluating the DFS, and the abscissa is expressed as the ratio of the number of times each feature attribute used for DT node segmentation to that all attributes used for node segmentation. It can be clearly seen that the aspect is the most important factor affecting the occurrence of debris flows, closely followed by the rainfall, and the impacts of elevation and slope curvature are ranked the third and fourth, respectively. According to these main characteristics, topographic and climatic factors are the main triggering factors for the occurrence of debris flows in Shigatse. The last few debris flow triggering factors ranked by feature importance have relatively little impact on debris flow events especially soil factors.

Importance of Triggering Factors
After successfully implementing the five evaluation algorithms, it is concluded that the XGBoost model performs the best in the DFS mapping, and the model can be used to fit the relationship between the debris flow triggering factors and the susceptibility. However, not all selected debris flow triggering factors have a good predictive power. Different triggering factors have different contributions to the model. Therefore, it is necessary to understand the contribution of each triggering factor. The practical significance of this research is that we can provide suggestions and references for local governments and researchers on site selection for public facilities by studying the importance of different debris flow triggering factors characteristics.
As shown in Figure 9, the ordinate represents fifteen index factors for evaluating the DFS, and the abscissa is expressed as the ratio of the number of times each feature attribute used for DT node segmentation to that all attributes used for node segmentation. It can be clearly seen that the aspect is the most important factor affecting the occurrence of debris flows, closely followed by the rainfall, and the impacts of elevation and slope curvature are ranked the third and fourth, respectively. According to these main characteristics, topographic and climatic factors are the main triggering factors for the occurrence of debris flows in Shigatse. The last few debris flow triggering factors ranked by feature importance have relatively little impact on debris flow events especially soil factors.

Discussion
This study aims to estimate the regional DFS by using five highly representative machine learning models, i.e., BPNN, 1D-CNN, DT, RF, and XGBoost. According to literature, such investigations are rare in Shigatse, particularly based on 1D-CNN and XGBoost.

Model Performance
First, as seen from Table 4, category imbalances have caused over-fitting problems on all of the above machine learning models, resulting in poor model performance. The data generated by the interpolation method is close to the original data, and we will adopt better interpolation methods to obtain more reasonable data and use other methods to alleviate the over-fitting problem of machine learning for unbalanced data [44] in the future studies.
As shown in Table 7, among the five evaluation methods (Recall, Precision, F1 score, Accuracy and AUC), the results show that there is a gap in performance among different algorithms. The performance rankings of the five models from high to low are XGBoost, 1D-CNN, RF, BPNN and DT. In addition, ANOVA and Tukey HSD test results show that XGBoost is significantly different from the RF, BPNN, and DT models. This also shows that XGBoost's generalization performance and predictive ability are significantly better than RF, BPNN and DT.
In the early days, BPNN showed excellent performance in a variety of classification tasks. However, this research only demonstrates its accuracy to outperform a single DT. XGBoost is not only better than BPNN in terms of accuracy, but also in terms of speed, because BPNN has too many parameters to be adjusted. Especially, XGBoost can generate "feature importance" that allows researchers to analyze the data and BPNN is a black box model, for which much research has been done to explain the internal structure. Although XGBoost has not been used for debris flow susceptibility analysis, some scholars in the field of mountain disaster study have similar conclusions that the boost model exceeds the accuracy of BPNN by 8% [18].
RF and XGBoost are integrated machine learning algorithms based on DT. The corresponding evaluation scores are higher than that of a single DT. Such a result shows that the integrated algorithm can make up the lack of fitting ability of a single DT. Although RF and XGBoost are both integrated machine learning algorithms, XGBoost's overall performance is better than the RF algorithm. The RF algorithm focuses on the final voting results of all DTs, which can reduce variance, while XGBoost focuses on the residuals generated by the last iteration which can reduce both variance and bias. Performance comparisons between XGBoost and RF have been commonly obtained in many research areas. Usually, XGBoost is in the leading position [39,45].
Like XGoost, 1D-CNN has not been used in debris flow susceptibility, and little literature is concerned about them. The cross-validation results show that the accuracies of XGBoost and 1D-CNN are not significantly different, but the average accuracy of XGBoost is better than that of 1D-CNN. The test performance of the two models is also led by XGBoost. The main reason for this result is that CNN can capture things like image, audio and possibly text quite well by modeling the spatial temporal locality, while tree-based models solve tabular data very well. When considering the model classification performance comprehensively, we can find that XGBoost has the best comprehensive performance with high classification accuracy, good prediction effect and less calculation time. Therefore, the XGBoost research method should attract more attention in the future evaluation of DFS.

Feature Importance
Based on the selected model to construct the DFS map, the following conclusions can be drawn from Figure 4. The DFS in the study area is mainly medium and low, accounting for more than 50% of the entire study area. The feature attribute scores provided by the tree-based machine learning method are important for analyzing the cause of debris flows. The results have shown that the aspect, profile curvature, annual average rainfall and DEM are the main factors affecting the occurrence of debris flows in the study area. The other triggering factors such as vegetation cover and human activities also have a certain impact on debris flow, while the contribution of soil factors to the modeling is relatively weak. According to the evaluation results of the model feature attributes, the targeted analysis and investigation of the debris flow triggering factors in the study area can be carried out. Based on historical data statistics, analysis of the main triggering factors is conducted.
In the study area, different slope directions lead to differences in hydrothermal conditions, which in turn affect the geographical element distributions such as vegetation, hydrology, soil, and topography. Some Chinese scholars have also examined the relationship between vegetation and debris flow erosion and suggested that the slope direction largely determines the vegetation type and soil type [46]. Feature selection show that the slope aspect has the greatest impact on the distribution of debris flows. According to the actual debris flow events statistics (Figure 10a), the distribution of debris flow events in each aspect is given, and the number of debris flows on the southwest slope and the east slope are relatively large. Among them, the debris flow events on the southwest slope are the densest as well as the distribution location of highly occurrence-prone debris flow events.
The curvature of the slope describes its shape, which controls the formation of debris flow events by affecting the gravitational potential energy and water collection conditions. The feature selection results show that the profile curvature can be better used to estimate the DFS than the plane curvature and total curvature. The shape of the slope is usually linear, concave or convex, indicating the mid-term evolution of the landscape, the maturity of the landscape and the period of violence of the landscape, respectively. According to the statistic (Figure 10b), the debris flows are mainly concentrated in the area where the curvature is negative, i.e., the surface of the pixel is convex. This statistical result is consistent with the results of Guo et al. [47] on mountain debris flow events.
The overall elevation of the Shigatse region is very high and the valley is deep. Statistics on the distribution of debris flow events at different altitudes indicate that debris flow events are mainly distributed at altitudes between 3600-4600 m (Figure 10c). High-susceptibility and medium-susceptibility levels are distributed at the altitude of about 4000 m. The reason is that the region at the altitude between 3600 and 4600 m is very steep and densely populated. As a result, human activities have a huge impact on it. In addition, the area within this altitude is mainly eroded by flowing water with serious accumulation of loose materials and debris flow events particularly develop. Tang et al. [31] also got similar conclusions for the investigation of the study area.
Rainfall is the main triggering factor for debris flows. It mainly promotes the mountain debris flows by increasing soil bulk density and reducing cohesion and internal friction. The study area is mainly a rain-sparing region with annual average rainfall less than 1350 mm (Figure 10d). However, the debris flow has a very significant correlation with the rainfall season in Shigatse. Although the annual precipitation is not high, the temporal distribution is concentrated. That is, heavy rain season is also the season when debris flows frequently occur. According to statistics, debris flows in the flood season in Shigatse accounts for more than 70% of the total debris flows.
southwest slope and the east slope are relatively large. Among them, the debris flow events on the southwest slope are the densest as well as the distribution location of highly occurrence-prone debris flow events.
The curvature of the slope describes its shape, which controls the formation of debris flow events by affecting the gravitational potential energy and water collection conditions. The feature selection results show that the profile curvature can be better used to estimate the DFS than the plane curvature and total curvature. The shape of the slope is usually linear, concave or convex, indicating the mid-term evolution of the landscape, the maturity of the landscape and the period of violence of the landscape, respectively. According to the statistic (Figure 10b), the debris flows are mainly concentrated in the area where the curvature is negative, i.e., the surface of the pixel is convex. This statistical result is consistent with the results of Guo et al. [47] on mountain debris flow events.
The overall elevation of the Shigatse region is very high and the valley is deep. Statistics on the distribution of debris flow events at different altitudes indicate that debris flow events are mainly distributed at altitudes between 3600-4600 m (Figure 10c). High-susceptibility and medium-susceptibility levels are distributed at the altitude of about 4000 m. The reason is that the region at the altitude between 3600 and 4600 m is very steep and densely populated. As a result, human activities have a huge impact on it. In addition, the area within this altitude is mainly eroded by flowing water with serious accumulation of loose materials and debris flow events particularly develop. Tang et al. [31] also got similar conclusions for the investigation of the study area.
Rainfall is the main triggering factor for debris flows. It mainly promotes the mountain debris flows by increasing soil bulk density and reducing cohesion and internal friction. The study area is mainly a rain-sparing region with annual average rainfall less than 1350 mm (Figure 10d). However, the debris flow has a very significant correlation with the rainfall season in Shigatse. Although the annual precipitation is not high, the temporal distribution is concentrated. That is, heavy rain season is also the season when debris flows frequently occur. According to statistics, debris flows in the flood season in Shigatse accounts for more than 70% of the total debris flows. Machine learning algorithms can handle large-scale data. In addition, they are more objective than the traditional qualitative evaluation methods and can support making decisions without expert system support. However, there are some inherent problems. For example, the data preprocessing workload is large and time-consuming, and the data processing results have a great impact on the classifier. Machine learning algorithms can handle large-scale data. In addition, they are more objective than the traditional qualitative evaluation methods and can support making decisions without expert system support. However, there are some inherent problems. For example, the data preprocessing workload is large and time-consuming, and the data processing results have a great impact on the classifier.

Conclusions
In this study, multi-source satellite data and GIS are used to characterize the gestation environment of debris flows in the study area, and then input these environmental characteristics into the machine learning methods to establish the DFS model. The role and weight of the triggering factors shown by the training process are analyzed for the purpose of further studying the main causes of debris flow. In the entire research process described above, the four main findings are described as follows: (1) Satellite remote sensing can provide data for regional DFS analysis, especially for mountainous areas such as the southwestern Tibet with steep terrain where the sites are not always accessible for investigation. Higher resolution does allow the image to better describe the terrain where the debris flow occurs [48] and potentially improve further analysis. It is important and necessary to use topographical factors, human activities, vegetation cover, climatic and soil elements provided by satellite remote sensing to estimate regional debris flow susceptibility. (2) Five machine learning algorithms were used to construct DFS map in Shigatse. The results confirm that all five methods can be used to analyze the susceptibility of debris flows. According to the performance, XGBoost ranks the first, and 1D-CNN is the second, followed by RF, BPNN, and DT. XGBoost has the best predictive performance with the highest score among the five evaluation methods. The ANOVA method and the Tukey's HSD test showed that the accuracy of XGBoost is significantly better than those of RF, BPNN, and DT, but it is not significantly different from 1D-CNN. In terms of the time required for prediction, DT takes the least time, and the time required for 1D-CNN is moderate and close to XGBoost. RF and BPNN are slower to calculate. It is notable that this is the first comparative experiment of XGBoost and 1D-CNN in the study of DFS. The ranking of the model based on the "feature importance" indicates that the slope aspect, rainfall, profile curvature and DEM have a greater impact on the debris flows. The results of this study are significant for the local public facility construction and the residential property protection. Therefore, the XGBoost method has good prospects in estimating the DFS. (3) By comparing the debris flow susceptibility maps of the five prediction algorithms, it is found that the prediction results of five models all show that the moderately susceptible areas account for a large proportion. This experiment has not yet explained the reasons for the different prediction results. The causes will be explored in the subsequent studies. There may be some shortcomings in the use of susceptibility in statistics as a label in experiments. In the follow-up study, we are going to use the clustering algorithm first to obtain the location where the debris flow is not easy to occur and use it together with the existing debris flow data for the classification of debris flow susceptibility. With the development of machine learning technology, we will strive to further improve the performance of the model for DFS by modifying and optimizing the algorithm. (4) Debris flows are common in mountain areas. Five machine learning models are used to analyze the debris flow events in the study. The results show that the XGBoost model has the best predictive performance, which can be used to prevent casualties and economic losses caused by debris flows. For local land planning and land use, relevant departments can use the XGBoost model in combination with satellite remote sensing and GIS spatial data processing to create feature maps and high-precision area-sensitive maps to provide guidance and preparation for debris flow prevention and mitigation.