Spatial Predictions of Debris Flow Susceptibility Mapping Using Convolutional Neural Networks in Jilin Province, China

: Debris flows are a major geological disaster that can seriously threaten human life and physical infrastructures. The main contribution of this paper is the establishment of two– dimensional convolutional neural networks (2D–CNN) models by using SAME padding (S–CNN) and VALID padding (V–CNN) and comparing them with support vector machine (SVM) and artificial neural network (ANN) models, respectively, to predict the spatial probability of debris flows in Jilin Province, China. First, the dataset is randomly divided into a training set (70%) and a validation set (30%)


Introduction
Debris flows are defined as large-scale movements of a mixture of soil, rock, and water from the top to the bottom of the mountains. Debris flows cause numerous casualties and a great amount of property damage each year [1,2]. In the past few decades, the formation and movement mechanism of debris flows have been the focus of research on geological disasters [3,4]. The occurrence of a debris flow is sudden and destructive and is usually affected by factors such as earthquakes, heavy rainfall, and human activities. Therefore, it is difficult to accurately predict the spatial distribution of debris flows and areas of susceptibility [5].
Debris flow susceptibility refers to the probability of debris flow occurring in a given area [6]. For the sake of decreasing the hazards caused by debris flows, debris flow susceptibility mapping (DFSM) is a direct and effective method, which is a basic means to discover relationships between debris flows and influencing factors and identify potential debris flow areas. The combined action of many influencing factors leads to the occurrence of a debris flow. Therefore, using these influencing factors to predict the debris flow susceptibility spatial distribution is essential in preventing and decreasing a debris flow's damages. At present, various methods are being utilized to produce debris flow susceptibility maps [7,8]. These methods can be divided into three categories: qualitative methods, traditional statistical methods, and machine learning (ML) methods. Qualitative methods mainly include the analytic hierarchy process (AHP) [9][10][11], fuzzy logic method [12][13][14], etc. These kinds of methods mainly rely on the experience of experts, and the evaluation results have strong subjectivity. Traditional statistical methods mainly include the weight of evidence [15], frequency ratio [16], statistical index (SI) [17], and so on. These methods are not sufficient for data mining [18,19]. Therefore, more effective methods are urgently needed to improve the quality of debris-flow susceptibility maps.
With the rapid development of artificial intelligence in recent years, ML methods have been widely employed in regional debris flows prediction, including support vector machine (SVM) [20][21][22], Naïve Bayes (NB) [23], artificial neural network (ANN) [24,25], random forest (RF) [26,27], boosted regression tree(BRT) [28], logistic regression(LR) [29], etc. Compared with traditional statistical methods, ML methods are more suitable for finding the data relationships between debris flows and influencing factors [18]. But with the deepening of the research, we can find that traditional ML methods directly classify the debris flow data and ignore the relationships existing in the data, which ultimately fails to further improve the classification accuracy [30]. Therefore, using traditional ML methods to evaluate debris flows does not satisfy the needs in Jilin Province, China, where debris flows cause many casualties and high economic losses every year, especially during the rainy season [31,32].
To settle this problem, deep learning (DL), as an important branch of ML methods, has been supported and embraced by scholars as soon as it emerged because it can complete the target problems better than traditional ML methods [33]. DL algorithms mainly include convolutional neural network (CNN), recurrent neural network (RNN), generative adversarial network (GAN), etc. Among these, CNN is one of the most popular DL algorithms and has been gradually applied to DFSM [34,35]. Compared with traditional ML algorithms, the CNN structure adds the convolution layers and subsampling layers. Hence, CNN has fewer parameters and allows easier mining of the latent relationships that are hidden in the data. Specifically, it has outstanding ability in image classification because of its use of the convolution layers and subsampling layers, which can effectively extract the latent features of an image [36,37]. Moreover, there are two data processing methods applied when performing convolution and subsampling operations: SAME padding and VALID padding. SAME padding could ensure the integrity of the data when processing the data. However, VALID padding would discard data at the edge of the matrix for a quick calculation, resulting in an imbalanced use of the edge data and the central data. Wang et al. [30] used VALID padding to process debris flow data and generated a highly accurate debris flow susceptibility map. In this paper, the S-CNN and V-CNN models are established by using SAME padding and VALID padding to research the spatial distribution of debris-flow susceptibility in Jilin Province, China.
The purpose of this study is to provide a CNN framework for regional debris flow susceptibility evaluation in Jilin Province, China. The two primary contributions of this paper can be summarized as follows. First, we are attempting to select a CNN model with more predictive ability to provide valuable information for further research. Second, a comparison of the S-CNN, SVM, V-CNN, and ANN models is proposed for the spatial prediction of debris-flow susceptibility. To measure the model's accuracy, the receiver operatic characteristic (ROC) curve, the accuracy (ACC), F-measure, and the mean absolute error (MAE) are utilized for comparison. More importantly, it is hoped that the research from this paper can provide a more solid decision basis for policymakers to reduce the harm caused by debris flow.

Study Area
Jilin Province is located in the part of northeast China, extending between 40°50′ N and 46°19′ N latitude and 121°38′ E and 131°19′ E longitude with an area of 187,400 km 2 , as shown in Figure 1.
The study area has a temperate continental monsoon climate. The average temperature in winter is below −11 °C. The annual temperature difference is 35 °C to 42 °C, and the daily temperature difference is generally 10 °C to 14 °C. The average annual precipitation is 400 mm to 600 mm, but the seasonal and regional differences are large and the eastern part has the most abundant rainfall. There are many lakes in Jilin Province, and the western plains are mostly natural lakes, such as the Moon Bubble; the eastern mountainous areas are mainly artificial lakes and volcanic lakes, such as the Triangle Dragon Bay.
The geomorphologic difference in Jilin Province is obvious. The terrain is inclined from the southeast to the northwest, showing obvious characteristics of southeast high elevation and northwest low elevation. With the central Big Black Mountain as the boundary, the landforms can be divided into the eastern mountains, the central, and the western plains. The geomorphology includes weathered, eroded, and denuded volcanic landforms with alluvial fans and fluvial river systems. In Jilin Province, the main geology and lithology are shown in Table 1.
All in all, the geological environmental conditions are exceedingly complex, so many influencing factors can cause debris flows to occur in Jilin Province. During the rainy season, debris flows occur more frequently, causing high economic losses every year. Therefore, policymakers urgently need more accurate and reliable debris flow susceptibility maps to provide more reasonable preventive measures for the local area.

Materials and Methods
In this paper, the following four steps were used to complete the spatial prediction of debrisflow susceptibility in the study area.

Data Preparation
The quality of a debris flow inventory map is vital and ultimately affects the evaluation results of the study area [38,39]. Therefore, a quality-assured data source will help to uncover potential relationships between the occurrence of debris flows and influencing factors. In this paper, a debris flow inventory map has been prepared from different sources, including field investigation, remote sensing interpretation, and official data. 868 debris flow locations and 868 non-debris flow locations were randomly selected in the debris flow and non-debris flow areas. The dataset was randomly split into a training set (70%) to train the models and a validation set (30%) to check the accuracy of the models, in which debris flow and non-debris flow locations were equally balanced. Figure 1 shows the specific location distribution of debris flow in Jilin Province, China.
In ML algorithms, data can be read directly in text or table format. However, input data needs to be converted to matrices to feed to CNN. In this paper, thirteen influencing factors were used, so each grid cell had thirteen corresponding values. In a binary classification problem, 0 is usually used to represent non-debris flow location and 1 is used to represent debris flow location. Then some grid cells' influencing factors values are shown in Figure 3. In this paper, 2D-CNN was utilized to build the models. The conversion process of 1D to 2D is shown in Figure 4. Finally, the CNN data preparation was implemented by using a python-based graphical user interface (GUI).

Influencing Factors
Selecting high quality influencing factors is a crucial step in producing debris flow susceptibility maps because irrelevant factors or noise factors can damage the quality of the models and thus reduce the quality of DFSM. Unfortunately, there is not presently an accepted benchmark for the selection of debris flow influencing factors. Therefore, based on previous researches and experts experience [40], a total of thirteen influencing factors were selected, including elevation, slope, aspect, plan curvature, profile curvature, topographic wetness index (TWI), distance to roads, distance to rivers, lithology, population density, annual rainfall, topography, and vegetation coverage. The digital elevation model (DEM) was downloaded from Google Maps with a spatial resolution of 100 m. To improve the quality of DFSM, grid cells were used in this paper which are recognized by the majority of researchers. The database for influencing factors is shown in Table 2. Elevation is a critical and widely used factor for DFSM [41][42][43]. Debris flows are also closely correlated with rainfall, vegetation cover, geologic bedrock, soil composition, and river discharge which vary at different altitudes. The altitude of Changbai Mountain is 800-1500 m which is located in the southeast of the research area. The main peak of Changbai Mountain and its surrounding peaks are over 2000 m, and the central plain is 600-800 m. Therefore, elevation was classified into four classes: <600 m, 600-800 m, 800-1500 m, and >1500 m (Figure 5a).
Slope is also a factor frequently used in DFSM [44]. Slope affects the shear strength, and runoff of the slope is closely related to the occurrence of debris flows [45]. In the northwest plain of Jilin Province, the slope is mainly 0-5°, while in the southeast mountainous area, the slope is usually 10°, with a few areas exceeding 20°. In this paper, slope was classified into four classes: 0-5°, 5-10°, 10-20°, and >20° ( Figure 5b).
Aspect refers to the downslope direction. Aspect is affected by rainfall, duration of sunshine, wind direction, and weathering such as frost action [46,47]. These factors all play key roles in the occurrence of debris flows. Aspect was classified into nine classes: flat, north, northeast, east, southeast, south, southwest, west, and northwest ( Figure 5c).
Plan curvature delineates the morphology of the topography and represents the ability of water divergence or convergence to affect the occurrence of debris flows [48,49]. Plan curvature was divided into three classes: concave, flat, and convex ( Figure 5d).
Profile curvature reflects the rate at which the aspect changes and affects the size of the catchment area, which is closely related to debris flow scales [50]. Profile curvature was classified into five classes by natural breaks method: <−0.31, −0.31-−0.10, −0.10-0.06, 0.06-0.28, and 0.28-2.10 ( Figure 5e).
TWI is also a significant factor for DFSM [51]. The higher the TWI value, the easier it is for soil moisture to reach saturation, and the probability of debris flow occurring is increased. Most debris flows in Jilin Province occurred in areas with TWI less than 5. For the sake of a comprehensive analysis, TWI was classified into five classes using an interval of 2: <5, 5-7, 7-9, 9-11, and >11 ( Figure  5f).
Distance to roads refers to the distance from the debris flow locations to a road. In general, the closer a debris flow location is to a road, the more lives and property of pedestrians are threatened. Additionally, debris flows are more likely to occur on slopes that are close to the roads because of human interference and so this is also a very important factor. Considering the reasonableness of the buffer distance in the study area, distance to roads was classified into six classes at 1000 m intervals: <1000 m, 1000-2000 m, 2000-3000 m, 3000-4000 m, 4000-5000 m, and >5000 m (Figure 5g).
Distance to the rivers can affect the hydrologic index of the slope and thus the slope stability. Slopes are more prone to debris flows which are closer to rivers because the distance describes the erosion power of streams [52]. Considering the influence of rivers on slope stability, distance to rivers was classified into six classes with intervals of 500 m: <500 m, 500-1000 m, 1000-1500 m, 1500-2000 m, 2000-2500 m, and >2500 m (Figure 5h).
Bedrock is a key factor with respect to the degree of weathering, mineral composition rock type, the orientation of discontinuities such as bedding, fractures, foliations with respect to slope direction (aspect). Rocks with phyllosilicate minerals are particularly prone to mass wasting such as claystones, shales, mudstones, tuffs, pyroclastic rocks, slate, and phyllite. Rocks with discontinuities inclined in the same direction as topographic slope are highly prone to mass wasting. The research area encompasses 370 lithostratigraphic units. Segoni et al. [53] have proposed several classification methods in complex geological settings. In the paper, we attached great importance to the antiweathering ability of rocks. Debris flows are more likely to occur where the lithology is easily weathered, which is because the weathering crust is easily eroded under the action of external forces. For example, the slope composed of loose rock and soil will lead to debris flows with unpredictable consequences [54,55]. According to the anti-weathering ability of rocks, the lithology was classified into four classes: group 1 (granite, basalt, carbonate rock, gneiss, etc.), group 2 (glutenite, slate, phyllite, schist, etc.), group 3 (claystones, pyroclastic rock, etc.), and group 4 (soil) (Figure 5i).
Population density is a key factor for debris flow occurring. Human engineering activities can change the shape of slopes, vegetation cover, etc., which could provide sufficient conditions for debris flows [56]. According to the number of people per square kilometers, the population density was classified into four classes: very low (0-10), low (10-100), moderate (100-500), and high (>500) ( Figure  5j).
It is well known that most debris flows occur during the rainy season [57]. Because the continuous heavy rainfall increases the runoff and seepage flow, the possibility of debris flow occurring is greatly increased. According to the annual rainfall distribution in the study area, annual rainfall was classified into four classes: 0-600 mm, 600-800 mm, 800-1000 mm, and >1000 mm ( Figure  5k).
Topography affects the size, formation, and movement of debris flows [58]. Based on existing studies and topography in Jilin Province, topography was classified into four classes: plain, valley, hills, and mountains ( Figure 5l).
The level of vegetation coverage represents the growth status of vegetation in an area. Generally, the better the vegetation growth, the less prone an area is to debris flow. In Jilin Province, vegetation coverage of northwestern plain is less than 20%, while that of Changbai Mountain is more than 80%. The vegetation coverage in the central region is mainly 20% to 50%. Therefore, vegetation coverage was classified into four classes: low (<20%), moderate (20-50%), high (50-80%), and very high (>80%) (Figure 5m).

Evaluation of Influencing Factors
In the paper, thirteen influencing factors were selected through reference to previous research literature and expert experiences. Specifically, multicollinearity analysis was used to detect the collinearity between factors, and the FR method was utilized to ascertain the spatial relationship between each influencing factor and debris flows [59,60]. In addition, we used the GR method to quantify the predictive ability of the factors and exclude irrelevant factors.

Multicollinearity Analysis
Collinearity means that the influencing factors are not independent of each other, a situation that can damage a model's quality [61]. Therefore, to improve the quality of DFSM, strong collinear factors must be discarded. Then tolerance and variance inflation factor (VIF) methods were used in the paper to quantify the collinearity between the factors. When the tolerance value is less than 0.1 or the VIF value is greater than 10, we assume that the collinearity of the factor satisfies the requirements. The formulas for calculating tolerance and VIF are as follows [62]: where is the coefficient of determination for the regression of the explanatory variable j on all the other explanatory variables.

Frequency Ratio Method
FR method is a geospatial assessment tool that can quantify the relationship between the influencing factors and debris flows [63]. This method uses the number of pixels dominating a particular area of a region and the number of pixels affected by a debris flow. In general, the slide ratio and the class ratio were utilized to calculate the FR value. In the relationship analysis, 1 is deemed to be the average value. When the FR value is greater than 1, it represents a higher connection between influencing factors and a debris flow. The formulas for calculating the FR value are as follows:

Gain Ratio Method
In the study area, the predictive ability of influencing factors is not the same. Some influencing factors may cause errors in the models and reduce the quality of debris-flow susceptibility maps. Therefore, it is necessary to evaluate the importance of influencing factors. In this study, the GR method was introduced to measure the predictive ability of influencing factors [64]. The average merit (AM) derived from the GR method uncovers the importance of influencing factors and debris flows occurrence. The formulas for calculating AM value of influencing factor A are as follows: where is the training set and is the probability that a sample belongs to (debris flow, nondebris flow).
The amount of information that needs to split into m parts and the formulas for ( , ) is as follows:

V-CNN
In the paper, the LetNet-5 structure was used which was the first proposed structure in the CNN algorithm [65]. The LetNet-5 structure ( Figure 6) contains an input layer, two convolution layers, two subsampling layers, two full connection layers, and an output layer. The V-CNN based on the LetNet-5 structure was used in this research. Additionally, the backpropagation (BP) algorithm was utilized to modify the model parameters and increase model robustness. The V-CNN structure is shown in Figure 7.  As we can see from the V-CNN structure, the input matrices become smaller after passing through the first convolution layer. VALID padding is used in this convolution manipulation, as shown in Figure 8. In this process, the sixteen convolution kernels with a size of 3 × 3 move across the matrices and perform the convolution calculation. Sixteen feature maps are obtained after the convolution manipulation and are fed into the first subsampling layer. Similarly, VALID padding is also used in the subsampling layer. The specific process used in the subsampling layer is shown in Figure 9. The max pooling manipulation takes the maximum value in a 2 × 2 region as the output. Because the subsampling kernels move two strides at a time, some data are directly discarded. After the matrices perform the first subsampling manipulation, sixteen feature maps with a size of 5 × 5 are produced.  Thirty-two convolution kernels with a size of 3 × 3 are utilized to perform the convolution manipulation. The matrices become smaller with a size of 3 × 3. After the second subsampling manipulation, the size of the matrices reduces to 1 × 1. Finally, the feature vectors are fed into the two full connection layers and the probability of debris flow occurring are determined by using the softmax classifier. To facilitate the adjustment of parameters and the establishment of the V-CNN model, a GUI based on python is used.
The V-CNN model is widely used for object recognition. During the convolution operation, the image edge information has limited use for object recognition. In other words, the image edge information is rarely used but the intermediate information is used many times, which results in an unbalanced use of information. During the subsampling operation, the V-CNN model discards the edge information, which has a limited impact on the main recognition process for the image. In this way, irrelevant data can be quickly discarded for the recognition of the photos to improve the recognition efficiency. It is important to note that if the size of the feature images is even, the subsampling operation will not discard the edge data but will still weaken the edge information of the image during the convolution operation. There is no doubt that the V-CNN model has the advantages of reduced computation time and rapid recognitions for objects. However, the purpose of this paper is not to identify objects but to search for the deep data relationships between debris flow data and the occurrence of debris flows. Therefore, it is not adequate to use VALID padding to process the data because debris flow data is too important to be discarded.

S-CNN
Compared to VALID padding, SAME padding approach is used for the first time in DFSM. The S-CNN model based on the SAME padding is proposed for use in this study, as shown in Figure 10.
As we can see from the S-CNN structure, after the input matrices pass through the first convolution layer, the matrix size does not change. The SAME padding approach is used in this convolution manipulation as shown in Figure 11. In the process of convolution manipulation, zeros are filled around the input matrices. For example, the input matrices are a size of p × p and the convolution kernel is a size of n × n. By using SAME padding, the matrices with a size of (p + n−1) × (p + n−1) are processed with convolution manipulation. Then, the feature maps with a size of p × p are created, which means that the size of the matrices before and after the convolution manipulation are the same. Next, the max pooling layer is used to perform the subsampling manipulation and zeros are filled around the feature maps as shown in Figure 12. It is obvious that no data were discarded by using SAME padding. After the second subsampling layer, the size of the matrices becomes 4 × 4 which means that more feature vectors are obtained. At last, by using two full connection layers and the softmax classifier, the probability distribution of two classified objects is obtained. Similarly, the GUI is used to adjust parameters and train the S-CNN model.   The S-CNN model is more suitable for the study of debris flows because this method uses the data in a balanced way and does not discard the edge information. During the convolution operation, to increase the efficiency of the use of matrices edge information and retain all data, the S-CNN model adds a certain range of blank areas around the matrices, but these areas have no impact on our calculation results. Therefore, the matrix size remains the same before and after each convolution operation, which means that all debris flow data are used equitably and kept intact. In the subsampling operation, to use the matrices edge information, a certain range of blank areas are also added to the matrices edge. These blank areas are still not involved in the calculation, thus increasing the use efficiency of debris flow data. No matter if the matrix size is odd or even, the data are not discarded during the subsampling operation. Debris flow data is very valuable, so the benefit of using SAME padding is to fully exploit the value of each piece of data. Therefore, the S-CNN model has great advantages compared with the V-CNN model in mining the relationship between debris flow data and debris flows occurring.

Model Evaluation
Without model evaluations, this research would be meaningless [66]. As far as we know, the ROC curve is normally used to evaluate the models in classification problems. The ROC curve distinguishes the two categories by a diagnostic test based on the distribution of correct and incorrect predictions according to a 2 × 2 contingency table (Table 3) [67]. The ROC curve takes 1-Specificity (FP rate) and Sensitivity (TP rate) as the abscissa and ordinate, respectively. In this paper, the TP rate represents the proportion of disaster locations that are correctly classified as debris flows, and the FP rate represents the proportion of non-disaster locations that are wrongly classified as debris flows. Additionally, the area under the ROC curve (AUC) is an important index to measure the predictive capability of a model. In general, the closer the AUC value is to 1, the better the model. Some mathematical-statistical methods were also utilized to measure the predictive ability of the models, such as ACC, F-measure, and MAE. The formulas for these evaluation indicators are as follows [ where TP and TN represent the proportion which were correctly classified, while FP and FN represent the proportion which were incorrectly classified and i p and i t are the predicted and true values, respectively.

Multicollinearity Analysis Results
The tolerance and VIF values calculated by thirteen factors are shown in Table 4. In general, a VIF value greater than 10 or a tolerance value less than 0.1 indicates that the factor had strong collinearity, and should be discarded. The results show that vegetation coverage had a minimum tolerance value of 0.476, and a maximum VIF value of 2.099. Therefore, all thirteen influencing factors meet the collinearity requirements.

The Results of FR Method
The FR values are used to represent the correlation between debris flows and influencing factors. The calculation results are shown in Table 5. In general, the results indicate that the corresponding area is more prone to debris flow occurring when the FR value is greater than 1 [69].
The results demonstrated that elevation lower than 600 m had the highest probability of debris flow occurring in Jilin Province, China. In terms of the slope, the trend of the FR values was convex. Slopes from 5° to 10° had the highest FR value (1.923). Similarly, the trend of FR values for the aspect was convex. In addition, the southeast-facing aspect had the highest FR value of 1.524. Plan curvature of the concave class (1.208) and profile curvature of between 0.06 and 0.28 class (2.634) had higher FR values than the other classes. Meanwhile, TWI had the highest FR value of 2.199 between 9 and 11 and the lowest value of 0.677 between 5 and 7.
Additionally, the results also showed that distance to roads of less than 1000 m (1.353) was prone to debris flow, which had the highest FR value compared to other classes. On the other hand, distance to rivers of between 2000 m and 2500 m (1.975) and greater than 2500 m (0.950) had the highest and lowest probability of debris flow occurring, respectively. From the lithology point of view, group 3 has the highest FR value (5.462). The results also revealed that population density and annual rainfall values ranging from 10-100/km 2 (1.239) and 600-800 mm (2.314), respectively, had a higher probability of debris flow occurring. For the topography factor, the valley areas obtained the highest FR value (2.370). Compared to the four classes of vegetation coverage, we found that the moderate class (1.933) and the low class (0.172) had the highest and lowest FR values.

The Results of GR Method
The results are shown in Figure 13. The selection of influencing factors can be supported by the GR method. Factors with higher AM values are more significant to the models, whereas factors with AM values of 0 cannot contribute to debris flow susceptibility modeling and should be excluded. It was obvious that lithology had the highest AM values of 0.085, followed by distance to roads (0.083), rainfall (0.077), topography (0.065), population density (0.062), elevation (0.059), profile curvature (0.057), TWI (0.053), plan curvature (0.047), aspect (0.046), vegetation coverage (0.036), distance to rivers (0.024), and slope (0.018). Therefore, all influencing factors have sufficient predictive ability to complete the establishment of models.

Production of the Debris Flow Susceptibility Maps
In the paper, grid cells with a size of 100 m × 100 m were utilized to make the debris flow susceptibility maps. We used the LetNet-5 structure to build the S-CNN and V-CNN models by using the same training set. All codes are implemented using Python under the TensorFlow framework. TensorFlow is a Google open-source software that is widely used in DL algorithms [70,71]. The main compiler was PyCharm5.0.3, which was used for building the S-CNN and V-CNN models and doing the GUI production. The adjustment of hyperparameters was mainly carried out by using the GUI, and the final parameters were determined through a trial-and-error method and previous experts' previous experience, as shown in Table 6. Next, we used the training set to train the S-CNN model, and the model applied to the whole research area after the validation set reached a certain accuracy. The probability of debris flow in each grid cell was obtained. The debris flow susceptibility map was generated in an arcgis10.5 environment by using the point to raster tool. For better visualization, the debris flow susceptibility map was divided into five classes of very low, low, moderate, high, and very high by natural breaks method, as shown in Figure 14a. The area percentages of each class were 69.92% (very low), 4.11% (low), 3.92% (moderate), 5.32% (high), and 16.73% (very high). To facilitate the comparison of susceptibility maps, the same break values of the classes as the S-CNN model were used for all the maps.
The V-CNN model was applied to the study area similarly. Then the debris flow susceptibility map using the V-CNN model was produced, as shown in Figure 14b. And the area percentages of each class were 73.36% (very low), 2.15% (low), 2.01% (moderate), 3.66% (high), and 18.82% (very high).
In the SVM model, the radial basis function (RBF) kernel was applied [72][73][74]. The penalty coefficient C and kernel width γ were used after the trial-and-error method. In the optimization function, the penalty coefficient C mainly balanced the relationship between the complexity of the model and the misclassification rate, which can be understood as the regularization coefficient and kernel width γ mainly defined the influence of a single sample on the entire classification hyperplane [73]. Finally, the optimized C and γ were determined to be 1, and 0.1, respectively. The model building process and the production of debris-flow susceptibility maps were mainly implemented in an IBM SPSS Modeler18.0 and arcgis10.5 environment. Similarly, the debris flow susceptibility map was also reclassified into five classes of very low, low, moderate, high, and very high, as shown in Figure 14c. The results showed that the very low class had the largest percentage of 68.42%, followed by very high (14.46%), low (6.55%), high (5.65%), and moderate (4.92%).
In this paper, the multilayer perceptron with two hidden layers was used to build the ANN model [75][76][77]. To ascertain whether the convolutional layers and subsampling layers can mine more information from the data, the two hidden layers used the same number of neurons as the full connection layers. Meanwhile, the logistic sigmoid was utilized to determine the probability of a debris flow occurring. Then the ANN model was constructed using the training set in an IBM SPSS Modeler18.0 environment. Similarly, the debris flow susceptibility map was produced in an arcgis10.5 environment. In the same way, the debris flow susceptibility map was reclassified into five classes as very low, low, moderate, high, and very high, as shown in Figure 14d. The results showed that the very low class had the largest percentage of 72.35%, followed by very high (12.64%), high (9.26%), moderate (3.12%), and low (2.63%).

Model Validation
The results of the ROC curves and AUC values of the four models using the training set and validation set are shown in Figures 15 and 16. It was obvious that the S-CNN model had the highest AUC value of 0.946 in the training set, followed by the SVM model (0.935), the V-CNN model (0.910), and the ANN model (0.903). In addition, the validation set showed that the S-CNN model had the highest AUC value of 0.901, followed by the SVM model (0.858), the V-CNN model (0.852) and the ANN model (0.815) which meant that the S-CNN model had the best predictive ability over the other three models.  Additionally, three other statistical methods, ACC, F-measure, and MAE, were also used to compare the model's accuracy using the validation set. The results were shown in Figure 17. It was obvious that the S-CNN model had the highest ACC value (0.811), the highest F-measure value (1.204), and the lowest MAE value (0.189), which meant that the S-CNN model was better than the other three models. The comparison results of the statistical methods and ROC curves were consistent, which proved that the S-CNN model can effectively improve the spatial prediction accuracy of DFSM.  Figure 18 shows the percentage of the five classes of the different models in DFSM. The results of the four models were acceptable according to the hierarchical distribution. In the very low grade, the V-CNN model had the highest proportion of 73.36% and the SVM model had the lowest proportion of 68.42%, which indicated that most of the research area was in the very low grade. In the low grade, the highest value was 6.55% in the SVM model and the lowest value was 2.15% in the V-CNN model. In the moderate grade, the proportion of the four models was significantly different, including the SVM model (4.92%), the S-CNN model (3.92%), the ANN model (3.12%) and the V-CNN model (2.01%). In the high grade, the ANN model had the highest value of 9.26% and the V-CNN model had the lowest value of 3.66%. For the very high grade, the V-CNN model had the highest value of 18.82%, followed by the S-CNN model (16.73%), the SVM model (14.46%), and the ANN model (12.64%). To more accurately compare susceptibility maps, a comparison of the susceptibility values was carried out on a pixel-by-pixel basis, to reveal spatial patterns in the differences among susceptibility maps [78]. And the results were shown in Figure 19. The model with a higher AUC value was selected as the benchmark, and six comparison maps were obtained, namely "S-CNN and SVM", "S-CNN and V-CNN", "S-CNN and ANN", "SVM and V-CNN", "SVM and ANN", and "V-CNN and ANN". It was obvious that the relevant differences were encountered and not evenly distributed. Some spatial features can be observed in almost every comparison map. The difference was mainly in the southeast of Jilin Province, where 94% debris flows occurred. It can be inferred that the greater the difference of AUC values between models, the more obvious the relevant difference in the comparison maps.

Discussion
DFSM is one of the most effective approaches for debris flow evaluation and management tasks. Although many methods have been applied to DFSM, there is no benchmark for model selection. In this paper, four models (S-CNN, V-CNN, SVM, ANN) were used to study the spatial distribution of debris-flow susceptibility in Jilin Province, China.
To ensure the quality of the models, influencing factors selection is essential. Then a multicollinearity test was prepared. The results showed that all influencing factors were independent and effective ( Table 4). The results of the GR method ( Figure 13) showed that all influencing factors had enough predictive ability for the occurrence of debris flows. In addition, the FR method was introduced to measure the relationships between influencing factors and debris flows ( Table 5). The results showed that debris flows had the greatest impact on low elevation terrain. The reason may be that the area with low elevation has a larger slope value in the research area. Slopes from 5° to 10° had the highest FR value (1.923). The main reason may be that there are less vegetation coverage and sufficient rain in this area. The southeast aspect had the highest effect on debris flows because the southeast aspect may have the most wetness among other aspects. Regarding the plan curvature, the concave had the highest effect on debris flows occurring. This result may be closely related to human engineering activities. The relationships between profile curvature and debris flows were roughly linear. The profile curvature between 0.06 and 0.28 had the highest impact on debris flow occurrences which had the biggest catchment area. TWI represents the ability of soil moisture to reach saturation. In this research, TWI values between 9 and 11 had the highest impact on debris flows. The distances less than 1000 m from the road exerted the highest impact on debris flow occurring since the closer to the road, the stronger the human activity. The distances between 2000 to 2500 m from the river had the highest FR value (1.975). This may be related to the density of the river networks. Although most debris flows are distributed in group 1 and group 2 area, group 3 had the highest relationship on debris flow occurrences. This may be closely related to the terrain conditions and the anti-weathering ability of rocks in the southeast of Jilin Province. Population density between 10 and 100/km 2 had the highest FR value. It can be inferred from the topography factor that the class (10-100/km 2 ) is located in mountainous areas, so it is closely related to the occurrence of debris flows. From the perspective of annual rainfall, the 600-800 mm class was most closely associated with debris flows. This may be related to the dual effect of rainfall and soil. In addition, the valley area was most closely related to debris flows among the four topography class. This is mainly due to the dense distribution of rivers and the large slope of the slope in the area. It can be seen from vegetation cover that debris flows mostly occurred in areas with more vegetation because the areas are mostly mountainous and receive plenty of rain. To find out the contribution of influencing factors to the distribution of debris-flow susceptibility, we excluded one factor in turn to construct the four models. The results were shown in Figure 20. All influencing factors have contributed to the distribution of debris-flow susceptibility. Lithology has the highest contribution value of 19.8% in the S-CNN model, while elevation and distance to rivers have the lowest contribution value of 4.2% in the SVM model. The same factor has different contributions on different models. The difference between the SVM and V-CNN models is small, but the difference between the S-CNN and ANN models is large. The ROC curve results showed that the accuracy of the S-CNN model was far higher than that of the other three models, which indicated that the CNN structure that used SAME padding had the highest accuracy in a debris flow susceptibility evaluation. Then some detailed comparisons can be summarized as follows. First, for the V-CNN and ANN models, the accuracy of the V-CNN model was 0.852 in the validation set, which was higher than that of the ANN model (0.815) when they were inputted with the same data, indicating that the convolution layers and subsampling layers uncovered more effective information from the data that made the model more accurate. This comparative case showed that the CNN model was indeed available and better than the ANN model for debris flow susceptibility assessments. Second, by comparing the SVM and V-CNN models, we found that the accuracy of the SVM model was 0.6% higher than that of the V-CNN model in the validation set. The reason was that the convolution layers and subsampling layers used VALID padding to process the data, resulting in the loss of debris flow information (See Figures 8 and 9), which made the accuracy of the V-CNN model lower than that of the SVM model. Therefore, we used a different method named SAME padding to process the data. On this basis, the S-CNN model can make full use of all the information in the process of convolution and subsampling operations compared with the V-CNN model (See Figures 11 and 12). As we can see, the matrices became smaller after the convolution operation by the use of VALID padding. It meant that the edge information of the matrices was used only once but the information in the middle of the images was used many times, which led to the unbalanced use of the information. In contrast, SAME padding added a certain range of blank areas around the matrices during the convolution manipulation, which meant that all the data were used evenly. During the max pooling manipulation, VALID padding discarded the edge information of the matrices, which resulted in the loss of debris flow data. However, SAME padding also added a certain range of blank areas around the matrices, and then all the information can be used effectively. Therefore, the S-CNN model can definitely obtain a more accurate debris flow susceptibility map compared with the V-CNN model. Finally, it was found that the accuracy of the S-CNN model was 4.3% higher than that of the SVM model in the validation set, which meant that SAME padding was more suitable to handle the debris flow data than VALID padding because SAME padding in the CNN structure can make full use of the debris flow data and extract more valuable information than the V-CNN model.
The differences between CNN and traditional ML algorithms (SVM and ANN) can be summarized as follows. First, the format of input data is different. The input data of CNN and traditional ML algorithms are 4D-tensor and text format, respectively. Second, the structures of the algorithms are different. The convolution and subsampling layers are unique structures in CNN. The function of convolution layers is to extract different features of input data. The subsampling layers are used to compress the extracted features. Third, the means of building models are different. CNN and traditional ML algorithms are based on TensorFlow framework and SPSS Modeler software to build models, respectively.
The strength of this paper is the successful application of 2D-CNN to a susceptibility evaluation of debris flow and the use of the S-CNN model for the first time to study the debris flow susceptibility in Jilin Province. Compared with the traditional ML methods, the S-CNN model used the convolution and subsampling operations to dig out more rules from the data. In contrast to the V-CNN model, the S-CNN model can retain data and use it completely. The ROC curves and three statistical methods results show that the S-CNN model had the best performance. And the shortcomings of this paper are the uncertainty of the CNN model parameters and structure. At present, there is no accepted standard for selecting CNN parameters and structures. Both the parameters and the structure are the local optimal solutions acquired after continuous attempts. This paper used the LetNet-5 structure, and the size of the convolution and subsampling layers were derived from experience, which made the precision of the CNN model was not necessarily optimal. This also points out a potential new direction for our research, namely, the optimization of CNN parameters.

Conclusions
In debris flow prone areas, it is very important to establish a robust, stable and accurate model for the spatial prediction of debris flow susceptibility. Although many well-performing models have been used in regional research, there is still room for improvement in increasing model accuracy. For this reason, we have carried out the research in this paper. Then the following conclusions can be inferred.
1. The AUC value of the V-CNN model (0.852) was higher than that of the ANN model (0.815).
And three mathematical statistical methods also showed that the V-CNN model had smaller errors. The research indicated that the convolution layers and max pooling layers can extract more data patterns than pure full connection layers. Therefore, CNN models are more suitable for studying debris flow susceptibility than traditional ML methods such as ANN. 2. The CNN models based on VALID padding still have some shortcomings. From the ROC caves and mathematical statistics, the accuracy of the V-CNN model was lower than that of the SVM model. This indicated that the method of VALID padding processing of the data was not the optimal choice. Therefore, SAME padding was selected to process debris flow data in this paper. 3. SAME padding is more suitable for processing debris flow data compared with VALID padding.
The S-CNN model obtained the highest AUC value and the minimum error. Obviously, the S-CNN model can make full use of the data and dig out more valuable information than the other three models. 4. We compare the susceptibility maps on a pixel-by-pixel basis and six comparison maps are produced. By observing the differences distribution of comparison maps, we draw a conclusion that the highest AUC map may not have the best predictive ability in some areas. This may be related to geotechnical and geomorphological reasons of differences and systematic errors.