Identification of the Debris Flow Process Types within Catchments of Beijing Mountainous Area

The distinguishable sediment concentration, density, and transport mechanisms characterize the different magnitudes of destruction due to debris flow process (DFP). Identifying the dominating DFP type within a catchment is of paramount importance in determining the efficient delineation and mitigation strategies. However, few studies have focused on the identification of the DFP types (including water-flood, debris-flood, and debris-flow) based on machine learning methods. Therefore, while taking Beijing as the study area, this paper aims to establish an integrated framework for the identification of the DFP types, which consists of an indicator calculation system, imbalance dataset learning (borderline-Synthetic Minority Oversampling Technique (borderline-SMOTE)), and classification model selection (Random Forest (RF), AdaBoost, Gradient Boosting (GBDT)). The classification accuracies of the models were compared and the significance of parameters was then assessed. The results indicate that Random Forest has the highest accuracy (0.752), together with the highest area under the receiver operating characteristic curve (AUROC = 0.73), and the lowest root-mean-square error (RMSE = 0.544). This study confirms that the catchment shape and the relief gradient features benefit the identification of the DFP types. Whereby, the roughness index (RI) and the Relief ratio (Rr) can be used to effectively describe the DFP types. The spatial distribution of the DFP types is analyzed in this paper to provide a reference for diverse practical measures, which are suitable for the particularity of highly destructive catchments.


Introduction
Debris flow is one of the most influential natural disasters in mountainous areas [1,2] and it periodically causes a large number of losses of lives and properties as well as the destruction of ecosystems and infrastructures [3].Debris flow, including water-flood, debris-flood, and debris-flow, is a constant threat to mankind and human achievements.The destruction that is based on different magnitudes of debris flow is characterized by the distinguishable sediment concentration, density, and transport mechanisms [4][5][6].Researchers have paid great attention to susceptibility assessments of the debris flow disasters [7][8][9][10].However, the studies failed to emphasize the practical problem that different disasters require different strategies to maintain the targeted solutions at the policy level.
Therefore, the identification of the dominating debris flow process (DFP) type within catchments is of paramount importance in determining accurate and efficient tools that are necessary for the delineation and mitigation strategies in the early planning period [11].
Researchers showed that geomorphic parameters can be used to identify the catchment types.Terrain analysis explores the catchment formation mechanisms of different disaster types by revealing the relationship between the river basin size and its contribution to the basin [12][13][14][15].Melton's ruggedness number has been used to obtain a rapid first approximation of the potential debris flow disaster [16][17][18][19].Additionally, it has been demonstrated that the Melton ratio, when combined with the catchment length, can be effectively used to differentiate between catchments that are prone to debris flow and debris flood [20].Discriminant analysis using morphometric variables indicated that the basin area and fan gradient can be used to differentiate the debris flow and fluvial fan types that are based on the process [21].Other studies have indicated that the standard deviations of the slope gradient and slope aspect are strong predictors for the identification of debris flow [22].The assessment indicator system of debris flow with a variety of parameters has been established in previous studies.However, the contribution of the parameters to the DFP identification has not been previously studied.Therefore, this study aims to calculate the catchment parameters to determine the most significant parameter in the identification of DFP types.
Debris flow usually occurs coincidentally, therefore, recorded hazardous events show an imbalance in the number of different types.Traditional classification models usually improve the model performance by minimizing the classification errors in such a way that the majority class can be correctly predicted, whereas samples from the minority class tend to be incorrectly predicted [23].Examples of the minority class are usually of primary interest and their correct recognition is more important than the recognition of examples from the other classes.Such a situation often occurs during hazard assessment, where the number of destructive events that require more attention is much smaller than the number of events that are not as devastating.So far, the strategies dealing with the imbalanced dataset can be divided into three categories: under-sampling (BalanceCascad, EasyEnsemble) [24], over-sampling (Synthetic Minority Oversampling Technique (SMOTE), k-nearest neighbor (KNN)) [23], and data cleaning (Tomek links, neighborhood cleaning rule (NCL)) [25].These methods have shown a great deal of success in domains, such as fraudulent telephone calls [26], telecommunications management [27], text classification, and the detection of oil spills from satellite images [28].With respect to the smaller datasets, over-sampling usually shows a better performance due to the limitation of the samples.Among all the over-sampling learning methods, borderline-SMOTE is an extension that generates synthetic samples while considering the data distribution [29].
Traditionally, the DFP types were identified based on the geomorphologic expertise [30].For quantitative studies, empirical models were used to establish the empirical relationship between the geometric parameters and the DFP types [31,32].In the early stage, representative models for quantitative prediction, including the logistic regression, Bayes discriminant, and neural network, were widely used [33][34][35].Recently, machine learning ensembles and hybrid methods have received substantial attention in many fields due to their improved performance when compared with conventional methods [36,37].Scholars have applied Random Forest (RF) and Support Vector Machines to flood risk assessment [7,38].Nevertheless, ensemble frameworks for the identification of the DFP types have rarely been explored.
Recently, the prosperity of suburban tourism has increased the attention to the safety and stability of the mountainous area.Beijing, as the political, economic, and cultural center of China, is located between the Yan Mountains and Taihang Mountains.With the climate changing, Beijing mountainous area has repeatedly experienced serious debris flow during the summer in recent years [39][40][41][42].In this condition, accurately making the targeted strategies for the debris flow disasters with various destruction powers becomes a challenge.Nowadays, a lot of work has focused on the debris flow hazard assessment on regional or catchment scales.However, few studies have set their sights on Water 2019, 11, 638 3 of 26 identifying the specific DFP type and deducing the dominating DFP type within catchments in Beijing, which is of vital importance for the prevention and mitigation of disasters.Therefore, using the documented debris flow disaster events and remote sensing images, we herein present a method that is based on morphometric criteria for the assessing of a first approximation of the DFP type within catchments in Beijing mountainous areas.We are supposed to determine the dominant DFP type by analyzing the morphometric parameters that are contingently connected to flowing.
In the rest of this paper, an integrated framework was established, which consists of indicator system establishment, imbalanced dataset learning, and classification model selection.The indicator system that is used in this study can be divided into parameters that are related to catchment shape and relief gradient, respectively.The imbalanced sample dataset was resampled while using borderline-SMOTE.The ensemble learning models RF, AdaBoost, and GBDTwere used to identify the DFP types.Finally, we analyzed the spatial distribution of the DFP types to provide environment management of the Beijing mountainous area a reference for well-directed measures, which are suitable for highly vulnerable regions.

Study Area
The mountainous regions that surround Beijing constitute an estimated area of 10,417.5 km 2 , accounting for 62% of the surface terrain, which extends over distances of 160 km from east to west and 176 km from south to north (Figure 1).Five rivers are distributed in the study area, that is, the Daqing Rivers, Yongding Rivers, Beiyun Rivers, Chaobai Rivers, and Jiyun Rivers, with more than 100 tributaries.Beijing is located in the semi-arid and semi-humid continental monsoon climate zone with four distinct seasons.The annual mean temperature ranges from 10 • C to 12 • C and the annual mean precipitation ranges from 238 mm to 514 mm.Approximately 75% of the precipitation occurs in the wet season, from June to September.Peak storms occasionally occur in the summer season, which usually trigger debris flow.The mountainous area of Beijing is a complex geological structure with complex folds and fractures and shale joints.The composition of the bedrock and the damaging of rock masses that are induced by tectonics and weathering favor the production of loose eluvial deposits, which are the main sources of the solid material involved in debris flow [43].Due to the special geographical and climatic conditions, the catchments in Beijing are sensitive to floods, landslides, debris flows, and other natural disasters.Most debris flows in the Beijing mountainous area are distributed in the Western Mountains, Jundu Mountains, and Yan Mountains, which are separated by the Guan Gully and Chao Rivers [44].Among these debris flow events, the most famous one was the thunderstorm that occurred on 21 July 2012, in the Beijing mountainous area.It caused up to 79 fatalities and losses of over RMB 100 billion.

ASTER-GDEM (Version 2)
The Ministry of Economy, Trade, and Industry (METI) of Japan and the United States National Aeronautics and Space Administration (NASA) jointly announced the release of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model Version 2 (GDEM V2) on 17 October 2011.ASTER-GDEM provides the only high-resolution elevation image dataset that covers the global land surface.The data covering the Beijing mountainous area were downloaded from the geospatial data cloud website (http://www.gscloud.cn/) of the Computer Network Information Center (CNIC).Based on the 30 m spatial resolution DEM data, the small catchments in the study area were extracted while using an ArcGIS 10.2 hydrological module [45].

ASTER-GDEM (Version 2)
The Ministry of Economy, Trade, and Industry (METI) of Japan and the United States National Aeronautics and Space Administration (NASA) jointly announced the release of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model Version 2 (GDEM V2) on 17 October 2011.ASTER-GDEM provides the only high-resolution elevation image dataset that covers the global land surface.The data covering the Beijing mountainous area were downloaded from the geospatial data cloud website (http://www.gscloud.cn/) of the Computer Network Information Center (CNIC).Based on the 30 m spatial resolution DEM data, the small catchments in the study area were extracted while using an ArcGIS 10.2 hydrological module [45].

Debris Flow Inventory
The precision of the debris flow inventory greatly affects the reliability of the analysis results [46].In this study, the debris flow inventory of the Beijing mountainous area was obtained from the list Water 2019, 11, 638 5 of 26 in the document, named Debris Flow in Beijing Mountain Area [42].The latitudes and longitudes were obtained based on Google Earth (http://www.earth.google.com).The spatial locations were then checked using optical images and aerial photographs.As mentioned in the document, Zhong et al. classified the historical events into three types based on the analysis of deposits and the simulation experiment.Finally, 705 debris flow events were classified into three types that were based on the criteria shown in Table 1.In this study, we determined the dominating DFP type of the catchment according to the debris flow events that occurred in the catchment.We sampled the catchments based on the following two assumptions [11].Firstly, only catchments with at least two historical events were selected.Secondly, 80% of all debris flow events in a catchment that belong to the same type were selected.As a result, we obtained a total of 90 catchment samples, including 13 water-flood (Figure 2a), 44 debris-flood (Figure 2b), and 33 debris-flow (Figure 2c) catchments.The precision of the debris flow inventory greatly affects the reliability of the analysis results [46].In this study, the debris flow inventory of the Beijing mountainous area was obtained from the list in the document, named Debris Flow in Beijing Mountain Area [42].The latitudes and longitudes were obtained based on Google Earth (http://www.earth.google.com).The spatial locations were then checked using optical images and aerial photographs.As mentioned in the document, Zhong et al. classified the historical events into three types based on the analysis of deposits and the simulation experiment.Finally, 705 debris flow events were classified into three types that were based on the criteria shown in Table 1.In this study, we determined the dominating DFP type of the catchment according to the debris flow events that occurred in the catchment.We sampled the catchments based on the following two assumptions [11].Firstly, only catchments with at least two historical events were selected.Secondly, 80% of all debris flow events in a catchment that belong to the same type were selected.As a result, we obtained a total of 90 catchment samples, including 13 water-flood (Figure 2a), 44 debris-flood (Figure 2b), and 33 debris-flow (Figure 2c) catchments.Water 2019, 11, 638 6 of 26

Parameters
Several studies have indicated that local flood-producing processes may be more easily analyzed in typical small-scale catchments than in large-scale ones in which the regional combination and interplay of controls is more important [48,49].Therefore, the area of catchments that were analyzed in this study mainly varies from 3 to 50 km 2 .Parameters that related to the catchment shape and relief gradient can be used to model different processes [17,18,20,22].We selected the circularity ratio (Cr), elongation ratio (Er), drainage density (Dd), and form factor (Ff) to characterize the shape features of the catchments.The roughness index (RI), Melton ratio (Mr), elevation relief ratio (Err), and relief ratio (Rr) were used to characterize the topographic features of the catchments.Table 2 defines these parameters.
Table 2. Morphometric parameters related to the catchment shape and gradient relief.

Definition Function Parameter Reference
Parameters related to the catchment shape The circularity ratio (Cr) reflects the roundness of a catchment based on the analysis of the relationship between the area and circumference of catchment [50,51].
A is the catchment area, km 2 ; C is the catchment circumference, km [50,51] The elongation ratio (Er) is defined as the ratio of the diameter of a circle with the same area as the catchment to the maximum catchment length [52].
Er = D Bl D is the diameter of the circle which has the same area as the catchment, km; Bl is the maximum length of the catchment, km [52] The drainage destiny (Dd) within the catchment per unit of area is the simplest and most convenient tool for the characterization of the degree of drainage development [53,54].
∑ L is the total length of the streams, km; A is the catchment area, km 2 ; both are given in units of the same system [53,54] The form factor (Ff) is defined as the ratio of the catchment area to the square of the catchment length [54].
A is the catchment area, km 2 ; Bl is the maximum length of the catchment, km [54] Parameters related to the relief gradient The roughness index (RI) is the ratio of the surface area to its projected area [55].
S 1 is the surface area, km 2 ; S 2 is the projected area, km 2 [55] The Melton ratio (Mr) is an index of the catchment ruggedness equal to the basin relief divided by the square root of the catchment area [56].
The elevation relief ratio (Err) is the ratio of the difference between the average and minimum elevations of the catchment to the catchment relief [57].Er = h mean −h min h max −h min h mean , h min , and h max are the mean, minimum, and maximum elevation of the catchment, km, respectively [57] The relief ratio (Rr) is the dimensionless height length ratio [58].Rr = R Bl R is the relief of the catchment, km; Bl is the maximum length of the catchment, km [58] 3.2.1.Parameters Related to the Catchment Shape Cr is affected by the lithological character of a catchment.The closer the Cr is to 1, the closer the catchment shape is to a circle.The ratio is more influenced by the length, frequency, and gradient of various orders than by the slope conditions and the drainage pattern of the catchment [59].The areal properties express the planform and dimensions of the catchment.The Cr has been proven to be very promising for the characterization of the sediment dynamics [11].To facilitate the understanding of this parameter, values of 0.79 and 0.61 are usually used as the thresholds for measuring the approximation to a rectangle or triangle [60].
The Er indicates that the catchment may be affected by faults and other tectonic activities; a high value of Er also illustrates that the catchment is prone to erosion or accumulation [59].An Er value that is close to 1 indicates that the catchment shape is more like a circle.The Er varies from 0.6 to 0.8, indicating that the catchment has strong relief and steep slope.The higher the Er is, the higher the chances that the catchment has a higher infiltration capacity and lower runoff.In contrast to more circular catchments, the runoff in highly elongated catchments must travel greater distances to reach the catchment outlet.Therefore, a strong fluctuation and high Er are favorable morphometric conditions for debris-flow process [22].
The Dd difference is widely applied in the characterization of the physiographic age, as proposed by Davis [61,62].The Dd varies with the rainfall, relief, infiltration capacity of the soil, and initial anti-erosion ability of the terrain.Therefore, catchments with a higher Dd usually have a more fragmented surface and worse water impermeability [51,63].
Horton proposed the Ff to predict the flow intensity of a catchment in a defined area.The Ff has an inverse relationship with the square of the axial length and a direct relationship with the peak discharge [54].

Parameters Related to Relief Gradient
The RI reflects the dispersion and collection ability of rainfall runoff.It indicates the local diversities of the elevation and slope.Moreover, the surface roughness affects the hydraulics of overland flow and sediment transport mechanics by increasing the flow resistance that is associated with microtopographic features [64].The roughness of the slope is not conducive to the runoff, while it is conducive to flood generation.
The Mr, which is a dimensionless parameter, is used for measuring the roughness and average slope of the catchments [13,56].It effectively characterizes the geological disaster process type of the river basin [18].It also reflects the tectonic activity and the sediment transportation ability of the catchments.The Mr of the debris-flow dominated catchment is usually higher than 0.5, with the slope of the catchment being greater than 4 • [19].As mentioned in the study of Welsh et al., 0.3 and 0.6 can be used as the thresholds for the identification of the DFP types [18].Based on the results from earlier studies, the Mr can be used to distinguish water-flood, debris-flood, and debris-flow processes [31,32].
The Err is one of the indicators of geomorphological dissection [57], which reveals the evolution of the catchment geomorphology [65].The ratio can be used to characterize the formation and processes of the catchments.The Err is a simplified index of the hypsometric integral.The Err value ranges from 0 to 1; and, 1 indicates the strongest intensify erosion of the catchment.
The Rr is equal to the tangent of the angle that is formed by two planes intersecting at the outlet of the catchment, where one represents the horizontal and the other passes through the highest point of the catchment [58].High Rr values indicate that the catchment tend to be located in the hilly regions, while the low values imply the plains and valleys.As for the stream slope, the inclinations of the ground surface are closely tied with its channel gradient and relief.Field studies showed a high degree of correlation between the high relief and fast drainage frequency.During the heavy rain, the fast drainage frequency and the steep stream channel slope lead to high discharge over a short duration [66].

Model and Method
In this study, the framework includes data acquisition and preprocessing, parameter calculation, samples over-resampling, and classification modelling (RF, AdaBoost, GBDT).The root-mean-square Water 2019, 11, 638 8 of 26 error (RMSE), mean absolute error (MAE), accuracy, recall, F1-score, kappa coefficient, and area under the receiver operating characteristic curve (AUROC) were used to measure model performance.By comparing the results, the model with the best performance was selected.
The procedure mainly consists of three parts.Firstly, based on the dataset that was collected from the documentation, the imbalanced dataset was resampled using the borderline-SMOTE model; secondly, classification models were constructed while using the training dataset and the parameters were calculated to improve the classification accuracy of the testing dataset; finally, the optimal classification model was obtained to finalize the type of the unknown catchments.Figure 3 shows a detailed overview of the modelling procedure.

Imbalanced Learning
In the imbalanced datasets, the number of samples of a given class is much higher than that of other classes.To obtain a higher overall accuracy, most of the traditional classifiers tend to favor the majority class, which has a large number of samples [67].In this case, the imbalanced datasets require special attention.Class imbalance learning is a new learning problem that aims to deal with datasets with extremely skewed class distributions.Traditional methods that are used to release the restriction

Imbalanced Learning
In the imbalanced datasets, the number of samples of a given class is much higher than that of other classes.To obtain a higher overall accuracy, most of the traditional classifiers tend to favor the majority class, which has a large number of samples [67].In this case, the imbalanced datasets require special attention.Class imbalance learning is a new learning problem that aims to deal with datasets with extremely skewed class distributions.Traditional methods that are used to release the restriction of imbalanced data include three categories: cost-sensitive learning, over-sampling, and under-sampling.Cost-sensitiveness is realized by adding a cost matrix consisting of a class misjudged punishment coefficient to raise the misjudgment cost weight of the default samples [68].The BalanceCascade approach is an informed under-sampling technique, which is used to effectively overcome the weakness of information loss by randomly removing the redundant samples with random under-sampling techniques [24].Over-sampling aims at increasing the samples of the minority class until they are equal to the majority class by randomly duplicating the minority class samples.The SMOTE, which creates artificial samples for the minority class, has been widely used to cope with the imbalanced ratio and a good performance has been achieved [23].The process works as follows.
Let x i be an instance from the minority class.To create an artificial instance from x i , SMOTE first isolates the k-nearest neighbors of x i , from the minority class.Subsequently, it randomly selects one neighbor and then generates a synthetic example along the imaginary line connecting x i and the selected neighbor.
However, inevitable weakness exists.The selected neighbor and current sample may be in different classes.To further address the weakness, researchers presented a modified minority over-sampling method, that is, borderline-SMOTE, in which only the minority examples that are near the borderline are over-sampled while using SMOTE [29].This study includes fewer water-flood samples than debris-flood and debris-flow samples, which could affect the accuracy of the classifiers.To deal with this, we selected the borderline-SMOTE method to preprocess the sample datasets and then obtained a final sample dataset, including 44 samples for each class.

Model Training
Ensemble methods use multiple learning algorithms to improve the predictive performance of the constituent learning algorithms [69].Unlike a statistical ensemble in statistical mechanics, which is usually infinite, a machine learning ensemble only consists of a concrete finite set of alternative models, which typically allows for much more flexible structures.Based on a combination of the strategies, alternative methods can be divided into two categories, averaging and boosting methods.The driving principle of the averaging methods is to independently build several estimators and subsequently average their predictions.On average, the combined estimator is usually better than any of the single-based estimators, because its variance is reduced (e.g., bagging methods, forests of randomized trees).In contrast, the base estimators of the boosting methods are sequentially built and then one tries to reduce the bias of the combined estimator (the former estimator).The motivation is to combine several weak models to produce a powerful ensemble, for example, AdaBoost and Gradient Boosting.

Random Forest
Random Forest (RF) is a combination of tree predictors, such that each tree depends on the values of an independently sampled random vector, with the same distribution for all trees in the forest.The bootstrap resampling method is used to extract multiple samples from the original data.A classification tree is constructed for each bootstrap sample, the predictions of all taxa are combined, and the final result is obtained by voting [70].The basic idea of RF is to combine multiple weak classifiers to form a strong classifier.These weak classifiers, which play a complementary role, reduce the impact of a single classifier error to improve the classification accuracy and stability.Randomness in the RF is the result of two randomization processes: firstly, a bootstrap sample is taken from the learning set for each tree; and secondly, a subset of the explanatory variables is randomly selected at each node.RF, as a natural nonlinear modeling tool, effectively solves multivariate predictions and therefore it is applied in many fields [71][72][73].Furthermore, the RF model has achieved a good performance in flooding disaster assessment and risk analysis [7,74,75].
In this study, the RF model for the DFP type identification was implemented with the Python programming language.In this study, we used bootstrap sampling to extract the k samples from the original training set, and the size of each sample was the same as that of the original training set; a k decision tree model was established for the k samples to obtain k classification results.Based on the k classification, the results of each record determine its final classification.
where, H(x) represents the composition classification model, h i is the single decision tree classification model, Y is the output variable, and I(x) represents the function.
The number of features that are randomly chosen at each node is a key parameter of the RF, which may affect the stability of the model.The sensitivity of other parameters, such as the number of trees in the forest, as well as the size of each tree (i.e., the minimum number of samples for splits) have also been studied [76][77][78].These RF parameters can be made by means of resampling techniques, such as bootstrap or cross-validation.In this study, the number of features for the best split was set to the square root of the total feature number (sqrt), the number of trees was 60, and the minimum number of samples to split was set to 2. Additionally, we adopted the balanced mode to automatically adjust the weights for each class.

AdaBoost
AdaBoost is an ensemble machine learning technique that was initiated by Freund and Schapire [79].As a boosting method, AdaBoost is designed to sequentially build a series of classifiers from the weights of the sample, which were adjusted according to the error of previous predictions [80].At a specific training stage, the learning weights of the samples with higher prediction errors from previous models are increased, while the learning weights of the samples with lower prediction errors are decreased.As the iterations proceed, samples that are difficult to predict receive more attention, lowering the global prediction error, are decreased.The final model is a linear combination of these base estimators with better classifiers generating higher coefficients, and vice versa.The base estimator is used as a classification and regression tree (CART) to estimate the feature importance after model fitting.
AdaBoost is sensitive to noisy data and outliers.In some cases, it can be less susceptible to the overfitting problem than other learning algorithms.Individual learners can be weak so long as the performance of each one is slightly better than random guessing; the final model can converge to a strong learner.
In this study, two user-configurable parameters were used for the AdaBoost training procedure, that is, the learning rate for every tree, which was set to 0.05, and the number of boosting stages, which was set to 100.

Gradient Boosting
Gradient Boosting (GBDT) is an integrated learning algorithm that consists of gradient boosting and decision trees and it automatically searches nonlinear interplay by decision-tree learning with minimal error [81].The GBDT is a supervised machine learning algorithm and it comprises a family of powerful machine-learning techniques that have yielded promising results in a wide range of practical applications [82].The GBDT is a type of additive model that performs classifications by combining decisions from a sequence of base classification tree models [83].The GBDT uses a model ensemble technique, called gradient boosting, which iteratively builds a model, while improving the performance of the previous iteration model.
The name "Gradient Boosting" originates from the association of this method with gradient descent optimization [83], which is commonly used to solve classification problems by finding a local minimum of the loss function.
Similarly, let g t (x) be the classification tree trained at iteration t, L[y i , g(x i )] be the loss function, and N be the number of observations; at each gradient boosting iteration, the algorithm determines a classification tree f t , which moves g t in the negative gradient direction −∂L/∂g by a step-size of η.Hence, f t is chosen to be, and the algorithm sets, For classification problems with the sum-squared loss function, Therefore, f t can be written, as follows, In this study, some of the parameters of GBDT were set in advance.The learning rate for every tree was set to 0.1, the number of boosting stages to perform was set to 60, the depth for every tree was set to 6, the loss function was set as deviance, and 80% of the samples were used for fitting the individual base learners.

Model Validation
The goodness of fit for the classification model was evaluated while using a set of quantitative criteria, including the RMSE, MAE, recall (sensitivity), accuracy, F1-score, kappa coefficient, and AUROC.
The RMSE and MAE are often used for the validation of models, and are defined, as follows, where, y is the vector of the observed values and ŷ is the vector of N predictions.The confusion matrix is an efficient tool in describing the relationship between prediction and observation.The confusion matrix consists of true positive (TP), false positive (FP), true negative (TN), and false negative (FN).By definition, TP is the number of correctly classified catchments.The FP is the number of incorrectly classified catchments.The TN is the number of catchments that are correctly classified as two other types and FN is the number of catchments that are incorrectly classified as two types.The higher the TP is and the lower the FP, the better the results [84].Based on the four possible consequences, the recall, accuracy, F1-score, and Cohen's kappa criteria are formulated as: The Kappa coefficient is 0.6~0.8 and 0.8~1, representing a substantial and almost perfect agreement between the estimation and observation, respectively [85].
The receiver operating characteristic (ROC) curve is another useful and standard way of assessing the predictive power and the quality of probabilistic models [36].Graphically, the x-axis and y-axis are plotted according to the sensitivity and 100-specificity, respectively [86].The AUROC is a quantitative index for identifying the general performance of the models [36].The higher the AUROC, the better the model performance.The AUROC ranges from 0.5 (for an inaccurate model) to 1 (a perfect model) [35], which can be computed as, In this study, the modeling and validating were implemented with the "scikit-learn", which is a package for machine learning in Python.Scikit-learn (http://scikit-learn.org) offers packages for ensemble learning, including packages for bagging and averaging methods.

Distribution of the Catchment Shape
The value of Cr ranges from 0.22 to 0.79, with a mean of 0.52.Thresholds of 0.61 and 0.79 were proposed to indicate the approximation to a triangle and rectangle, respectively.The catchments in the study area are more similar to triangles, which indicates that the permeability of the catchments is weak (Figure 4a).The value of Er is in the range of 0.42-0.9,with a mean of 0.67.A total of 74% of the catchments are in the range of 0.6-0.8,indicating that the catchments are in the active process of erosion and accumulation (Figure 4b).The value of Dd ranges from 0 to 1.96 km/km 2 , with a mean of 0.34 km/km 2 .The higher value is mainly obtained in catchments with an area that is below 10 km 2 , in which the dense surface runoff causes the sharp incision on the ground (Figure 4c).The value of Ff varies from 0.14 to 0.63, with an average of 0.36.Based on the equation, the value of 0.79 is a threshold differentiating the catchment from a circle.In contrast, lower values indicate a shorter axial length and a more intense flow discharge (Figure 4d).The boxplot below shows an overview of the basic parameter samples, grouped by the three defined DFP types (Figure 5).To compare the results, all the parameters were normalized while using the Min-Max method.Most morphometric variables that were selected in this study were sensitive to the identification of the DFP types.The mean value of RI for the water-flood catchments was demonstrated to be higher than the debris-flood and the debris-flow catchments.The Cr, Mr, and Rr, on average, show the higher debris-flow values and they significantly differ from the debris-flood and waterflood catchments.An exception is Dd, which displayed lower values for debris-flow catchments, which can also effectively identify debris-flow catchments from the other two types.However, the Er, Ff, and Err were not sensitive to the DFP types.

Models Validation and Comparison
The debris flow inventory dataset was partitioned into subsets of 80% and 20% (Pareto principle) to be used for training and testing, respectively.All three models (RF, AdaBoost, and GBDT) that were discussed in the previous sections were fitted to the training and testing datasets using the Python environment.Five-fold cross-validation on the training dataset tuned the parameters for the

Distribution of the Relief Gradient
The four parameters of the relief gradient have a similar spatial pattern, that is, they are high in the southwest and low in the northeast in the Beijing mountainous area.The RI value is in the range of 1.01-1.31,with a mean of 1.11.Figure 4e indicates the concentration of high values in the south of the study area.A low RI is often observed along streams or around lakes.The Mr value ranges from 0.04 to 0.91 and the mean value is 0.23.The Rr value varies from 0.02 to 0.45 and the mean value is 0.14.The distribution of the Rr is consistent with that of the Mr; high values are detected along the valley axes from the southwest to northeast of the study area.However, low values are mainly scattered in the northern part, which is known as the Yan Mountains (Figure 4f,h).The Err value varies from 0.05 to 0.89, with a mean value of 0.63.The value is concentrated in the range of 0.5-0.7.Apart from the catchments on the borderline between the mountains and plain and the catchments that are close to the lakes, the Err values are rather high (Figure 4g).
The boxplot below shows an overview of the basic parameter samples, grouped by the three defined DFP types (Figure 5).To compare the results, all the parameters were normalized while using the Min-Max method.Most morphometric variables that were selected in this study were sensitive to the identification of the DFP types.The mean value of RI for the water-flood catchments was demonstrated to be higher than the debris-flood and the debris-flow catchments.The Cr, Mr, and Rr, on average, show the higher debris-flow values and they significantly differ from the debris-flood and waterflood catchments.An exception is Dd, which displayed lower values for debris-flow catchments, which can also effectively identify debris-flow catchments from the other two types.However, the Er, Ff, and Err were not sensitive to the DFP types.The boxplot below shows an overview of the basic parameter samples, grouped by the three defined DFP types (Figure 5).To compare the results, all the parameters were normalized while using the Min-Max method.Most morphometric variables that were selected in this study were sensitive to the identification of the DFP types.The mean value of RI for the water-flood catchments was demonstrated to be higher than the debris-flood and the debris-flow catchments.The Cr, Mr, and Rr, on average, show the higher debris-flow values and they significantly differ from the debris-flood and waterflood catchments.An exception is Dd, which displayed lower values for debris-flow catchments, which can also effectively identify debris-flow catchments from the other two types.However, the Er, Ff, and Err were not sensitive to the DFP types.

Models Validation and Comparison
The debris flow inventory dataset was partitioned into subsets of 80% and 20% (Pareto principle) to be used for training and testing, respectively.All three models (RF, AdaBoost, and GBDT) that were discussed in the previous sections were fitted to the training and testing datasets using the Python environment.Five-fold cross-validation on the training dataset tuned the parameters for the

Models Validation and Comparison
The debris flow inventory dataset was partitioned into subsets of 80% and 20% (Pareto principle) to be used for training and testing, respectively.All three models (RF, AdaBoost, and GBDT) that were discussed in the previous sections were fitted to the training and testing datasets using the Python environment.Five-fold cross-validation on the training dataset tuned the parameters for the models were tuned by and the optimum ones were used in the final models.The performance of a model is given by the statistics parameters, kappa coefficient, and AUROC, which were evaluated while using bootstrap resampling [87].
Table 3 lists the training and testing results of the three models.The comparison of the training and testing metrics indicates a clear decrease in the accuracy and sensitivity of all the models.This indicates the overfitting of the models with the training data and that further model validation is necessary.The results show that the RF model has the highest accuracy and recall (0.752 and 0.75, respectively), followed by the GBDT and AdaBoost model.Additionally, the RF model has the highest kappa coefficient of 0.625, signifying a substantial consistency between prediction and observation.However, the RF model also has the lowest RMSE and MAE values of 0.544 and 0.265, respectively.The ROC curves for the three models were constructed using the training and testing datasets (Figure 6).The AUROC of the training dataset is high, indicating an almost perfect agreement between prediction and observation.In contrast, with respect to the validation dataset, the RF model yields the highest AUROC (0.73), followed by the GBDT (0.7), and then the AdaBoost (0.68) models.All of the models have an acceptable classification capability.With respect to the AUROC of each class using Random Forest, debris-flow has the highest AUROC (0.78), while water-flood and debris-flood yield values of 0.74 and 0.7, respectively.
Water 2019, 11, x FOR PEER REVIEW 16 of 26 models were tuned by and the optimum ones were used in the final models.The performance of a model is given by the statistics parameters, kappa coefficient, and AUROC, which were evaluated while using bootstrap resampling [87].Table 3 lists the training and testing results of the three models.The comparison of the training and testing metrics indicates a clear decrease in the accuracy and sensitivity of all the models.This indicates the overfitting of the models with the training data and that further model validation is necessary.The results show that the RF model has the highest accuracy and recall (0.752 and 0.75, respectively), followed by the GBDT and AdaBoost model.Additionally, the RF model has the highest kappa coefficient of 0.625, signifying a substantial consistency between prediction and observation.However, the RF model also has the lowest RMSE and MAE values of 0.544 and 0.265, respectively.The ROC curves for the three models were constructed using the training and testing datasets (Figure 6).The AUROC of the training dataset is high, indicating an almost perfect agreement between prediction and observation.In contrast, with respect to the validation dataset, the RF model yields the highest AUROC (0.73), followed by the GBDT (0.7), and then the AdaBoost (0.68) models.All of the models have an acceptable classification capability.With respect to the AUROC of each class using Random Forest, debris-flow has the highest AUROC (0.78), while water-flood and debrisflood yield values of 0.74 and 0.7, respectively.

Parameter Sensitivity Analysis Based on the RF Model
The importance of each parameter can be evaluated based on the worsening of the prediction if the parameter is randomly permuted.The parameter importance of each model was calculated during the training procedure with five-fold cross-validation.At the end of the training procedure,

Parameter Sensitivity Analysis Based on the RF Model
The importance of each parameter can be evaluated based on the worsening of the prediction if the parameter is randomly permuted.The parameter importance of each model was calculated during the training procedure with five-fold cross-validation.At the end of the training procedure, the importance of each parameter was obtained by averaging the difference, which was then normalized while using the standard deviation of all importance values of each parameter.
Figure 7 shows that the parameters can be broadly divided into three groups according to the evaluation results.The two parameters that are related to the catchment gradient relief, RI and Rr, occupy the top two ranks.The RI has the largest effect on the identification of the DFP types, contributing 17.7% to the classification.RI is an indicator of the microtopographic features and it affects the overland flow and sediment transport mechanics.As mentioned in Section 4.1 (Figure 5), the water-flood type showed a higher RI value than the other two types.The rougher topography surface increased the flow resistance during the sediment transport process, causing it to be more difficult for the solid material to move with the flow.Therefore, catchment with a higher RI value is more likely to induce water-flood.The Rr also influences the type identification, with a significant value of 15.4%.Rr is closely related to the channel gradient and relief, and the high Rr produces the high discharge with the more power.The Rr value of debris-flow type that is displayed in Section 4.1 (Figure 5) was much higher than the other types, for the debris-flow process with more solid material requires stronger carrying capacity.The second most important group of parameters includes the Mr, Err, Dd, and Cr, which contribute 14.2%, 13.7%, 12.5%, and 10.4% to the total classification, respectively.The last groups of parameters used are the Ff and Er, which rank seventh and eighth, indicating that the two parameters provide less information during the training procedure.
Water 2019, 11, x FOR PEER REVIEW 17 of 26 Figure 7 shows that the parameters can be broadly divided into three groups according to the evaluation results.The two parameters that are related to the catchment gradient relief, RI and Rr, occupy the top two ranks.The RI has the largest effect on the identification of the DFP types, contributing 17.7% to the classification.RI is an indicator of the microtopographic features and it affects the overland flow and sediment transport mechanics.As mentioned in Section 4.1 (Figure 5), the water-flood type showed a higher RI value than the other two types.The rougher topography surface increased the flow resistance during the sediment transport process, causing it to be more difficult for the solid material to move with the flow.Therefore, catchment with a higher RI value is more likely to induce water-flood.The Rr also influences the type identification, with a significant value of 15.4%.Rr is closely related to the channel gradient and relief, and the high Rr produces the high discharge with the more power.The Rr value of debris-flow type that is displayed in Section 4.1 (Figure 5) was much higher than the other types, for the debris-flow process with more solid material requires stronger carrying capacity.The second most important group of parameters includes the Mr, Err, Dd, and Cr, which contribute 14.2%, 13.7%, 12.5%, and 10.4% to the total classification, respectively.The last groups of parameters used are the Ff and Er, which rank seventh and eighth, indicating that the two parameters provide less information during the training procedure.

Mapping of the Debris Flow Process Type
The map of the DFP types was generated via the above-mentioned data processing framework.The proportions of the catchments that are dominated by different disaster processes vary.The results show that 179, 306, and 245 catchments are dominated by water-flood, debris-flood, and debris-flow, accounting for 20.04%, 57.32%, and 22.64% of Beijing mountainous area, respectively (Table 4).Figure 8 shows that the water-flood process dominates 24.52% of the total catchment area.The concentration of water-flood prone catchments, which are significantly influenced by dissected terrain, is higher in the Taihang Mountains.In addition, almost half of the catchment area (41.92%) in the Beijing mountainous area are dominated by debris-flood process.The debris-flood process frequently occurs and it predominates the study area.The catchments in the Yan Mountains are dominated by debris-flood process because of the relatively gentle terrain and the slightly elongated

Mapping of the Debris Flow Process Type
The map of the DFP types was generated via the above-mentioned data processing framework.The proportions of the catchments that are dominated by different disaster processes vary.The results show that 179, 306, and 245 catchments are dominated by water-flood, debris-flood, and debris-flow, accounting for 20.04%, 57.32%, and 22.64% of Beijing mountainous area, respectively (Table 4).Figure 8 shows that the water-flood process dominates 24.52% of the total catchment area.The concentration of water-flood prone catchments, which are significantly influenced by dissected terrain, is higher in the Taihang Mountains.In addition, almost half of the catchment area (41.92%) in the Beijing mountainous area are dominated by debris-flood process.The debris-flood process frequently occurs and it predominates the study area.The catchments in the Yan Mountains are dominated by debris-flood process because of the relatively gentle terrain and the slightly elongated shape.Furthermore, approximately one-third of the total study area (33.56%) belongs to the debris-flow process.Catchments that are dominated by debris-flow process are scattered in the study area.In the Taihang Mountains, catchments that are prone to debris-flow process are concentrated around coalmines.The abandoned coal gangue and the wasted fuel material source for the disaster.In the Yan Mountains, catchments that are dominated by debris-flow process are found along faults, with the more active tectonics.

Validation against the Documentary Dataset
To validate the final classification model against the documentary data, several recorded events were considered (mainly based on the field investigation) [88][89][90][91][92][93][94][95][96][97][98].Table 5 shows the confusion matrix of the predicted types, being estimated for each catchment of the documentary data set.Based on the final classification model, 10 of 14 catchments were correctly identified.
The validated results indicated that the prediction accurately classifies the water-flood process.Here, no clear validation results were acquired for the debris-flood process for the lack of documentary data.However, only three out of six catchments were correctly predicted as the debrisflow process.

Predicted
Water-flood Debris-flood Debris-flow

Validation against the Documentary Dataset
To validate the final classification model against the documentary data, several recorded events were considered (mainly based on the field investigation) [88][89][90][91][92][93][94][95][96][97][98].Table 5 shows the confusion matrix of the predicted types, being estimated for each catchment of the documentary data set.Based on the final classification model, 10 of 14 catchments were correctly identified.
The validated results indicated that the prediction accurately classifies the water-flood process.Here, no clear validation results were acquired for the debris-flood process for the lack of documentary data.However, only three out of six catchments were correctly predicted as the debris-flow process.were considered (mainly based on the field investigation) [88][89][90][91][92][93][94][95][96][97][98].Table 5 shows the confusion matrix of the predicted types, being estimated for each catchment of the documentary data set.Based on the final classification model, 10 of 14 catchments were correctly identified.
The validated results indicated that the prediction accurately classifies the water-flood process.Here, no clear validation results were acquired for the debris-flood process for the lack documentary data.However, only three out of six catchments were correctly predicted as the debrisflow process.

Parameters Sensitivity Analysis
The RI reflects the local variability of the elevation and slope, which indicates the differences of the three process types (Figure 9).The correlation (R 2 = 0.36) between the RI and elevation can be

Parameters Sensitivity Analysis
The RI reflects the local variability of the elevation and slope, which indicates the differences of the three process types (Figure 9).The correlation (R 2 = 0.36) between the RI and elevation can be used as an indicator for the identification of catchments that are dominated by the debris-flood process.However, relationships were not observed for the other two process types.Consistent with the result that was proposed by Heiser et al. [11], the debris-flood process tends to form a distinctive channel-bed morphology, which is different from the other processes.used as an indicator for the identification of catchments that are dominated by the debris-flood process.However, relationships were not observed for the other two process types.Consistent with the result that was proposed by Heiser et al. [11], the debris-flood process tends to form a distinctive channel-bed morphology, which is different from the other processes.The ratio of the Rr and slope reveals the transport mechanisms along the flow path (Figure 10).There seems to be a relationship (R 2 = 0.23) between the Rr and slope of the catchments that are dominated by the water-flood process.The high ratio indicates a long flow path within catchments, as a result, they are more prone to be a water-flood process.Contrary, debris-flow process tends to generate in a steep channel with strong entrainment of material and water from the flow path.Additionally, the low Rr-slope ratio of debris-flow process is in accordance with the results of previous studies [47].

Spatial Differentiation of the DFP Types
There is evidence that indicates that several factors could intensify the future debris flow risk, such as global warming and ongoing socioeconomic development in debris flow prone areas [99- The ratio of the Rr and slope reveals the transport mechanisms along the flow path (Figure 10).There seems to be a relationship (R 2 = 0.23) between the Rr and slope of the catchments that are dominated by the water-flood process.The high ratio indicates a long flow path within catchments, as a result, they are more prone to be a water-flood process.Contrary, debris-flow process tends to generate in a steep channel with strong entrainment of material and water from the flow path.Additionally, the low Rr-slope ratio of debris-flow process is in accordance with the results of previous studies [47].used as an indicator for the identification of catchments that are dominated by the debris-flood process.However, relationships were not observed for the other two process types.Consistent with the result that was proposed by Heiser et al. [11], the debris-flood process tends to form a distinctive channel-bed morphology, which is different from the other processes.The ratio of the Rr and slope reveals the transport mechanisms along the flow path (Figure 10).There seems to be a relationship (R 2 = 0.23) between the Rr and slope of the catchments that are dominated by the water-flood process.The high ratio indicates a long flow path within catchments, as a result, they are more prone to be a water-flood process.Contrary, debris-flow process tends to generate in a steep channel with strong entrainment of material and water from the flow path.Additionally, the low Rr-slope ratio of debris-flow process is in accordance with the results of previous studies [47].

Spatial Differentiation of the DFP Types
There is evidence that indicates that several factors could intensify the future debris flow risk, such as global warming and ongoing socioeconomic development in debris flow prone areas [99-

Spatial Differentiation of the DFP Types
There is evidence that indicates that several factors could intensify the future debris flow risk, such as global warming and ongoing socioeconomic development in debris flow prone areas [99][100][101][102].Climate change has caused the more frequent occurrence of extreme precipitation in summer, resulting in the debris flow warning threshold calling for more attention.In addition, the expansion construction in the mountainous area, on one hand, disturbed the of the surface water cycle; on the other hand, the prosperous economic development intensified the hazard vulnerability.Therefore, to obtain more details regarding the spatial distribution of the catchments, maximum continuous precipitation (MCP), moisture index (IM), distance to road, population density, and per capita Gross Domestic Product (GDP) related to debris flow were analyzed (Figure 11). of Sciences (RESDC; http://www.resdc.cn),water-flood catchments are mostly distributed in the arid region, with a lower IM, while debris-flood is mainly distributed in the region with a higher IM.Debris-flow is distributed in both the arid and humid regions.Soil offers the growth environment for vegetations, which, to a great extent, determines the stability of surface material.In the study area, cinnamon soil and brown soil are the dominant types, accounting for more than 90% of the total area.Where, cinnamon soil is distributed in the region with intense sunshine, high soil temperature, and strong evaporation, and it is hard to efficiently conserve soil moisture, resulting in low vegetation coverage.Brown soil is mostly distributed in the region with high elevation, especially the watershed between rivers.High altitude area with suitable climate and little human interference is fit for vegetation growth.The spatial distribution of water-flood catchments is consistent with that of brown soil.Human activities, such as land use, road construction, and river bank invasion, have changed the mountainous environment, and disturbed the stability of the catchments in the long term.The artificial impervious surface hinders the discharge of debris flow, resulting in the accumulation of runoff water in the downstream.Most catchments in the study area are less than 1000 m away from roads.In most of the study area, the population density is 100-300 people per km² and the per capita GDP is approximately 3000 RMB/km².The rapid economic development in the mountainous area has caused an annual increase in the loss of human lives and it has increased the exposure of properties to all kinds of disasters.The lives and properties of both residents and tourists are threatened by debris flow disasters.This issue should be prioritized in disaster planning and prevention.

Model Deficiencies
Although the results of this study somewhat satisfy the classification demand, more work should be performed to make improvements.Here, we list several factors that need to be considered in future studies.
1.When referring to the recorded disaster events, the event location interpretation was difficult.It The MCP is one of the prerequisites that may lead to the outburst of debris flow during the heavy storm.Studies showed that spatial attention should be paid to the changing climate, which may affect the occurrence and magnitude of hydro meteorological hazards [103].In the study area, most of the water-flood and debris-flow catchments in the area has the MCP of 200-300 mm, and debris-flood are mostly with the MCP above 300 mm.The moisture index (IM) influences the absorption of surface water and soil water saturation that causes the occurrence of debris flow.According to the IM map that was obtained from the Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (RESDC; http://www.resdc.cn),water-flood catchments are mostly distributed in the arid region, with a lower IM, while debris-flood is mainly distributed in the region with a higher IM.Debris-flow is distributed in both the arid and humid regions.Soil offers the growth environment for vegetations, which, to a great extent, determines the stability of surface material.In the study area, cinnamon soil and brown soil are the dominant types, accounting for more than 90% of the total area.Where, cinnamon soil is distributed in the region with intense sunshine, high soil temperature, and strong evaporation, and it is hard to efficiently conserve soil moisture, resulting in low vegetation coverage.Brown soil is mostly distributed in the region with high elevation, especially the watershed between rivers.High altitude area with suitable climate and little human interference is fit for vegetation growth.The spatial distribution of water-flood catchments is consistent with that of brown soil.Human activities, such as land use, road construction, and river bank invasion, have changed the mountainous environment, and disturbed the stability of the catchments in the long term.The artificial impervious surface hinders the discharge of debris flow, resulting in the accumulation of runoff water in the downstream.Most catchments in the study area are less than 1000 m away from roads.In most of the study area, the population density is 100-300 people per km 2 and the per capita GDP is approximately 3000 RMB/km 2 .The rapid economic development in the mountainous area has caused an annual increase in the loss of human lives and it has increased the exposure of properties to all kinds of disasters.The lives and properties of both residents and tourists are threatened by debris flow disasters.This issue should be prioritized in disaster planning and prevention.

Model Deficiencies
Although the results of this study somewhat satisfy the classification demand, more work should be performed to make improvements.Here, we list several factors that need to be considered in future studies.

1.
When referring to the recorded disaster events, the event location interpretation was difficult.It was also hard to discern the disaster types after human transformation [42].Usually, only debris flow that caused huge losses is reported or listed in the documents.Thus, debris flow that occurred in a remote area or that did not cause damage to people was ignored.These issues lead to an incomplete disaster inventory.2.
The hydrology model was used to divide the catchments based on the DEM, supplemented by visual interpretation and manual modification.However, the catchment size greatly differs.
The larger the catchments are, the more hazardous the events, and the more complex are the types.Therefore, catchments with different types of events are typically insufficient for model training and more information is needed to classify the dominated disaster type.

3.
The RF model is regarded as one of the most effective and popular classification models.However, studies showed that the RF model has several drawbacks [71], for example, the algorithm tends to base the classification on the group with a larger number of samples.Therefore, the application of the RF model is limited [104].

4.
When dealing with the disaster type of the particular catchment, the final choice of the prediction results should not only depend on the classification accuracy, but also on consideration of the actual field research.It is of vital importance to complete continuous simulation experiments to obtain a more suitable method.
Despite the drawbacks, the contributions of this study represent an approach that can be applied to the identification of the DFP types and to additional decision-making processes in hazard prevention.

Summary and Conclusions
With the consistent warming climate on global and national scales in recent years, severe extreme precipitation frequently occurs, which imposes a greater challenge on people to accurately make preparation for disaster prevention.Identifying the specific debris flow process type may powerfully aid in decision making.The objective of this study was to develop a model framework that can be used to identify the debris flow process (DFP) types in the Beijing mountainous area.This objective was achieved by applying ensemble learning to a dataset that integrated data from multi-sources.The dataset extracted the parameters that are related to the catchment shape and relief gradient.Based on the comprehensive datasets, three ensemble learning models (RF, AdaBoost, and GBDT) were developed.The results show that Random Forest more accurately identifies the DFP types than the other two models, with an overall accuracy of 75%.The key points of this study can be summarized, as follows: 1.
This work generates insights into the suitability of different ensemble learning methods for the identification of the DFP types, demonstrating that Random Forest achieves a better result when compared with AdaBoost and Gradient Boosting.

2.
By developing models with different subsets of parameters, it is possible to derive insights into the different parameters and their contribution to the classification model.In particular, the RI and Rr are optimal parameters in the identification of the DFP types, while Ff and Er are not sensitive to the DFP types in the study area.
The study provides knowledge that guides the abstract decision-making process of the concerned authorities.The proposed diverse strategies that are associated with the spatial distribution of various DFP types will be beneficial for the decision-makers.

Figure 1 .
Figure 1.Geographical setting of the study area.

Figure 1 .
Figure 1.Geographical setting of the study area.

Figure 3 .
Figure 3. Flow chart of the computational process.

Figure 5 .
Figure 5. Boxplot of the Min-Max normalized parameters grouped by the DFP types.

Water 2019 ,
11, x FOR PEER REVIEW 15 of 26

Figure 5 .
Figure 5. Boxplot of the Min-Max normalized parameters grouped by the DFP types.

Figure 5 .
Figure 5. Boxplot of the Min-Max normalized parameters grouped by the DFP types.

Figure 6 .
Figure 6.Receiver operating characteristic (ROC) curve and area under the receiver operating characteristic curve (AUROC) for (a) the testing dataset and (b) multi-class with RF.

Figure 6 .
Figure 6.Receiver operating characteristic (ROC) curve and area under the receiver operating characteristic curve (AUROC) for (a) the testing dataset and (b) multi-class with RF.

Figure 7 .
Figure 7. Relative importance values of the parameters for the DFP types.

Figure 7 .
Figure 7. Relative importance values of the parameters for the DFP types.

Water 2019 ,
11, x FOR PEER REVIEW 18 of 26Yan Mountains, catchments that are dominated by debris-flow process are found along faults, with the more active tectonics.

Figure 8 .
Figure 8. Distribution of the DFP types based on the RF model.The black points are coal mines and the brown lines are the main faults in Beijing mountainous area.

Figure 8 .
Figure 8. Distribution of the DFP types based on the RF model.The black points are coal mines and the brown lines are the main faults in Beijing mountainous area.

Water 2019 ,
11, x FOR PEER REVIEW 19 of 26

Figure 9 .
Figure 9. Scatter plot of the RI versus elevation values for the DFP types.(a) Water-flood; (b) Debris-flood; and, (c) Debris-flow.

Figure 10 .
Figure 10.Scatter plot of the Rr versus slope values for the DFP types.(a) Water-flood; (b) Debris-flood; and, (c) Debris-flow.

Figure 11 .
Figure 11.Relationship between the influencing factors and catchment area of different DFP types.(a) Water-flood; (b) Debris-flood; and, (c) Debris-flow.

Figure 11 .
Figure 11.Relationship between the influencing factors and catchment area of different DFP types.(a) Water-flood; (b) Debris-flood; and, (c) Debris-flow.

Table 1 .
Classification criteria for the three types of debris flow events.

Table 1 .
Classification criteria for the three types of debris flow events.

Table 3 .
Model performance for the training and testing datasets.

Table 3 .
Model performance for the training and testing datasets.

Table 5 .
Confusion matrix of the validation against the documentary dataset.

Table 5 .
Confusion matrix of the validation against the documentary dataset.

Table 5 .
Confusion matrix of the validation against the documentary dataset.