Next Article in Journal
Numerical Modeling of Multiple Inclined Dense Jets Discharged from Moderately Spaced Ports
Next Article in Special Issue
Using RothC Model to Simulate Soil Organic Carbon Stocks under Different Climate Change Scenarios for the Rangelands of the Arid Regions of Southern Iran
Previous Article in Journal
Long-Term Study of Soluble Reactive Phosphorus Concentration in Fall Creek and Comparison to Northeastern Tributaries of Cayuga Lake, NY: Implications for Watershed Monitoring and Management
Previous Article in Special Issue
Predicting Habitat Suitability and Conserving Juniperus spp. Habitat Using SVM and Maximum Entropy Machine Learning Techniques
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Comparative Assessment of Random Forest and k-Nearest Neighbor Classifiers for Gully Erosion Susceptibility Mapping

by
Mohammadtaghi Avand
1,
Saeid Janizadeh
1,
Seyed Amir Naghibi
1,
Hamid Reza Pourghasemi
2,*,
Saeid Khosrobeigi Bozchaloei
3 and
Thomas Blaschke
4
1
Department of Watershed Management Engineering and Sciences, Faculty in Natural Resources and Marine Science, Tarbiat Modares University, Tehran 14115-111, Iran
2
Department of Natural Resources and Environment Engineering, College of Agriculture, Shiraz University, Shiraz 71441-65186, Iran
3
Department of Watershed Management, Faculty in Natural Resources, Tehran University, Tehran 14174-14418, Iran
4
Department of Geoinformatics – Z_GIS, University of Salzburg, 5020 Salzburg, Austria
*
Author to whom correspondence should be addressed.
Water 2019, 11(10), 2076; https://doi.org/10.3390/w11102076
Submission received: 8 August 2019 / Revised: 30 September 2019 / Accepted: 2 October 2019 / Published: 5 October 2019
(This article belongs to the Special Issue Spatial Modelling in Water Resources Management)

Abstract

:
This research was conducted to determine which areas in the Robat Turk watershed in Iran are sensitive to gully erosion, and to define the relationship between gully erosion and geo-environmental factors by two data mining techniques, namely, Random Forest (RF) and k-Nearest Neighbors (KNN). First, 242 gully locations we determined in field surveys and mapped in ArcGIS software. Then, twelve gully-related conditioning factors were selected. Our results showed that, for both the RF and KNN models, altitude, distance to roads, and distance from the river had the highest influence upon gully erosion sensitivity. We assessed the gully erosion susceptibility maps using the Receiver Operating Characteristic (ROC) curve. Validation results showed that the RF and KNN models had Area Under the Curve (AUC) 87.4 and 80.9%, respectively. As a result, the RF method has better performance compared with the KNN method for mapping gully erosion susceptibility. Rainfall, altitude, and distance from a river were identified as the most important factors affecting gully erosion in this area. The methodology used in this research is transferable to other regions to determine which areas are prone to gully erosion and to explicitly delineate high-risk zones within these areas.

1. Introduction

Water erosion is considered as one of the basic causes of land destruction, particularly in watersheds [1]. Erosion and soil destruction are among the most significant consequences of water erosion. Soil erosion has been studied by various researchers; its consequences are very dangerous, and both the flourishing and the destruction of previous civilizations have been attributed to this phenomenon. Despite experts extensively studying gully erosion in the twentieth century, and previous work before the 1930s surveying the factors that influence its formation and expansion, gully erosion is still classified in a variety of ways in various regions of the world. Meanwhile, despite considerable research on soil erosion and its various dimensions, many aspects of this phenomenon remain unknown or are only vaguely understood [2].
Gully erosion is the devastating kind of erosion, which can become a hazard if human factors such as land-use change, grazing pressure, and inadequate farming are exacerbated. This type of erosion causes sediment production in the environment, and is one of the important signs of land destruction [3]. Gully erosion is an action in which surface soil is degraded and runoff concentrates within canals, thereby deepening the canals [4]. Gully erosion is a complicated and devastating process of water erosion, which begins with a depression in the land, leached by waterfalls, and develops by head cut [4,5,6]. Because of the damage caused by gully erosion, this phenomenon is a permanent threat to soil ecosystems, land, and economic stability [7,8,9]. In general, gully erosion has three main impacts: (1) eliminating valuable agricultural soil and reducing yield and soil fertility, (2) increased fluidity of surface water, and thus, a higher risk of flooding, and (3) sedimentation in reservoirs and dams, reducing their useful lifespan [10,11,12].
Under specific conditions, gully erosion can also cause landslides and debris flow [8,12]. Mechanisms such as penetration, piping, and underground flow can lead to soil dispersion and the collapse of piping walls, and eventually, gully erosion [1,13,14]. The development and growth of rill erosion can also lead to gully erosion [15,16,17]. Gully erosion is an important source of sediment production; 10 to 94 percent [18] of gully erosion occurs at the watershed scale [3]. Accordingly, understanding gully erosion is necessary to better manage erosion and to identify and prioritize areas which are susceptible to it in order to reduce the risks associated with it [3]. Various climatic, hydrological, and physical factors have been reported as the most important ones affecting gully erosion [3,19,20,21].
Our literature review revealed that different data mining and statistical methods of gully erosion assessment have been used, including classification and regression tree-CART [22], Logistic regression-LR [23,24,25,26], BF-Tree for gully headcut [27], and frequency ratio-FR [28]. Angileri et al. [29] used Stochastic Gradiant Treeboost (SGT) to analyze and forecast the spatial occurrence of the rill-inter rill and gully erosion types in central-northern Sicily (Italy). They stated that SGT is a good theory to better clarify the relationships among erosion and environmental variables.
Accordingly, the purposes of the current research are (1) to model gully erosion susceptibility using two famous data mining techniques, namely RF and KNN, (2) to determine the importance of the geo-environmental/conditioning factors for gully erosion susceptibility mapping, and, finally, (3) to provide an applicable guideline for stakeholders to reduce gully damage in the study area. The main novelty of the current research is the application of the RF and KNN data driven methods for gully susceptibility mapping in order to compare both models for the first time. Therefore, the gully susceptibility map is an appropriate tool with which to understand the mechanism of gully erosion and to aid scientific planning and decision making. Also, this map is applicable to the management of land use, gully erosion risk, and sustainable development in the Robat Turk area.

2. Materials and Methods

2.1. Study Area

The Robat Turk Watershed is one of the sub-watersheds of the Shoor River, situated between Markazi and Isfahan Provinces, spanning from 33°42′ to 35°45′ N latitude and 50°46′ to 50°52′ E longitude (Figure 1). The area of Rabat Turk is 242 km2, with a minimum altitude of 1807 m and a maximum altitude of 2723 m. The climate of the study area is arid and semi-arid; the annual rainfall is 213 mm [30]. Approximately 80% of the annual rainfall in this area occurs in December and April. Peak stream flows occur from February to June [30]. Land-use types in the study region include agriculture, bare land, and pastureland uses, whereby bare lands cover a large share of the region. Most of the gullies in the region are concentrated in the northern regions, where bare land and agricultural land are prevalent. Investigation of the morphometric properties of the gullies in two land uses (agriculture and rangeland) showed that they were relatively active for both. The shape of the gullies is concave and vertical, and there are soil fragments within the gully canal, as well as wall cracks that cause the longitudinal profile to be convex in some cases. Also, the transverse profiles of each gully differ greatly, indicating that the gullies are active. The shape of the transverse profiles of the gullies varies widely, indicating that the gully is active as the walls are gradually destroyed and fall inwards [31]. Figure 2 depicts two instances of gully erosion in this area.

2.2. Methodology

2.2.1. Gully Dataset

To provide a gully erosion inventory map in the Robat Turk watershed, we first carried out field surveys to identify V- and U-shaped gully occurrences. This revealed a total of 242 gully locations in the study area [32]. These gullies mostly occur in plains and low slopes, where drainage density is high. In these areas, gully erosion and piping have developed due to the low vegetation and the high evaporation causing the formation of salt and gypsum in the soil [1,22]. To separate the gullies for training and validation purposes, a random partition algorithm was used [32,33,34,35,36]. Of the 242 gully locations and the same number of non-gully locations, 70% were used for the training stage and 30% for the validation stage [27]. Furthermore, the same number of non-gully erosion points were prepared for training and validation process [1,29,37,38]. Figure 3 displays a flow diagram of the methodology implemented in this study to create the gully erosion susceptibility maps.

2.2.2. Gully Erosion Geo-Environmental Factors

Based on previous studies [24,25,39,40,41] and the available data, hydrologic-geological-physiographic factors, including altitude, slope degree, slope aspect, plan curvature, profile curvature, distance to rivers, drainage density, distance to roads, lithology, land use, Normalized Difference Vegetation Index (NDVI), and annual mean rainfall were identified as important factors for gully erosion appraisals.
The Digital Elevation Model (DEM) of the study area was derived from ALOS PALSAR Global Radar Imagery with a spatial resolution of 12.5 m × 12.5 m [42] (Figure 4a).
Several layers, namely, slope degree, slope aspect, plan curvature, and profile curvature, were created using the ALOS PALSAR DEM. Slope is an influential variable in the gully erosion process due to its impact on surface flow and drainage density, and because it causes the expansion of the gully [41]. The ArcGIS 10.5 software (developed by Environmental Systems Research Institute (ESRI) located in Redlands, California, USA) was used to prepare the slope map (Figure 4b), which was subsequently classified into five classes 0–5, 5–12, 12–20, 20–30, and >30 degrees, according to Zabihi et al. [43].
Slope aspect is a significant variable due to its effect on the type of vegetation present in a given area. It controls the duration of sunlight, moisture, evaporation, and transpiration, and the distribution of vegetation that indirectly affects the erosion process [44]. The aspect map was extracted from the DEM and classified into nine classes (Figure 4c) according to categorical features.
Useful geomorphological information and morphological land descriptions can be defined by analyzing the slope shape [40]. The plan and profile curvatures affect the convergence or divergence of the flow [45]. Plan and profile curvatures were prepared from the DEM with pixel size 12.5 m × 12.5 m [46] using ArcGIS 10.5, and were subsequently divided into three classes, namely, concave, flat, and convex (Figure 4d–e).
Gullies are always associated with drainage networks [1]. In order to survey the effect of the stream network on gully erosion, the distance to rivers and drainage density factors were used (Figure 4f–g). Surface runoff is high in areas in which the drainage density is high. Drainage density can also affect the drainage pattern in an area, and the development drainage density depends on many variables, such as the structure and nature of geological formation, soil features, vegetation conditions, penetration rate, and slope degree [47,48]. A drainage density map was developed in ArcGIS 10.5 using Line Density Tools. The distance to rivers factor was also determined using the Euclidean Distance Tools in ArcGIS 10.5 software.
Gully erosion depends on the lithological features of the surface material and the shape of the surface in Earth [25,49]. A geological map with a scale of 1:100,000 was used to prepare the lithology map [50]. The lithology map was divided into eight classes using ArcGIS 10.5 (Table 1, Figure 4h).
Land-use management has a significant impact on the geomorphology of slope stability and the incidence of gully erosion. Generally, vegetation-free areas and scattered areas have a higher sensitivity to erosion than those with good vegetation coverage [37,40]. The land use map for this study was generated using Landsat 8 (OLI) imagery [51] processed in the ENVI 5.4 software (developed by Harris Geospatial Solutions located in Broomfield, Colorado, United States). The land-use classes identified in the region are bare land, range land, and agriculture areas (Figure 4i).
The normalized difference vegetation index (NDVI) quantifies vegetation by measuring the difference between near-infrared (which vegetation strongly reflects) and red light (which vegetation absorbs). Areas covered in dense vegetation often have fewer instances of gully erosion [52]. The NDVI map used for this study was generated from Landsat 8 imagery collected on 15 June 2017 (Figure 4j). The NDVI value is computed by the following equation:
N D V I = ( N I R R e d ) / ( N I R + R e d )
where near-infrared (NIR) is band 5 of Landsat 8 imagery within a wavelength range of 0.845–0.885 μm, and Red is band 4 of Landsat 8 imagery within a wavelength range of 0.63–0.68 μm. The range obtained from this index varies from −1 to +1, with positive numerical values for dense vegetation, zero and near numerical values for water areas and negative numerical values indicating low vegetation areas.
Some linear and manmade phenomena, such as roads and canals, can be susceptible to gully erosion [1,18,53]. Improper road construction disrupts natural drainage and, as a result, expands erosion. Consequently, in areas with bare soil, inadequate construction can exacerbate gully erosion [54]. Therefore, a distance-from-roads map was created using the Euclidean Distance tools in ArcGIS 10.5 software (Figure 4k), to be used in the creation of the gully erosion susceptibility map. In order to prepare rainfall data, three rain gauge stations (inside and outside the basin) were used. After checking the accuracy of different interpolation methods, the annual rainfall map of the Robat Turk watershed was provided by the Inverse Distance Weighting (IDW) method (Figure 4l). IDW is one of the most applicable and deterministic techniques of interpolation in the environmental sciences. IDW estimates are based on known locations nearby. The weights assigned to the interpolation points are the distance from the interpolation point. As a result, short distances are made to have more weight (therefore, more impact) than distant points. Well known sample points indicate that they are controlled by each other [55].

2.2.3. Gully Erosion Susceptibility Mapping Using Data Mining Methods

Random Forest (RF)

The RF technique is a modern, tree-based method that includes a multitude of classification and regression trees [56]. It is also a nonparametric method for modeling the continuous and discrete data of decision tree methods. The main problem faced by this method is the fluctuations in the results of each tree [41]. To reduce these fluctuations, and to reduce the estimation of variance, a random forest approach is proposed [57].
This is a combination of several decision trees that incorporate multiple bootstrap samples from the data, and a number of input variables randomly participate in the construction of each tree [58]. By using the bootstrap method, a large number of n samples from the initial observational data set are inserted [59]. During the sampling process, about one-third of the data as out-of- bag (OOB) was used for validation of models. Then, a tree is expanded based on any bootstrap sample [32,60]. During the process of constructing a tree in each branch, from between all M independent variables, m variables were randomly chosen for partition. For regression, the ratio m/M is one-third, and is proposed for classification as m = √M [61]. After introducing the whole tree construction, a number of trees are used as inputs and to determine the output [61]. By averaging these outcomes, the final output of the model is calculated, considering the empirical distribution of outputs, the percentile values, and the range of uncertainty.
In this research, a random forest model was computed in the R 3.3.1 software (developed by R core team located in University of Auckland, Auckland, New Zealand) using the “Random forest” package [62]. Then, the ArcGIS software was used to compute the gully erosion susceptibility, while the Gini index was used to calculate important factors in the R 3.3.1 software [63]. As a classifier, the RF makes an implicit feature selection, using a small subset of “robust variables” for classification only, resulting in superior performance in the subsequent data. The result of this selection of the implicit attribute of the random forest can be visualized with “Ginni index”, and can be used as a general indicator of the significance of the attribute.

K-Nearest Neighbor (KNN)

K-Nearest Neighbor is in the class of algorithms that can classify an unknown entity if we have data with specific properties (X) and the value of the relationship (Y) [64]. The KNN Classifier is a sample and nonparametric learning algorithm. In the classification setting, the algorithm calculates the distance of the target point to the closest points according to the value specified for K, and according to the maximum number of votes of these neighboring points, in relation to the number of points was chosen [65,66]. By bypassing the density subordinate and going directly to a decision rule, the KNN algorithm supposes that pixels near each other in the trait space ought to fall into one class [67].
This method is based on the calculation of the similarity (neighborhood) of the real-time prediction value X r = { x 1 n ,   x 2 n ,   x 3 n , , x m r } to the predictive value for each historical observation X t = { x 1 b ,   x 2 b ,   x 3 b , , x m r } through the Euclidean distance function ( D r t ) as follows:
D r t = i = 1 m w i ( x i r x i t ) 2 , t = 1 , 2 , , n
where w i ( i = 1 , 2 , , m ) is the weight of predictors, whose sum is equal to one.

2.2.4. Assessment of Data Mining Based Models

All gully and non-gully points in this area were classified into two categories to create gully erosion susceptibility maps: one group for modeling and one for validation. The accuracy and performance of the maps generated by the RF and KNN models were confirmed using a receiver operating characteristic (ROC) curve. The ROC curve represents the ability of a model to forecast the occurrence and non-occurrence of a gully. The threshold-independent performance measure is the area under the receiver operating characteristics curve (AUC), which is used in various studies to assess the predictive value of the model [68,69]. To calculate the ROC curve, missing points (for example, places without gullies) were determined using the random extraction algorithm in ArcGIS 10.5, and these points were controlled in the study area not located in gully erosion. The range under the ROC curve was computed for the rate of success and the prediction rate for the gully sensitive maps [70]. The AUC values ranged from 0.5 to 1, whereby the higher the AUC, the higher the model’s veracity [43].

3. Results

The KNN model was fitted and modeled in the R 3.3.1 software using the “CARET” package [71]. In this algorithm, there is a basic parameter, K, or the number of neighbors that needs to be optimized; this parameter specifies the voting system in the KNN algorithm in which the neighbors K are selected with the least distance, and then specifies the output model. For this reason, the number k, between 5 and 45, is optimized based on the accuracy of the model [65,66]. A final k value of 13 was obtained with an accuracy of 0.736 (Figure 5). Also, when assessing the effect of input variables on modeling, the results showed that rainfall, altitude, distance to rivers, and drainage density are the most important factors in the modeling process. In contrast, profile curvature, slope aspect, and plan curvature were recognized as the minimum significant factors in the modeling process, whereby plan curvature has no significance in modeling gully erosion susceptibility (Table 2).
A confusion matrix shows the performance of the random forest model in the training stage; this matrix is not used in model evaluation, but rather, when the accuracy of a group detection is more important than the overall detection accuracy. The results of the confusion matrix for the RF model are depicted in Table 3. Based on Table 3, the model predictions are shown with “1” and the observations with “0”. The results show that the training dataset and the model agree that there are no gully erosions for 137 observations, while there will be gully erosions for 145 observations. Nevertheless, there are 25 gully erosion pixels that the model predicts that are not gully erosions. Similarly, the model predicts that 33 observations will be gully erosions, where, in fact, they are not gully erosions.
The results of the RF prioritization using Gini index are depicted in Table 2. The results show that rainfall (48.74), altitude (30.46), distance to roads (18.36), and distance to rivers (14.95) have the highest importance scores.
Table 4 shows the results of determining the best parameter in the random forest model in the training phase. The results of tuneRF indicated an mtry of 5 whereas the highest accuracy find at ntree of 235.
Figure 6 indicates the misclassification/error rate as a function of trees grown in the random forest. The black line represents the entire sample (out-of-bag) and the green line represents the error rate, where non-gully = 0 and the red line represents the error rate when gully = 1.
Finally, the gully erosion susceptibility maps were classified into four classes, namely, low, moderate, high, and very high susceptibility, using the “natural breaks method” in ArcGIS 10.5 (Figure 7 and Figure 8; Table 5) [27,28]. The natural breaks classification system is a method for classifying data that is designed to optimize the order of a set of values to natural classes. A natural class is the most desirable class range naturally found in a dataset. Class range consists of items with similar properties that form a natural group in a dataset. This classification method seeks to minimize the mean deviation from the mean of the class while maximizing the separation from other groups. This method reduces the variance within classes and maximizes the variance between them. [72].
Based on the results of the KNN model, the classification of the susceptibility map resulted in the following class shares: low (13.23%), moderate (10.83%), high (42.16%), and very high (33.78%). On the other hand, the RF model resulted in different class coverage shares for the study area, namely, low (46.42%), moderate (15.42%), high (19.99%), and very high (18.18%).

Validation of Gully Erosion Susceptibility Maps

The results of the AUC/ROC validation indicated that both models produced good results. For the KNN model, the AUC and prediction accuracy were computed to be 80.9% and 0.809, respectively; in contrast, for the RF model, the AUC and predictive accuracy values were 87.4% and 0.874, respectively (Figure 9).

4. Discussion

In general, our results show that the RF and KNN data mining models have a reasonable accuracy for gully erosion susceptibility mapping in our study area, whereby the RF outperformed the KNN based on the AUC values. These results are consistent with the results of previous studies [42,73,74,75,76] which suggested that the RF model is a robust and well-functioning model. RF is an advanced technique in spatial sciences. One of the benefits of the RF model is that it can support multiple input variables without deleting every variable and several other sets of classes that have high prophetic accuracy [41,77]. The classification accuracy of this model is influenced by several factors, such as range, scale, file type, and the accuracy of the computer file. The RF model has the ability to use explanatory variables in the modeling process [77]. The accuracy of RF modeling depends on various factors such as quantity, scale, type, and the accuracy of the input data [41]. This technique can manage and model a large data set [77] and can combine duplicate predictions of any phenomenon using multiple tree algorithms. The RF model can identify and notice nonlinear relationships between independent and dependent variables [41,78]. Therefore, utilizing all the acceptable factors for a modeling task will increase the accuracy of the model. Compared to alternative models, the RF has a strong ability to validate an outsized range of data sets [41,77]. The RF model has the power to assess environmental problems and hazards for any given area. The advantages of the RF model include the fact that it has robust and accurate machine learning algorithms, is considered a highly accurate classifier for many datasets, it can run efficiently on large databases, manage enormous of input variables without elimination, estimate effective factors in the classification, generate an internal, unbiased estimate of the generalization error as the forest building progresses, is an effective method for estimating missing data, and retains validity when a big portion of data are missing [79].
This study compared the results of the RF and KNN models for gully erosion susceptibility modeling, and determined that the KNN model has a lower validity than the RF method. The accuracy of the KNN model can, hence, be severely degraded by the presence of noisy or irrelevant features. In other words, the model structure is determined by the data. Naghibi et al. [80] and Naghibi and Moradi Dashtpagerdi [60] also showed that the KNN model has a lower efficiency than other methods in groundwater potential modeling.
The outcome of the variable importance validation indicates that rainfall is an effective variable for modeling gully susceptibility. This seems possible because rainfall is the basic source of runoff, and an initiating factor in gully erosion. Due to the high amount of rainfall in the region in December and the low vegetation cover, a rapid increase in gully erosion occurred in the area [3,76]. Rainfall effects on land can cause runoff and soil erosion. [81,82]. Also, the spatiotemporal heterogeneity of rainfall plays an effective role in erosion [83]. Because the spatial distribution of rainfall is irregular in the studied region, the rainfall regime is such that rainfall intensity is high, and its duration is short. These factors have led to an increase in erosion and contributed to the deployment of the gully. Also, it was observed that altitude is another important factor in this study. Based on the results obtained, the locations at the lowest elevation are most susceptible to gully erosion, which is consistent with the finding of [42]. This may be due to a gully erosion mechanism (incision, seepage, or piping) that is more likely to occur in lowland areas. Therefore, this factor could be considered as a distinguishing variable in the susceptibility mapping of gully erosion. Areas closer to rivers have more developed drainage systems, which increases the water flow and speed, thereby resulting in gully erosion. This fact is reflected in the results of this study, as it is the third most important factor in the modeling procedure. And, by visually inspecting the gully susceptibility map obtained by the RF model, it can be seen that areas closer to the river system have a higher susceptibility to this erosion type. Also, this can be explained by the fact that the river, with its underlying action and erosion, disturbs the balance of the slopes overlooking the waterway, increasing the sensitivity to gully erosion along the river’s edge. These results are consistent with the results of [22,84]. Dube et al. [85] also obtained similar results regarding the inverse influence of distance from a river on gully susceptibility. Roads are anthropogenic construction projects that artificially concentrate surface runoff and increase its speed could be an intensifying factor in gully erosion. This fact has also been observed and supported by the results of the present study. Roads concentrate surface runoff and runoff depression from other basins into the basin. Hence, gully erosion increases after the construction of a road [54].

5. Conclusions

Due to the destructive nature of gully erosion, researchers and natural resource managers around the world have concentrated on mapping the susceptibility and assessing the hazards of gully erosion. In this research, we employed the RF and KNN data mining models to assess the results of geo-environmental variables on gully erosion and to identify areas prone to this hazard. For this purpose, we evaluated twelve variables. The methodological framework used in this study has demonstrated that the appropriate choice of effective factors in gully erosion, along with the use of data driven techniques, can correctly identify gully erosion-prone areas. The expansion and advancement of gully erosion have severe environmental impacts, including the destruction of fertile surface soil, damages to roads, damages to canal routes, and the exhaustion of large volumes of rainfall as floodwater flows through the gully. In terms of human impact, the destruction of agricultural land caused by gully erosion can lead to increasing unemployment and migrating villagers, and, in some cases, entire villages need to be evacuated due to other damages caused by the expansion of the gully. Identifying and predicting which areas may be susceptible to gully erosion can help reduce the destructive effects and prevent the future development of this type of erosion, and can provide significant assistance to the people of the study area. The susceptibility maps prepared using the RF and KNN methods are suitable tools for appropriate projections to preserve lands from gully erosion. Accordingly, this methodology can be used to determine gully erosion in other, similar regions, especially those with the same dry and semi-arid climate.

Author Contributions

Conceptualization, M.A., S.J., S.A.N. and H.R.P.; methodology, M.A., S.J., S.A.N. and H.R.P.; software, M.A., S.J., S.K.B. and H.R.P.; validation, M.A., S.J., S.A.N., S.K.B. and H.R.P.; formal analysis, M.A., S.J., S.A.N. and T.B.; investigation, M.A., S.J., S.A.N., S.K.B., T.B. and H.R.P.; writing—original draft preparation, M.A., S.J., S.K.B., S.A.N., T.B. and H.R.P., writing—review and editing, S.A.N., H.R.P., T.B.; project administration, M.A., H.R.P. and T.B.; funding acquisition, T.B.

Funding

The study was funded by the Austrian Science Fund FWF through the GIScience Doctoral College (DK W 1237-N23) at the University of Salzburg. Open Access Funding by the Austrian Science Fund (FWF).

Acknowledgments

The study was supported by the College of Agriculture, Shiraz University (Grant No. 96GRD1M271143).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, and in the decision to publish the results.

References

  1. Conoscenti, C.; Angileri, S.; Cappadonia, C.; Rotigliano, E.; Agnesi, V.; Märker, M. Gully erosion susceptibility assessment by means of GIS-based logistic regression: A case of Sicily (Italy). Geomorphology 2014, 204, 399–411. [Google Scholar] [CrossRef] [Green Version]
  2. Bull, L.J.; Kirkby, M.J. (Eds.) Dryland Rivers: Hydrology and Geomorphology of Semi-Arid Channels; Wiley: Chichester, UK, 2002. [Google Scholar]
  3. Valentin, C.; Poesen, J.; Li, Y. Gully erosion: Impacts, factors and control. Catena 2005, 63, 132–153. [Google Scholar] [CrossRef]
  4. Shellberg, J.G.; Spencer, J.; Brooks, A.P.; Pietsch, T.J. Geomorphology Degradation of the Mitchell River fluvial megafan by alluvial gully erosion increased by post-European land use change, Queensland, Australia. Geomorphology 2016, 266, 105–120. [Google Scholar] [CrossRef]
  5. Dymond, J.R.; Herzig, A.; Basher, L.; Betts, H.D.; Marden, M.; Phillips, C.J.; Ausseil, A.-G.E.; Palmer, D.J.; Clark, M.; Roygard, J. Development of a New Zealand SedNet model for assessment of catchment-wide soil-conservation works. Geomorphology 2016, 257, 85–93. [Google Scholar] [CrossRef]
  6. Goodwin, N.R.; Armston, J.D.; Muir, J.; Stiller, I. Monitoring gully change: A comparison of airborne and terrestrial laser scanning using a case study from Aratula, Queensland. Geomorphology 2017, 282, 195–208. [Google Scholar] [CrossRef]
  7. El Maaoui, M.M.; Sfar Felfoul, M.; Boussema, M.R.; Shane, M.H. Sediment yield from irregularly shaped gullies located on the Fortuna lithologic formation in semi-arid area of Tunisia. Catena 2012, 93, 97–104. [Google Scholar] [CrossRef]
  8. Ionita, I.; Fullen, M.A.; Zgłobicki, W.; Poesen, J. Gully erosion as a natural and human-induced hazard. Nat. Hazards 2015, 79, 1–5. [Google Scholar] [CrossRef] [Green Version]
  9. Ibáñez, J.; Contador, J.F.L.; Schnabel, S.; Valderrama, J.M. Evaluating the influence of physical, economic and managerial factors on sheet erosion in rangelands of SW Spain by performing a sensitivity analysis on an integrated dynamic model. Sci. Total Environ. 2016, 544, 439–449. [Google Scholar] [CrossRef]
  10. Ekholm, P.; Lehtoranta, J. Does control of soil erosion inhibit aquatic eutrophication? J. Environ. Manag. 2012, 93, 140–146. [Google Scholar] [CrossRef]
  11. Moradi, H.; Avand, M.T.; Janizadeh, S. Landslide Susceptibility Survey Using Modeling Methods. In Spatial Modeling in GIS and R for Earth and Environmental Sciences, 1st ed.; Pourghasemi, H.R., Gokceoglu, C., Eds.; Elsevier: Amsterdam, The Netherlands, 2019; pp. 259–275. [Google Scholar]
  12. Fox, G.A.; Sheshukov, A.; Cruse, R.; Kolar, R.L.; Guertault, L.; Gesch, K.R.; Dutnell, R.C. Reservoir Sedimentation and Upstream Sediment Sources: Perspectives and Future Research Needs on Streambank and Gully Erosion. Environ. Manag. 2016, 57, 945–955. [Google Scholar] [CrossRef] [Green Version]
  13. Imeson, A.C.; Kwaad, F.J.P.M. Gully types and gully prediction. Geogr. Tydschr. 1980, 14, 430–441. [Google Scholar]
  14. Rijkee, P. Low-land Gully Formation in the Amhara Region, Ethiopia. Minor Master’s Thesis, Wageningen UR, Wageningen, The Netherlands, 15 December 2015. [Google Scholar]
  15. Barnes, N.; Luffman, I.; Nandi, A. Gully erosion and freeze-thaw processes in clay-rich soils, northeast Tennessee, USA. GeoResJ 2016, 9–12, 67–76. [Google Scholar] [CrossRef]
  16. Luffman, I.E.; Nandi, A.; Spiegel, T. Gully morphology, hillslope erosion, and precipitation characteristics in the Appalachian Valley and Ridge province, southeastern USA. Catena 2015, 133, 221–232. [Google Scholar] [CrossRef]
  17. Ollobarren, P.; Capra, A.; Gelsomino, A.; La Spada, C. Effects of ephemeral gully erosion on soil degradation in a cultivated area in Sicily (Italy). Catena 2016, 145, 334–345. [Google Scholar] [CrossRef]
  18. Poesen, J. Gully typology and gully control measures in the European Loess Belt. In Farm Land Erosion in Temperate Plains Environments and Hills; Wicherek, S., Ed.; Elsevier Science Publishers: Amsterdam, The Netherlands, 1993; pp. 221–239. [Google Scholar]
  19. Poesen, J.; Vandekerckhove, L.; Nachtergaele, J.; Oostwoud Wijdenes, D.J.; Verstraeten, G.; van Wesemael, B. Gully erosion in dryland environments. In Dryland Rivers: Hydrology and Geomorphology of Semi-Arid Channels; Bull, L.J., Kirkby, M.J., Eds.; Wiley: Chichester, UK, 2002; pp. 229–262. [Google Scholar]
  20. Nazari Samani, A.; Ahmadi, H.; Jafari, M.; Boggs, G.; Ghoddousi, J.; Malekian, A. Geomorphic threshold conditions for gully erosion in Southwestern Iran (Boushehr-Samal watershed). J. Asian Earth Sci. 2009, 35, 180–189. [Google Scholar] [CrossRef]
  21. McCloskey, G.L.; Wasson, R.J.; Boggs, G.S.; Douglas, M. Timing and causes of gully erosion in the riparian zone of the semi-arid tropical Victoria River, Australia: Management implications. Geomorphology 2016. [Google Scholar] [CrossRef]
  22. Gómez-Gutiérrez, Á.; Conoscenti, C.; Angileri, S.E.; Rotigliano, E.; Schnabel, S. Using topographical attributes to evaluate gully erosion proneness (susceptibility) in two mediterranean basins: Advantages and limitations. Nat. Hazards 2015, 79, 291–314. [Google Scholar] [CrossRef]
  23. Chaplot, V.; Coadou le Brozec, E.; Silvera, N.; Valentin, C. Spatial and temporal assessment of linear erosion in catchments under sloping lands of northern Laos. Catena 2005, 63, 167–184. [Google Scholar] [CrossRef]
  24. Conoscenti, C.; Angileri, S.; Cappadonia, C.; Rotigliano, E.; Agnesi, V.; Märker, M. A GIS-based approach for gully erosion susceptibility modelling: A test in Sicily, Italy. Environ. Earth Sci. 2013, 70, 1179–1195. [Google Scholar] [CrossRef]
  25. Lucà, F.; Conforti, M.; Robustelli, G. Comparison of GIS-based gullying susceptibility mapping using bivariate and multivariate statistics: Northern Calabria, South Italy. Geomorphology 2011, 134, 297–308. [Google Scholar] [CrossRef]
  26. Kornejady, A.; Ownegh, M.; Bahremand, A. Landslide susceptibility assessment using maximum entropy model with two different data sampling methods. Catena 2017, 152, 144–162. [Google Scholar] [CrossRef]
  27. Hosseinalizadeh, M.; Kariminejad, N.; Chen, W.; Pourghasemi, H.R.; Alinejad, M.; Behbahani, A.M.; Tiefenbacher, J.P. Spatial modelling of gully headcuts using UAV data and four best-first decision classifier ensembles (BFTree, Bag-BFTree, RS-BFTree, and RF-BFTree). Geomorphology 2019, 329, 184–193. [Google Scholar] [CrossRef]
  28. Rahmati, O.; Haghizadeh, A.; Pourghasemi, H.R.; & Noormohamadi, F. Gully erosion susceptibility mapping: The role of GIS-based bivariate statistical models and their comparison. Nat. Hazards 2016, 82, 1231–1258. [Google Scholar] [CrossRef]
  29. Angileri, S.E.; Conoscenti, C.; Hochschild, V.; Märker, M.; Rotigliano, E.; Agnesi, V. Water erosion susceptibility mapping by applying Stochastic Gradient Treeboost to the Imera Meridionale River Basin (Sicily, Italy). Geomorphology 2016, 262, 61–76. [Google Scholar] [CrossRef]
  30. Iranian Department of Water Resources Management of Markazi Province. Available online: http://marw.ir (accessed on 15 September 2017).
  31. Shadfar, S.; Davoodirad, A.A.; Peyrowan, H.R. Investigation and comparing gully erosion characteristics in agriculture and rangeland land uses, case study: Robat Tork watershed. J. Watershed Eng. Manag. 2013, 4, 217–222. [Google Scholar] [CrossRef]
  32. Davoodi Rad, A.A. Identification and study of gully erosion in the Robat Turk watershed 2015, Iranian Administration Department of Natural Resources of Markazi Province. Project code 9003-29-29-0. Available online: http://markazi.frw.ir (accessed on 22 June 2016).
  33. Golkarian, A.; Naghibi, S.A.; Kalantar, B.; Pradhan, B. Groundwater potential mapping using C5.0, random forest, and multivariate adaptive regression spline models in GIS. Environ. Monit. Assess. 2018, 190. [Google Scholar] [CrossRef]
  34. Oh, H.-J.; Pradhan, B. Application of a neuro-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area. Comput. Geosci. 2011, 37, 1264–1276. [Google Scholar] [CrossRef]
  35. Naghibi, S.A.; Ahmadi, K.; Daneshi, A. Application of Support Vector Machine, Random Forest, and Genetic Algorithm Optimized Random Forest Models in Groundwater Potential Mapping. Water. Resour. Manag. 2017, 31, 2761–2775. [Google Scholar] [CrossRef]
  36. Naghibi, S.A.; Moghaddam, D.D.; Kalantar, B.; Pradhan, B.; Kisi, O. A comparative assessment of GIS-based data mining models and a novel ensemble model in groundwater well potential mapping. J. Hydrol. 2017, 548, 471–483. [Google Scholar] [CrossRef]
  37. Cama, M.; Lombardo, L.; Conoscenti, C.; Rotigliano, E. Improving transferability strategies for debris flow susceptibility assessment: Application to the Saponara and Itala catchments (Messina, Italy). Geomorphology 2017, 288, 52–65. [Google Scholar] [CrossRef]
  38. Rahmati, O.; Tahmasebipour, N.; Haghizadeh, A.; Pourghasemi, H.R.; Feizizadeh, B. Evaluation of different machine learning models for predicting and mapping the susceptibility of gully erosion. Geomorphology 2017, 298, 118–137. [Google Scholar] [CrossRef]
  39. Zakerinejad, R.; Maerker, M. An integrated assessment of soil erosion dynamics with special emphasis on gully erosion in the Mazayjan basin, southwestern Iran. Nat. Hazards 2015, 79, 25–50. [Google Scholar] [CrossRef]
  40. Rahmati, O.; Tahmasebipour, N.; Haghizadeh, A.; Pourghasemi, H.R.; Feizizadeh, B. Evaluating the influence of geo-environmental factors on gully erosion in a semi-arid region of Iran: An integrated framework. Sci. Total Environ. 2017, 579, 913–927. [Google Scholar] [CrossRef] [PubMed]
  41. Arabameri, A.; Rezaei, K.; Pourghasemi, H.R.; Lee, S.; Yamani, M. GIS-based gully erosion susceptibility mapping: A comparison among three data-driven models and AHP knowledge-based technique. Environ. Earth Sci. 2018, 77, 1–22. [Google Scholar] [CrossRef]
  42. Chunxia, Z.; Linlin, G.; Dongchen, E.; & Hsingchung, C. A case study of using external DEM in insar DEM generation. Geo. Spat. Inf. Sci. 2005, 8, 14–18. [Google Scholar] [CrossRef]
  43. Zabihi, M.; Mirchooli, F.; Motevalli, A.; Khaledi Darvishan, A.; Pourghasemi, H.R.; Zakeri, M.A.; Sadighi, F. Spatial modelling of gully erosion in Mazandaran Province, northern Iran. Catena 2018, 161, 1–13. [Google Scholar] [CrossRef]
  44. Jaafari, A.; Najafi, A.; Pourghasemi, H.R.; Rezaeian, J.; Sattarian, A. GIS-based frequency ratio and index of entropy models for landslide susceptibility assessment in the Caspian forest, northern Iran. Int. J. Environ. Sci. Technol. 2014, 11, 909–926. [Google Scholar] [CrossRef] [Green Version]
  45. Yilmaz, C.; Topal, T.; Süzen, M.L. GIS-based landslide susceptibility mapping using bivariate statistical analysis in Devrek (Zonguldak-Turkey). Environ. Earth Sci. 2012, 65, 2161–2178. [Google Scholar] [CrossRef]
  46. Alaska Satelatite Facility. Available online: https://vertex.daac.asf.alaska.edu/# (accessed on 22 December 2010).
  47. Manap, M.A.; Nampak, H.; Pradhan, B.; Lee, S.; Sulaiman, W.N.A.; Ramli, M.F. Application of probabilistic-based frequency ratio model in groundwater potential mapping using remote sensing data and GIS. Arab. J. Geosci. 2014, 7, 711–724. [Google Scholar] [CrossRef]
  48. Pourghasemi, H.R.; Moradi, H.R.; Fatemi Aghda, S.M.; Gokceoglu, C.; Pradhan, B. GIS-based landslide susceptibility mapping with probabilistic likelihood ratio and spatial multi-criteria evaluation models (North of Tehran, Iran). Arab. J. Geosci. 2014, 7, 1857–1878. [Google Scholar] [CrossRef]
  49. Pourghasemi, H.R.; Kerle, N. Random forests and evidential belief function-based landslide susceptibility assessment in Western Mazandaran Province, Iran. Environ. Earth Sci. 2016, 75. [Google Scholar] [CrossRef]
  50. Geological Survey and Mineral Exploration Organization of Iran. 2018. Available online: https://gsi.ir/fa (accessed on 11 November 2018).
  51. United States Geological Survey. Available online: https://earthexplorer.usgs.gov (accessed on 25 June 2017).
  52. Pourghasemi, H.R.; Youse, S.; Kornejady, A.; Cerdà, A. Performance assessment of individual and ensemble data-mining techniques for gully erosion modeling. Sci. Total Environ. 2017, 609, 764–775. [Google Scholar] [CrossRef] [Green Version]
  53. Jungerius, P.D.; Matundura, J.; van de Ancker, J.A.M. Road construction and gully erosion in West Pokot, Kenya. Earth Surf. Process. Landf. 2002, 27, 1237–1247. [Google Scholar] [CrossRef]
  54. Nyssen, J.; Poesen, J.; Moeyersons, J.; Luyten, E.; Veyret-Picot, M.; Deckers, J.; Haile, M.; Govers, G. Impact of road building on gully erosion risk: A case study from the Northern Ethiopian Highlands. Earth Surf. Process. Landf. 2002, 27, 1267–1283. [Google Scholar] [CrossRef]
  55. Bhunia, G.S.; Shit, P.K.; Maiti, R. Comparison of GIS-based interpolation methods for spatial distribution of soil organic carbon (SOC). J. Saudi Soc. Agric. Sci. 2018, 17, 114–126. [Google Scholar] [CrossRef] [Green Version]
  56. Loh, W.Y. Classification and regression trees. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2011, 1, 14–23. [Google Scholar] [CrossRef]
  57. Kim, J.-C.; Lee, S.; Jung, H.-S.; Lee, S. Landslide susceptibility mapping using random forest and boosted tree models in Pyeong-Chang, Korea. Geocarto Int. 2018, 33, 1000–1015. [Google Scholar] [CrossRef]
  58. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  59. Micheletti, N.; Foresti, L.; Robert, S.; Leuenberger, M.; Pedrazzini, A.; Jaboyedoff, M.; Kanevski, M. Machine Learning Feature Selection Methods for Landslide Susceptibility Mapping. Math. Geosci. 2014, 46, 33–57. [Google Scholar] [CrossRef]
  60. Naghibi, S.A.; Moradi Dashtpagerdi, M. Evaluation of four supervised learning methods for groundwater spring potential mapping in Khalkhal region (Iran) using GIS-based features. Hydrogeol. J. 2017, 25, 169–189. [Google Scholar] [CrossRef]
  61. Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J.C.; Sheridan, R.P.; Feuston, B.P. Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling. J. Chem. Inf. Model. 2003, 43, 1947–1958. [Google Scholar] [CrossRef]
  62. R Core Team. R: A Language and Environment for Statistical Computing. 2018. The R Project for Statistical Computing. Available online: https://www.r-project.org (accessed on 22 June 2016).
  63. Breiman, L. Classification and Regression Trees, 1st ed.; Routledge: New York, NY, USA, 1984. [Google Scholar] [CrossRef]
  64. Mitchell, T.M. Machine Learning; McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
  65. Betrie, G.D.; Tesfamariam, S.; Morin, K.A.; Sadiq, R. Predicting copper concentrations in acid mine drainage: A comparative analysis of five machine learning techniques. Environ. Monit. Assess. 2013, 185, 4171–4182. [Google Scholar] [CrossRef]
  66. Naghibi, S.A.; Vafakhah, M.; Hashemi, H.; Pradhan, B.; Alavi, S.J. Water Resources Management Through Flood Spreading Project Suitability Mapping Using Frequency Ratio, k-nearest Neighbours, and Random Forest Algorithms. Nat. Resour. Res. 2019, 1–19. [Google Scholar] [CrossRef]
  67. Araghinejad, S. Data-Driven Modeling: Using MATLAB® in Water Resources and Environmental Engineering; Springer: Dordrecht, The Netherlands, 2013; Volume 67. [Google Scholar] [CrossRef]
  68. Park, S.; Choi, C.; Kim, B.; Kim, J. Landslide susceptibility mapping using frequency ratio, analytic hierarchy process, logistic regression, and artificial neural network methods at the Inje area, Korea. Environ. Earth Sci. 2013, 68, 1443–1464. [Google Scholar] [CrossRef]
  69. Razandi, Y.; Pourghasemi, H.R.; Neisani, N.S.; Rahmati, O. Application of analytical hierarchy process, frequency ratio, and certainty factor models for groundwater potential mapping using GIS. Earth Sci. Inform. 2015, 8, 867–883. [Google Scholar] [CrossRef]
  70. Yesilnacar, E.K. The Application of Computational Intelligence to Landslide Susceptibility Mapping in Turkey. Ph.D. Thesis, University of Melbourne, Melbourne, Australia, 2005. [Google Scholar]
  71. Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef]
  72. Jenks, G.F. The Data Model Concept in Statistical Mapping. Int. Yearb. Cartogr. 1967, 7, 186–190. [Google Scholar]
  73. Chen, W.; Pourghasemi, H.R.; Naghibi, S.A. A comparative study of landslide susceptibility maps produced using support vector machine with different kernel functions and entropy data mining models in China. Bull. Eng. Geol. Environ. 2018, 77, 647–664. [Google Scholar] [CrossRef]
  74. Lee, S.; Kim, J.-C.; Jung, H.-S.; Lee, M.J.; Lee, S. Spatial prediction of flood susceptibility using random-forest and boosted-tree models in Seoul metropolitan city, Korea. Geomat. Nat. Haz. Risk. 2017, 8, 1185–1203. [Google Scholar] [CrossRef] [Green Version]
  75. Nicodemus, K.K. Letter to the Editor: On the stability and ranking of predictors from random forest variable importance measures. Brief. Bioinform. 2011, 12, 369–373. [Google Scholar] [CrossRef] [Green Version]
  76. Rizeei, H.M.; Pradhan, B.; Saharkhiz, M.A. An integrated fluvial and flash pluvial model using 2D high-resolution sub-grid and particle swarm optimization-based random forest approaches in GIS. Complex Intell. Syst. 2019, 5, 283–302. [Google Scholar] [CrossRef]
  77. Zhang, X.; Fan, J.; Liu, Q.; Xiong, D. The contribution of gully erosion to total sediment production in a small watershed in Southwest China. Phys. Geogr. 2018, 39, 246–263. [Google Scholar] [CrossRef]
  78. Garosi, Y.; Sheklabadi, M.; Conoscenti, C.; Pourghasemi, H.R.; Van Oost, K. Assessing the performance of GIS-based machine learning models with different accuracy measures for determining susceptibility to gully erosion. Sci. Total Environ. 2019, 664, 1117–1132. [Google Scholar] [CrossRef]
  79. Kantardzic, M. Data Mining: Concepts, Models, Methods, and Algorithms, 2nd ed.; Wiley-IEEE Press: Hoboken, NJ, USA, 2011. [Google Scholar]
  80. Naghibi, S.A.; Pourghasemi, H.R.; Abbaspour, K. A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS. Theor. Appl. Climatol. 2018, 131, 967–984. [Google Scholar] [CrossRef]
  81. Kinnell, P.I.A. Raindrop-impact-induced erosion processes and prediction: A review. Hydrol. Process. 2005, 19, 2815–2844. [Google Scholar] [CrossRef]
  82. Van Dijk, A.I.J.M.; Bruijnzeel, L.A.; Rosewell, C.J. Rainfall intensity–kinetic energy relationships: A critical literature appraisal. J. Hydrol. 2002, 261, 1–23. [Google Scholar] [CrossRef]
  83. Endale, D.M.; Fisher, D.S.; Steiner, J.L. Hydrology of a zero-order Southern Piedmont watershed through 45 years of changing agricultural land use. Part 1. Monthly and seasonal rainfall-runoff relationships. J. Hydrol. 2006, 316, 1–12. [Google Scholar] [CrossRef]
  84. Azareh, A.; Rahmati, O.; Rafiei-Sardooi, E.; Sankey, J.B.; Lee, S.; Shahabi, H.; Ahmad, B.B. Modelling gully-erosion susceptibility in a semi-arid region, Iran: Investigation of applicability of certainty factor and maximum entropy models. Sci. Total Environ. 2019, 655, 684–696. [Google Scholar] [CrossRef]
  85. Dube, F.; Nhapi, I.; Murwira, A.; Gumindoga, W.; Goldin, J.; Mashauri, D.A. Potential of weight of evidence modelling for gully erosion hazard assessment in Mbire District – Zimbabwe. Phys. Chem. Earth 2014, 67–69, 145–152. [Google Scholar] [CrossRef]
Figure 1. (A) the Markazi and Isfahan Provinces in Iran, (B) the study area in the Markazi and Isfahan Provinces, and (C) gully erosion situations with a DEM map of the Robat Turk Watershed.
Figure 1. (A) the Markazi and Isfahan Provinces in Iran, (B) the study area in the Markazi and Isfahan Provinces, and (C) gully erosion situations with a DEM map of the Robat Turk Watershed.
Water 11 02076 g001
Figure 2. A picture of two gullies created in agriculture land (right) and range land (left) study area.
Figure 2. A picture of two gullies created in agriculture land (right) and range land (left) study area.
Water 11 02076 g002
Figure 3. Flowchart of the research methodology used to provide the gully erosion susceptibility map.
Figure 3. Flowchart of the research methodology used to provide the gully erosion susceptibility map.
Water 11 02076 g003
Figure 4. Gully erosion geo-environment factor maps of the study area: (a) DEM, (b) slope degree, (c) slope aspect, (d) plan curvature, (e) profile curvature, (f) distance to rivers, (g) drainage density, (h) lithology, (i) land use, (j) NDVI, (k) distance to roads, and (l) annual mean rainfall.
Figure 4. Gully erosion geo-environment factor maps of the study area: (a) DEM, (b) slope degree, (c) slope aspect, (d) plan curvature, (e) profile curvature, (f) distance to rivers, (g) drainage density, (h) lithology, (i) land use, (j) NDVI, (k) distance to roads, and (l) annual mean rainfall.
Water 11 02076 g004aWater 11 02076 g004b
Figure 5. The results of cross-validation in the KNN model.
Figure 5. The results of cross-validation in the KNN model.
Water 11 02076 g005
Figure 6. Misclassification/error rate as a function of trees grown in RF.
Figure 6. Misclassification/error rate as a function of trees grown in RF.
Water 11 02076 g006
Figure 7. Gully erosion susceptibility map using the RF model.
Figure 7. Gully erosion susceptibility map using the RF model.
Water 11 02076 g007
Figure 8. Gully erosion susceptibility map using the KNN model.
Figure 8. Gully erosion susceptibility map using the KNN model.
Water 11 02076 g008
Figure 9. The ROC curve of the KNN and RF models used to map the study area.
Figure 9. The ROC curve of the KNN and RF models used to map the study area.
Water 11 02076 g009
Table 1. Lithology available in Robat Turk watershed.
Table 1. Lithology available in Robat Turk watershed.
RowCodeLithologyGeological Age
1Qft2Low level pediment fan and valley terrace depositsQuaternary
2PlcPolymictic conglomerate and sandstonePliocene
3pCkDull green grey salty shales
with subordinate intercalation of quartzitic sandstone
(KAHAR FM; Morad series and Kalmard Formation)
Pre-Cambrian
4EkgyGypsumLate Eocene
5E2lNummulitic limestoneEocene
6pCmt2Low - grade, regional metamorphic rocks (Green Schist Facies)Pre-Cambrian
7OMqlMassive to thick - bedded reefal limestoneOligocene-Miocene
8PdRed sandstone and shale with subordinate sandy limestone (Dorud Formation)Permian
Table 2. Share of the gully erosion efficacy variables to the KNN and RF methods.
Table 2. Share of the gully erosion efficacy variables to the KNN and RF methods.
VariableImportance
KNNRF
Rainfall100.0048.74
Altitude74.3530.46
Distance from rivers50.6414.95
Drainage density30.116.40
Distance from road19.3918.36
Land use17.562.18
NDVI5.668.92
Slope5.636.32
Lithology4.544.07
Profile curvature1.232.70
Slope aspect0.924.82
Plan curvature0.004.94
Table 3. Confusion matrix of the random forest (RF) model (0 = no gully, 1 = gully).
Table 3. Confusion matrix of the random forest (RF) model (0 = no gully, 1 = gully).
ObservationPredictedClass Error
01
0137330.1941
1251450.1470
Table 4. Extracting best parameters in Rf model.
Table 4. Extracting best parameters in Rf model.
Node SizemtryTreesBest Tree
55700235
Table 5. Percentage distribution of susceptibility classes in the KNN and RF models.
Table 5. Percentage distribution of susceptibility classes in the KNN and RF models.
GPM ZonesRFKNN
RangeArea%RangeArea%
Low<0.21746.42<0.213.23
Moderate0.217–0.4515.420.2–0.510.83
High0.45–0.67719.990.5–0.842.16
Very high>0.67718.18>0.833.78

Share and Cite

MDPI and ACS Style

Avand, M.; Janizadeh, S.; Naghibi, S.A.; Pourghasemi, H.R.; Khosrobeigi Bozchaloei, S.; Blaschke, T. A Comparative Assessment of Random Forest and k-Nearest Neighbor Classifiers for Gully Erosion Susceptibility Mapping. Water 2019, 11, 2076. https://doi.org/10.3390/w11102076

AMA Style

Avand M, Janizadeh S, Naghibi SA, Pourghasemi HR, Khosrobeigi Bozchaloei S, Blaschke T. A Comparative Assessment of Random Forest and k-Nearest Neighbor Classifiers for Gully Erosion Susceptibility Mapping. Water. 2019; 11(10):2076. https://doi.org/10.3390/w11102076

Chicago/Turabian Style

Avand, Mohammadtaghi, Saeid Janizadeh, Seyed Amir Naghibi, Hamid Reza Pourghasemi, Saeid Khosrobeigi Bozchaloei, and Thomas Blaschke. 2019. "A Comparative Assessment of Random Forest and k-Nearest Neighbor Classifiers for Gully Erosion Susceptibility Mapping" Water 11, no. 10: 2076. https://doi.org/10.3390/w11102076

APA Style

Avand, M., Janizadeh, S., Naghibi, S. A., Pourghasemi, H. R., Khosrobeigi Bozchaloei, S., & Blaschke, T. (2019). A Comparative Assessment of Random Forest and k-Nearest Neighbor Classifiers for Gully Erosion Susceptibility Mapping. Water, 11(10), 2076. https://doi.org/10.3390/w11102076

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop