Gullying is a type of soil erosion that currently represents a major threat at the societal scale and will likely increase in the future. In Iran, soil erosion, and specifically gullying, is already causing significant distress to local economies by affecting agricultural productivity and infrastructure. Recognizing this threat has recently led the Iranian geomorphology community to focus on the problem across the whole country. This study is in line with other efforts where the optimal method to map gully-prone areas is sought by testing state-of-the-art machine learning tools. In this study, we compare the performance of three machine learning algorithms, namely Fisher’s linear discriminant analysis (FLDA), logistic model tree (LMT) and naïve Bayes tree (NBTree). We also introduce three novel ensemble models by combining the aforementioned base classifiers to the Random SubSpace (RS) meta-classifier namely RS-FLDA, RS-LMT and RS-NBTree. The area under the receiver operating characteristic (AUROC), true skill statistics (TSS) and kappa criteria are used for calibration (goodness-of-fit) and validation (prediction accuracy) datasets to compare the performance of the different algorithms. In addition to susceptibility mapping, we also study the association between gully erosion and a set of morphometric, hydrologic and thematic properties by adopting the evidential belief function (EBF). The results indicate that hydrology-related factors contribute the most to gully formation, which is also confirmed by the susceptibility patterns displayed by the RS-NBTree ensemble. The RS-NBTree is the model that outperforms the other five models, as indicated by the prediction accuracy (area under curve (AUC) = 0.898, Kappa = 0.748 and TSS = 0.697), and goodness-of-fit (AUC = 0.780, Kappa = 0.682 and TSS = 0.618). The analyses are performed with the same gully presence/absence balanced modeling design. Therefore, the differences in performance are dependent on the algorithm architecture. Overall, the EBF model can detect strong and reasonable dependencies towards gully-prone conditions. The RS-NBTree ensemble model performed significantly better than the others, suggesting greater flexibility towards unknown data, which may support the applications of these methods in transferable susceptibility models in areas that are potentially erodible but currently lack gully data.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited