MDPI - Publisher of Open Access Journals

28 pages, 1624 KB

Open AccessArticle

Domain-Constrained Stacking Framework for Credit Default Prediction

by Ming-Liang Ding, Yu-Liang Ma and Fu-Qiang You

Mathematics 2025, 13(21), 3451; https://doi.org/10.3390/math13213451 - 29 Oct 2025

Viewed by 1604

Accurate and reliable credit risk classification is fundamental to the stability of financial systems and the efficient allocation of capital. However, with the rapid expansion of customer information in both volume and complexity, traditional rule-based or purely statistical approaches have become increasingly inadequate. [...] Read more.

Accurate and reliable credit risk classification is fundamental to the stability of financial systems and the efficient allocation of capital. However, with the rapid expansion of customer information in both volume and complexity, traditional rule-based or purely statistical approaches have become increasingly inadequate. Motivated by these challenges, this study introduces a domain-constrained stacking ensemble framework that systematically integrates business knowledge with advanced machine learning techniques. First, domain heuristics are embedded at multiple stages of the pipeline: threshold-based outlier removal improves data quality, target variable redefinition ensures consistency with industry practice, and feature discretization with monotonicity verification enhances interpretability. Then, each variable is transformed through Weight-of-Evidence (WOE) encoding and evaluated via Information Value (IV), which enables robust feature selection and effective dimensionality reduction. Next, on this transformed feature space, we train logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), and a two-layer stacking ensemble. Finally, the ensemble aggregates cross-validated out-of-fold predictions from LR, RF and XGBoost as meta-features, which are fused by a meta-level logistic regression, thereby capturing both linear and nonlinear relationships while mitigating overfitting. Experimental results across two credit datasets demonstrate that the proposed framework achieves superior predictive performance compared with single models, highlighting its potential as a practical solution for credit risk assessment in real-world financial applications. Full article

► Show Figures

Figure 1

14 pages, 1436 KB

Open AccessArticle

Defect Prediction for Capacitive Equipment in Power System

by Qingjun Peng, Zezhong Zheng and Hao Hu

Appl. Sci. 2024, 14(5), 1968; https://doi.org/10.3390/app14051968 - 28 Feb 2024

Cited by 5 | Viewed by 1453

Abstract

As a core component of the smart grid, capacitive equipment plays a critical role in modern power systems. When defects occur, they pose a significant threat to the safety of both other equipment and personnel. Hence, it is of great significance to predict [...] Read more.

As a core component of the smart grid, capacitive equipment plays a critical role in modern power systems. When defects occur, they pose a significant threat to the safety of both other equipment and personnel. Hence, it is of great significance to predict whether defects occur in capacitive equipment in advance. To achieve this goal, we propose a novel method that integrates the weight of evidence (WOE) feature encoding with machine learning (ML). Five models, including support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost), multi-layer perceptron (MLP), and linear classification, are employed with WOE features for defect prediction. Furthermore, based on the prediction of equipment with defects, an additional prediction is conducted to determine the potential defect level of the equipment. Experimental results demonstrate that the performance of each algorithm significantly improves with WOE encoding features. Particularly, the RF model with WOE encoding features exhibits optimal performance. In conclusion, the proposed method offers a promising solution for predicting the occurrence of defects and the corresponding defect levels of capacitive equipment. It enables relevant personnel to focus on and inspect equipment predicted to be at risk of defects, thereby preventing major malfunctions. Full article

► Show Figures

Figure 1

31 pages, 4651 KB

Open AccessArticle

An Integrated Grassland Fire-Danger-Assessment System for a Mountainous National Park Using Geospatial Modelling Techniques

by Olga D. Mofokeng, Samuel A. Adelabu and Colbert M. Jackson

Fire 2024, 7(2), 61; https://doi.org/10.3390/fire7020061 - 19 Feb 2024

Cited by 8 | Viewed by 5086

Abstract

Grasslands are key to the Earth’s system and provide crucial ecosystem services. The degradation of the grassland ecosystem in South Africa is increasing alarmingly, and fire is regarded as one of the major culprits. Globally, anthropogenic climate changes have altered fire regimes in [...] Read more.

Grasslands are key to the Earth’s system and provide crucial ecosystem services. The degradation of the grassland ecosystem in South Africa is increasing alarmingly, and fire is regarded as one of the major culprits. Globally, anthropogenic climate changes have altered fire regimes in the grassland biome. Integrated fire-risk assessment systems provide an integral approach to fire prevention and mitigate the negative impacts of fire. However, fire risk-assessment is extremely challenging, owing to the myriad of factors that influence fire ignition and behaviour. Most fire danger systems do not consider fire causes; therefore, they are inadequate in validating the estimation of fire danger. Thus, fire danger assessment models should comprise the potential causes of fire. Understanding the key drivers of fire occurrence is key to the sustainable management of South Africa’s grassland ecosystems. Therefore, this study explored six statistical and machine learning models—the frequency ratio (FR), weight of evidence (WoE), logistic regression (LR), decision tree (DT), random forest (RF), and support vector machine (SVM) in Google Earth Engine (GEE) to assess fire danger in an Afromontane grassland protected area (PA). The area under the receiver operating characteristic curve results (ROC/AUC) revealed that DT showed the highest precision on model fit and success rate, while the WoE was used to record the highest prediction rate (AUC = 0.74). The WoE model showed that 53% of the study area is susceptible to fire. The land surface temperature (LST) and vegetation condition index (VCI) were the most influential factors. Corresponding analysis suggested that the fire regime of the study area is fuel-dominated. Thus, fire danger management strategies within the Golden Gate Highlands National Park (GGHNP) should include fuel management aiming at correctly weighing the effects of fuel in fire ignition and spread. Full article

(This article belongs to the Special Issue Remote Sensing of Wildfire: Regime Change and Disaster Response)

► Show Figures

Figure 1

28 pages, 51846 KB

Open AccessArticle

Landslide Susceptibility Mapping and Interpretation in the Upper Minjiang River Basin

by Xin Wang and Shibiao Bai

Remote Sens. 2023, 15(20), 4947; https://doi.org/10.3390/rs15204947 - 13 Oct 2023

Cited by 14 | Viewed by 2810

Abstract

To enable the accurate assessment of landslide susceptibility in the upper reaches of the Minjiang River Basin, this research intends to spatially compare landslide susceptibility maps obtained from unclassified landslides directly and the spatial superposition of different types of landslide susceptibility map, and [...] Read more.

To enable the accurate assessment of landslide susceptibility in the upper reaches of the Minjiang River Basin, this research intends to spatially compare landslide susceptibility maps obtained from unclassified landslides directly and the spatial superposition of different types of landslide susceptibility map, and explore interpretability using cartographic principles of the two methods of map-making. This research using the catalogs of rainfall and seismic landslides selected nine background factors those affect the occurrence of landslides through correlation analysis finally, including lithology, NDVI, elevation, slope, aspect, profile curve, curvature, land use, and distance to faults, to assess rainfall and seismic landslide susceptibility, respectively, by using a WOE-RF coupling model. Then, an evaluation of landslide susceptibility was conducted by merging rainfall and seismic landslides into a dataset that does not distinguish types of landslides; a comparison was also made between the landslide susceptibility maps obtained through the superposition of rainfall and seismic landslide susceptibility maps and unclassified landslides. Finally, confusion matrix and ROC curve were used to verify the accuracy of the model. It was found that the accuracy of the training set, testing set, and the entire data set based on the WOE-RF model for predicting rainfall landslides were 0.9248, 0.8317, and 0.9347, and the AUC area were 1, 0.949, and 0.955; the accuracy of the training set, testing set, and the entire data set for seismic landslides prediction were 0.9498, 0.9067, and 0.8329, and the AUC area were 1, 0.981, and 0.921; the accuracy of the training set, testing set, and the entire data set for unclassified landslides prediction were 0.9446, 0.9080, and 0.8352, and the AUC area were 0.9997, 0.9822, and 0.9207. Both of the confusion matrix and the ROC curve indicated that the accuracy of the coupling model is high. The southeast of the line from Mount Xuebaoding to Lixian County is a high landslide prone area, and through the maps, it was found that the extremely high susceptibility area of seismic landslides is located at a higher elevation than rainfall landslides by extracting the extremely high susceptibility zones of both. It was also found that the results of the two methods of evaluating landslide susceptibility were significantly different. As for a same background factor, the distribution of the areas occupied by the same landslide occurrence class was not the same according to the two methods, which indicates the necessity of conducting relevant research on distinguishing landslide types. Full article

(This article belongs to the Topic Earth Observation Systems in Geology Mass Identification, Investigation and Inventory Mapping)

► Show Figures

Graphical abstract

49 pages, 52986 KB

Open AccessArticle

Investigation of Landslide Susceptibility Decision Mechanisms in Different Ensemble-Based Machine Learning Models with Various Types of Factor Data

by Jiakai Lu, Chao Ren, Weiting Yue, Ying Zhou, Xiaoqin Xue, Yuanyuan Liu and Cong Ding

Sustainability 2023, 15(18), 13563; https://doi.org/10.3390/su151813563 - 11 Sep 2023

Cited by 13 | Viewed by 3539

Abstract

Machine learning (ML)-based methods of landslide susceptibility assessment primarily focus on two dimensions: accuracy and complexity. The complexity is not only influenced by specific model frameworks but also by the type and complexity of the modeling data. Therefore, considering the impact of factor [...] Read more.

Machine learning (ML)-based methods of landslide susceptibility assessment primarily focus on two dimensions: accuracy and complexity. The complexity is not only influenced by specific model frameworks but also by the type and complexity of the modeling data. Therefore, considering the impact of factor data types on the model’s decision-making mechanism holds significant importance in assessing regional landslide characteristics and conducting landslide risk warnings given the achievement of good predictive performance for landslide susceptibility using excellent ML methods. The decision-making mechanism of landslide susceptibility models coupled with different types of factor data in machine learning methods was explained in this study by utilizing the Shapley Additive exPlanations (SHAP) method. Furthermore, a comparative analysis was carried out to examine the differential effects of diverse data types for identical factors on model predictions. The study area selected was Cenxi, Guangxi, where a geographic spatial database was constructed by combining 23 landslide conditioning factors with 214 landslide samples from the region. Initially, the factors were standardized using five conditional probability models, frequency ratio (FR), information value (IV), certainty factor (CF), evidential belief function (EBF), and weights of evidence (WOE), based on the spatial arrangement of landslides. This led to the formation of six types of factor databases using the initial data. Subsequently, two ensemble-based ML methods, random forest (RF) and XGBoost, were utilized to build models for predicting landslide susceptibility. Various evaluation metrics were employed to compare the predictive capabilities of different models and determined the optimal model. Simultaneously, the analysis was conducted using the interpretable SHAP method for intrinsic decision-making mechanisms of different ensemble-based ML models, with a specific focus on explaining and comparing the differential impacts of different types of factor data on prediction results. The results of the study illustrated that the XGBoost-CF model constructed with CF values of factors not only exhibited the best predictive accuracy and stability but also yielded more reasonable results for landslide susceptibility zoning, and was thus identified as the optimal model. The global interpretation results revealed that slope was the most crucial factor influencing landslides, and its interaction with other factors in the study area collectively contributed to landslide occurrences. The differences in the internal decision-making mechanisms of models based on different data types for the same factors primarily manifested in the extent of influence on prediction results and the dependency of factors, providing an explanation for the performance of standardized data in ML models and the reasons behind the higher predictive performance of coupled models based on conditional probability models and ML methods. Through comprehensive analysis of the local interpretation results from different models analyzing the same sample with different sample characteristics, the reasons for model prediction errors can be summarized, thereby providing a reference framework for constructing more accurate and rational landslide susceptibility models and facilitating landslide warning and management. Full article

(This article belongs to the Topic Natural Hazards and Disaster Risks Reduction)

► Show Figures

Figure 1

11 pages, 4886 KB

Open AccessArticle

Evaluation of Geological Disaster Sensitivity in Shuicheng District Based on the WOE-RF Model

by Zefang Zhang, Zhikuan Qian, Yong Wei, Xing Zhu and Linjun Wang

Sustainability 2022, 14(23), 16247; https://doi.org/10.3390/su142316247 - 5 Dec 2022

Cited by 1 | Viewed by 1961

Abstract

To improve the prevention and control of geological disasters in Shuicheng District, 10 environmental factors—slope, slope direction, curvature, NDVI, stratum lithology, distance from fault, distance from river system, annual average rainfall, distance from road and land use—were selected as evaluation indicators by integrating [...] Read more.

To improve the prevention and control of geological disasters in Shuicheng District, 10 environmental factors—slope, slope direction, curvature, NDVI, stratum lithology, distance from fault, distance from river system, annual average rainfall, distance from road and land use—were selected as evaluation indicators by integrating factors such as landform, basic geology, hydrometeorology and engineering activities. Based on the weight of evidence, random forest, support vector machine and BP neural network algorithms were introduced to build WOE-RF, WOE-SVM and WOE-BPNN models. The sensitivity of Shuicheng District to geological disasters was evaluated using the GIS platform, and the region was divided into areas of extremely high, high, medium, low and extremely low sensitivity to geological disasters. By comparing and analyzing the ROC curve and the distribution law of the sensitivity index, the AUC evaluation accuracy of the WOE-RF, WOE-SVM and WOE-BPNN models was 0.836, 0.807 and 0.753, respectively; the WOE-RF model was shown to be the most effective. In the WOE-RF model, the extremely high-, high-, medium-, low- and extremely low-sensitivity areas accounted for 15.9%, 16.9%, 19.3%, 21.0% and 26.9% of the study area, respectively. The extremely high- and high-sensitivity areas are mainly concentrated in areas with large slopes, broken rock masses, river systems and intensive human engineering activity. These research results are consistent with the actual situation and can provide a reference for the prevention and control of geological disasters in this and similar mountainous areas. Full article

(This article belongs to the Special Issue Applications of GIS and Remote Sensing for Sustainable Spatial Planning)

► Show Figures

Figure 1

19 pages, 3910 KB

Open AccessArticle

Gully Erosion Susceptibility Mapping in Highly Complex Terrain Using Machine Learning Models

by Annan Yang, Chunmei Wang, Guowei Pang, Yongqing Long, Lei Wang, Richard M. Cruse and Qinke Yang

ISPRS Int. J. Geo-Inf. 2021, 10(10), 680; https://doi.org/10.3390/ijgi10100680 - 9 Oct 2021

Cited by 52 | Viewed by 5466

Abstract

Gully erosion is the most severe type of water erosion and is a major land degradation process. Gully erosion susceptibility mapping (GESM)’s efficiency and interpretability remains a challenge, especially in complex terrain areas. In this study, a WoE-MLC model was used to solve [...] Read more.

Gully erosion is the most severe type of water erosion and is a major land degradation process. Gully erosion susceptibility mapping (GESM)’s efficiency and interpretability remains a challenge, especially in complex terrain areas. In this study, a WoE-MLC model was used to solve the above problem, which combines machine learning classification algorithms and the statistical weight of evidence (WoE) model in the Loess Plateau. The three machine learning (ML) algorithms utilized in this research were random forest (RF), gradient boosted decision trees (GBDT), and extreme gradient boosting (XGBoost). The results showed that: (1) GESM were well predicted by combining both machine learning regression models and WoE-MLC models, with the area under the curve (AUC) values both greater than 0.92, and the latter was more computationally efficient and interpretable; (2) The XGBoost algorithm was more efficient in GESM than the other two algorithms, with the strongest generalization ability and best performance in avoiding overfitting (averaged AUC = 0.947), followed by the RF algorithm (averaged AUC = 0.944), and GBDT algorithm (averaged AUC = 0.938); and (3) slope gradient, land use, and altitude were the main factors for GESM. This study may provide a possible method for gully erosion susceptibility mapping at large scale. Full article

(This article belongs to the Special Issue Geomorphometry and Terrain Analysis)

► Show Figures

Figure 1

28 pages, 10688 KB

Open AccessArticle

Uncertainties Analysis of Collapse Susceptibility Prediction Based on Remote Sensing and GIS: Influences of Different Data-Based Models and Connections between Collapses and Environmental Factors

by Wenbin Li, Xuanmei Fan, Faming Huang, Wei Chen, Haoyuan Hong, Jinsong Huang and Zizheng Guo

Remote Sens. 2020, 12(24), 4134; https://doi.org/10.3390/rs12244134 - 17 Dec 2020

Cited by 53 | Viewed by 6352

Abstract

To study the uncertainties of a collapse susceptibility prediction (CSP) under the coupled conditions of different data-based models and different connection methods between collapses and environmental factors, An’yuan County in China with 108 collapses is used as the study case, and 11 environmental [...] Read more.

To study the uncertainties of a collapse susceptibility prediction (CSP) under the coupled conditions of different data-based models and different connection methods between collapses and environmental factors, An’yuan County in China with 108 collapses is used as the study case, and 11 environmental factors are acquired by data analysis of Landsat TM 8 and high-resolution aerial images, using a hydrological and topographical spatial analysis of Digital Elevation Modeling in ArcGIS 10.2 software. Accordingly, 20 coupled conditions are proposed for CSP with five different connection methods (Probability Statistics (PSs), Frequency Ratio (FR), Information Value (IV), Index of Entropy (IOE) and Weight of Evidence (WOE)) and four data-based models (Analytic Hierarchy Process (AHP), Multiple Linear Regression (MLR), C5.0 Decision Tree (C5.0 DT) and Random Forest (RF)). Finally, the CSP uncertainties are assessed using the area under receiver operation curve (AUC), mean value, standard deviation and significance test, respectively. Results show that: (1) the WOE-based models have the highest AUC accuracy, lowest mean values and average rank, and a relatively large standard deviation; the mean values and average rank of all the FR-, IV- and IOE-based models are relatively large with low standard deviations; meanwhile, the AUC accuracies of FR-, IV- and IOE-based models are consistent but higher than those of the PS-based model. Hence, the WOE exhibits a greater spatial correlation performance than the other four methods. (2) Among all the data-based models, the RF model has the highest AUC accuracy, lowest mean value and mean rank, and a relatively large standard deviation. The CSP performance of the RF model is followed by the C5.0 DT, MLR and AHP models, respectively. (3) Under the coupled conditions, the WOE-RF model has the highest AUC accuracy, a relatively low mean value and average rank, and a high standard deviation. The PS-AHP model is opposite to the WOE-RF model. (4) In addition, the coupled models show slightly better CSP performances than those of the single data-based models not considering connect methods. The CSP performance of the other models falls somewhere in between. It is concluded that the WOE-RF is the most appropriate coupled condition for CSP than the other models. Full article

(This article belongs to the Special Issue Spatial Modelling of Natural Hazards and Water Resources through Remote Sensing, GIS and Machine Learning Methods)

► Show Figures

Figure 1

21 pages, 8386 KB

Open AccessArticle

Performance Evaluation and Comparison of Bivariate Statistical-Based Artificial Intelligence Algorithms for Spatial Prediction of Landslides

by Wei Chen, Zenghui Sun, Xia Zhao, Xinxiang Lei, Ataollah Shirzadi and Himan Shahabi

ISPRS Int. J. Geo-Inf. 2020, 9(12), 696; https://doi.org/10.3390/ijgi9120696 - 24 Nov 2020

Cited by 21 | Viewed by 3551

Abstract

The purpose of this study is to compare nine models, composed of certainty factors (CFs), weights of evidence (WoE), evidential belief function (EBF) and two machine learning models, namely random forest (RF) and support vector machine (SVM). In the first step, fifteen landslide [...] Read more.

The purpose of this study is to compare nine models, composed of certainty factors (CFs), weights of evidence (WoE), evidential belief function (EBF) and two machine learning models, namely random forest (RF) and support vector machine (SVM). In the first step, fifteen landslide conditioning factors were selected to prepare thematic maps, including slope aspect, slope angle, elevation, stream power index (SPI), sediment transport index (STI), topographic wetness index (TWI), plan curvature, profile curvature, land use, normalized difference vegetation index (NDVI), soil, lithology, rainfall, distance to rivers and distance to roads. In the second step, 152 landslides were randomly divided into two groups at a ratio of 70/30 as the training and validation datasets. In the third step, the weights of the CF, WoE and EBF models for conditioning factor were calculated separately, and the weights were used to generate the landslide susceptibility maps. The weights of each bivariate model were substituted into the RF and SVM models, respectively, and six integrated models and landslide susceptibility maps were obtained. In the fourth step, the receiver operating characteristic (ROC) curve and related parameters were used for verification and comparison, and then the success rate curve and the prediction rate curves were used for re-analysis. The comprehensive results showed that the hybrid model is superior to the bivariate model, and all nine models have excellent performance. The WoE–RF model has the highest predictive ability (AUC_T: 0.9993, AUC_P: 0.8968). The landslide susceptibility maps produced in this study can be used to manage landslide hazard and risk in Linyou County and other similar areas. Full article

► Show Figures

Figure 1

35 pages, 6562 KB

Open AccessArticle

Application of Probabilistic and Machine Learning Models for Groundwater Potentiality Mapping in Damghan Sedimentary Plain, Iran

by Alireza Arabameri, Jagabandhu Roy, Sunil Saha, Thomas Blaschke, Omid Ghorbanzadeh and Dieu Tien Bui

Remote Sens. 2019, 11(24), 3015; https://doi.org/10.3390/rs11243015 - 14 Dec 2019

Cited by 70 | Viewed by 6241

Abstract

Groundwater is one of the most important natural resources, as it regulates the earth’s hydrological system. The Damghan sedimentary plain area, located in the region of a semi-arid climate of Iran, has very critical conditions of groundwater due to massive pressure on it [...] Read more.

Groundwater is one of the most important natural resources, as it regulates the earth’s hydrological system. The Damghan sedimentary plain area, located in the region of a semi-arid climate of Iran, has very critical conditions of groundwater due to massive pressure on it and is in need of robust models for identifying the groundwater potential zones (GWPZ). The main goal of the current research is to prepare a groundwater potentiality map (GWPM) considering the probabilistic, machine learning, data mining, and multi-criteria decision analysis (MCDA) approaches. For this purpose, 80 wells collected from the Iranian groundwater resource department and field investigation with global positioning system (GPS), have been selected randomly and considered as the groundwater inventory datasets. Out of 80 wells, 56 (70%) wells have been brought into play for modeling and 24 (30%) for validation purposes. Elevation, slope, aspect, convergence index (CI), rainfall, drainage density (Dd), distance to river, distance to fault, distance to road, lithology, soil type, land use/land cover (LU/LC), normalized difference vegetation index (NDVI), topographic wetness index (TWI), topographic position index (TPI), and stream power index (SPI) have been used for modeling purpose. The area under the receiver operating characteristic (AUROC), sensitivity (SE), specificity (SP), accuracy (AC), mean absolute error (MAE), and root mean square error (RMSE) are used for checking the goodness-of-fit and prediction accuracy of approaches to compare their performance. In addition, the influence of groundwater determining factors (GWDFs) on groundwater occurrence was evaluated by performing a sensitivity analysis model. The GWPMs, produced by technique for order preference by similarity to ideal solution (TOPSIS), random forest (RF), binary logistic regression (BLR), weight of evidence (WoE) and support vector machine (SVM) have been classified into four categories, i.e., low, medium, high and very high groundwater potentiality with the help of the natural break classification methods in the GIS environment. The very high groundwater potentiality class is covered 15.09% for TOPSIS, 15.46% for WoE, 25.26% for RF, 15.47% for BLR, and 18.74% for SVM of the entire plain area. Based on sensitivity analysis, distance from river, and drainage density represent significantly effects on the groundwater occurrence. validation results show that the BLR model with best prediction accuracy and goodness-of-fit outperforms the other five models. Although, all models have very good performance in modeling of groundwater potential. Results of seed cell area index model that used for checking accuracy classification of models show that all models have suitable performance. Therefore, these are promising models that can be applied for the GWPZs identification, which will help for some needful action of these areas. Full article

(This article belongs to the Special Issue Remote Sensing and Geoscience Information Systems Applied to Groundwater Research)

► Show Figures

Graphical abstract

26 pages, 7494 KB

Open AccessEditor’s ChoiceArticle

Landslide Susceptibility Modeling Using Integrated Ensemble Weights of Evidence with Logistic Regression and Random Forest Models

by Wei Chen, Zenghui Sun and Jichang Han

Appl. Sci. 2019, 9(1), 171; https://doi.org/10.3390/app9010171 - 4 Jan 2019

Cited by 165 | Viewed by 10220

Abstract

The main aim of this study was to compare the performances of the hybrid approaches of traditional bivariate weights of evidence (WoE) with multivariate logistic regression (WoE-LR) and machine learning-based random forest (WoE-RF) for landslide susceptibility mapping. The performance of the three landslide [...] Read more.

The main aim of this study was to compare the performances of the hybrid approaches of traditional bivariate weights of evidence (WoE) with multivariate logistic regression (WoE-LR) and machine learning-based random forest (WoE-RF) for landslide susceptibility mapping. The performance of the three landslide models was validated with receiver operating characteristic (ROC) curves and area under the curve (AUC). The results showed that the areas under the curve obtained using the WoE, WoE-LR, and WoE-RF methods were 0.720, 0.773, and 0.802 for the training dataset, and were 0.695, 0.763, and 0.782 for the validation dataset, respectively. The results demonstrate the superiority of hybrid models and that the resultant maps would be useful for land use planning in landslide-prone areas. Full article

(This article belongs to the Special Issue Machine Learning Techniques Applied to Geoscience Information System and Remote Sensing)

► Show Figures

Figure 1

21 pages, 4359 KB

Open AccessArticle

Spatial Modelling of Gully Erosion Using GIS and R Programing: A Comparison among Three Data Mining Algorithms

by Alireza Arabameri, Biswajeet Pradhan, Hamid Reza Pourghasemi, Khalil Rezaei and Norman Kerle

Appl. Sci. 2018, 8(8), 1369; https://doi.org/10.3390/app8081369 - 14 Aug 2018

Cited by 124 | Viewed by 9507

Abstract

Gully erosion triggers land degradation and restricts the use of land. This study assesses the spatial relationship between gully erosion (GE) and geo-environmental variables (GEVs) using Weights-of-Evidence (WoE) Bayes theory, and then applies three data mining methods—Random Forest (RF), boosted regression tree (BRT), [...] Read more.

Gully erosion triggers land degradation and restricts the use of land. This study assesses the spatial relationship between gully erosion (GE) and geo-environmental variables (GEVs) using Weights-of-Evidence (WoE) Bayes theory, and then applies three data mining methods—Random Forest (RF), boosted regression tree (BRT), and multivariate adaptive regression spline (MARS)—for gully erosion susceptibility mapping (GESM) in the Shahroud watershed, Iran. Gully locations were identified by extensive field surveys, and a total of 172 GE locations were mapped. Twelve gully-related GEVs: Elevation, slope degree, slope aspect, plan curvature, convergence index, topographic wetness index (TWI), lithology, land use/land cover (LU/LC), distance from rivers, distance from roads, drainage density, and NDVI were selected to model GE. The results of variables importance by RF and BRT models indicated that distance from road, elevation, and lithology had the highest effect on GE occurrence. The area under the curve (AUC) and seed cell area index (SCAI) methods were used to validate the three GE maps. The results showed that AUC for the three models varies from 0.911 to 0.927, whereas the RF model had a prediction accuracy of 0.927 as per SCAI values, when compared to the other models. The findings will be of help for planning and developing the studied region. Full article

(This article belongs to the Special Issue Machine Learning Techniques Applied to Geoscience Information System and Remote Sensing)

► Show Figures

Graphical abstract

Search Results (12)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (12)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI