Next Article in Journal
Accurate Extraction of Ground Objects from Remote Sensing Image Based on Mark Clustering Point Process
Previous Article in Journal
Crime Prediction and Monitoring in Porto, Portugal, Using Machine Learning, Spatial and Text Analytics
Previous Article in Special Issue
Enhancing the Accuracy of Land Cover Classification by Airborne LiDAR Data and WorldView-2 Satellite Imagery
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Hybrid Machine Learning Approach for Gully Erosion Mapping Susceptibility at a Watershed Scale

Geosciences Laboratory, Department of Geology, Faculty of Sciences, University Ibn Tofail, BP 133, Kenitra 14000, Morocco
ITC-CNR, Construction Technologies Institute, National Research Council, 70124 Bari, Italy
Geo-Engineering and Environment Laboratory, Water Sciences and Environment Engineering Team, Department of Geology, Faculty of Sciences, Moulay Ismail University, Meknes 50050, Morocco
Department of Geography, Faculty of Sciences, Aligarh Muslim University (AMU), Aligarh 202002, UP, India
Institute of Applied Technology, Thu Dau Mot University, Thu Dau Mot City 820000, Vietnam
Department of Geography, Hong Kong Baptist University, Hong Kong, China
School of Aerospace Engineering, University of Rome “La Sapienza”, 00138 Rome, Italy
Earth Sciences Institute (ICT) and Department of Geosciences, Environment and Land Planning, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
ISPRS Int. J. Geo-Inf. 2022, 11(7), 401;
Submission received: 1 June 2022 / Revised: 6 July 2022 / Accepted: 12 July 2022 / Published: 14 July 2022
(This article belongs to the Special Issue Integrating GIS and Remote Sensing in Soil Mapping and Modeling)


Gully erosion is a serious threat to the state of ecosystems all around the world. As a result, safeguarding the soil for our own benefit and from our own actions is a must for guaranteeing the long-term viability of a variety of ecosystem services. As a result, developing gully erosion susceptibility maps (GESM) is both suggested and necessary. In this study, we compared the effectiveness of three hybrid machine learning (ML) algorithms with the bivariate statistical index frequency ratio (FR), named random forest-frequency ratio (RF-FR), support vector machine-frequency ratio (SVM-FR), and naïve Bayes-frequency ratio (NB-FR), in mapping gully erosion in the GHISS watershed in the northern part of Morocco. The models were implemented based on the inventory mapping of a total number of 178 gully erosion points randomly divided into 2 groups (70% of points were used for training the models and 30% of points were used for the validation process), and 12 conditioning variables (i.e., elevation, slope, aspect, plane curvature, topographic moisture index (TWI), stream power index (SPI), precipitation, distance to road, distance to stream, drainage density, land use, and lithology). Using the equal interval reclassification method, the spatial distribution of gully erosion was categorized into five different classes, including very high, high, moderate, low, and very low. Our results showed that the very high susceptibility classes derived using RF-FR, SVM-FR, and NB-FR models covered 25.98%, 22.62%, and 27.10% of the total area, respectively. The area under the receiver (AUC) operating characteristic curve, precision, and accuracy were employed to evaluate the performance of these models. Based on the receiver operating characteristic (ROC), the results showed that the RF-FR achieved the best performance (AUC = 0.91), followed by SVM-FR (AUC = 0.87), and then NB-FR (AUC = 0.82), respectively. Our contribution, in line with the Sustainable Development Goals (SDGs), plays a crucial role for understanding and identifying the issue of “where and why” gully erosion occurs, and hence it can serve as a first pathway to reducing gully erosion in this particular area.

1. Introduction

Gully erosion is considered the most destructive type of soil erosion, and it is associated with various topographic, climatic, and anthropogenic factors [1], causing serious environmental and human issues across the world [2], especially in arid and semi-arid regions. Gully erosion occurs over a short period of time. Gullies are a common cause of land degradation, as inappropriate land management and land use practices can lead to increased soil erosion, with gullies as the primary landform [3].
The effective functioning of soil has a significant impact on ecosystem services and is linked to the attainment of the Sustainable Development Goals (SDGs). The soil-water system is the most important component in achieving multiple SDGs, with a focus on neutralizing land degradation and restoring land [4,5]. As a result, one of the most significant issues for the long-term development of the environment and economic activity is prevention of land degradation. As a result, extensive planning and erosion protection have always been essential. Therefore, it is a thoughtful environmental issue that loses a considerable quantity of productive soils each year all around the world [6,7]. Hence, mapping soil erosion is very essential for communicating the spatial information risk of gully erosion for managers and decision-makers for its conservation and management planning.
Soil erosion in Morocco has increased dramatically, leading to severe negative effects on crop production, water ecosystems, and the environment. It is estimated that at least 13% of Moroccan lands are affected by soil erosion [8]. However, there is little research on gully erosion in Morocco in the literature [8,9]. Azedou et al. [10] used frequency ratio (FR), logistic regression (LR), and random forest (RF) to project the spatial distribution of gully erosion in the Souss-Massa watershed, Morocco. The results revealed that among the models tested, the RF model had the best prediction performance. Tairi et al. [11] used the revised universal soil loss equation (RUSLE) for estimating soil erosion in the Tifnout Askaoun watershed in Morocco. Such efforts resulted in a vital tool for the local region’s long-term land management. It is important to perform soil erosion research in this environment to add to the current literature and assist local governments in developing suitable plans for soil and land management, watershed management, and infrastructure planning.
Recently, expert knowledge methods such as the analytical hierarchy process (AHP) [12,13,14], bivariate statistical methods (BSMs), such as FR [15,16], certainty factors (CF) [17,18], weight of evidence (WoE) [19], the information value (InfVal) [20], and the evidential belief function (EBF) [16], conditional probability (CP) [21], index of entropy (IOE) [22], multivariate statistical methods (MSMs), such as linear regression (LiR) [23] and logistic regression (LR) [24], and machine learning (ML) methods such as artificial neural networks (ANN) [25,26], support vector machine (SVM) [27,28], RF [29,30], classification and regression trees (CART) [31], and Naive Bayes [32,33], have been applied for soil erosion mapping. ML algorithms are widely employed for a variety of purposes, including soil erosion mapping, due to their superior prediction capacity compared to other traditional methods [34,35]. There are a variety of approaches, each with its own set of pros and cons. ML models, on the other hand, are useful for determining gully erosion and have been utilized for susceptibility mapping [36,37,38] and ML mode piping erosion susceptibility prediction [35,39]. The RF model and information value approaches are the most often utilized methods in the bivariate model category [36,40]. The RF model has produced positive outcomes in various studies [36,41,42]. Bivariate models can be simply applied within a geographic information system (GIS) due to their straightforward interpretation [43]. These have yielded positive findings in the literature, both in Morocco through studies in the Ourika and Rheraya watersheds [10,20] and elsewhere [38,44,45,46]. Selecting gully triggering elements, generating susceptibility maps, implementing land management decisions, and establishing future strategies have all been performed using GIS- and ML-based models [47]. Indeed, ML approaches allow for the evaluation of the role of various components and their interactions, which has significant potential and has been increasingly applied in recent years [34]. With the rapid advancement of different ML algorithms in recent years, determining which model is optimal for a certain location has become difficult. It is critical to look into a variety of algorithms and determine which one is best for each situation.
More recently, hybrid/ensemble models are developed in a combined way, via an integration of individual ML models and statistical approaches. The usefulness of hybrid models that have been discussed in previous publications [48,49] lies in their highest accuracies in comparison to individual models [50,51]. Additionally, in this study, the efficiency of different approaches for gully erosion susceptibility, such as FR, RF, SVM, and NB, is investigated. Although these ML techniques have been employed in the past, they have only been used infrequently for gully erosion modeling.
Gullies inflict severe damage in arid and semi-arid areas, for example the GHISS watershed in northern Morocco, and are regarded as a major environmental hazard [52,53]. As a result, executive agencies must continue to identify the reasons for gully erosion development and zoning to build comprehensive management plans. This will be crucial in the development of restoration methods that are based on natural solutions to ensure long-term sustainability. Regardless of the high susceptibility of this study region to gully erosion, no thorough research has been conducted to date to recognize places that are particularly vulnerable to gully erosion; however, the scientific community is working extremely hard and producing intriguing results. In this analysis, a hybrid methodology combining bivariate statistical approaches and ML algorithms was employed to identify locations susceptible to gully erosion. The main objective of this study is to create a gully erosion susceptibility map in an area with identified soil erosion processes, such as the GHISS watershed of northern Morocco. For that objective, the efficacy of the FR as a statistical model as well as other ML techniques were evaluated in terms of their applicability for predicting gully erosion-prone areas.

2. Materials and Methods

2.1. Study Area

The selected study area in this study is the GHISS watershed, located in the province of Al Hoceima, in the northern part of Morocco. The study area lies between longitudes 3°45′ and 4°30′ W and latitudes 34°15′ and 35°17′ N, covering an area of 837.69 km2 (Figure 1). Elevation ranges from 0 to 2032 m.a.s.l, characterized by mostly steep reliefs with slopes of more than 35%. The climate in this region is described as semi-arid, and the majority of rainfall occurs during September to May, with an average annual rainfall of 300 mm [54], characterized by both seasonal and interannual variabilities [55]. The average temperature ranges from 21 °C, in July, to 10 °C in January [56].
From a geological point of view, the GHISS basin is characterized by the Ketama unit, which outcrops in the central Rif and is essentially formed of flysch of the Albo-Aptian domain. It belongs to the external domain (intra-Rif) and the flyschs nappes deposits.
Due to its topographical conditions (i.e., slopes higher than 55°) and geological formations (shale, marl, and marl-limestone) [57,58,59], the watershed has suffered severe soil erosion and decline of forest ecosystems’ resources [57]
The study area belongs to the GHISS-Nekkor aquifer, which is an important source of groundwater in that region [60]. However, insufficient and poor sanitation facilities, as well as unsustainable agriculture in the study area, have resulted in deterioration of groundwater quality in this region [60,61]. Previously, a study [62] using the RUSLE method reported that more than 50% of the watershed is considered as moderately eroded. Thus, our research effort is directed towards a deep understanding of gully erosion in this study area.

2.2. Data Used and Methodology

The methodology adopted for this research is divided into three steps, as shown in Figure 2. The first step of the methodology is the data gathering, in which different datasets, including topographic, climatic, and human variables, were preprocessed, and then different geo-environmental variables were generated. Besides geo-environmental variables, a gully erosion inventory map was prepared which was later used in step 2. The second step is the analysis part, in which gully inventory data from step 1 were used to prepare the training and validation data. FR and normalization were performed for training data along with geo-environmental variables, and then all resulting data were fed to three different hybrid ML algorithms (RF-FR, SVM-FR, and NB-FR). In the third step, the validation data prepared in step 2 were used and an accuracy assessment was performed for each of the three hybrid classifiers. Based on the training dataset, the gully erosion maps have been generated using natural breaks classification, available in Arc GIS 10.4. Thus, five categories have been identified: very low, low, moderate, high, and very high gully erosion susceptibility classes. Finally, the gully erosion susceptibility maps were created using the best-suited hybrid classifier. The complete ML implementation was performed in Python utilizing GIS tools and the Jupyter environment.

2.3. Gully Erosion Inventory Map

The elaboration of the gully erosion inventory map (i.e., target variable) is the first task, and it aims to statistically elucidate the relationship between the distribution of gully erosion (dependent variable) and the conditioning factors (independent variables) of gully erosion hazards [40]. In this study, gully erosion points were identified during field surveys using Global Positioning System (GPS) receivers, and once these locations were recorded, interpretations of high-resolution images from Google Earth were performed. Hence, the 178 points of gully erosion were randomly split into 70% (125) for training, and 30% (53) for validation were kept for the modeling task. An equal total number of non-gully erosion points was randomly selected and split into two sets: 70% (125) for training and 30% (53) for validation. Some examples of point locations and their field photographs are shown in Figure 3.

2.4. Parameters’ Description

For building binary predictive models, it is necessary to gather both a dependent variable (i.e., target) and a set of independent variables. In this study, 12 variables were selected from different sources (Table 1) due to their importance in gully erosion, as discussed in previous works [37,40]. The selected gully erosion factors classified as topographic, hydrologic, and geologic [37] were used to derive the following variables: elevation, slope, aspect, plan curvature, topographic moisture index (TWI), stream power index (SPI), precipitation, distance to road, distance to stream, drainage density, land use/land cover (LULC), and lithology. All these gully erosion controlling factors were prepared and reclassified based on expert knowledge and statistical analysis using the natural break classification method using GIS tools [63]. To calculate the proportion of gully/non-gully data in each class of each variable, we reclassified each continuous conditioning factor into a set of classes. It should be noted that we adopted an automatic classification for some variables, and for some parameters the classification remains the same as those provided in the source data. The digital elevation model (DEM) with a pixel size of 30×30 m was downloaded from the USGS Earth Explorer website (, (accessed on 20 August 2021).
It is considered in this study because of its importance in the gully erosion process [37]. Using spatial analysis tools available in ArcGIS 10.4 software, DEM was used to calculate other topographic parameters, including slope, aspect, plan curvature, TWI, and SPI. Due to its effects on vegetation and microclimate [40], elevation plays an important role in gully erosion. It was classified automatically into five classes, including: 0–417, 417–799, 799–1125, 1125–1427, and 1427–2032 m (Figure 4a).
As reported in previous studies [64,65], slope has an influence on gully erosion, and it becomes more serious in the upslope. In this study, the generated slope factor was classified automatically into five classes: <5, 5–10, 10–20, 20–30, and >30° (Figure 4b).
Aspect has an important effect on gully erosion, as it can influence evapotranspiration, vegetation cover, and incoming solar radiation [66]. In this study, it was classified automatically into nine classes: (1) Flat, (2) North, (3) Northeast, (4) East, (5) Southeast, (6) South, (7) Southwest, (8) West, and (9) Northwest (Figure 4c).
Plan curvature is the curvature of a contour line formed by intersecting a horizontal plane with the surface [67], and it plays an important role in divergence or convergence of water during downslope flow [68]. It was classified automatically into three classes: concave, flat, and convex (Figure 4d).
TWI has been shown to be useful for gully erosion [40], and it was calculated through Equation (1) [69]:
TWI = ln (As/tanβ)
where A is the watershed area in meters, and β is the slope gradient. This index was classified into five classes (Figure 5b): (18–21), (21–22), (22–23), (23–24), and (24–34).
SPI is an index used to measure the capacity and resistance of the soil through surface water flow, runoff, and infiltration, that allows the development of gullies [70]. It was calculated using Equation (2), proposed in [69].
SPI = As ∗ tanβ
where AS is the special area of the basin (m2 m−1) and β is the slope, in degrees.
It is well-established that gullies occur more in the areas near roads [38]. Hence, distance to road, distance to stream, and drainage density were considered in this study. Using the Euclidean Distance tool available in ArcGIS 10.4, these factors were generated (Figure 5c).
LULC was prepared using the Landsat-8-OLI image acquired on 12 June 2019 downloaded from the United States Geological Survey (USGS) website. First, it was radiometrically and atmospherically calibrated, and then the maximum likelihood supervised classification algorithm using the ENVI software tool was employed. A total of 1079 ground-truth points were randomly selected based on visual interpretation and high-resolution orthorectified Google Earth imagery. Afterwards, based on field investigation, five classes were generated, including: water bodies, forestlands, agricultural lands, buildings/settlements, and bare lands (Figure 6a). The accuracy assessment showed that the generated map had an overall accuracy of 92.4%.
This study used monthly precipitation data for a period of 2010 to 2019, downloaded from, (accessed on 18 August 2021) with a pixel size of 0.25 × 0.25° and resampled to a 30 m pixel size using the nearest neighbor resampling method. A rainfall map was generated using the inverse distance-weighted (IDW) interpolation method, and it was classified into five groups (Figure 6c): (315–461 mm), (461–560 mm), (560–634 mm), (634–778 mm), and (778–1042 mm).
A lithology map of the watershed was digitized from the geological map of Morocco at a scale of 1:000 000. The lithology classes in the study area include four classes, as shown in Figure 6b.

2.5. Multicollinearity Analysis

The correlation of conditioning variables is represented by multicollinearity analysis, which was used to select the optimal controlling factor for gully erosion susceptibility mapping [71]. Many researchers have looked at the value of controlling variables, particularly for gully susceptibility mapping, and virtually all have come to the conclusion that each variable’s importance is mostly determined by its surroundings [37,40]. In this study, multicollinearity analysis was performed using variance inflation factors (VIF), which show multiplicative inverse of tolerance (TOL), which is computed as 1 − R2, where R2 is calculated through reverting all resulting variables in multivariate regression [40]. Multiplicative analysis was performed using the 12 geo-environmental variables prepared earlier. From the literature, it is evident that values for TOL = 0.10, and VIF = 0.5 represents issues in overall multicollinearity [72]. For multicollinearity analysis, equations used to derive tolerance and VIF are given as Equations (3) and (4), respectively.
Tolerance = 1 − R2j
VIF = 1/Tolerance
where R2j is the coefficient of determination of regression for the variable j.

3. Models and Methods Background

3.1. Frequency Ratio (FR)

The FR model adopts a theory of probability to define the relationship between independent and dependent variables of spatial information using the multi-class mapping approach [73]. It has been used for a variety of environmental hazards, such as landslide [74], flood [75], forest fire [76], and gully erosion susceptibility mapping [77]. It is a bivariate probability statistic index, used to identify the spatial relationship between erosion and different factors contributing towards gully erosion in the region. The FR model can be defined as (Equation (5)):
Fri = b/a
where b is the ratio between erosion pixels by total number of erosion pixels, a is the ratio between no erosion cells by total number of non-erosion cells, while F r i donates the importance of the conditioning factor in relation to erosion occurrence. FR > 1 indicates high correlation with the erosion probability, while FR < 1 represents low correlation.

3.2. Random Forest (RF)

The RF model works on the principle of constructing multiple decision trees from different subsets of data. RF is an integrated approach that combines the ideas proposed in [78] with the methods described in [79]. The RF starts growing when the algorithm predicts the variables and targets, leading to a decision tree which can be further pruned [80]. A RF is so large that it is very difficult to explain. It is necessary to summarize its information using quantitative indicators. The famous indicators are the mean decrease Gini index and mean decrease accuracy [80]. RF utilizes mean decrease accuracy and the mean decrease Gini index in the ranking of factors [40,81].

3.3. Support Vector Machine (SVM)

SVM is a typical ML algorithm method which uses statistical learning theory based on the structural risk minimization (SRM) [82]. This algorithm is best-suited to solving regression analysis and classifier problems [64]. Generally, four kinds of computing functions were used in SVM: linear kernel (LN), polynomial kernel (PL), sigmoid kernel, and radial basis function (RBF) [70,83]. The accuracy of the prediction usually depends on the selection of the type of function [84]. The SVM model works well only for linear data; in case of nonlinear datasets, it transforms the nonlinear data into linear by using the so-called “Kernel-trick” [85].

3.4. Naïve Bayes (NB)

The NB model is based on Bayes’ theorem, which uses a set of assortment algorithms for classification [86]. This is a family of algorithms, where all explanatory variables are completely independent of each other, which share a common principle [87]. The NB model is well-suited against noise and irreverent models [88]. This model can also be used with a relatively small amount of training data to estimate parameters for classification [89].

3.5. Model Validation

Validation of the developed models is an essential part of any modeling study [90,91]. Thus, several statistical indices were widely used, and among them, accuracy, specificity, sensitivity, and precision were calculated in this research. Overall accuracy (OA) is the probability of occurrence of correctly classified pixels which are computed by the sum of true positive and true negative divided by all available singular tests. The equation form of OA is given in Equation (6). Specificity, also known as the true-positive rate, represents the proportion of gully erosion pixels correctly predicted as gully erosion (Equation (8)). Sensitivity focuses only on correctly classified pixels from the test data and is calculated by dividing the true-negative values by the sum of true negative and false positive (Equation (7)). Precision is the measure of the quality of the results and is calculated by dividing the true positive by the sum of the true positive and the false positive (Equation (9)).
Accuracy = (TP + TN)/TP + TN + FP + FN
Sensitivity = TP/TP + TN
Specificity = TN/TN + FP
Precision = TP/TP + FP
where TP is true positive, TN is true negative, FP is false positive, and FN is false negative. For the AUC (area under the receiver operating characteristic curve) and the receiver operating characteristic (ROC) curve, on the y-axis, the sensitivity is plotted, and the x-axis shows the specificity in terms of gully erosion probability occurrences.

3.6. Variable Importance Using Information Gain Index

Various statistical indices have been used for feature selection, which include, one-rule attribute elevation (ORAE) [92], forward elimination [93], backward elimination [93], and information gain (IG) [94]. Based on the results of the RF-FR model, IG was used to reveal the importance of each conditioning factor for the modeling process [95].

4. Results

4.1. Results of Frequency Ratio

The FR was calculated to reveal the relationship between gully erosion as the dependent variable and each gully erosion conditioning factor as the independent variables. Based on findings form this analysis (Table 2), the highest FR value (i.e., 3.528) was found in the class of lithology of Cenomanian to Santonian with “flysch” Rif facies of Tisrin slick, which represents the area more susceptible to gully erosion, mainly composed of the alternation of sandstone marl-limestone flysch and sandstone flysch, more sensitive to erosion, followed by the highest class of rainfall (885–1042) (i.e., 3.058).
The lowest classes of slope, elevation, curvature, plan curvature, distance from roads, distance from stream, drainage density, TWI, and SPI are more susceptible to gully erosion. From the analysis of the LULC parameter, it was observed that this erosion occurred more in the bare lands class, followed by the buildings/settlement class. For the aspect factor, it was also observed that the susceptibility to gully erosion is more pronounced in flat areas.

4.2. Results of Multicollinearity Assessment

Multicollinearity analysis represents the correlation of conditioning variables, correlated or interrelated [46]. It was applied in this study to analyze the correlation among the gully erosion factors (independent variables). To perform this, we used two indexes: TOL and VIF [64]. If the value of TOL is less than 0.1 and the value of VIF is greater than 10 [96], collinearity exists amongst the variables. Our results (Table 3) showed that LULC had the lowest tolerance value of the gully erosion conditioning factors (0.454), while aspect had the highest tolerance value (0.893). Regarding the variance inflation factor (VIF), the highest value was 2.203 (LULC), and the lowest value was 1.120 (aspect). These gully erosion conditioning factors had tolerance values greater than 0.1, and the VIF values were less than 0.1 and 10, indicating that no collinearity exists amongst these factors. Therefore, all 12 conditioning factors were kept in this research.

4.3. Identification of Gully Zones

Based on the training dataset, the gully erosion maps have been generated using natural breaks classification, available in Arc GIS 10.4. Thus, five categories have been identified: very low, low, moderate, high, and very high gully erosion susceptibility classes. Figure 7 and Table 4 present the spatial distribution of the susceptibility classes in the gully erosion susceptibility maps. In the gully erosion susceptibility map constructed using the RF model, 25.98% of the study area had a very high susceptibility to erosion, while 17.88%, 19.30%, 19.64%, and 17.20% of the area was classified as very low, low, moderate, and high susceptibilities, respectively. In the case of the SVM model, 22.62% of the area was classified as very high susceptibility, while 15.01%, 14.4%, 20.00%, and 27.95% had very low, low, moderate, and high susceptibilities, respectively. For the NB model, 27.10% of the study area was classified to the very high gully hazard category, while 17.40%, 17.67%, 20.37%, and 17.44% had very low, low, moderate, and high susceptibilities, respectively. Figure 7 shows the spatial distribution of gully erosion in the watershed. By comparing the spatial distribution of gully erosion obtained by all the used models, a quiet homogeneity of the most erosion-prone areas throughout the watershed was clearly observed. The most eroded areas were located in different parts of the watershed, which were characterized mainly by variations in slope, rapidly increasing the transport of sediments. The characteristic lithology of this watershed might be another reason [97]. In addition, inappropriate agriculture practices and overgrazing also acted as other driving forces in gully erosion in this study area [62].

4.4. Variable Importance

The importance of variables for gully erosion mapping was performed based on the RF model. As can be seen in Figure 8, TWI (1.78), LULC (1.73), distance from stream (1.47), drainage density (1.45), slope (1.42), aspect (1.34), rainfall (1.28), SPI (1.06), and distance from road (1.02) were the most importance factors for gully erosion susceptibility mapping, whereas elevation (0.79), plan curvature (0.76), and lithology (0.74) were of the least importance.

4.5. Validation of Gully Erosion Models

The validation results for both training and validation datasets using accuracy, precision, and AUC are presented in Table 5 and Figure 9. The statistical parameters for each model were almost the same for both training and validation datasets, with a slight difference in favor of the AUC for the FR-NB model.

5. Discussion

The effective functioning of soil has a significant impact on ecosystem services and is linked to the attainment of the SDGs. The soil-water system is the most important component in achieving multiple SDGs, with a focus on neutralizing land degradation and restoring land [4,5]. With the shifting land-use trends and compaction of productive soil, excessive degradation in productive land is observed globally due to gully erosion. As a result, one of the most significant issues for the long-term development of the environment and economic activity is preventing land degradation. Therefore, extensive planning and erosion protection have always been essential.
Machine learning methods are reliable tools for mitigating and controlling the influence of gully erosion in different regions all over the world. Based on the Web of Science (WoS) database and using the common keywords “gully erosion susceptibility” and “machine learning algorithms”, nine papers published between 2019 and 2021 were selected from different parts of the world, and their results are reported in Table 6. RF generates models with high accuracy in comparison to the different approaches, and this is due to its ability to handle large datasets and produce fast classifications, based on multiple features. Additionally, RF is widely used to assess the importance of each variable used in order to calculate a multi-classifier and evaluates its own accuracy and its suitability for the modeling process [98].
In this research, the relationship between gully erosion occurrence and various environmental factors was investigated for the GHISS watershed. We used three hybrid ML models (i.e., RF-FR, SVM-FR, and NB-FR) for gully erosion susceptibility mapping. We found that the FR-RF model achieved better performance results compared to the other models. Our findings are consistent with previous studies, for example [36,40]. In one study [103], the authors argued that RF’s better performance is because it is less prone to both over-fitting and outliers in the training dataset.
In terms of accuracy, the RF model is followed by the SVM-FR model due to its capacity to handle non-linear data, and it has yielded good results for both classification and regression problems in many applications [104].
The performance of NB-FR was slightly weaker than the other models. It assumes conditional independence between features [105]. It has been used with great results in previous papers [106].
Based on our results, among model variables, LULC and topographic moisture index (TMI) showed the maximum importance factors in enhancing the performance of hybrid models. Land-use change affects gully erosion by altering the hydrological and physicochemical properties of the soil. Other factors, such as distance from stream, drainage density, and slope, also showed reasonable importance output after LULC and TWI. Vegetation stabilizes gullies because of the role of plant roots. Thus, areas with no and sparse vegetation are most affected by gully erosion and widely exposed to rainfall and runoff. The land resources in the northern part of Morocco that have been found to be influenced by environmental and anthropogenic activities mainly include: rainfall irregularity [107,108], steep slopes, and weak geologic units, i.e., the type of geological formation that forms the northern parts make the lands more prone to erosion. Moreover, Cannabis cultivation has become a complex and challenging practice to control, leading to the loss of forestlands and accelerating soil erosion processes [76]. Thereby, the approach developed in this study could be effective in gully erosion prediction. The maps generated here can be a good reference to reduce the phenomenon of gully erosion and can be used as a valuable tool for the establishment of sustainable strategies and actions. It should be highlighted that there is not one perfect algorithm in comparison to others, because each one has its specific advantages and drawbacks, and each algorithm highlights its own usefulness for each study case.

6. Conclusions

The use of ML algorithms for environmental hazard modeling is an emerging focus in many studies, thanks to the technological advancements of Internet of Things (IoT). The relationship between gully erosion occurrence and various environmental factors was investigated for the study area. As investigated in this study, three hybrid models (FR-RF, FR-SVM, and FR-NB) have been elaborated for gully erosion susceptibility mapping. The results of this study showed that the FR-RF hybrid model outperformed the other developed models for gully erosion susceptibility mapping. In summary, this study showed that the use of hybrid ML models for gully erosion is better than single ML models, which is consistent with previous studies. The methodology proposed in this study can be applied to areas influenced by identical environmental and anthropogenic activities, which includes, for instance, rainfall irregularity, steep slopes, and weak geologic units, for mapping gully erosion. For the elaboration of new studies, researchers are encouraged to use the above three models to address new questions and research directions. Further research may consider the application of deep learning approaches in gully erosion mapping from local to regional scale areas.

Author Contributions

Conceptualization, Sliman Hitouri and Antonietta Varasano; Formal analysis, Antonietta Varasano; Funding acquisition, Antonietta Varasano; Investigation, Meriame Mohajane, Mirza Waleed and Sasi Kiran Palateerdham; Methodology, Antonietta Varasano; Resources, Meriame Mohajane and Ali Essahlaoui; Software, Antonietta Varasano, Meriame Mohajane, Safae Ijlil and Narjisse Essahlaoui; Supervision, Ali Essahlaoui; Visualization, Narjisse Essahlaoui; Writing—original draft, Sliman Hitouri, Antonietta Varasano, Meriame Mohajane, Safae Ijlil, Sk Ajim Ali and Quoc Bao Pham; Writing—review & editing, Ana Cláudia Teodoro. Sliman Hitouri and Antonietta Varasano contributed equally to this paper by performing analyses and designing the study. All the co-authors drafted and revised the article together. All authors have read and agreed to the published version of the manuscript.


This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.


  1. Domazetović, F.; Šiljeg, A.; Lončar, N.; Marić, I. Development of Automated Multicriteria GIS Analysis of Gully Erosion Susceptibility. Appl. Geogr. 2019, 112, 102083. [Google Scholar] [CrossRef]
  2. Fadul, H.M.; Salih, A.A.; Ali, I.A.; Inanaga, S. Use of Remote Sensing to Map Gully Erosion along the Atbara River, Sudan. Int. J. Appl. Earth Obs. Geoinf. 1999, 1, 175–180. [Google Scholar] [CrossRef]
  3. Magliulo, P. Assessing the Susceptibility to Water-Induced Soil Erosion Using a Geomorphological, Bivariate Statistics-Based Approach. Environ. Earth Sci. 2012, 67, 1801–1820. [Google Scholar] [CrossRef]
  4. Dooley, E.; Roberts, E.; Wunder, S. Land Degradation Neutrality under the SDGs: National and International Implementation of the Land Degradation Neutral World Target. Elni. Rev. 2015, 1, 2–9. [Google Scholar] [CrossRef]
  5. Safriel, U. Land Degradation Neutrality (LDN) in Drylands and beyond—Where Has It Come from and Where Does It Go. Silva Fenn. 2017, 51, 20–24. [Google Scholar] [CrossRef] [Green Version]
  6. Durán Zuazo, V.H.; Rodríguez Pleguezuelo, C.R. Soil-Erosion and Runoff Prevention by Plant Covers. A Review. Agron. Sustain. Dev. 2008, 28, 65–86. [Google Scholar] [CrossRef] [Green Version]
  7. Lal, R. Soil Erosion Impact on Agronomic Productivity and Environment Quality. Crit. Rev. Plant Sci. 1998, 17, 319–464. [Google Scholar] [CrossRef]
  8. Peter, K.D.; d’Oleire-Oltmanns, S.; Ries, J.B.; Marzolff, I.; Ait Hssaine, A. Soil Erosion in Gully Catchments Affected by Land-Levelling Measures in the Souss Basin, Morocco, Analysed by Rainfall Simulation and UAV Remote Sensing Data. CATENA 2014, 113, 24–40. [Google Scholar] [CrossRef]
  9. Simonneaux, V.; Cheggour, A.; Deschamps, C.; Mouillot, F.; Cerdan, O.; Le Bissonnais, Y. Land Use and Climate Change Effects on Soil Erosion in a Semi-Arid Mountainous Watershed (High Atlas, Morocco). J. Arid Environ. 2015, 122, 64–75. [Google Scholar] [CrossRef] [Green Version]
  10. Azedou, A.; Lahssini, S.; Khattabi, A.; Meliho, M.; Rifai, N. A Methodological Comparison of Three Models for Gully Erosion Susceptibility Mapping in the Rural Municipality of El Faid (Morocco). Sustainability 2021, 13, 682. [Google Scholar] [CrossRef]
  11. Tairi, A.; Elmouden, A.; Bouchaou, L.; Aboulouafa, M. Mapping Soil Erosion–Prone Sites through GIS and Remote Sensing for the Tifnout Askaoun Watershed, Southern Morocco. Arab. J. Geosci. 2021, 14, 811. [Google Scholar] [CrossRef]
  12. Kachouri, S.; Achour, H.; Abida, H.; Bouaziz, S. Soil Erosion Hazard Mapping Using Analytic Hierarchy Process and Logistic Regression: A Case Study of Haffouz Watershed, Central Tunisia. Arab. J. Geosci. 2015, 8, 4257–4268. [Google Scholar] [CrossRef]
  13. Saha, S.; Gayen, A.; Pourghasemi, H.R.; Tiefenbacher, J.P. Identification of Soil Erosion-Susceptible Areas Using Fuzzy Logic and Analytical Hierarchy Process Modeling in an Agricultural Watershed of Burdwan District, India. Environ. Earth Sci. 2019, 78, 649. [Google Scholar] [CrossRef]
  14. Senouci, R.; Taibi, N.-E.; Teodoro, A.C.; Duarte, L.; Mansour, H.; Yahia Meddah, R. GIS-Based Expert Knowledge for Landslide Susceptibility Mapping (LSM): Case of Mostaganem Coast District, West of Algeria. Sustainability 2021, 13, 630. [Google Scholar] [CrossRef]
  15. Senanayake, S.; Pradhan, B.; Huete, A.; Brennan, J. Assessing Soil Erosion Hazards Using Land-Use Change and Landslide Frequency Ratio Method: A Case Study of Sabaragamuwa Province, Sri Lanka. Remote Sens. 2020, 12, 1483. [Google Scholar] [CrossRef]
  16. Tehrany, M.S.; Shabani, F.; Javier, D.N.; Kumar, L. Soil Erosion Susceptibility Mapping for Current and 2100 Climate Conditions Using Evidential Belief Function and Frequency Ratio. Geomat. Nat. Hazards Risk 2017, 8, 1695–1714. [Google Scholar] [CrossRef] [Green Version]
  17. Arabameri, A.; Pradhan, B.; Rezaei, K. Gully Erosion Zonation Mapping Using Integrated Geographically Weighted Regression with Certainty Factor and Random Forest Models in GIS. J. Environ. Manage. 2019, 232, 928–942. [Google Scholar] [CrossRef]
  18. Azareh, A.; Rahmati, O.; Rafiei-Sardooi, E.; Sankey, J.B.; Lee, S.; Shahabi, H.; Ahmad, B.B. Modelling Gully-Erosion Susceptibility in a Semi-Arid Region, Iran: Investigation of Applicability of Certainty Factor and Maximum Entropy Models. Sci. Total Environ. 2019, 655, 684–696. [Google Scholar] [CrossRef]
  19. Hembram, T.K.; Paul, G.C.; Saha, S. Comparative Analysis between Morphometry and Geo-Environmental Factor Based Soil Erosion Risk Assessment Using Weight of Evidence Model: A Study on Jainti River Basin, Eastern India. Environ. Process. 2019, 6, 883–913. [Google Scholar] [CrossRef]
  20. Meliho, M.; Khattabi, A.; Mhammdi, N. A GIS-Based Approach for Gully Erosion Susceptibility Modelling Using Bivariate Statistics Methods in the Ourika Watershed, Morocco. Environ. Earth Sci. 2018, 77, 655. [Google Scholar] [CrossRef]
  21. Pourghasemi, H.R.; Mohammady, M.; Pradhan, B. Landslide Susceptibility Mapping Using Index of Entropy and Conditional Probability Models in GIS: Safarood Basin, Iran. CATENA 2012, 97, 71–84. [Google Scholar] [CrossRef]
  22. Pournader, M.; Ahmadi, H.; Feiznia, S.; Karimi, H.; Peirovan, H.R. Spatial Prediction of Soil Erosion Susceptibility: An Evaluation of the Maximum Entropy Model. Earth Sci. Inform. 2018, 11, 389–401. [Google Scholar] [CrossRef]
  23. Nosrati, K. Assessing Soil Quality Indicator under Different Land Use and Soil Erosion Using Multivariate Statistical Techniques. Environ. Monit. Assess. 2013, 185, 2895–2907. [Google Scholar] [CrossRef] [PubMed]
  24. Sarkar, T.; Mishra, M. Soil Erosion Susceptibility Mapping with the Application of Logistic Regression and Artificial Neural Network. J. Geovisualization Spat. Anal. 2018, 2, 8. [Google Scholar] [CrossRef]
  25. Gholami, V.; Sahour, H.; Hadian Amri, M.A. Soil Erosion Modeling Using Erosion Pins and Artificial Neural Networks. CATENA 2021, 196, 104902. [Google Scholar] [CrossRef]
  26. Gholami, V.; Booij, M.J.; Nikzad Tehrani, E.; Hadian, M.A. Spatial Soil Erosion Estimation Using an Artificial Neural Network (ANN) and Field Plot Data. CATENA 2018, 163, 210–218. [Google Scholar] [CrossRef]
  27. Dinh, T.V.; Nguyen, H.; Tran, X.-L.; Hoang, N.-D. Predicting Rainfall-Induced Soil Erosion Based on a Hybridization of Adaptive Differential Evolution and Support Vector Machine Classification. Math. Probl. Eng. 2021, 2021, 6647829. [Google Scholar] [CrossRef]
  28. Arabameri, A.; Asadi Nalivan, O.; Chandra Pal, S.; Chakrabortty, R.; Saha, A.; Lee, S.; Pradhan, B.; Tien Bui, D. Novel Machine Learning Approaches for Modelling the Gully Erosion Susceptibility. Remote Sens. 2020, 12, 2833. [Google Scholar] [CrossRef]
  29. Ghosh, A.; Maiti, R. Soil Erosion Susceptibility Assessment Using Logistic Regression, Decision Tree and Random Forest: Study on the Mayurakshi River Basin of Eastern India. Environ. Earth Sci. 2021, 80, 328. [Google Scholar] [CrossRef]
  30. Phinzi, K.; Ngetar, N.S.; Ebhuoma, O. Soil Erosion Risk Assessment in the Umzintlava Catchment (T32E), Eastern Cape, South Africa, Using RUSLE and Random Forest Algorithm. S. Afr. Geogr. J. 2021, 103, 139–162. [Google Scholar] [CrossRef]
  31. Choubin, B.; Moradi, E.; Golshan, M.; Adamowski, J.; Sajedi-Hosseini, F.; Mosavi, A. An Ensemble Prediction of Flood Susceptibility Using Multivariate Discriminant Analysis, Classification and Regression Trees, and Support Vector Machines. Sci. Total Environ. 2019, 651, 2087–2096. [Google Scholar] [CrossRef]
  32. Lee, S.; Lee, M.-J.; Jung, H.-S.; Lee, S. Landslide Susceptibility Mapping Using Naïve Bayes and Bayesian Network Models in Umyeonsan, Korea. Geocarto Int. 2020, 35, 1665–1679. [Google Scholar] [CrossRef]
  33. Mosavi, A.; Sajedi-Hosseini, F.; Choubin, B.; Taromideh, F.; Rahi, G.; Dineva, A. Susceptibility Mapping of Soil Water Erosion Using Machine Learning Models. Water 2020, 12, 1995. [Google Scholar] [CrossRef]
  34. Lei, X.; Chen, W.; Avand, M.; Janizadeh, S.; Kariminejad, N.; Shahabi, H.; Costache, R.; Shahabi, H.; Shirzadi, A.; Mosavi, A. GIS-Based Machine Learning Algorithms for Gully Erosion Susceptibility Mapping in a Semi-Arid Region of Iran. Remote Sens. 2020, 12, 2478. [Google Scholar] [CrossRef]
  35. Saha, S.; Roy, J.; Arabameri, A.; Blaschke, T.; Tien Bui, D. Machine Learning-Based Gully Erosion Susceptibility Mapping: A Case Study of Eastern India. Sensors 2020, 20, 1313. [Google Scholar] [CrossRef] [Green Version]
  36. Gayen, A.; Pourghasemi, H.R.; Saha, S.; Keesstra, S.; Bai, S. Gully Erosion Susceptibility Assessment and Management of Hazard-Prone Areas in India Using Different Machine Learning Algorithms. Sci. Total Environ. 2019, 668, 124–138. [Google Scholar] [CrossRef]
  37. Pal, S.C.; Arabameri, A.; Blaschke, T.; Chowdhuri, I.; Saha, A.; Chakrabortty, R.; Lee, S.; Band, S.S. Ensemble of Machine-Learning Methods for Predicting Gully Erosion Susceptibility. Remote Sens. 2020, 12, 3675. [Google Scholar] [CrossRef]
  38. Soleimanpour, S.M.; Pourghasemi, H.R.; Zare, M. A Comparative Assessment of Gully Erosion Spatial Predictive Modeling Using Statistical and Machine Learning Models. CATENA 2021, 207, 105679. [Google Scholar] [CrossRef]
  39. Razavi-Termeh, S.V.; Sadeghi-Niaraki, A.; Choi, S.-M. Gully Erosion Susceptibility Mapping Using Artificial Intelligence and Statistical Models. Geomat. Nat. Hazards Risk 2020, 11, 821–844. [Google Scholar] [CrossRef]
  40. Pourghasemi, H.R.; Sadhasivam, N.; Kariminejad, N.; Collins, A.L. Gully Erosion Spatial Modelling: Role of Machine Learning Algorithms in Selection of the Best Controlling Factors and Modelling Process. Geosci. Front. 2020, 11, 2207–2219. [Google Scholar] [CrossRef]
  41. Avand, J.; Janizadeh, S.; Naghibi, S.A.; Pourghasemi, H.R.; Bozchaloei, S.K.; Blaschke, T. A Comparative Assessment of Random Forest and K-Nearest Neighbor Classifiers for Gully Erosion Susceptibility Mapping. Water 2019, 11, 2076. [Google Scholar] [CrossRef] [Green Version]
  42. Pham, Q.B.; Mukherjee, K.; Norouzi, A.; Linh, N.T.T.; Janizadeh, S.; Ahmadi, K.; Cerdà, A.; Doan, T.N.C.; Anh, D.T. Head-Cut Gully Erosion Susceptibility Modelling Based on Ensemble Random Forest with Oblique Decision Trees in Fareghan Watershed, Iran. Geomat. Nat. Hazards Risk 2020, 11, 2385–2410. [Google Scholar] [CrossRef]
  43. Arabameri, A.; Chen, W.; Lombardo, L.; Blaschke, T.; Tien Bui, D. Hybrid Computational Intelligence Models for Improvement Gully Erosion Assessment. Remote Sens. 2020, 12, 140. [Google Scholar] [CrossRef] [Green Version]
  44. Ahmadpour, H.; Bazrafshan, O.; Rafiei-Sardooi, E.; Zamani, H.; Panagopoulos, T. Gully Erosion Susceptibility Assessment in the Kondoran Watershed Using Machine Learning Algorithms and the Boruta Feature Selection. Sustainability 2021, 13, 10110. [Google Scholar] [CrossRef]
  45. Chen, Y.; Chen, W.; Janizadeh, S.; Bhunia, G.S.; Bera, A.; Pham, Q.B.; Linh, N.T.T.; Balogun, A.-L.; Wang, X. Deep Learning and Boosting Framework for Piping Erosion Susceptibility Modeling: Spatial Evaluation of Agricultural Areas in the Semi-Arid Region. Geocarto. Int. 2021, 12, 1–27. [Google Scholar] [CrossRef]
  46. Yang, A.; Wang, C.; Pang, G.; Long, Y.; Wang, L.; Cruse, R.M.; Yang, Q. Gully Erosion Susceptibility Mapping in Highly Complex Terrain Using Machine Learning Models. ISPRS Int. J. Geo-Inf. 2021, 10, 680. [Google Scholar] [CrossRef]
  47. Saha, S.; Sarkar, R.; Thapa, G.; Roy, J. Modeling Gully Erosion Susceptibility in Phuentsholing, Bhutan Using Deep Learning and Basic Machine Learning Algorithms. Environ. Earth Sci. 2021, 80, 295. [Google Scholar] [CrossRef]
  48. Kadavi, P.; Lee, C.-W.; Lee, S. Application of Ensemble-Based Machine Learning Models to Landslide Susceptibility Mapping. Remote Sens. 2018, 10, 1252. [Google Scholar] [CrossRef] [Green Version]
  49. Lee, S.; Pradhan, B. Probabilistic Landslide Hazards and Risk Mapping on Penang Island, Malaysia. J. Earth Syst. Sci. 2006, 115, 661–672. [Google Scholar] [CrossRef]
  50. Costache, R.; Ngo, P.T.T.; Bui, D.T. Novel Ensembles of Deep Learning Neural Network and Statistical Learning for Flash-Flood Susceptibility Mapping. Water 2020, 12, 1549. [Google Scholar] [CrossRef]
  51. Ijlil, S.; Essahlaoui, A.; Mohajane, M.; Essahlaoui, N.; Mili, E.M.; Van Rompaey, A. Machine Learning Algorithms for Modeling and Mapping of Groundwater Pollution Risk: A Study to Reach Water Security and Sustainable Development (Sdg) Goals in a Mediterranean Aquifer System. Remote Sens. 2022, 14, 2379. [Google Scholar] [CrossRef]
  52. Arabameri, A.; Cerda, A.; Rodrigo-Comino, J.; Pradhan, B.; Sohrabi, M.; Blaschke, T.; Bui, D.T. Proposing a Novel Predictive Technique for Gully Erosion Susceptibility Mapping in Arid and Semi-Arid Regions (Iran). Remote Sens. 2019, 11, 2577. [Google Scholar] [CrossRef] [Green Version]
  53. Bouamrane, A.; Bouamrane, A.; Abida, H. Water Erosion Hazard Distribution under a Semi-Arid Climate Condition: Case of Mellah Watershed, North-Eastern Algeria. Geoderma 2021, 403, 115381. [Google Scholar] [CrossRef]
  54. Nouayti, N.; Cherif, E.K.; Algarra, M.; Pola, M.L.; Fernández, S.; Nouayti, A.; Esteves da Silva, J.C.G.; Driss, K.; Samlani, N.; Mohamed, H.; et al. Determination of Physicochemical Water Quality of the Ghis-Nekor Aquifer (Al Hoceima, Morocco) Using Hydrochemistry, Multiple Isotopic Tracers, and the Geographical Information System (GIS). Water 2022, 14, 606. [Google Scholar] [CrossRef]
  55. Bouhout, S.; Haboubi, K.; Zian, A.; Elyoubi, M.S.; Elabdouni, A. Evaluation of Two Linear Kriging Methods for Piezometric Levels Interpolation and a Framework for Upgrading Groundwater Level Monitoring Network in Ghiss-Nekor Plain, North-Eastern Morocco. Arab. J. Geosci. 2022, 15, 1016. [Google Scholar] [CrossRef]
  56. Benabdelouahab, S.; Salhi, A.; Stitou, J.; Himi, M.; Draoui, M.; Casas, A. Application Des SIG et de La Tomographie Électrique Pour Contribuer à La Protection de l’aquifère de Martil-Alila (Maroc). In Proceedings of the Euromediterranean Scientific Congress on Engineering, Algeciras, Spain, 19–20 May 2011. [Google Scholar]
  57. Bourjila, A.; Dimane, F.; EL Ouarghi, H.; Nouayti, N.; Taher, M.; EL Hammoudani, Y.; Saadi, O.; Bensiali, A. Groundwater Potential Zones Mapping by Applying GIS, Remote Sensing and Multi-Criteria Decision Analysis in the Ghiss Basin, Northern Morocco. Groundw. Sustain. Dev. 2021, 15, 100693. [Google Scholar] [CrossRef]
  58. El Motaki, H.; El-Fengour, A.; Aissa, E.; Madureira, H.; Monteiro, A. The Global Change Impacts on Forest Natural Resources in Central Rif Mountains in Northern Morocco: Extensive Exploration and Planning Perspective. GOT—J. Geogr. Spat. Plan. 2019, 17, 75–92. [Google Scholar] [CrossRef]
  59. Leikine, M.; Asebriy, L.; Bourgois, J. About the Age of the Ketama Unit’s Anchi-Epizonal Metamorphism, Central Rif, Morocco. Comptes Rendus—Acad. Sci. Ser. II 1991, 313, 787–793. [Google Scholar]
  60. Mansour, S.; Kouz, T.; Thaiki, M.; Ouhadi, A.; Mesmoudi, H.; Hassani Zerrouk, M.; Mourabit, T.; Dakak, H.; Cherkaoui Dekkaki, H. Spatial Assessment of the Vulnerability of Water Resources against Anthropogenic Pollution Using the DKPR Model: A Case of Ghiss-Nekkour Basin, Morocco. Arab. J. Geosci. 2021, 14, 699. [Google Scholar] [CrossRef]
  61. Benyoussef, S.; Arabi, M.; El Ouarghi, H.; Ghalit, M.; Azirar, M.; El Midaoui, A.; Ait Boughrous, A. Impact of Anthropic Activities on the Quality of Groundwater in the Central Rif (North Morocco). Ecol. Eng. Environ. Technol. 2021, 22, 69–78. [Google Scholar] [CrossRef]
  62. Taher, M.; Mourabit, T.; Bourjila, A.; Saadi, O.; Errahmouni, A.; El Marzkioui, F.; El Mousaoui, M. An Estimation of Soil Erosion Rate Hot Spots by Integrated USLE and GIS Methods: A Case Study of the Ghiss Dam and Basin in Northeastern Morocco. Geomat. Environ. Eng. 2022, 16, 95–110. [Google Scholar] [CrossRef]
  63. Tsangaratos, P.; Ilia, I.; Hong, H.; Chen, W.; Xu, C. Applying Information Theory and GIS-Based Quantitative Methods to Produce Landslide Susceptibility Maps in Nancheng County, China. Landslides 2017, 14, 1091–1111. [Google Scholar] [CrossRef]
  64. Amiri, M.; Pourghasemi, H.R.; Ghanbarian, G.A.; Afzali, S.F. Assessment of the Importance of Gully Erosion Effective Factors Using Boruta Algorithm and Its Spatial Modeling and Mapping Using Three Machine Learning Algorithms. Geoderma 2019, 340, 55–69. [Google Scholar] [CrossRef]
  65. Valentin, C.; Poesen, J.; Li, Y. Gully Erosion: Impacts, Factors and Control. CATENA 2005, 63, 132–153. [Google Scholar] [CrossRef]
  66. Hembram, T.K.; Paul, G.C.; Saha, S. Modelling of Gully Erosion Risk Using New Ensemble of Conditional Probability and Index of Entropy in Jainti River Basin of Chotanagpur Plateau Fringe Area, India. Appl. Geomat. 2020, 12, 337–360. [Google Scholar] [CrossRef]
  67. Conforti, M.; Aucelli, P.P.C.; Robustelli, G.; Scarciglia, F. Geomorphology and GIS Analysis for Mapping Gully Erosion Susceptibility in the Turbolo Stream Catchment (Northern Calabria, Italy). Nat. Hazards 2011, 56, 881–898. [Google Scholar] [CrossRef]
  68. Arabameri, A.; Chandra Pal, S.; Costache, R.; Saha, A.; Rezaie, F.; Seyed Danesh, A.; Pradhan, B.; Lee, S.; Hoang, N.-D. Prediction of Gully Erosion Susceptibility Mapping Using Novel Ensemble Machine Learning Algorithms. Geomat. Nat. Hazards Risk 2021, 12, 469–498. [Google Scholar] [CrossRef]
  69. Moore, I.D.; Grayson, R.B.; Ladson, A.R. Digital Terrain Modelling: A Review of Hydrological, Geomorphological, and Biological Applications. Hydrol. Process. 1991, 5, 3–30. [Google Scholar] [CrossRef]
  70. Band, S.S.; Janizadeh, S.; Chandra Pal, S.; Saha, A.; Chakrabortty, R.; Shokri, M.; Mosavi, A. Novel Ensemble Approach of Deep Learning Neural Network (DLNN) Model and Particle Swarm Optimization (PSO) Algorithm for Prediction of Gully Erosion Susceptibility. Sensors 2020, 20, 5609. [Google Scholar] [CrossRef]
  71. Chowdhuri, I.; Pal, S.C.; Saha, A.; Chakrabortty, R.; Roy, P. Evaluation of Different DEMs for Gully Erosion Susceptibility Mapping Using In-Situ Field Measurement and Validation. Ecol. Inform. 2021, 65, 101425. [Google Scholar] [CrossRef]
  72. Bui, D.T.; Lofman, O.; Revhaug, I.; Dick, O. Landslide Susceptibility Analysis in the Hoa Binh Province of Vietnam Using Statistical Index and Logistic Regression. Nat. Hazards 2011, 59, 1413–1444. [Google Scholar] [CrossRef]
  73. Rahmati, O.; Pourghasemi, H.R.; Zeinivand, H. Flood Susceptibility Mapping Using Frequency Ratio and Weights-of-Evidence Models in the Golastan Province, Iran. Geocarto Int. 2016, 31, 42–70. [Google Scholar] [CrossRef]
  74. Lee, S.; Sambath, T. Landslide Susceptibility Mapping in the Damrei Romel Area, Cambodia Using Frequency Ratio and Logistic Regression Models. Environ. Geol. 2006, 50, 847–855. [Google Scholar] [CrossRef]
  75. Samanta, R.K.; Bhunia, G.S.; Shit, P.K.; Pourghasemi, H.R. Flood Susceptibility Mapping Using Geospatial Frequency Ratio Technique: A Case Study of Subarnarekha River Basin, India. Model. Earth Syst. Environ. 2018, 4, 395–408. [Google Scholar] [CrossRef]
  76. Mohajane, M.; Costache, R.; Karimi, F.; Bao Pham, Q.; Essahlaoui, A.; Nguyen, H.; Laneve, G.; Oudija, F. Application of Remote Sensing and Machine Learning Algorithms for Forest Fire Mapping in a Mediterranean Area. Ecol. Indic. 2021, 129, 107869. [Google Scholar] [CrossRef]
  77. Boroughani, M.; Pourhashemi, S.; Hashemi, H.; Salehi, M.; Amirahmadi, A.; Asadi, M.A.Z.; Berndtsson, R. Application of Remote Sensing Techniques and Machine Learning Algorithms in Dust Source Detection and Dust Source Susceptibility Mapping. Ecol. Inform. 2020, 56, 101059. [Google Scholar] [CrossRef]
  78. Breiman, L. Random Forests Machine Learning. Kluwer Academic Publishers: Dordrecht, The Netherlands, 2001; Volume 45. [Google Scholar]
  79. Ho, T.K. The Random Subspace Method for Constructing Decision Forests. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 832–844. [Google Scholar] [CrossRef] [Green Version]
  80. Rodriguez-Galiano, V.F.; Chica-Olmo, M.; Chica-Rivas, M. Predictive Modelling of Gold Potential with the Integration of Multisource Information Based on Random Forest: A Case Study on the Rodalquilar Area, Southern Spain. Int. J. Geogr. Inf. Sci. 2014, 28, 1336–1354. [Google Scholar] [CrossRef]
  81. Arabameri, A.; Pradhan, B.; Rezaei, K. Spatial Prediction of Gully Erosion Using ALOS PALSAR Data and Ensemble Bivariate and Data Mining Models. Geosci. J. 2019, 23, 669–686. [Google Scholar] [CrossRef]
  82. Pai, P.-F.; Hsu, M.-F. An Enhanced Support Vector Machines Model for Classification and Rule Generation. In Computational Optimization, Methods and Algorithms; Koziel, S., Yang, X.-S., Eds.; Studies in Computational Intelligence; Springer: Berlin/Heidelberg, Germany, 2011; Volume 356, pp. 241–258, ISBN 978-3-642-20858-4. [Google Scholar]
  83. Rahmati, O.; Tahmasebipour, N.; Haghizadeh, A.; Pourghasemi, H.R.; Feizizadeh, B. Evaluation of Different Machine Learning Models for Predicting and Mapping the Susceptibility of Gully Erosion. Geomorphology 2017, 298, 118–137. [Google Scholar] [CrossRef]
  84. Garosi, Y.; Sheklabadi, M.; Conoscenti, C.; Pourghasemi, H.R.; Van Oost, K. Assessing the Performance of GIS- Based Machine Learning Models with Different Accuracy Measures for Determining Susceptibility to Gully Erosion. Sci. Total Environ. 2019, 664, 1117–1132. [Google Scholar] [CrossRef]
  85. Phinzi, K.; Abriha, D.; Bertalan, L.; Holb, I.; Szabó, S. Machine Learning for Gully Feature Extraction Based on a Pan-Sharpened Multispectral Image: Multiclass vs. Binary Approach. ISPRS Int. J. Geo-Inf. 2020, 9, 252. [Google Scholar] [CrossRef] [Green Version]
  86. Ranjitha, K.V. Classification and Optimization Scheme for Text Data Using Machine Learning Naïve Bayes Classifier. In Proceedings of the 2018 IEEE World Symposium on Communication Engineering (WSCE), Singapore, 28–30 December 2018; IEEE: Singapore, 2018; pp. 33–36. [Google Scholar]
  87. Serrano-Cinca, C.; Gutiérrez-Nieto, B. Partial Least Square Discriminant Analysis for Bankruptcy Prediction. Decis. Support Syst. 2013, 54, 1245–1255. [Google Scholar] [CrossRef]
  88. Pourghasemi, H.; Gayen, A.; Park, S.; Lee, C.-W.; Lee, S. Assessment of Landslide-Prone Areas and Their Zonation Using Logistic Regression, LogitBoost, and NaïveBayes Machine-Learning Algorithms. Sustainability 2018, 10, 3697. [Google Scholar] [CrossRef] [Green Version]
  89. Bhargavi, P.; Tech, M.; Jyothi, D.S. Applying Naive Bayes Data Mining Technique for Classification of Agricultural Land Soils. Int. J. Comput. Sci. Netw. Secur. 2009, 6, 189–193. [Google Scholar]
  90. Pham, B.T.; Jaafari, A.; Avand, M.; Al-Ansari, N.; Dinh Du, T.; Yen, H.P.H.; Phong, T.V.; Nguyen, D.H.; Le, H.V.; Mafi-Gholami, D.; et al. Performance Evaluation of Machine Learning Methods for Forest Fire Modeling and Prediction. Symmetry 2020, 12, 1022. [Google Scholar] [CrossRef]
  91. Domazetović, F.; Šiljeg, A.; Lončar, N.; Marić, I. GIS Automated Multicriteria Analysis (GAMA) Method for Susceptibility Modelling. MethodsX 2019, 6, 2553–2561. [Google Scholar] [CrossRef]
  92. Yildirim, P. Filter Based Feature Selection Methods for Prediction of Risks in Hepatitis Disease. Int. J. Mach. Learn. Comput. 2015, 5, 258–263. [Google Scholar] [CrossRef] [Green Version]
  93. Mao, K.Z. Orthogonal Forward Selection and Backward Elimination Algorithms for Feature Subset Selection. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2004, 34, 629–634. [Google Scholar] [CrossRef]
  94. Lee, C.; Lee, G.G. Information Gain and Divergence-Based Feature Selection for Machine Learning-Based Text Categorization. Inf. Process. Manag. 2006, 42, 155–165. [Google Scholar] [CrossRef]
  95. Dash, M.; Liu, H. Feature Selection for Classification. Intell. Data Anal. 1997, 1, 131–156. [Google Scholar] [CrossRef]
  96. Imdadullah, M.; Aslam, M.; Altaf, S. Mctest: An R Package for Detection of Collinearity among Regressors. R J. 2016, 8, 495–505. [Google Scholar] [CrossRef]
  97. Salhi, A.; Benabdelouahab, T.; Martin-Vide, J.; Okacha, A.; El Hasnaoui, Y.; El Mousaoui, M.; El Morabit, A.; Himi, M.; Benabdelouahab, S.; Lebrini, Y.; et al. Bridging the Gap of Perception Is the Only Way to Align Soil Protection Actions. Sci. Total Environ. 2020, 718, 137421. [Google Scholar] [CrossRef]
  98. Phinzi, K.; Holb, I.; Szabó, S. Mapping Permanent Gullies in an Agricultural Area Using Satellite Images: Efficacy of Machine Learning Algorithms. Agronomy 2021, 11, 333. [Google Scholar] [CrossRef]
  99. Lana, J.C.; Castro, P.D.; Lana, C.E. Assessing Gully Erosion Susceptibility and Its Conditioning Factors in Southeastern Brazil Using Machine Learning Algorithms and Bivariate Statistical Methods: A Regional Approach. Geomorphology 2022, 402, 108159. [Google Scholar] [CrossRef]
  100. Hembram, T.K.; Saha, S.; Pradhan, B.; Abdul Maulud, K.N.; Alamri, A.M. Robustness Analysis of Machine Learning Classifiers in Predicting Spatial Gully Erosion Susceptibility with Altered Training Samples. Geomat. Nat. Hazards Risk 2021, 12, 794–828. [Google Scholar] [CrossRef]
  101. Bouramtane, T.; Hilal, H.; Rezende-Filho, A.T.; Bouramtane, K.; Barbiero, L.; Abraham, S.; Valles, V.; Kacimi, I.; Sanhaji, H.; Torres-Rondon, L.; et al. Mapping Gully Erosion Variability and Susceptibility Using Remote Sensing, Multivariate Statistical Analysis, and Machine Learning in South Mato Grosso, Brazil. Geosciences 2022, 12, 235. [Google Scholar] [CrossRef]
  102. Arabameri, A.; Chen, W.; Loche, M.; Zhao, X.; Li, Y.; Lombardo, L.; Cerda, A.; Pradhan, B.; Bui, D.T. Comparison of Machine Learning Models for Gully Erosion Susceptibility Mapping. Geosci. Front. 2020, 11, 1609–1620. [Google Scholar] [CrossRef]
  103. Costache, R.; Popa, M.C.; Tien Bui, D.; Diaconu, D.C.; Ciubotaru, N.; Minea, G.; Pham, Q.B. Spatial Predicting of Flood Potential Areas Using Novel Hybridizations of Fuzzy Decision-Making, Bivariate Statistics, and Machine Learning. J. Hydrol. 2020, 585, 124808. [Google Scholar] [CrossRef]
  104. Tien Bui, D.; Bui, Q.-T.; Nguyen, Q.-P.; Pradhan, B.; Nampak, H.; Trinh, P.T. A Hybrid Artificial Intelligence Approach Using GIS-Based Neural-Fuzzy Inference System and Particle Swarm Optimization for Forest Fire Susceptibility Modeling at a Tropical Area. Agric. For. Meteorol. 2017, 233, 32–44. [Google Scholar] [CrossRef]
  105. Ng, S.S.Y.; Xing, Y.; Tsui, K.L. A Naive Bayes Model for Robust Remaining Useful Life Prediction of Lithium-Ion Battery. Appl. Energy 2014, 118, 114–123. [Google Scholar] [CrossRef]
  106. Nguyen, P.T.; Tuyen, T.T.; Shirzadi, A.; Pham, B.T.; Shahabi, H.; Omidvar, E.; Amini, A.; Entezami, H.; Prakash, I.; Phong, T.V.; et al. Development of a Novel Hybrid Intelligence Approach for Landslide Spatial Prediction. Appl. Sci. 2019, 9, 2824. [Google Scholar] [CrossRef] [Green Version]
  107. Gourfi, A.; Daoudi, L.; Shi, Z. The Assessment of Soil Erosion Risk, Sediment Yield and Their Controlling Factors on a Large Scale: Example of Morocco. J. Afr. Earth Sci. 2018, 147, 281–299. [Google Scholar] [CrossRef]
  108. Salhi, A.; Martin-Vide, J.; Benhamrouche, A.; Benabdelouahab, S.; Himi, M.; Benabdelouahab, T.; Casas Ponsati, A. Rainfall Distribution and Trends of the Daily Precipitation Concentration Index in Northern Morocco: A Need for an Adaptive Environmental Policy. SN Appl. Sci. 2019, 1, 277. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Study area map showing GHISS basin, its elevation profile, and training and validation points used for the gully inventory map.
Figure 1. Study area map showing GHISS basin, its elevation profile, and training and validation points used for the gully inventory map.
Ijgi 11 00401 g001
Figure 2. Methodology followed in this study.
Figure 2. Methodology followed in this study.
Ijgi 11 00401 g002
Figure 3. Gully erosion photos in the GHISS watershed area.
Figure 3. Gully erosion photos in the GHISS watershed area.
Ijgi 11 00401 g003
Figure 4. Gully erosion conditioning factors: (a) elevation, (b) slope, (c) aspect, and (d) plan curvature.
Figure 4. Gully erosion conditioning factors: (a) elevation, (b) slope, (c) aspect, and (d) plan curvature.
Ijgi 11 00401 g004
Figure 5. Gully erosion conditioning factors: (a) SPI, (b) TWI, (c) distance to road, and (d) distance to stream.
Figure 5. Gully erosion conditioning factors: (a) SPI, (b) TWI, (c) distance to road, and (d) distance to stream.
Ijgi 11 00401 g005
Figure 6. Gully erosion conditioning factors: (a) LULC, (b) lithology, (c) rainfall, and (d) drainage density.
Figure 6. Gully erosion conditioning factors: (a) LULC, (b) lithology, (c) rainfall, and (d) drainage density.
Ijgi 11 00401 g006
Figure 7. Gully erosion susceptibility mapping using: (a) NB-FR, (b) RF-FR, and (c) SVM-FR.
Figure 7. Gully erosion susceptibility mapping using: (a) NB-FR, (b) RF-FR, and (c) SVM-FR.
Ijgi 11 00401 g007
Figure 8. The importance of conditioning factors.
Figure 8. The importance of conditioning factors.
Ijgi 11 00401 g008
Figure 9. (a) ROC curves of success rate. (b) ROC curves of prediction rate.
Figure 9. (a) ROC curves of success rate. (b) ROC curves of prediction rate.
Ijgi 11 00401 g009
Table 1. Data used in this study.
Table 1. Data used in this study.
Conditioning FactorUnitSourceResolution
SlopeDegrees (°)DEM 30 m, from (accessed on 20 August 2021)30 m
ElevationMeters (m)DEM 30 m, from (accessed on 20 August 2021)30 m
Plane curvature-Morocco DEM 30 m, from (accessed on 20 August 2021)30 m
Aspect-DEM 30 m, from https (accessedon 20 August 2021)30 m
Land cover-Landsat-8-OLI image, from (accessed on 12 July 2021)30 m
Rainfall(mm)ERA-Interim, from on 18 July 2021)30 m
Distance from RoadmRoad map of Morocco30 m
Distance from streammStream map of Morocco30 m
Drainage density-DEM 30 m, from on 20 August 2021)30 m
Lithology-Geological map of Morocco 1/1,000,00030 m
TWI-DEM 30 m, from on 20 August 2021)30 m
SPI-DEM 30 m, from on 20 August 2021)30 m
Table 2. Frequency ratio.
Table 2. Frequency ratio.
FactorsClassesNo. of Points% of PointsClasses Area% of Class AreaFR
799–1. 12511,7007.602119,356127.3320.597
1. 427–2.03254003.509162,771173.6480.202
Plan curvature (100/m)Concave50,40032.749237,24225.3101.294
Distance from road (m)0–1. 308103,50065.714339,67736.4951.801
1. 308–2. 95654003.42951,9485.5810.614
Distance from stream (m)0–31237,80024.000290,99331.0490.773
1.050–1. 52024,30015.429134,22214.3211.077
Drainage density (km/km2)0–90583,70053.143362,87638.7191.373
Rainfall (mm)315–47253,10033.908231,03318.6011.823
LithologyAlluvium (Holocene)90005.71440,7154.3441.315
Lower pleistocene “villafranchian”22,50014.28652,1065.5602.569
Cenomanian to Santonian with “flysch” Rif facies of Tisrin slick34,20021.71457,6806.1553.528
Lower and Middle Cretaceous with “flysch” facies91,80058.286786,67383.9410.694
LULCWater bodies00.0001340.0140.000
Agricultural lands10,8006.857124,56613.3920.512
Bare lands88,20056.000427,31945.9411.219
Table 3. Multicollinearity analysis.
Table 3. Multicollinearity analysis.
FactorsCollinearity Statistics
Plan curvature0.7471.338
Distance to stream0.7971.254
Distance to road0.7821.279
Drainage density0.7571.321
Table 4. Percentages of gully erosion susceptibility classes.
Table 4. Percentages of gully erosion susceptibility classes.
Susceptibility ClassRFSVMNB
Class% of AreaClass% of AreaClass% of Area
Very low595417.88499615.01579117.40
Very high864825.98752922.62902127.10
Table 5. Model statistical measures assigned to the training and validation datasets.
Table 5. Model statistical measures assigned to the training and validation datasets.
Table 6. A comparison of machine learning models in gully erosion susceptibility.
Table 6. A comparison of machine learning models in gully erosion susceptibility.
RegionML ModelPerformances Based on Accuracy/AUCPaper Reference
Brazil (Rio das Velhas watershed)RF0.996[99]
Iran (Robat Turk Watershed)RF0.893[34]
Naïve bayes86.37
Brazil (South Mato Grosso)MDA78.47[101]
India (Hinglo River basin)RF0.87[35]
Iran (Bastam watershed)ADTree0.922[102]
Iran (Fars province)RF0.958[64]
Abbreviations: random forest (RF), logistic regression (LR), naïve Bayes (NB), artificial neural network (ANN), credal decision trees (CDTree), kernel logistic regression (KLR), best-first decision tree (BFTree), boosted regression tree (BRT), multivariate discriminant analysis (MDA), classification and regression tree (CART), gradient boosted decision trees (GBDT), extreme gradient boosting (XGBoost), multivariate additive regression splines (MARS), flexible discriminant analysis (FDA), support vector machine (SVM), gradient boosted regression tree (GBRT), naïve Bayes tree (NBT), tree ensemble (TE), alternating decision tree (ADTree), logistic model tree (LMT).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Hitouri, S.; Varasano, A.; Mohajane, M.; Ijlil, S.; Essahlaoui, N.; Ali, S.A.; Essahlaoui, A.; Pham, Q.B.; Waleed, M.; Palateerdham, S.K.; et al. Hybrid Machine Learning Approach for Gully Erosion Mapping Susceptibility at a Watershed Scale. ISPRS Int. J. Geo-Inf. 2022, 11, 401.

AMA Style

Hitouri S, Varasano A, Mohajane M, Ijlil S, Essahlaoui N, Ali SA, Essahlaoui A, Pham QB, Waleed M, Palateerdham SK, et al. Hybrid Machine Learning Approach for Gully Erosion Mapping Susceptibility at a Watershed Scale. ISPRS International Journal of Geo-Information. 2022; 11(7):401.

Chicago/Turabian Style

Hitouri, Sliman, Antonietta Varasano, Meriame Mohajane, Safae Ijlil, Narjisse Essahlaoui, Sk Ajim Ali, Ali Essahlaoui, Quoc Bao Pham, Mirza Waleed, Sasi Kiran Palateerdham, and et al. 2022. "Hybrid Machine Learning Approach for Gully Erosion Mapping Susceptibility at a Watershed Scale" ISPRS International Journal of Geo-Information 11, no. 7: 401.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop