Modeling Gully Erosion Susceptibility to Evaluate Human Impact on a Local Landscape System in Tigray, Ethiopia

: In recent years, modeling gully erosion susceptibility has become an increasingly popular approach for assessing the impact of different land degradation factors. However, different forms of human inﬂuence have so far not been identiﬁed in order to form an independent model. We investigate the spatial relation between gully erosion and distance to settlements and footpaths, as typical areas of human interaction, with the natural environment in rural African areas. Gullies are common features in the Ethiopian Highlands, where they often hinder agricultural productivity. Within a catchment in the north Ethiopian Highlands, 16 environmental and human-related variables are mapped and categorized. The resulting susceptibility to gully erosion is predicted by applying the Random Forest (RF) machine learning algorithm. Human-related and environmental factors are used to generate independent susceptibility models and form an additional inclusive model. The resulting models are compared and evaluated by applying a change detection technique. All models predict the locations of most gullies, while 28% of gully locations are exclusively predicted using human-related factors.


Introduction
Gullies are linear depressions of constant grade incised deeper than 0.3 m, resulting from the removal of soil and weathered bedrock by concentrated runoff [1,2]. Where gully erosion appears, it poses a problem for the conservation of arable land and thus for longterm food production [3][4][5]. Poesen et al. [6] conclude that 10%-94% of overall soil loss volume due to water erosion is caused by gullying. In order to predict and subsequently prevent gully erosion, several remote-sensing-based models are available that use machine learning algorithms and statistical methods, i.e., Random Forest (RF), Support Vector Machines (SVM), or Weights-of-Evidence (WoE).
These models aim to assess the geomorphic threshold that must be exceeded for the initiation of gully erosion based on classification rules and include a combination of environmental and land use factors [7][8][9][10]. The distinction between environmental and human-driven factors is only rarely addressed in the available modeling approaches, although soil erosion is primarily understood as a process of accelerated erosion resulting from human impact [11][12][13][14][15]. Gully erosion, in particular, has been shown to be related to different aspects of land use such as overgrazing, topsoil crusting, and unadapted irrigation techniques [5,16]. Furthermore, it is suggested that the occurrence of pathways, i.e., unpaved roads and footpaths, trigger the generation of concentrated runoff [14,[16][17][18]; hollow ways are a typical indication of past gullying processes along pathways [19][20][21].
In the present study, we focus on the role of land use and settlement activities in gully erosion. We aim to generate a gully erosion susceptibility model which will: (a) distinguish between the influences of human and environment-related factors on the spatial distribution of gullies; and (b) evaluate the influence of pathways on the formation of gullies. A gully erosion. We aim to generate a gully erosion susceptibility model which will: a) distinguish between the influences of human and environment-related factors on the spatial distribution of gullies; and b) evaluate the influence of pathways on the formation of gullies. A catchment in the north Ethiopian Highlands served as a study site (Figure 1), an area which is well known for current gully erosion dynamics [3,14,17,22].

Study Area
The study area is located in the Tigray region in the north Ethiopian Highlands, about 30 km north of Aksum and ca. 9 km south of the Mareb River (called the Gash River further downstream) which marks the border between Ethiopia and Eritrea here ( Figure 1). The topography of the study area is characterized by a graben-like structure with a major S-N strike direction. A road connecting the Ethiopian town of Adwa with the Eritrean town of Adi Ugri runs through and follows this structure. The depression is drained by the River Inda Shawit which is a tributary to the Mareb River. The studied catchment is tributary to the Inda Shawit and lies on the eastern side of the depression c. 16 km south of the confluence of the Inda Shawit and Mareb River, covering ca. 6.7 km 2 . At the mouth to the Inda Shawit, the catchment is situated at 1375 m (a.s.l.). Due to its topography the study catchment can be subdivided into three zones: (a) the lowlands (< 1463 m a.s.l.) predominant in the western and central parts, and (b) a transition zone (1463-1575 m a.s.l.) connecting the lowlands with (c) the hill country (> 1575 m a.s.l.) that is dominated by short steep slopes with inclinations up to 40° (Figure 2a).
The annual precipitation cycle follows a bimodal pattern and reaches an average annual value of 300-800 mm [23]. At the Adwa weather station (1908 m a.s.l.), ca. 25 km to the south of the study area, annual rainfall amounts to 668 mm, and the annual temperature average is 19.6 °C (1982-2012) [24].

Study Area
The study area is located in the Tigray region in the north Ethiopian Highlands, about 30 km north of Aksum and ca. 9 km south of the Mareb River (called the Gash River further downstream) which marks the border between Ethiopia and Eritrea here ( Figure 1). The topography of the study area is characterized by a graben-like structure with a major S-N strike direction. A road connecting the Ethiopian town of Adwa with the Eritrean town of Adi Ugri runs through and follows this structure. The depression is drained by the River Inda Shawit which is a tributary to the Mareb River. The studied catchment is tributary to the Inda Shawit and lies on the eastern side of the depression c. 16 km south of the confluence of the Inda Shawit and Mareb River, covering ca. 6.7 km 2 . At the mouth to the Inda Shawit, the catchment is situated at 1375 m (a.s.l.). Due to its topography the study catchment can be subdivided into three zones: (a) the lowlands (<1463 m a.s.l.) predominant in the western and central parts, and (b) a transition zone (1463-1575 m a.s.l.) connecting the lowlands with (c) the hill country (>1575 m a.s.l.) that is dominated by short steep slopes with inclinations up to 40 • (Figure 2a).
The annual precipitation cycle follows a bimodal pattern and reaches an average annual value of 300-800 mm [23]. At the Adwa weather station (1908 m a.s.l.), ca. 25 km to the south of the study area, annual rainfall amounts to 668 mm, and the annual temperature average is 19.6 • C (1982-2012) [24].
Rainfall occurs in the "belg"-season (March-May) and the "kiremt"-season (June-September), which corresponds to the major rainy season [25]. The potential mean annual evapotranspiration in the Tigray region varies between 1600 and 2100 mm; thus, up to 90% of the region is classified as semi-arid [26].

Figure 2.
Overview map of the studied catchment area (red frame in Figure 1) and the mainstream (river) to which the watershed streams are tributary (ALOS Global DSM). (a) includes the drainage network, differentiated into gullies, main channels, and rivers. The white line marks the local topography-based division into lowlands (<1463 m above sea level (a.s.l.), transition (1463-1575 m a.s.l.), and hill country (>1575 m a.s.l.). (b) shows areas of human interaction (residential areas, pathways and hollow ways).
Rainfall occurs in the "belg"-season (March-May) and the "kiremt"-season (June-September), which corresponds to the major rainy season [25]. The potential mean annual evapotranspiration in the Tigray region varies between 1600 and 2100 mm; thus, up to 90% of the region is classified as semi-arid [26].
Referring to the Soil Atlas of Africa [27], ca. 93% of the study area is characterized by eutric cambisols, while in the headwater area, eutric leptosols are widespread. On-site soil identification showed that soils in the transition zone and the hill country consisted mainly of leptosols developed in highly weathered and eroded bedrock, indicating high morpho-dynamics [27].
Due to the periodic rainfall pattern and the annual negative water balance, local agriculture depends on irrigation. Land use in the western part of the lowlands is characterized by croplands and plantations where mainly avocado and mango trees are cultivated. The high water demand of these cash crops is met using a motorized pump system that lifts water from the Inda Shawit River. Field crops (maize, sorghum, teff, and finger millet) dominate the central lowlands. A small residential area is located in the northwest  Figure 1) and the mainstream (river) to which the watershed streams are tributary (ALOS Global DSM). (a) includes the drainage network, differentiated into gullies, main channels, and rivers. The white line marks the local topography-based division into lowlands (<1463 m above sea level (a.s.l.), transition (1463-1575 m a.s.l.), and hill country (>1575 m a.s.l.). (b) shows areas of human interaction (residential areas, pathways and hollow ways).
Referring to the Soil Atlas of Africa [27], ca. 93% of the study area is characterized by eutric cambisols, while in the headwater area, eutric leptosols are widespread. On-site soil identification showed that soils in the transition zone and the hill country consisted mainly of leptosols developed in highly weathered and eroded bedrock, indicating high morpho-dynamics [27].
Due to the periodic rainfall pattern and the annual negative water balance, local agriculture depends on irrigation. Land use in the western part of the lowlands is characterized by croplands and plantations where mainly avocado and mango trees are cultivated. The high water demand of these cash crops is met using a motorized pump system that lifts water from the Inda Shawit River. Field crops (maize, sorghum, teff, and finger millet) dominate the central lowlands. A small residential area is located in the northwest of the lowlands, complemented by scattered housing distributed over the entire catchment area. Amid the lowland's agricultural areas, badlands have developed, mostly along the main channels and close to pathways. Sloping areas of the hill country and the transition zone are either terraced and used to cultivate finger millet and teff, or uncultivated land mostly covered by shrubs.

Modeling Gully Erosion
The formation of gullies is determined by the erodibility of the source material and the force of water as the eroding agent [14,28]. The erodibility is related to the geomorphic threshold; thus, it is a determinant for a landscape system's stability [11,29,30]. In order to reproduce the terrain-forming process of gullying, various models aim to predict the influences on the erodibility threshold. Conditioning influences are approached mainly by topographic, climatic, pedogenic, and geological factors [21,[31][32][33][34]. Additionally, Valentin et al. [5] emphasize the importance of land use change on the formation of gully erosion. Accordingly, Boardman [35] criticizes common soil erosion models stating that they underestimate the influence of socio-economic variables. This observation is underscored by the suggestion that, e.g., the construction of roads causes a reduction in the infiltration capacity and results in a concentration of superficial water, leading to surface runoff. This, in turn, has been shown to affect the formation of gullies [36,37]. Following similar principles, Schütt et al. [14] observe that a vast number of gullies formed along paths and cattle trails as well as in the vicinity of settlements. The compaction of sediment underneath paths and residential areas results in a reduction of infiltration capacity, in turn generating surface runoff [14,16,21]. Given these observations, it is evident that there is need for a model describing the spatial relation between gully erosion, pathways, and settlements.
Modeling gully erosion involves a high degree of complexity. Standard approaches for soil erosion modeling, such as the Universal Soil Loss Equation USLE [38] and its subsequent extensions (RUSLE; MUSLE) express the basic principle of erosion but cannot address the gully erosion process [39,40]. Other models, such as the Chemicals, Runoff, and Erosion from Agricultural Management Systems (CREAMS; [41]), the European Soil Erosion Model (ESEM; [42]), the Ephemeral Gully Erosion Model (EGEM; [43]) or the Water Erosion Prediction Project (WEPP; [44]), aim not only to address soil erosion, but also to reproduce erosion rates both quantitatively and qualitatively [7,15].
In order to understand the dynamics and emergence of gully erosion in terms of the variability of the conditioning factors, it is necessary to consider the spatial distribution of gully erosion as a first step. Correspondingly, the locations of gully heads can be associated with a combination of factors that cause instability and trigger erosion. The transition between stability and instability is described by a geomorphic threshold [29,30]. This dimensionless state is composed of environmental and human-related factors that describe and affect the physical basis of a specific location. Even though the above-mentioned models help to understand the fundamental processes of gully erosion, none of them addresses the spatial distribution of this terrain-forming process. Consequently, most recent studies about gully development using statistical models are based on remote sensing data and apply data mining methods and geographic information systems (GIS). Machine learning algorithms such as Support Vector Machines (SVM), AdaBoost, Artificial Neuronal Networks (ANN), Boosted Regression Tree (BRT), or Random Forest (RF) are frequently used for gully erosion modeling [7,8,31,45]. Statistical methods applied include Frequency Ratio (FR), Multivariate Adaptive Regression Splines (MARS), Weights-of-Evidence (WoE), and Maximum Entropy (ME) [7,8,10,21].
Comparison of the different approaches reveals that the Random Forest algorithm can handle large datasets and generate models with high accuracy. It can produce fast and stable classifications, especially given the need to incorporate multiple features. Additionally, RF is able to assess the importance of each variable used in order to calculate a multi-classifier, and evaluates its own accuracy [7, 9,46].

Database
Mapping of the gully, settlement, and pathway networks is based on Google Earth imagery (CNES/Airbus, Maxar Technologies, Map data ©2020) imported into QGIS (2.18.16) via the "QuickMapServices" plug-in. Mapping results were complemented and validated during a field campaign in November 2019. During the field campaign area-wide mapping of gullies and gully heads was also undertaken.

Gully Erosion Conditioning Factors
Sixteen factors potentially determining the location of gullies are considered; the compilation of these factors is based on a literature review. The factors are grouped into human-related and environmental influences, whereby environmental factors are understood as influences that also exist without human intervention ( Figure 3, Table 1) [7- 10,47]. Three factors were selected to reflect possible human drivers of gully erosion: distance to settlements, distance to pathways, and land use land cover (LULC). These three factors are associated with the occurrence of gullies in the literature [5,14,16,21,48]. The number of variables is not equal when comparing environmental and human-related factors. However, the lower amount of these human-related factors should not affect the prediction of gullies due to their independent evaluation and the overall similarity in magnitude.

Database
Mapping of the gully, settlement, and pathway networks is based on Google Earth imagery (CNES/Airbus, Maxar Technologies, Map data ©2020) imported into QGIS (2.18.16) via the "QuickMapServices" plug-in. Mapping results were complemented and validated during a field campaign in November 2019. During the field campaign area-wide mapping of gullies and gully heads was also undertaken.

Gully Erosion Conditioning Factors
Sixteen factors potentially determining the location of gullies are considered; the compilation of these factors is based on a literature review. The factors are grouped into human-related and environmental influences, whereby environmental factors are understood as influences that also exist without human intervention ( Figure 3, Table 1) [7- 10,47]. Three factors were selected to reflect possible human drivers of gully erosion: distance to settlements, distance to pathways, and land use land cover (LULC). These three factors are associated with the occurrence of gullies in the literature [5,14,16,21,48]. The number of variables is not equal when comparing environmental and human-related factors. However, the lower amount of these human-related factors should not affect the prediction of gullies due to their independent evaluation and the overall similarity in magnitude. Several

Model C
Land use land cover (LULC) Human-related On-site mapping/digital mapping in QGIS/rasterized (30 × 30 m)/ArcMap [5,16] Distance to pathways Human-related On-site mapping/digital mapping in QGIS/Euclidean distance in ArcMap [14,48] Distance to residential areas Human-related On-site mapping/digital mapping in QGIS/Euclidean distance in ArcMap [14,17] Several  Figure S1n). The Sentinel scene was taken at the end of the kiremt-season (24.09.2019). This period not only provides a lower cloud cover, but also represents a potential phenological peak of the study area and consequently maximum vegetative-induced stability [54]. The land use land cover (LULC) was manually mapped during the field survey using the classes of cropland, mixed cropland and plantations, residential areas, badlands, erosion protection terraces, and currently uncultivated land (Supplementary Material, Figure S1m). The Euclidean distance to the mapped and digitized pathways network, river courses, and residential areas was calculated in ArcMap

Data Processing
The calculations are based on selectively extracted factors at the origin of the mapped gullies (gully heads). The ArcMap (10.5) tool "point extract" was used to obtain the respective starting points of the digitalized gully-polylines.
The Random Forest (RF) algorithm [59] was applied in R environment using Rstudio (3.5.0) and the "randomForest" package of Liaw and Wiener [60]. From the combinations of classified human-related and environmental factors, a total of three models were trained to predict the formation of gullies within the studied catchment. Each of the models should allow susceptibility to gully erosion to be estimated for any given location.
At the mapped gully heads, human-related and environmental factors (Table 1) were extracted from the GIS environment and combined in order to train and validate the models in a ratio of 70:30 with 30% of the data for cross-validation ( Figure 3). As a control group, the same amount of "non-gully" points were randomly created, visually validated, and added in the same ratio to the training and test datasets. Thus, in the bootstrap aggregation process of the random forest algorithm (so-called "bagging"), training points were randomly selected and used to derive a newly assembled bootstrapped dataset. In order to estimate a classification rule for each of the three models, it was specified how many of the variables are used for a decision at each split node of the decision trees (mtry). The "randomForest" package by Liaw and Wiener [60] contains the function "tuneRF", which was used to identify the best amount of mtry variables dependent on the resulting prediction error. With the fixed number of mtry variables and trees, the algorithm built up the individual decision trees using a random subset of the bootstrapped dataset at each split [46,59,61].
Model A applies all 16 factors (Table 1) and was derived by using 500 trees at four mtry variables. In Model B, 13 variables incorporating only the environmental factors are processed in the RF algorithm by 600 trees and two mtry variables. Model C is exclusively based on the three human-related factors ( Table 1) and was trained by 600 trees at three mtry variables. The prediction of gully erosion susceptibility in the study area by Models A-C led to pixel-wise majority votes, which were equally classified into five susceptibility classes (very low, low, moderate, high, very high). Each class corresponds to a share of 20%. Therefore, pixels representing classes from very low to low were determined to be susceptible to gully erosion by only 40% or less of all decision trees, indicating that gullies are less likely to develop in these areas. The moderate class addresses pixels close to the majority vote threshold and aims to prevent uncertain classifications (40-60%). High to very high scores indicate a substantial positive agreement between the decision trees (>60%), indicating that gullies are more likely to develop in these areas. Results were incorporated in a Gully Erosion Susceptibility Map (GESM) displayed for each model [32,33,45]. Three different methods were used to validate the model accuracies: (1) supervised validation by a subset and unbiased test dataset, (2) "out of bag" (OOB) error, automatically estimated by the random forest algorithm, and (3)

Results
In the study catchment (6.7 km 2 ) a total of 667 gullies and 546 gully heads were mapped. The variance inflation factors among all variables indicate no multicollinearity (Supplementary Material, Table S3). The most relevant human-related and environmental factors influencing the spatial distribution of gullies and gully heads are derived in the following (for remaining factors, see Table S1 in the Supplementary Material).

Gully Head Distribution
The majority of all gully heads are found in the hill country (51%), while 38% of the gully heads are in the transition zone, and 11% in the lowlands. Gully heads occur primarily on slopes with an inclination of 15%-35 • (71%). The location of gully heads in the vicinity of pathways follows a bimodal distribution: 89% of the gully heads are located up to 100 m from pathways, while 21% of all gully heads occur at distances of less than 5 m distance from pathways. Distances of 5-25 m between gully heads and pathways are underrepresented (4%-7%) (Supplementary Material, Table S1). About 11% of all gully heads are located more than 100 m from pathways (Supplementary Material, Table S1). Furthermore, 71% of all gully heads are located more than 100 m from settlements (Supplementary Material, Table S1). Analyzing the distribution of gully heads among the different land use land cover classes shows that most gully heads are either located on uncultivated bare land (64% of all gully heads) or in areas covered by erosion protection terraces (25% of all gully heads). Only 7% of the gully heads occur on cultivated croplands (Supplementary Material, Table S1).

Model Performance
To train gully erosion susceptibility models, a training dataset of 764 points was applied. Half of these points correspond to mapped gully heads, while the other half are located in supervised areas that are not affected by gullying. Considering all factors (Model A, Table 1), 340 of 382 (89%) of the gully heads were correctly predicted, and 336 of 382 (88%) areas unaffected by gully heads were correctly predicted as "non-gully head locations". The out-of-bag (OOB) error (11.52%) results in an accuracy of 88.48%; the verification of the test data reveals an accuracy of 89.26% for Model A ( Table 2). When training gully erosion susceptibility by including environmental factors only (Model B), gully heads are predicted with an accuracy of 87.44% and non-gully heads with an accuracy of 86.13%. The validation of the test data for Model B reveals an accuracy of 88.34% (Table 2) and an OOB error that corresponds to an accuracy of 86.78%. According to the comparison of receiver operating characteristic (ROC) curves, the area under the curve (AUC) of Model B is slightly lower than for Model A (Figure 4, Table 2). In Model C, applying exclusively human-related factors, the prediction of gully heads achieves an accuracy of 87.44% and, thus, is widely similar to that of Model B. According to the validation of the test data, Model C predicts the location of gully heads with 84.97% accuracy ( Table 2). The AUC of Model C is lower than for Model A and Model B, indicating a higher false-positive rate for the classifications of Model C ( Figure 4, Table 2). For Model C, the OOB error (18.32%) yields a precision of 81.68% and Model C predicts the location of non-gullies with an error of 24.08%.  When training gully erosion susceptibility by including environmental factors only (Model B), gully heads are predicted with an accuracy of 87.44% and non-gully heads with an accuracy of 86.13%. The validation of the test data for Model B reveals an accuracy of 88.34% (Table 2) and an OOB error that corresponds to an accuracy of 86.78%. According to the comparison of receiver operating characteristic (ROC) curves, the area under the curve (AUC) of Model B is slightly lower than for Model A (Figure 4, Table 2). In Model C, applying exclusively human-related factors, the prediction of gully heads achieves an accuracy of 87.44% and, thus, is widely similar to that of Model B. According to the validation of the test data, Model C predicts the location of gully heads with 84.97% accuracy ( Table 2). The AUC of Model C is lower than for Model A and Model B, indicating a higher false-positive rate for the classifications of Model C ( Figure 4, Table 2). For Model C, the OOB error (18.32%) yields a precision of 81.68% and Model C predicts the location of non-gullies with an error of 24.08%.

Variable Importance
The Mean Decrease in Accuracy (MDA) and the Mean Decrease in Gini (MDG) scores estimate the drainage density, Land Use Land Cover (LULC), and elevation as the most important variables included in Model A (Table 3). According to the estimated MDA of Model A, these variables are followed by the slope aspect, distance to residential areas, slope degree, NDVI, and the distance to pathways (in descending relevance). In contrast to the MDA, the MDG indicates that the slope degree and distance to pathway have a higher impact on the performance of Model A. In contrast to MDA slope length, steepness (LS) and curvature are of higher influence for Model A in the MDG ranking. The least influential factor for both rankings of Model A is the stream power index (SPI). Essential variables for Model B are the drainage density and the elevation (Table 3). Furthermore, the MDG ranking indicates that the factors drainage density and elevation

Variable Importance
The Mean Decrease in Accuracy (MDA) and the Mean Decrease in Gini (MDG) scores estimate the drainage density, Land Use Land Cover (LULC), and elevation as the most important variables included in Model A (Table 3). According to the estimated MDA of Model A, these variables are followed by the slope aspect, distance to residential areas, slope degree, NDVI, and the distance to pathways (in descending relevance). In contrast to the MDA, the MDG indicates that the slope degree and distance to pathway have a higher impact on the performance of Model A. In contrast to MDA slope length, steepness (LS) and curvature are of higher influence for Model A in the MDG ranking. The least influential factor for both rankings of Model A is the stream power index (SPI). Essential variables for Model B are the drainage density and the elevation (Table 3). Furthermore, the MDG ranking indicates that the factors drainage density and elevation are followed in importance by slope degree, and after a considerable gap by slope aspect. In contrast to the MDG ranking, the MDA assessment for Model B estimates the NDVI and slope aspect factors as being more eminent for the model performance than the slope degree. MDA and MDG rankings for variables in Models A and B show great similarities (Table 3). For Model C, the land use land cover represents the most crucial factor applying both ranking systems, followed by the distance to residential areas and the distance to pathways (Table 3).

Spatial Distribution of Gully Erosion Susceptibility
The spatially differentiated prediction of gully erosion susceptibility for the study catchment by Models A and B results in a pixel-wise maximum majority voting of 80.5%. This implies that the applied random forest algorithm of both models was not able to achieve an absolute distinction between gully erosion susceptibility and non-gully erosion susceptibility in all of its decision trees. In contrast, Model C achieves a maximum majority voting of 100%. All three models A-C clearly point out that about 50% of the study catchment displays very low susceptibility to gully erosion (Figure 5a-c).
The spatially differentiated prediction of gully erosion susceptibility for the stu catchment by Models A and B results in a pixel-wise maximum majority voting of 80.5 This implies that the applied random forest algorithm of both models was not able achieve an absolute distinction between gully erosion susceptibility and non-gully e sion susceptibility in all of its decision trees. In contrast, Model C achieves a maximu majority voting of 100%. All three models A-C clearly point out that about 50% of t study catchment displays very low susceptibility to gully erosion (Figure 5a-c). Areas with low gully erosion susceptibility cover about 21% (Model A) with resp to. 24% (Model B) of the study area. Predominantly the lowlands show overall low su ceptibility to gully erosion; however, low gully erosion susceptibility is also predicted the proximity of mapped gully courses (Figure 2). Areas that are moderately suscepti to gully erosion cover 13% of the study catchment. Locations with high susceptibility gully erosion show a widely similar spatial pattern with 15 (Model A) with respect 14% (Model B). These areas that are highly susceptible to gully erosion predominan occur in the transition zone and hill country as well as in the changeover between t transition zone and the lowlands. According to Model A, areas of very high gully erosi susceptibility cover 3%, while according to Model B they cover 0% of the study catc ment and occur exclusively within the changeover between the hill country and t transition zone. Applying exclusively human-related factors (Model C) shows a sligh different spatial pattern of gully erosion susceptibility from those of Models A and Model C classifies 49% of the catchment area as having very low susceptibility to gu erosion and 23% as having low susceptibility to gully erosion; these areas predominan occur in the western part of the lowlands. Compared to Models A and B, the outcomes Model B in general predict a lower gully erosion susceptibility in the lowlands and Areas with low gully erosion susceptibility cover about 21% (Model A) with respect to. 24% (Model B) of the study area. Predominantly the lowlands show overall low susceptibility to gully erosion; however, low gully erosion susceptibility is also predicted for the proximity of mapped gully courses (Figure 2). Areas that are moderately susceptible to gully erosion cover 13% of the study catchment. Locations with high susceptibility for gully erosion show a widely similar spatial pattern with 15 (Model A) with respect to 14% (Model B). These areas that are highly susceptible to gully erosion predominantly occur in the transition zone and hill country as well as in the changeover between the transition zone and the lowlands. According to Model A, areas of very high gully erosion susceptibility cover 3%, while according to Model B they cover 0% of the study catchment and occur exclusively within the changeover between the hill country and the transition zone. Applying exclusively human-related factors (Model C) shows a slightly different spatial pattern of gully erosion susceptibility from those of Models A and B. Model C classifies 49% of the catchment area as having very low susceptibility to gully erosion and 23% as having low susceptibility to gully erosion; these areas predominantly occur in the western part of the lowlands. Compared to Models A and B, the outcomes of Model B in general predict a lower gully erosion susceptibility in the lowlands and a higher gully erosion susceptibility in the hill country and the transition zone. Areas of moderate (1%) and high (2%) gully erosion susceptibility are scarce in Model C (Figure 5c). In contrast to Models A and B, Model C estimates 25% of the catchment to be highly susceptible to gully erosion (Model A: 1%, Model B: 0%). The areas of high gully erosion susceptibility predicted by Model C primarily occur in the transition zone and hill country and in general occur in association with footpaths and residential areas (Figures 2b and 5c).

Change Detection
Outputs of Model B (exclusively considering environmental factors) and Model C (exclusively considering human-related factors) are compared and assessed by change detection. The results exclude Model A (all factors) and are differentiated into five classes: 5.5.1. Areas Predicted as Susceptible to Gully Erosion by Model B and Model C Comparing the gully erosion susceptibility predictions by Models B and C reveals that both models similarly predict 9% of the catchment to be highly to very highly susceptible to gully erosion. Comparison with the field datasets confirms that 183 (34%) of the gully heads mapped occur in areas predicted to be highly to very highly susceptible to gully erosion by both models with a core area of these 183 gully heads predominantly occurring in the hill country (120 gully heads) and the transition zone (63 gully heads), primarily on slopes with inclinations >10 • . According to the LULC classification, 144 of the 183 gully heads are located in uncultivated areas and 39 gully heads can be found within erosion protection terraces. Of the 183 mapped gully heads found in areas predicted as being highly to very highly susceptible to gully erosion, 125 gully heads occur within 100-200 m of settlement areas and 126 (69%) occur at less than 50 m to pathways (Supplementary Material, Table S2).

Areas Predicted as Susceptible to Gully Erosion Only by Model B
Areas that are predicted as highly to very highly susceptible to gully erosion by Model B but not by Model C cover about 11% of the total study catchment. 156 (29%) of the mapped gully heads are located in areas predicted exclusively by Model B as being highly to very highly susceptible to gully erosion. 90% of these 156 gully heads occur at distances of more than 200 m from residential areas, while 58% of these 156 gully heads are located within 50 m of pathways (Supplemental Material, Table S2).

Areas Predicted as Susceptible to Gully Erosion Only by Model C
Overall, 18% of the study area is predicted as having high to very high susceptibility to gully erosion by Model C but not by Model B. Of the mapped gully heads, 151 (28%) are located in areas predicted exclusively by Model C as highly to very highly susceptible to gully erosion, with 91 of these gully heads being located in the hill country, 51 in the transition zone, and 9 in the lowlands. According to the distribution of land use land cover classes, these 151 gully heads primarily occur on uncultivated land (119 gully heads) and to a minor degree on erosion protection terraces (32 gully heads). Gully heads located within 50 m of the pathway network comprise 75% of those located in areas that are highly to very highly susceptible to gully erosion as predicted by Model C only. 86% of the gully heads occurring in areas predicted as susceptible to erosion exclusively by Model C are located within a distance of 50-200 m to settlement areas. Slopes with 15-30 • inclination accommodate 77% of these 151 gully heads (Supplementary Material, Table S2).

Areas Predicted to Have Moderate Gully Erosion Susceptibility by Either Model B and Model C or by One of Them
Areas classified as being moderately susceptible to gully erosion either by Model B or Model C or by both show majority votes in the random forest between 40%-60% of all trees. These areas only cover about 1% of the total study catchment and do not follow a distinctive spatial pattern (Figure 5d). Only 7 (1%) of all mapped gully heads are located in such areas. In consequence, these areas are used to exclude uncertain classifications from the comparisons.

Areas Predicted as Non-Susceptible to Gully Erosion by Either Model B and Model C or by One of Them
Up to 60% of the study area is classified as having low to very low susceptibility to gully erosion by Model B as well as by Model C. These areas predominantly occur in the lowlands and partly the hill country (Figure 5d). In total, 49 of the mapped gully heads (9%) are located in areas predicted as having low to very low susceptibility to gully erosion either by Model B or by Model C, and 30 of these 49 mapped gully heads are located in the lowlands, 11 in the transition zone and 8 in the hill country. Correspondingly, 28 of these 49 gully heads are located on slopes with an inclination of less than 5 • (Supplemental Material, Table S2). 31 of these 49 gully heads located in areas predicted to have low to very low gully erosion susceptibility can be found at distances of more than 200 m to settlement areas; 25 of these 49 gully heads are located less than 50 m from pathways.

Modeling Gully Erosion Susceptibility
Comparing performances between the three Models A-C designed to predict gully erosion susceptibility by applying different conditioning factors in general certifies that all three models achieve a satisfying classification accuracy for gully erosion susceptibility ( Table 2). The achieved accuracies are in agreement with performances recently published for gully erosion susceptibility assessment [7,64].
Model B, which includes just environmental factors, and Model C, exclusively including human-related factors, correspond to subsets of Model A. Therefore, the relevance of the implemented variables can be determined by comparing the accuracies of Models B and C. Comparing these two modeling approaches reveals that a considerably higher degree of accuracy is achieved with exclusively environmental rather than strictly human-related factors (Table 2). Consequently, Model A, which integrates all factors, generated the highest accuracy in predicting an area's susceptibility to gully erosion. Taking both human and non-human related types of factor into account using a single model has previously produced highly accurate GESMs [7, 47,64]. However, in these models distances to pathways and settlements were not considered as input factors.
The importance of the factor "distance to pathways" is neither represented in the mean decrease in accuracy (MDA) nor the mean decrease in Gini (MDG) ranking of Model A or Model C (Table 3). However, areas close to settlements, which like pathways are similarly exposed to strong trampling and resulting soil compaction [14], are less likely to have gully heads in their immediate vicinity than pathways. The MDA and MDG rankings indicate a more eminent impact of distance to residential areas on the performance in Models A and C than for distances to pathways (Table 3). Methodologically, this inverse importance of both factors relates to the training of Models A-C, as each model was trained by a supervised dataset to create a classification rule, i.e., a multi-classifier, that is able to effectively differentiate between factors associated with gully head points and random non-gully head points. This training implies that a factor that supports the prediction of non-gully heads (true negative classification) but decreases the performance for the prediction of gully heads (true positive classification) can still have a high value for the models and therefore a high importance rank. Consequently, an exclusive comparison of the rankings does not provide robust information about which of the factors have a decisive influence on controlling gully erosion. The root of this problem corresponds to the problems described by Braun [65] while discussing remote-sensing-based land use analysis. His "More Accuracy Less Meaningful" (MALM) approach criticizes the desire for the highest accuracy of predictions seen in many land use modeling studies, which often leads to a neglect of environmental validity for the modeling results. However, the chosen procedure of MDA and MDG rankings reflects how effectively an individual factor contributes to the model's decision on a distinctive differentiation between gully heads and non-gully heads. Although the evaluation of MDA and MDG rankings is a valuable method to reflect the accuracy of predictions for each model and estimate valuable factors [63,66], it does not allow delineation between the different factor-sets applied in the miscellaneous model approaches.

Human Influence on Gully Erosion
It is assumed that morpho-dynamics underlie a dynamic equilibrium which can be disturbed by natural triggers as well as human impact [67]. Especially human settlement activities affect this dynamic equilibrium and change tipping points of surface shaping processes [12,[67][68][69]. While settlement activities mostly cause an acceleration of erosion processes corresponding to soil erosion [12,68,69], targeted soil conservation and water harvesting measures increase landscape stability [70,71].

Land Use and Land Cover
The land use land cover factor (LULC) is one of the most eminent factors for Models A and C, which corresponds to the results of Gayen et al. [64] and the observations of Valentin et al. [5]. The land use land cover factor primarily displays areas that are utilized by humans (cropland, terraces, settlement areas) combined with different environmental characteristics and may also indicate land degradation [5,16,32]. One indicator for humanaccelerated soil degradation is the extensive badlands found in the lowlands within the main agricultural area of the study catchment, which is predominantly used for cash crop cultivation. Human-driven interventions contributing to such soil disturbance are unsuitable soil cultivation practices and unsuitable crops [72], as well as insufficient maintenance of the cropping areas [14].
The majority of gully heads occur in areas that are classified as uncultivated. These areas, especially in the hill country, are primarily covered by sparse shrubland with poor ground vegetation. The resulting poor leaf coverage increases exposure to raindrop impact and generation of saturation overland flow [73,74] and causes low rooting density that reduces ground stabilization [75,76], and both contexts increase exposure to soil erosion and gully erosion, especially when combined with steep slope inclinations [74]. Moreover, pathways crossing these uncultivated areas are not considered in the land use land cover classification, thus the effects of gully erosion due to concentrated runoff developing along pathways cannot be differentiated [14,17]. Gully head density in areas of erosion protection terraces (192 gully heads/km 2 ) is higher than gully head density in uncultivated areas (163 gully heads/km 2 ) and, thus, indicates a higher susceptibility to gully erosion. The construction of erosion protection terraces aims to increase slope stability in the headwater area and thus to reduce linear erosion. However, although they are intended to prevent erosion, terraces in the study catchment are an area highly affected by gully erosion (Table S1). Similar to the descriptions of Nyssen et al. [17] and Schütt et al. [14], most of the stone terrace risers in the study catchment are unmaintained. Accordingly, their protective function is impaired, and especially during heavy rainfall events, backward erosion causes dissection of the terrace risers (cf. 'bank gullies') [6]. Further, uncontrolled excess water in the terraces infiltrates and soaks the riser, putting pressure on it in line with the hydraulic gradient [77]. Especially at the transition from dry season to rainy season, desiccation cracks frequently foster infiltration and the generation of subsurface runoff [77][78][79]. Large mud cracks observed on the surface of the terrace treads indicate such subsurface processes. For semi-arid areas, Bocco [11] observes that 60% of the gullies formed as a result of piping. The rare occurrence of gullies on croplands and plantations corresponds to the relatively flat terrain of these areas, which are mostly located in the lowlands, and thus the lack of the minimum hydraulic gradient for gully head development [11].

Distance from Pathways
Mapping of gully heads shows that about a quarter of the gully heads are located within 5 m of pathways, and almost all gully heads are located less than 100 m from pathways (Table S1). These findings coincide with the observations of Schütt et al. [14], who suggest that the initiation of gully erosion in the immediate vicinity of pathways is due to runoff concentration along the pathways. These observations were later validated for the Tigray region [21]. During the field survey it was observed that mainly cattle herds and occasional groups of people moved along the pathways to and from the Adwa-Adi Ugri main road, forming pathways by repeated trampling along the same route. The surface of pathways is usually bare, and trampling compacts the near-surface underground and reduces its pore volume [80]. Consequentially, during rainfall events, the reduced infiltration capacity of these compacted surfaces favors the generation of surface runoff [14,16,48,81]. Where this surface runoff causes erosion along the pathway, a hollow way can develop. It has been observed that most of these situations occur in areas of moderate relief, such as the transition zone. In contrast, in areas of stronger relief, such as in the hill country, surface runoff along the pathways is likely to result in downslope wash-over of the downhill pathway edge, consequently resulting in linear erosion in the direct slope direction, frequently at right angles to the pathway.

Distance from Settlements
Gully heads usually do not develop in the direct vicinity of settlements, with the majority of gully heads (71%) developing >100 m from settlements. As the immediate area around settlements is greatly affected by trampling and soil compaction, this is in sharp contrast to the development of gullies close to pathways. This contradiction can be explained by the fact that the terrain in settlement areas is mostly flat to moderately sloped, thus lacking the necessary hydraulic gradient to initiate gullies [11]. This coincides with the observation that the occurrence of gully heads in the vicinity of settlements in general coincides with an increase in slope inclination. In consequence, the results do not indicate a direct positive connection between the location of gully heads and settlement areas. However, this result is weakened by the distinct connection between the positions of gully heads and pathways and the merging of pathways at settlements (Figure 2b).

Modeling Environmental and Human-Related Factors
In order to assess areas of natural stability and areas affected by human influences and settlement activities, a change detection was conducted between the Gully Erosion Susceptibility Maps (GESM) with models based exclusively on natural factors (Model B) and exclusively on human-related factors (Model C). The change detection allows estimation of the influence of these two types of factor on triggering gully erosion. Change detection of gully erosion susceptibility prediction by Model B and Model C reveals that ca. 25% of the mapped gully heads are due to their location as distinctly controlled by human-related factors (LULC, distance to pathways, distance to settlement areas). Furthermore, according to the MDA and MDG evaluations, all three human-related factors are eminent for the RF algorithm of Model A, taking all factors into account (Table 3).
Independently from each other, Models B and C predicted the location of 29% and 28% of gully heads. Gully heads that occur in an overlap prediction of both Models B and C (34%) are more likely to be interpreted as environmentally triggered. However, the accelerating effect of diffuse human activities on land degradation and erosion processes (e.g., overgrazing, reducing infiltration capacity by soil compaction, damaging irrigation techniques, tillage) should not be ignored [13,68]. Accordingly, areas assessed as susceptible to gully erosion by Models B and C cannot be strictly attributed to a distinctively environmental origin. Considering the number of gully heads that are predicted by applying exclusively human factors (Model C) and those which are potentially influenced by an accelerating human-effect (overlapping predictions of Models B and C), the maximum human impact influences the development of up to 62% of the mapped gully heads.

Conclusions
It is evident from this work that gullies tend to occur in the immediate vicinity of footpaths, but not necessarily in the direct vicinity of residential areas. In general, gullies develop close to human-influenced areas, but environmental factors control their ultimate formation. In this respect, it is necessary to consider human influence as an accelerating factor when predicting Gully Erosion Susceptibility. In this research, different types of human influence were combined for the first time to generate a Gully Erosion Susceptibility Map (GESM) using a study catchment in the Ethiopian Highlands. The application of the data-mining algorithm "Random Forest" provided powerful models with stable and accurate predictions of Gully Erosion Susceptibility by incorporating several different natural and human-related factors. Applying the change detection tool enabled us to distinguish between natural and human-related drivers for gully head development and to identify intersections of both types of factors. Including the distance to pathways as an additional variable that affects gully erosion and distinguishing between natural and human-related factors by comparing Gully Erosion Susceptibility Maps can both be used to identify, monitor, and prevent further human acceleration of land degradation. Further research is required to better understand the impact of settlement activities on landscape dynamics taking both spatial and temporal aspects into account. Within a heterogeneous environment, modeling approaches should integrate such activities to provide applicable predictions of gully erosion susceptibility.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/rs13102009/s1, Figure S1: Gully erosion conditioning factors of the study catchment tributary to the Inda Shawit River; Table S1: Spatial coverage and distribution of gully heads among all conditioning factors; Table S2: Distribution of gully heads among all differences between Models B and C; Table S3