Object-Based Mapping of Gullies Using Optical Images: A Case Study in the Black Soil Region, Northeast of China

: Gully erosion is a widespread natural hazard. Gully mapping is critical to erosion monitoring and the control of degraded areas. The analysis of high-resolution remote sensing images (HRI) and terrain data mixed with developed object-based methods and ﬁeld veriﬁcation has been certiﬁed as a good solution for automatic gully mapping. Considering the availability of data, we used only open-source optical images (Google Earth images) to identify gully erosion through image feature modeling based on OBIA (Object-Based Image Analysis) in this paper. A two-end extrusion method using the optimal machine learning algorithm (Light Gradient Boosting Machine (LightGBM)) and eCognition software was applied for the automatic extraction of gullies at a regional scale in the black soil region of Northeast China. Due to the characteristics of optical images and the design of the method, unmanaged gullies and gullies harnessed in non-forest areas were the objects of extraction. Moderate success was achieved in the absence of terrain data. According to independent validation, the true overestimation ranged from 20% to 30% and was mainly caused by land use types with high erosion risks, such as bare land and farm lanes being falsely classiﬁed as gullies. An underestimation of less than 40% was adjacent to the correctly extracted gullied areas. The results of extraction in regions with geographical object categories of a low complexity were usually more satisfactory. The overall performance demonstrates that the present method is feasible for gully mapping at a regional scale, with high automation, low cost, and acceptable accuracy.


Introduction
Gully erosion, one type of soil erosion, occurs due to the removal of the top soil along the drainage channels through surface water runoff [1]. Gully erosion is recognized as a key indicator of land degradation [2], which threatens agricultural production, economic development, and the ecological environment [3][4][5]. Identifying gully erosion and vulnerable areas is thus of great importance to soil and water conservation and land use planning and concerns scientists, farmers, and land managers [6][7][8].
Field investigation methods were initially used to map gullies. However, these methods are time-consuming and laborious, making them infeasible in large areas. With more readily available remote sensing images and the development of GIS, image-based interpretation methods have been widely employed. There are two directions for these techniques: One is visual interpretation, and the other is automatic extraction. Visual interpretation relies on the image's characteristics and expert knowledge to extract gullies [9][10][11][12], so low efficiency and the subjectivity of manual work

Study Area
The black soil region is a significant production base for commodity grain in China. However, soil erosion has become increasingly more serious in recent years, especially gully erosion, which disconnects and devours farmland, threatening food security [42]. The study area lies in the west of Baiquan County in the black soil region of Northeast China (Figure 1), including one training area (2.66 km 2 ), one testing area (0.95 km 2 ), and three validation areas (2.69 km 2 , 1.10 km 2 and 2.00 km 2 ), each of which is a complete watershed response system. In the study area, several common stratigraphic compositions in the black soil area can be seen. Black soil, chernozem, and meadow soil form the main components of the substrate, and other dark brown soil is sporadically distributed. Its topographic characteristics feature long and gentle slopes. Influenced by its topographic factors, the water system of this area is relatively developed, with a considerable number of tributaries. Regarded as the earliest part of the water system to develop, gullies are widely distributed. The annual rainfall ranges from 450 mm to 790 mm (mainly occurring in June to September), and the annual average runoff depth is about 70 mm. Affected by human activities, cultivated land is the major land use type for maize, soybean, and rice cultivation, and natural forest land is only distributed in areas with steep slopes that are unsuitable for crop growth.
The natural environment of the study area is typical for the black soil region of northeast China. As the key area of gully erosion control, the density of the gullies in this area is high, showing a significant increase [43]. Moreover, the research data on gully erosion in this area are sufficient, which helped ensure a smooth study [44][45][46]. Thus, it was necessary, representative, and feasible to study the extraction of gully erosion in the chosen study area.

Image Data
In the study area, GE images are available with sub-meter and meter resolution acquired at different times. Studies on the identification of gully erosion conducted in an environment similar to the black soil region of Northeast China indicated that late spring, autumn, and early summer images were the best choices, as they experienced minimal effects from vegetation cover or snowfall [47]. Another point to consider is spatial resolution. The higher the spatial resolution, the greater the probability that gully erosion will appear as pure pixels on the images and the clearer the outline, shape, and internal details will be. On the contrary, an excessively large spatial resolution makes the internal details overly detailed, which increases the noise information for object extraction [48]. Thus, we used GE images of a 1.19 m resolution acquired on 9 May 2018 (downloaded with the software Rivermap) to extract the gullies.

Reference Data from the Field Survey
The study sites were investigated during two field trips that lasted one week each for the establishment of interpretation marks and the verification of polygons. During the field investigations, the Global Positioning System (GPS) was the main instrument used to precisely locate gully erosion. Since ephemeral gullies with small widths can be easily filled with normal tillage, we did not work on them. Therefore, the 1.19 m spatial resolution images were reasonable for gully mapping. In addition, as the penetration ability of the optical images is weak, the gully erosion in forest areas was beyond the scope of this study. Based on field investigations and expert knowledge, the manually polygonised reference datasets of gully erosion were drawn for training and validating the gully extraction model within the ArcGIS software. The reference data for training are shown below as an example ( Figure 2).

Reference Data from the Field Survey
The study sites were investigated during two field trips that lasted one week each for the establishment of interpretation marks and the verification of polygons. During the field investigations, the Global Positioning System (GPS) was the main instrument used to precisely locate gully erosion. Since ephemeral gullies with small widths can be easily filled with normal tillage, we did not work on them. Therefore, the 1.19 m spatial resolution images were reasonable for gully mapping. In addition, as the penetration ability of the optical images is weak, the gully erosion in forest areas was beyond the scope of this study. Based on field investigations and expert knowledge, the manually polygonised reference datasets of gully erosion were drawn for training and validating the gully extraction model within the ArcGIS software. The reference data for training are shown below as an example ( Figure 2).

Method
The presented method for the extraction of gully erosion was implemented by employing OBIA and the optimal method based on machine learning algorithms on GE images. Figure 3 shows a survey of the workflow, which can be summed up as follows: (1) Creating the sample dataset within the eCognition and ArcGIS software, (2) determining the optimal machine learning method for preliminary extraction of gullies, (3) revising the extraction results using eCognition, and (4) validating the method presented. These steps are explained in further detail below.

Method
The presented method for the extraction of gully erosion was implemented by employing OBIA and the optimal method based on machine learning algorithms on GE images. Figure 3 shows a survey of the workflow, which can be summed up as follows: (1) Creating the sample dataset within the eCognition and ArcGIS software, (2) determining the optimal machine learning method for preliminary extraction of gullies, (3) revising the extraction results using eCognition, and (4) validating the method presented. These steps are explained in further detail below.

Method
The presented method for the extraction of gully erosion was implemented by employing OBIA and the optimal method based on machine learning algorithms on GE images. Figure 3 shows a survey of the workflow, which can be summed up as follows: (1) Creating the sample dataset within the eCognition and ArcGIS software, (2) determining the optimal machine learning method for preliminary extraction of gullies, (3) revising the extraction results using eCognition, and (4) validating the method presented. These steps are explained in further detail below.

GE Image Segmentation
For the initial step, we segmented GE images to derive objects in eCognition for the subsequent analysis; this step is a vital part of OBIA processing. The multiresolution segmentation method, an optimal method for high-quality multi-scale image segmentation [49], was chosen to achieve segmentation in this study. This process began with an individual pixel and further amalgamated the most similar neighbor pixel by a minimum heterogeneity increment until the heterogeneity of the generated object exceeded that of the preset threshold. Three visible bands (RGB) were involved in the segmentation process, each weighted with a value of 1.
There were three input parameters: Scale, shape, and compactness. Segmentation parameter determination was an iterative process. In the first iteration, the ranges of values for shape and compactness were 0 to 0.9 and 0 to 1.0, respectively. To obtain the appropriate ranges of shape and compactness, a variable-controlling approach was adopted. In this process, the scale parameter was temporarily set to 50 by visual assessment. After determining these ranges, the optimal combination of both was determined. The appropriate scale parameter, a key value that heavily affects the emergence of objects and facilitates further analysis [50], was explored with the help of the Estimation of Scale Parameter 2 (ESP2), proposed by Drăguţ et al. [51]. Regarded as a credible tool, the ESP2 can determine suitable scale parameters based on an object's shape, compactness, and other parameters (e.g., the number of loops and starting scales), which are input based on Local Variance (LV) values [52]. The scale parameter obtained in the first iteration was used as the initial parameter in the second iteration to find the appropriate combination of shape and compactness within their respective ranges obtained by the first iteration. Iterations continued until the results no longer changed.
In order to evaluate the goodness of the image segmentation's potential parameter combinations, over-segmentation (OS), under-segmentation (US), and Euclidian distance (ED) were computed, as follows [53]: where r i is part of the reference data, and s k is one of the corresponding segmentation objects (a segment that overlaps r i by no less than 50% is considered as s k , which can provide valuable information for training the subsequent extraction models). OS and US indicate how well the segments match the reference data. As a composite index, ED is explained by its proximity to perfect segmentation. Ideally, the value of each index will be 0.

Object-Based Explanatory Features Generation
In this study, the spectral, textural, and geometrical features of the segments were selected and used for gully mapping, as shown in Table 1. To obtain as many variables as possible, 37 variables were derived within eCognition. The spectral features comprised the mean band value, band ratios, mean brightness, maximum difference index, standard deviation of the band, and VI'. Among them, the band ratios and VI' were customized features. The calculation formulas are as follows [36]: Remote Sens. 2020, 12, 487 where R, G, and B are the mean values of the red, green, and blue bands, respectively. The built-in geometrical features for planar objects in eCognition were all selected, including the area, length, width, length/width, border length, rel. border to image border, border index, asymmetry, compactness, roundness, density, main direction, elliptic fit, rectangular fit, shape index, radius of largest enclosed ellipse, and radius of smallest enclosing ellipse.
The image texture expresses the connection of the adjacent pixels, which has been used to improve the classification accuracy [54]. Eight texture features based on the Grey Level Co-occurrence Matrix (GLCM), which is a tabulation illustrating the frequency of various combinations of pixel brightness values (gray levels) on the image [55], were computed for all directions using eCognition. Thus, the training, testing, and validation data were generated.
Acronyms: R: Red, G: Green, B: Blue, max. diff: Maximum difference index, ang. 2nd moment: Angular second moment, stdDev: Standard deviation, rel. border to image border: Relative border to image border.

Gully Mapping Using Machine Learning Methods
In this section, an optimal method based on machine learning algorithms is used for gully mapping, and the evaluation indexes, precision, recall, and F-score are introduced. These factors all have an important role to play in the validation of the results, as detailed in Section 3.4. We selected as many features as possible, but an overabundance of features may have caused data redundancy and noise interference. Thus, the importance of the features was explored in the machine learning model mentioned below.
Compared to machine learning methods, the popular methods for deep learning require large datasets (with millions of scales) to achieve high performance. Moreover, these methods are too complex to establish a model with good properties, so this study did not consider deep learning methods.

Tree-Based Pipeline Optimization Tool (TPOT)
TPOT was implemented to determine the optimum method simply and rapidly [56,57]. TPOT is a tool for automated machine learning in Python, which intelligently determines the best pipeline for Remote Sens. 2020, 12, 487 8 of 20 data among thousands of possible pipelines using genetic programming ( Figure 4). Multiple machine learning methods can be integrated into one pipeline by taking into account the data preprocessing steps (normalization, missing value imputation, transformation, scaling, etc.), the hyper parameter settings, and the ways to stack or ensemble the methods within the pipeline. However, although TPOT can function normally with almost all machine learning methods, it cannot function with LightGBM, which is a fairly efficient method. TPOT was implemented to determine the optimum method simply and rapidly [56,57]. TPOT is a tool for automated machine learning in Python, which intelligently determines the best pipeline for data among thousands of possible pipelines using genetic programming ( Figure 4). Multiple machine learning methods can be integrated into one pipeline by taking into account the data preprocessing steps (normalization, missing value imputation, transformation, scaling, etc.), the hyper parameter settings, and the ways to stack or ensemble the methods within the pipeline. However, although TPOT can function normally with almost all machine learning methods, it cannot function with LightGBM, which is a fairly efficient method. TPOT needs to run for enough time to achieve convergence. In this study, we installed XGBoost to supplement TPOT and ran TPOT until the same pipeline was recommended for the same dataset by two TPOT runs.

LightGBM
For the above-mentioned reasons, LightGBM alone was analyzed. Using tree-based learning methods, LightGBM offers a gradient boosting framework that includes two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) [28]. Due to the elimination of a high percentage of data instances with few gradients using GOSS, GOSS only employs the remainder to assess the information gain. With EFB, mutually exclusive features are packed so that the number of features decreases. The implementation of GOSS and EFB increased the speed of the method by dozens of times. To achieve accuracy optimization, LightGBM grows trees by a leaf-wise (best-first) approach that grows leaves with a max delta loss ( Figure 5) [58]. In addition, considering the possible leaf-wise over-fitting phenomenon with a small dataset, LightGBM restricts tree depth using a parameter. However, the trees still grow leaf-wise even when this parameter is identified. TPOT needs to run for enough time to achieve convergence. In this study, we installed XGBoost to supplement TPOT and ran TPOT until the same pipeline was recommended for the same dataset by two TPOT runs.

LightGBM
For the above-mentioned reasons, LightGBM alone was analyzed. Using tree-based learning methods, LightGBM offers a gradient boosting framework that includes two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) [28]. Due to the elimination of a high percentage of data instances with few gradients using GOSS, GOSS only employs the remainder to assess the information gain. With EFB, mutually exclusive features are packed so that the number of features decreases. The implementation of GOSS and EFB increased the speed of the method by dozens of times. To achieve accuracy optimization, LightGBM grows trees by a leaf-wise (best-first) approach that grows leaves with a max delta loss ( Figure 5) [58]. In addition, considering the possible leaf-wise over-fitting phenomenon with a small dataset, LightGBM restricts tree depth using a parameter. However, the trees still grow leaf-wise even when this parameter is identified.
We used the LightGBM Python-package established by Microsoft Corporation for statistical computing to associate gully presence with explanatory variables. Parametric adjustment was the most important step affecting accuracy. A large max_bin, small learning_rate, and large num_leaves were possibilities considered to improve accuracy. We used the LightGBM Python-package established by Microsoft Corporation for statistical computing to associate gully presence with explanatory variables. Parametric adjustment was the most important step affecting accuracy. A large max_bin, small learning_rate, and large num_leaves were possibilities considered to improve accuracy.

Stacking
Considering multiple machine learning methods, including LightGBM, stacking was applied. The idea of this approach is derived from stackedgeneralization, proposed by Wolpert [59]. Stacking is a hierarchical fusion model ( Figure 6). For example, when we fuse three machine learning models trained with different characteristics, we use them as the basic models. Then, we train a sub-learner on the basic models; the sub-learner is used to organize the basic learner's predictions or the predictions of the basic models as input, so the sub-learner learning organization assigns weight to the predictions of the basic models to obtain the final prediction. The following points should be noted in the selection of each model: First, basic models should be strong models, while the sublearner of the second layer may use a simple classifier to prevent over-fitting. Second, the number of the basic models can be neither too small nor too big. Third, the basic models need to be accurate and have little correlation.

Stacking
Considering multiple machine learning methods, including LightGBM, stacking was applied. The idea of this approach is derived from stackedgeneralization, proposed by Wolpert [59]. Stacking is a hierarchical fusion model ( Figure 6). For example, when we fuse three machine learning models trained with different characteristics, we use them as the basic models. Then, we train a sub-learner on the basic models; the sub-learner is used to organize the basic learner's predictions or the predictions of the basic models as input, so the sub-learner learning organization assigns weight to the predictions of the basic models to obtain the final prediction. The following points should be noted in the selection of each model: First, basic models should be strong models, while the sub-learner of the second layer may use a simple classifier to prevent over-fitting. Second, the number of the basic models can be neither too small nor too big. Third, the basic models need to be accurate and have little correlation. We used the LightGBM Python-package established by Microsoft Corporation for statistical computing to associate gully presence with explanatory variables. Parametric adjustment was the most important step affecting accuracy. A large max_bin, small learning_rate, and large num_leaves were possibilities considered to improve accuracy.

Stacking
Considering multiple machine learning methods, including LightGBM, stacking was applied. The idea of this approach is derived from stackedgeneralization, proposed by Wolpert [59]. Stacking is a hierarchical fusion model ( Figure 6). For example, when we fuse three machine learning models trained with different characteristics, we use them as the basic models. Then, we train a sub-learner on the basic models; the sub-learner is used to organize the basic learner's predictions or the predictions of the basic models as input, so the sub-learner learning organization assigns weight to the predictions of the basic models to obtain the final prediction. The following points should be noted in the selection of each model: First, basic models should be strong models, while the sublearner of the second layer may use a simple classifier to prevent over-fitting. Second, the number of the basic models can be neither too small nor too big. Third, the basic models need to be accurate and have little correlation.  On the basis of the above considerations, Xgboost, LogisticRegression, and GradientBoost were selected as the basic models, and LightGBM assumed the role of the sub-learner in this study. The whole process was carried out in a Python environment.

Revision of Preliminary Gully Extraction
To improve the results of the machine learning method, false positives were removed. In the absence of topographic data, the main idea of this part of the study was to remove residential areas, roads, arable land, and forest areas from the above extraction results, which have their own characteristics distinct from gully erosion; further, some of their segmentation objects are easily misclassified when extracting gullies based on OBIA. The principle of this step was to remove as many of the abovementioned ground objects as possible without removing other objects not addressed. There are many similar features between unused land and gully erosion that are relatively difficult to distinguish. Moreover, unused land is vulnerable to erosion and is often distributed near gully erosion; thus, its removal was not considered.
Residential areas include houses with blue tiled roofs and courtyards. Each independent residential area appears as a regular rectangle. Roads are strip-like and belong to one of two types: Cement roads or asphalt roads. The cultivated land presents a regular, uniform texture, with no vegetation on the surface. Forest land appears as contiguous or striped green areas.
On the basis of the optimal segmentation, target features, and the preliminary results above, the four types of objects used the nearest neighbor rule or membership function rule to extract as many corresponding pure objects as possible within eCognition. The houses of residential areas and cement roads were identified by higher values of the blue band and brightness, respectively. In addition, one obvious difference between houses and cement roads is the border length that roads have along the border length (houses have the opposite). The minimum boundary geometry of each housing agglomeration area was considered a residential area. Forest areas were distinguished mainly based on high values of the VI', whereas arable land was detected with low values of the VI' and brightness and a homogeneous texture. Next, these four types of objects were removed from the results above in ArcGIS to map the gullies.

Validation
Validation of the extraction is essential for judging the fitness of the model presented in this study for gully extraction. Validation was implemented by an overlay analysis between the gully map obtained from the model and the inventory map of gully erosion provided by field surveys and expert knowledge. Adopting different approaches to assess the extraction results highlighted the added value [60]. In this study, we used two strategies to estimate the accuracy of the extraction: (1) The conventional method, whereby the classification results were divided into two types-correct or wrong compared to the reference data. (2) Inspired by similar studies [19,61], there were two kinds of errors of commission. One was a limited error of commission, which bordered on the correctly extracted areas and displayed similar spectral, textural, and geometric properties. The remaining error of commission was regarded as a true error of commission, which was entirely falsely extracted.
The precision, recall, and F-score were used to process the quantitative analysis of the extraction results. The effects of these indicators are as follows. Precision indicated how many areas that the model determined to be positive were true positive areas. Recall indicated how many positive samples were judged to be positive by the model. F-score was an index balancing precision against recall. Ideally, all three indexes were 0.
Here, A is the area of extracted gully erosion and B is the gully area in the reference data.

Results
The determination of the shape and compactness parameters was based on the value of ED. When the ED value reached its minimum, the corresponding parameters were defined as best. Take the result of the first iteration as an example, the goodness assessment is shown in Tables 2-4. The primary range of the parameters of compactness and shape are obtained based on Tables 2 and 3, respectively: (0.8, 1.0) and (0.1, 0.4). Each combination of the compactness and shape parameters was screened as a candidate to obtain the optimum value (compactness: 0.8; shape: 0.3) shown in Table 4. Based on the acquired shape and compactness parameter, estimation of the scale parameter is provided to help determine the most proper scale (Figure 7). Because over-segmentation produces fewer threats to classification accuracy than under-segmentation [62], and based on an exploration of the scale parameters in alternating regions, 34 was chosen as the value of the scale parameter.
Based on the acquired shape and compactness parameter, estimation of the scale parameter is provided to help determine the most proper scale (Figure 7). Because over-segmentation produces fewer threats to classification accuracy than under-segmentation [62], and based on an exploration of the scale parameters in alternating regions, 34 was chosen as the value of the scale parameter.
Finally, the segmentation parameter set (scale, shape, and compactness) was established as (34, 0.3, 0.8). The segmentation results were obtained. Figure 8 illustrates some generated objects that were divided into objects of gully and objects of non-gully, which depended on the relationship between the percentage of gully erosion area and 50%, together with a gully system boundary that was obtained by digitizing the gully systems using visual image interpretation based on field investigations and expert knowledge. Objects with a percentage of gully erosion area greater than or equal to 50% were considered as objects of gully. On the contrary, other objects are objects of nongully. The segmentation results were obtained. Figure 8 illustrates some generated objects that were divided into objects of gully and objects of non-gully, which depended on the relationship between the percentage of gully erosion area and 50%, together with a gully system boundary that was obtained by digitizing the gully systems using visual image interpretation based on field investigations and expert knowledge. Objects with a percentage of gully erosion area greater than or equal to 50% were considered as objects of gully. On the contrary, other objects are objects of non-gully. The training data were randomly categorized into the training subset or testing subset with a ratio of 3. By setting and adjusting the parameters of each method based on machine learning algorithms to fit them with the training subset, the initial models were constructed. Table 5 reveals the goodness-of-fit results. TPOT and LightGBM obtained similar performance, and both were better than stacking-gully. In addition, TPOT was a time-consuming procedure; TPOT required almost a week to finish its search, while LigthGBM could finish its search within minutes. Therefore, LightGBM was determined to be the best choice for extracting gully erosion. The training data were randomly categorized into the training subset or testing subset with a ratio of 3. By setting and adjusting the parameters of each method based on machine learning algorithms to fit them with the training subset, the initial models were constructed. Table 5 reveals the goodness-of-fit results. TPOT and LightGBM obtained similar performance, and both were better than stacking-gully. In addition, TPOT was a time-consuming procedure; TPOT required almost a week to finish its search, while LigthGBM could finish its search within minutes. Therefore, LightGBM was determined to be the best choice for extracting gully erosion. Table 5. Goodness-of-fit results for the three machine-learning methods.  Figure 9 shows that the extraction accuracy first increased with an increment of the number of features and remained basically stable when the number exceeds 15. Thus, it was unnecessary to select and remove features in this study, thereby simplifying the extraction process of gullies. In addition, VI' was sixth in feature importance, which demonstrates the significance of its introduction to the extraction of gullies.

Model
Remote Sens. 2020, 12, 487 14 of 20 select and remove features in this study, thereby simplifying the extraction process of gullies. In addition, VI' was sixth in feature importance, which demonstrates the significance of its introduction to the extraction of gullies. Based on initial extraction by LightGBM and subsequent modification, an overview of the extraction result in the test area is displayed in Figure 10. The dark green polygons indicate correctly extracted areas of gully erosion. The light green and blue polygons indicate the limited and true error of omission, respectively, and the light red polygons show the error of commission. The reference data are represented as colorless polygons with black boundaries.  importance.
Based on initial extraction by LightGBM and subsequent modification, an overview of the extraction result in the test area is displayed in Figure 10. The dark green polygons indicate correctly extracted areas of gully erosion. The light green and blue polygons indicate the limited and true error of omission, respectively, and the light red polygons show the error of commission. The reference data are represented as colorless polygons with black boundaries.  Table 6 shows the accuracy evaluation of the extraction results. The comparison between the results of preliminary extraction and revised extraction showed that although the area of gully erosion extracted correctly decreased, revised extraction significantly improved the situation that non-gully was wrongly classified into gully erosion, and the performance of gully extraction was better in general. Finally, this study had a recall of 75.6%. A total of 24.4% of gully areas were not correctly extracted, which constitutes an omission error that mainly occurs at inflection points, bifurcations, and small-scale separate erosion gullies. A total of 38.4% of the extracted results were mistakenly extracted as gullies, while almost half of the errors were in the vicinity of gully and exhibited similar spectral, textural, and geometric properties to gully erosion, which was regarded as the limited error of commission. The true error of commission accounted for 21.1% of the extracted  Table 6 shows the accuracy evaluation of the extraction results. The comparison between the results of preliminary extraction and revised extraction showed that although the area of gully erosion extracted correctly decreased, revised extraction significantly improved the situation that non-gully was wrongly classified into gully erosion, and the performance of gully extraction was better in general. Finally, this study had a recall of 75.6%. A total of 24.4% of gully areas were not correctly extracted, which constitutes an omission error that mainly occurs at inflection points, bifurcations, and small-scale separate erosion gullies. A total of 38.4% of the extracted results were mistakenly extracted as gullies, while almost half of the errors were in the vicinity of gully and exhibited similar spectral, textural, and geometric properties to gully erosion, which was regarded as the limited error of commission. The true error of commission accounted for 21.1% of the extracted results. These errors occurred mainly in areas containing roadsides, farm paths, weed land, bare land, and sparse woodland. Overall, most gullies were correctly identified, and this method enabled an appropriate extraction of the gullies. To examine the performance of this approach, validation datasets were used ( Figure 11). Table 7 shows the validation results. The gullied areas had an underestimation of less than 40%. These areas were mostly adjacent to the correctly extracted gullied areas and almost certainly did not indicate the absence of a whole gully. A total of 20% to 30% of the areas were entirely classified falsely and thus assigned as the overestimation. That is, more than 70% of the extraction results were correct or important indicators for the location of gullies. The range of underestimation and overestimation varied as a result of considering areas with different complexities of geographical object categories for validation. The extraction results in the watershed with residential areas and more roads were worse, as they were either underestimated or overestimated. To examine the performance of this approach, validation datasets were used ( Figure 11). Table  7 shows the validation results. The gullied areas had an underestimation of less than 40%. These areas were mostly adjacent to the correctly extracted gullied areas and almost certainly did not indicate the absence of a whole gully. A total of 20% to 30% of the areas were entirely classified falsely and thus assigned as the overestimation. That is, more than 70% of the extraction results were correct or important indicators for the location of gullies. The range of underestimation and overestimation varied as a result of considering areas with different complexities of geographical object categories for validation. The extraction results in the watershed with residential areas and more roads were worse, as they were either underestimated or overestimated.

Discussion
GE imagery, a type of optical imaging technology, is the only data source used for gully mapping in this paper. Although there is an unquestionable link between topographic attributes and the occurrence of gullies, which helps to improve the extraction of gullies (as suggested by recent studies) [23,25,63], the acquisition and update of high-precision topographic data remain difficult for a vast number of users compared to optical imagery, especially when conducting regional gully erosion surveys and mapping. Hence, this type of data selection allows more people to use the method presented and facilitates large-scale gully erosion investigations at the sacrifice of some accuracy. Considering the scope of application, the lack of topographic data is regarded as an advantage. In addition, other optical imaging techniques can be used as the data source due to the adequate spatial and spectral information provided, such as images captured by QuickBird, WorldView, and GeoEye (if affordable). The approach developed using GE imagery provides good cross-data portability, which increases our ability to study the dynamic changes of gully erosion because there are more data to choose from at different points in time.
This study combines direct extraction and indirect extraction by removing non-eroded gully areas from the preliminary results of direct extraction to extract gully erosion; this is described as two-end extrusion. Various methods based on machine learning algorithms were evaluated and selected to obtain an optimal model for the direct extraction of gully erosion. This step explored the effectiveness of methods based on machine learning algorithms other than the random forest methods commonly used in previous studies for the extraction of gully erosion. We determined that LightGBM may be a better choice. Based on the preliminary extraction results, the final extraction results were obtained, via eCognition, by removing the areas identified not to be eroded by the gully. Compared with the studies on direct extraction [24,25], two-end extrusion has the capacity to improve the extraction results when the experimental conditions are consistent. In addition, the inclusion of a machine learning method simplifies the extraction process and enhances the automaticity of the approach by establishing rule sets within eCognition [19,20].
Although the extraction accuracy is not completely satisfactory, it is still acceptable considering that this method uses no topographic data. The lack of topographic data may lead to inadequate object features, which could have contributed to the extraction errors. False negatives are mainly isolated fragments and directly adjacent to the correctly extracted areas of gully erosion (true positives), which indicates that the main body of gullies can achieve good initial recognition. Almost all false positives are segments of roadsides, farm paths, weed land, bare land, and sparse woodland. Studies have shown that gully erosion develops seriously in areas with greater disturbances and lower vegetation coverage. Indeed, one of the primary causes of gully erosion is roads and paths [39,64,65]. Hence, false positives are mainly concentrated in high risk areas of gully erosion, which deserve attention. For the above-mentioned reasons, the method presented in this paper is a viable option for initially determining gully erosion, with acceptable accuracy to provide information for the pertinent stakeholders.
This study was conducted in a typical black soil region of Northeast China. Our results can be generalized to other areas of the black soil region for two reasons: (1) There are similarly distributed rules for gully erosion across the entire black soil region, despite variations in density and shape. (2) The results apply to a study area with the most serious and complex gully erosion, which demonstrates its gully mapping capabilities in areas with lighter and simpler gully erosion. However, when attempting to extend the present approach to areas other than the black soil region, an elaborate field investigation should be carried out. Based on the geographic environmental information collected, as well as expert knowledge, this approach should be adjusted accordingly.
Gullies, after harnessing, show the characteristics of woodland in the optical images, as the management of gully erosion in Northeast China has been vigorously put into practice since 2002 by the Chinese Government [66]. In light of the weak penetration of optical images and the removal of woodland, gullies after harnessing, and other gullies in forest areas, cannot be extracted. However, the stable state of gullies with less damage makes this issue less serious and less conspicuous. In this paper, unmanaged gullies and gullies harnessed in non-forest areas were extracted, but the distinction between these two types of gullies for the control and monitoring of gully erosion cannot be achieved and requires further study.

Conclusions
In this paper, we successfully built a two-end extrusion method integrating methods based on machine learning algorithms and OBIA using GE images as the data source to obtain the initial extraction of a gully in the black soil region. Several conclusions can be summarized: (1) This highly automatic method simplifies the extraction process and is applicable to diverse image data, which makes it broadly available and transferable to a large extent. Among mainstream methods based on machine learning algorithms, LightGBM is regarded as the best choice to preliminarily extract a gully due to its fast speed and high accuracy. (2) The performance of this method was modest but acceptable, as it was verified to have a true overestimation in the range of 20% to 30%, which was mainly caused by land use types with a high erosion risk, such as bare land and farm lanes, being falsely classified as gullies, and an underestimation of less than 40%, which was largely adjacent to the correctly extracted areas of the gully. Further, the lower the complexity of the geographical object categories of the study area, the more satisfactory its accuracy of extraction. (3) Due to the weak penetration of optical images and the removal of woodland, the extracted gullies consisted of unmanaged gullies and gullies harnessed in the non-forest areas, while it excluded gullies after harnessing and other gullies in the forest areas.
Meanwhile, this method has various limitations that must be overcome in future studies. On the one hand, the extraction accuracy of this method is not completely satisfactory. On the other hand, it is not possible to extract and distinguish all types of gully erosion, and each gully cannot be extracted as an individual. In addition, the applicability of this method needs to be verified in more areas in the black soil region and beyond.