Evaluation of Three Different Machine Learning Methods for Object-Based Artificial Terrace Mapping—A Case Study of the Loess Plateau, China

: Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field sur-vey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces. in remote sensing image classification and essentially consists of image segmentation, feature selection, and classification. In this we explored the potential of an object-based ML framework for artificial terrace mapping. Three object-based ML classifiers (i.e., XGBoost, RF, and KNN) were evaluated in an artificial terrace mapping comparison. We conducted experiments on an area of the Loess Plateau, China, through 1-m resolution aerial imagery and DEM from UAV photogrammetry. The results indicate that the RF model obtains the most accurate classification results, and the KNN and XGBoost classifiers influence classification


Introduction
Artificial terraces, as a typical artificial landform, are of great importance to agricultural production and soil and water conservation [1]. The construction of artificial terraces enhances water infiltration, reduces the risk of soil erosion, and improves biodiversity [2,3]. Artificial terraces are widely distributed around the world because of these advantages [4]. Many previous studies reported their effects on soil and water processes [5][6][7]. Nowadays, many artificial terraces are threatened by land degradation and soil erosion because of land abandonment and a lack of maintenance [8]. The Loess Plateau in China, which is a major agricultural production region in China, suffers from severe soil erosion and is a fragmented terrain [9]. Since the 1960s, lots of artificial terraces have been built on the top of the loess hills and hillslopes to improve agricultural productivity. However, the "Grain for Green" project was implemented by the Chinese government in 1999, with the aim to "convert all reclaimed farmland to forestland step by step within a definite time in a planned way" [10]. Through this project, hillsides were closed to actively facilitate afforestation, the ecological environment was improved, and vegetation recovery was accelerated [11]. Currently, the terraces are consistently at risk from gully erosion or gravity erosion because of the lack of maintenance [12,13]. The evident changes in certain artificial terrace areas can be clearly observed ( Figure 1). Hence, mapping artificial terraces will help us better understand their spatial distribution and other effects, such as geomorphology, soil erosion, and ecology. Terrace mapping is the basis of this field of study. Traditional methods, including field investigation and visual interpretation [14,15], are simple, highly accurate, and easy to implement. However, the time expenditure, financial costs, and inconvenience make long-term monitoring over large areas difficult. Automatic methods are possible with the technological development of high spatial resolution remote sensing and terrain modeling by unmanned aerial vehicle (UAV) photogrammetry [16][17][18] or light detection and ranging (lidar) technology [8]. Highly accurate terrain modeling by UAV photogrammetry is consistently difficult because fully and automatically interpolating the surface model into the terrain model, i.e., the digital elevation model (DEM), is problematic in hilly areas covered with dense vegetation [19]. The DEM-based mapping method is usually influenced by elevation noises caused by vegetation or other features. Lidar-based terrain modeling usually solves this problem using the multiple echo filtering technique, while the high economic cost limits its application. However, image-based methods utilize spectrum information and can achieve a better result. These methods include two types: pixelbased [16,20,21] and object-based methods [22].
Object-based image analysis (OBIA) transfers from pixels to objects [23]. Therefore, spectral as well as spatial information of target features [24] can be better utilized for the mapping task. This characteristic makes OBIA superior to pixel-based methods, which usually ignore these contents and texture information, leading to a discontinuity of mapping results. A general OBIA framework basically consists of three steps. First, an image is segmented into several individual polygons with approximately the same spectral or spatial features, namely, objects. Then, each object is treated as the analysis unit, and the corresponding features are calculated for incoming classification. Finally, a specific classifier is selected to achieve a certain classification task. The first step in OBIA, i.e., image segmentation, can directly affect the analysis [25]. Commonly-used segmentation methods, such as multi-resolution segmentation (MRS) implemented in eCognition software [26][27][28], are strongly defined by the parameters (scale, shape, and compactness in this MRS). Therefore, the inherent properties of geographic entities are necessity as the prior knowledge to obtain suitable segmentations.
Another significant factor affecting the efficiency of the OBIA framework is the application of the appropriate classifier. Previous studies have successfully utilized machine learning (ML) classifiers to classify highly complex data, which illustrate the great potential of ML in effective classification and object recognition [29]. Zhao et al. (2017) proposed a random forest-based method using DEMs and UAV imagery for terrace mapping with an overall accuracy of 94% [22]; however, the geometric features of terraces were not considered, which limits the accuracy. Dai et al. (2020) used edge detection combined with terrain attributes to extract the ridge lines of artificial terraces with an overall accuracy of 90.81%-97.57% [13]. While the inconsistency caused by soil erosion between terrace ridges will strongly influence the success of this method, among the various existing ML-based classifications, ensemble decision tree (DT) classifiers, such as random forest (RF), are considered as effective methods for mapping purposes in remote sensing. Recently, as a novel state-of-the-art DT-based ensemble classifier, extreme gradient boosting (XGBoost) is reported as a strong and effective method that has been successfully used for OBIA-based classification, such as landscape classification and gully feature mapping [25]. However, the capabilities of XGBoost for artificial terrace mapping remain to be demonstrated. Moreover, other algorithms under the OBIA framework are also commonly used for remote sensing classification, such as k-nearest neighborhood (KNN) [30]. Although a few studies have focused on the comparison of ML performance in land use and land cover [31] or natural hazard susceptibility (e.g., landslide) mapping [32][33][34], no comprehensive comparison with fine-scale artificial terrace mapping has been performed with an OBIA framework.
Considering the aforementioned limitations, this work focuses on the comparison of different ML classifiers in the OBIA framework for artificial terrace mapping. The contributions to this field are as follows: 1) image segmentation was optimized and feature selection was performed with an objective method; 2) the performance of three ML classifiers in the OBIA framework was compared and their potential for artificial terrace mapping was explored.

Study Area and Data
The Zhifang catchment (109°14′25″E-109°15′20″E, 36°43′10″N-36°44′30″N, with an area of 2.33 km2) was selected as the study area (Figure 2a,b). It is located in the southeastern part of Ansai County, Shaanxi Province, China. The study area is composed of loess ridge and hill, with a 289-m elevation difference, and is covered with massive ridges and tablelands. The study area is characterized by a semiarid climate with mean annual precipitation of 528.5 mm and a mean temperature of 8.8 °C [35][36][37]. Soil erosion is evidently severe, with a great number of artificial terraces built in this area, making the terrain fragmented. In this study, high-resolution DEMs and imagery were used to extract the terraces. A DEM with a 1-m resolution was generated using UAV photogrammetry in March 2016 ( Figure 2c). DEM acquisition can be divided into two main phases: the outdoor field survey and subsequent indoor image processing. For the outdoor survey, two steps are needed. First, an MD4-1000 drone with a Sony ILCE-7R digital camera system was used to capture optical aerial photographs of the study site. Secondly, the horizontal and vertical accuracies were maintained based on three base control points and 53 photo control points. Then, two indoor image processing steps were implemented: aerial triangulation and DEM generation. Trimble's Inpho 6.0 was used to perform the aerial triangulation. Subsequently, the DSM was generated using MapMatrix 4.0. Then, a 1-m resolution DEM was achieved after eliminating buildings and vegetation. The WorldView-3 imagery from April 2016 (Figure 2d) was utilized in this study. Launched in August 2014, the WorldView-3 satellite has higher spatial resolution than the previous WorldView-2 satellite. The resolution of the panchromatic band is 0.31 m, and the multispectral (red, green, blue, and near-infrared) bands are 1.24 m.
A field investigation was then conducted to generate the ground truth of the artificial terraces to evaluate the mapping accuracy. Reference polygons ( Figure 2d) were manually delineated from both the field investigation and the details of the terrace based on WorldView-3 Imagery, DEM information, and Ortho-image ( Figure 3).

Methods
The mapping task began with image segmentation, which is an essential step that affects the accuracy of terrace mapping. In this study, segmentation by weighted aggregation (SWA) was utilized for this step because it can generate multiple level segments without many predefined parameters. Subsequently, a watershed-based optimization strategy was initiated to optimize the segmentation. Then, the spectrum, topography, terrain texture, and geometry were calculated for each suitable segment. Next, a correlationbased feature selection method was adopted to acquire the relevant feature subsets. Subsequently, three object-based ML classifiers, namely, RF, XGBoost, and KNN, were selected for terrace mapping. Finally, the mapping results were validated and compared through an accuracy assessment analysis. Figure 4 illustrates the entire workflow of this study.

Optimized Image Segmentation
SWA, firstly proposed by Sharon et al. (2000Sharon et al. ( , 2001Sharon et al. ( , 2006 [38][39][40], mainly consists of four steps: (1) constructing the fine-level graph; (2) creating the coarse-level graph; (3) evaluating the saliency of segments; and (4) determining the boundaries of salient segments based on a top-down procedure. In this study, the SWA was used to obtain the multi-level segmentation results based on three visible bands, one near-infrared band, and one panchromatic band. Then, the segments were used as the basic processing units for terrace mapping. Therefore, selecting an appropriate level segmentation from multi-level results is necessary to optimize the segments that are generated by SWA. Present studies have pointed out that the category's effect on the segmentation selection needs to be considered [30]. This finding indicates that using different levels to segment an image is better than using a universal one [29]. In order to achieve a satisfactory segmentation, it is necessary to consider the homogeneity of segments and significant difference with the adjacent regions [41]. Hence, a watershed-based optimized strategy was used to select the optimal segmentation level on the basis of our previous knowledge [25]. This procedure first chooses the watershed as the basic processing unit because it is usually considered as the fundamental unit of landform [42]. The three following steps are used to extract sub-watersheds: (1) filling the sinks; (2) extracting the stream network; and (3) extracting the watersheds. Eighteen sub-watersheds were extracted; two were used for model training, with the remaining 16 sub-watersheds being utilized for validation. Every sub-watershed has its own segmentation level by considering between-segment heterogeneity and within-segment homogeneity [43]. To meet these requirements, an unsupervised F-measure was utilized [44]. Subsequently, the optimized segmentation level of each sub-watershed was achieved, and the final segmentation result was obtained.

Feature Extraction and Selection
For the ML-based terrace mapping, certain features need to be extracted to train the classifiers. In this study, 55 features were adopted for training the terrace mapping model. An overview of these features is illustrated in Table 1. The detailed descriptions for each feature can be found in Appendix A. Given that the spectrum and geometric features are widely used in the OBIA community [25], this information was used. Five basic topographic factors, namely, elevation, slope, roughness, shaded relief, and curvature, were utilized. Terrain textures, which are beneficial for the extraction of landform entities [45,46], were also adopted. Eight terrain texture measures derived from the gray-level cooccurrence matrix (GLCM) [47] were then calculated on the basis of five topographic features. All these object features were calculated based on the segmentation result using the eCognition software. Figure 5 illustrates the schematic for object-based feature extraction. Terrain texture Gray-level co-occurrence matrix (GLCM) homogeneity, GLCM contrast, GLCM dissimilarity, GLCM entropy, GLCM angular second moment, GLCM mean, GLCM standard deviation, and GLCM correlation based on Elevation, Curvature, Roughness, Slope, and Shaded relief A correlation-based feature selection (CFS) method [48], which is a popular filter approach, was used to select the feature subset in terrace mapping in this study. In this method, the feature subsets were evaluated by maximizing the dependency of a feature subset with the target class while minimizing the intercorrelation within the subset [49]: where S denotes a candidate feature subset, f(S) denotes the evaluation score, SCi denotes the average correlation between subset S and the dependent variable, and ASCi denotes the average intercorrelation within subset S. Given an initial set of inputs, a set of features meeting the best f(S) was identified. Each time the feature set proposed by the CFS was removed from the initial input, the algorithm was then reapplied to the set of reduced variables until no features were left. This selection algorithm in this work was implemented using the Weka 3.8 package.

Terrace Mapping Using ML Classifiers
After image segmentation and feature selection, three different widely-used ML classifiers, namely, XGBoost, RF, and KNN, were selected for terrace mapping.
(I) XGBoost XGBoost [50], as a boosting-based ensemble classifier, serially trains individual DTs, and each DT is an improved version of the previous one, resulting in a smaller error rate [51]. This serialized training method is beneficial to reduce the residuals. To avoid over-fitting the model, XGBoost adds a regularization term to the objective function for controlling the entire complexity of the model [52]. Another significant enhancement of XGBoost is the parallel processing ability. The algorithm uses presorting and block storage methods to determine the best split point rather than growing the DT in parallel. Then, linear scanning can be performed across the blocks to determine where the best split point is for each feature under test. Such characteristics make XGBoost very popular for the classification and mapping of remote sensing images.
In this study, the "XGBoost" package [53] in R language was employed to generate the terrace mapping model. For the sake of modeling, three significant parameters need to be determined: (1) the number of trees (nrounds) to grow; (2) the depth of the trees (max_depth), which controls the complexity of the model; and (3) the minimum sum of the instance weight (min_child_weight), which determines the stop point of the tree splitting. For tuning the XGBoost model, 10-fold cross-validation was selected on the training dataset.
(II) RF RF is another type of ensemble classifier, proposed by Leo Breiman in 2001 [54]. It uses bagging to generate different training samples to reduce the correlation between classification models, thereby improving the overall prediction accuracy of the classifier. The classification process of RF is simple and easy to implement. The training samples are randomly and repeatedly selected from the original training set through bootstrap sampling. Every classification and regression tree (CART) DT is based on every bootstrap dataset and is split and grown by independently and identically distributed random vectors. The final classification result is synthesized from the classification results trained by each CART DT through simple majority voting.
We used the R language to implement the RF package and build the terrace mapping model. Two important parameters affect the model performance, namely, the number of trees (nTree) needed to be grown and the number of features (mTry) that are randomly selected at each node. The nTree and mTry were set to 500 and the square root of the total number of features because they are a commonly used parameter combination for RF classifier [55].
(III) KNN KNN is a powerful tool used in many object-based procedures [56,57] because of its flexibility and simplicity. This classification method is frequently utilized in object-based software frameworks. Compared with model-based learning, KNN assigns objects to classes on the basis of the proximity or neighborhood in feature space rather than learning from models [58]. The nearest K neighbors can be achieved from the training data and used to vote for the new target prediction [59]. In this study, KNN was adopted using the eCognition software (Trimble Geospatial, Munich, Germany).

Mapping Accuracy Assessment
After the terraces were predicted, the predicted results were validated on the basis of the reference data. In this study, precision (Pr), recall (Re), F1-score (F1), overall accuracy (OA), and kappa coefficient (k) were adopted for the accuracy assessment. These metrics can be calculated as follows: , , , , (6) where Po is the overall accuracy, and Pe can be defined as , (7) where TP, FP, and FN denote true positive, false positive, and false negative, respectively. TP refers to the terrace area for which the extraction result is correct. FP refers to the terrace area for which the extraction result is incorrect. FN refers to the terrace area that has not been extracted. Therefore, the Pr metric indicates to what extent the extraction result represents the real target terrace area, and the Re metric indicates to what extent the real target can be extracted. The former metric can characterize the classification commission error of the model, and the latter can represent the omission error. F1 is the harmonic mean of the two metrics, reflecting their balance. Thus, it is used to characterize the OA of the model [60].

Image Segmentation and Feature Selection Results
The segmentation result can be achieved using the watershed-based SWA optimization strategy, as presented in Figure 6. In accordance with the segmentation results, 18 sub-watersheds and 3445 segments were generated. Two sub-watersheds were selected to train the ML models. The training area consists of 849 segments, including 78 positive samples and 731 negative samples. They were used to train the terrace mapping model. The remaining 16 sub-watersheds (2596 segments) were used to validate the model. For selecting the relevant feature subset, 20-fold replicate runs of the CFS method were implemented. The top 20 features were chosen, as shown in Table 2. The GLCM-based terrain texture can fully consider the spatial relationship among pixels, thereby providing the terrain texture a reliable basis for reflecting the assemblages of repeating patterns of landform elements [61]. Therefore, the terrain texture features play a dominant role in feature ranking. Eleven terrain texture features were selected. The terrain texture based on GLCM can be regarded as a second-order geomorphic parameter. These second-order parameters can improve the ability of traditional terrain factors to differentiate landform types. The GLCM angular second moment measures the uniformity of the texture feature. The smaller the value is, the more disorderly the image pixels will be. Therefore, the angular second moment of curvature (Ang_Cur) can be used to reflect the internal change pattern for the curvature. The larger the value of Ang_Cur, the more orderly the curvature, indicating that the change in the curvature is gentle. Given that the curvature measures the convexity of the landscape, a small Ang_Cur likely represents a complicated area. These areas usually have great potential to develop bank gullies. Therefore, compared with the traditional topographic features, terrain textures can preferably distinguish differences in landforms on a certain scale [52]. In addition to terrain texture features, three geometric and two topographic features were selected in the ranking list, indicating that they do not have a great deal of influence in terms of improving the accuracy of terrace mapping. The Pan feature ranks in the top positions, as shown in Table 2. This condition might be attributed to the high spatial resolution of the panchromatic band with a 0.31-m resolution. Thus, the features obtained from the panchromatic band are more important than those from the 1-m resolution DEM. . Figure 6. Segmentation result and training samples.  Figure 7 shows the terrace mapping results of XGBoost, RF, and KNN. The results illustrate that terraces and non-terraces were successfully detected. The results indicate that the RF classifier performed excellently by creating an accurate terrace inventory map with only limited incorrectly extracted and unextracted terraces. The visual assessment indicates that for terrace mapping, RF is more reliable than KNN or XGBoost. Most of the incorrectly detected terraces (false positives) in the study area were identified with the KNN classifier rather than RF or XGBoost. Most of the non-detected terraces (false negatives) in the study area were identified with the XGBoost classifier rather than RF or KNN.

Terrace Mapping Results and Accuracy Assessment
Four types of belief measurements, namely, Pr, Re, OA, and k, were adopted to validate the mapping results in this study because of their high performance in previous studies. As previously mentioned, F1 can reflect the balance between Pr and Re. Thus, F1 was utilized to evaluate the pixel coverage for quantitatively detecting terraces. The belief masses for labeled features caused by the classifiers were calculated with a confusion matrix. Table 3 indicates the performance of three classifiers in the testing area. Among the three classifiers, RF performed best in terms of Pr, Re, and F1 with 88.15%, 83.63%, and 85.83%, respectively, followed by XGBoost classifiers with 81.96%, 81.48%, and 81.72%. The KNN classifier obtained the lowest F1 score (77.19%) because of its lower Pr (73.36%) than the others, although its recall value was good (81.44%). In accordance with the accuracy assessment result, the RF classifier achieved an OA of up to 95.60% and accurately identified the terraces among the object-based classifiers. This result was followed by XGBoost and KNN classifiers with 94.19% and 92.33%, respectively. Although the three classifiers had an OA of more than 90%, RF had the highest k value of 0.83, which is considered as approximately perfect. The accuracy assessment proved that the RF classifier was effective and may be applied to other regions that exhibit similar conditions to those in the present study.

Influence of Class Imbalance and Feature Selection on Terrace Mapping
Class imbalance refers to when the total number of one class far exceeds the number of another, thereby giving rise to a low predictive accuracy for the infrequent class. The class imbalance effect needs to be considered because the covered areas of terraces are discrete and small in the study area. Previous studies demonstrated that undersampling of the majority class is a beneficial strategy to overcome the class imbalance problem [27,55,62]. To estimate the class imbalance effect, the training samples were divided into subtraining and subtesting samples. Later, a metric R was defined as the ratio of terrace and non-terrace segments in the current subtraining. Then, an iterative procedure was used to estimate a target R value for the emerging balance between Pr and Re on the subtesting samples. In each iteration, all the terrace segments and R-fold numbers of nonterrace samples were randomly sampled from the subtraining datasets to train the ML classifiers and assess the accuracies on the remaining subtesting samples. The procedure began from R = 1 and increased by 0.2 in each step. For evaluating the effect of different feature combinations on terrace mapping, five feature combinations were utilized, where the top 5, top 10, top 15, and 20 features based on Table 2 were selected, and all the 55 features were chosen.
In this study, the training area was divided into subtraining and subtesting at a ratio of 1:3 because the number of samples in the validation datasets (2596 segments) was approximately three times higher than the training samples (849 segments). Therefore, 221 segments, including 20 terrace and 183 non-terrace segments, were randomly selected as subtraining samples, and the remaining 646 segments were treated as subtesting samples. For each R, the procedure was repeated 10 times on the basis of randomly sampled subtraining and subtesting segments from training datasets based on different feature combinations. Then, the mean error rates and their standard deviations were calculated. The experimental results are shown in Figure 8.
The trends of three ML classifiers in terms of the accuracy value based on different class ratios (R) and feature combinations are similar. As shown in Figure 5, Re is approximately 100% when the value of the class balanced ratio R is set to one. However, the precision is lower than 20%, implying that the extent of terraces are overestimated. As shown in Figure 8, converging curves of the Re and Pr accuracy appear when the value of R changes from one to nine. For the RF, the converging curves can be crossed using five different feature combinations. However, for the XGBoost and KNN, this unique crossing for the converging curves can be achieved only when all the features are used, as shown in Figure 5. The highest balanced Re and Pr for RF that can be achieved was approximately 86% when R ≈ 4.8, which is significantly higher than XGBoost (approximately 81%) and KNN (approximately 75%). RF can obtain 85% Re and Pr when 20 features are applied. This circumstance may imply that training an RF model does not need large datasets.

General Discussion
Although experiments show that the RF classifier outperforms other state-of-the-art ML classifiers in terms of classification accuracy, some important considerations about its efficiency need to be further discussed. KNN has a lower efficiency as compared with ensemble classifiers because it is unsuitable for parallel processing. XGBoost and RF have a higher accuracy than KNN. The basic principle behind ensemble learning is that by combining a series of classifiers the performance is slightly better than "random guessing", similar to KNN [51].
The comparison in this work was conducted in a hilly area characterized by severe soil erosion in the Loess Plateau, China. Although the landform in the Loess Plateau has spatial heterogeneity, the distribution pattern of loess landform has a certain regularity. Therefore, the entire OBIA-ML framework is universal, including image segmentation, feature selection, and ML classification. The results from this study area are to some extent transferable to other Loess Plateau regions.
Although ML in the OBIA framework demonstrated a good performance in artificial terrace mapping, the classification works have several limitations. Only the artificial terrace area can be identified rather than the boundaries of each individual terrace (known as terrace ridges). To improve the mapping task, other data sources, such as lidar, along with a variety of terrain attributes, can be considered. Apart from ML classifiers, deep learning (DL) has become popular. The feature selection in the numbers and choices will certainly affect the final classification accuracy. By contrast to the mandatory feature selection in ML, the features in DL are automatically learned during classifier training. These distinctive machine-learned features, such as spectral, contextual, and spatial features, will lead to the increase in their generalizability. The training process in DL is time-consuming, and an improved model network manual design is strongly required. These topics will be explored in future studies.

Conclusions
Artificial terraces are common around the world and are of great importance in food production, water and soil conservation, and ecologic protection. Automatic, efficient, high-accuracy mapping of artificial terraces is the basis of this field of study. OBIA has been widely applied in remote sensing image classification and essentially consists of image segmentation, feature selection, and classification. In this study, we explored the potential of an object-based ML framework for artificial terrace mapping. Three object-based ML classifiers (i.e., XGBoost, RF, and KNN) were evaluated in an artificial terrace mapping comparison. We conducted experiments on an area of the Loess Plateau, China, through 1-m resolution aerial imagery and DEM from UAV photogrammetry. The results indicate that the RF model obtains the most accurate classification results, and the KNN and XGBoost classifiers produce acceptably accurate classification results. The influence of class imbalance and feature selection was analyzed and discussed. In this study, one of our hypotheses was that the use of objective feature selection would improve the classification results.
The final conclusion drawn from this study is that the OBIA framework for artificial terrace mapping can still be improved. DL and certain other data sources, such as lidar, may help to enhance the OBIA framework. This will require several in-depth studies in the future.  Data Availability Statement: Data from this research will be available upon request to the authors.

Conflicts of Interest:
The authors declare no conflicts of interest.

Appendix A
The detailed descriptions of the features used in this study for mapping terraces. The mean intensity of all pixels forming an image object within each band, , where Ci denotes the intensity value at the pixel in an image object; and n is the total number of an object.

Brightness
The is the mean intensity of image layer i of image object k; and is the mean intensity of image layer j of image object k.

Shape index
The smoothness of an image object border, , where bk is the border length of image object k, which is defined as the sum of the edges of the object k. Pk is the total number of pixels contained in object k. The smoother the border of an image object, the lower its shape index.

Length-width
A length-to-width ratio of an image object, , where ; a and b are the length and width of the bounding box of the image object k. Pk is the total number of pixels contained in object k.

Roundness
Similarity of an object to an ellipse, , where and are the radius of the largest and smallest enclose ellipse of image object k, respectively.

Area
The number of pixels forming an image object.

Topography
Mean value (Elevation, Curvature, Roughness, Slope, and Shaded relief) The mean intensity of all pixels forming an image object within each topographic layer.
Terrain texture Gray-level co-occurrence matrix (GLCM) homogeneity The GLCM measures how often different combinations of pixel gray levels occur in a scene. In this study, the terrain texture features were derived from GLCM based on five topographic layers. The detail for calculating GLCM was taken from the study by Haralick et al. (1973) [63].