Bag of Geomorphological Words: A Framework for Integrating Terrain Features and Semantics to Support Landform Object Recognition from High-Resolution Digital Elevation Models

: High-resolution digital elevation models (DEMs) and its derivatives (e.g., curvature, slope, aspect) o ﬀ er a great possibility of representing the details of Earth’s surface in three-dimensional space. Previous research investigations concerning geomorphological variables and region-level features alone cannot precisely characterize the main structure of landforms. However, these geomorphological variables are not su ﬃ cient to represent a complex landform object’s whole structure from a high-resolution DEM. Moreover, the amount of the DEM dataset is limited, including the landform object. Considering the challenges above, this paper reports an integrated model called the bag of geomorphological words (BoGW), enabling automatic landform recognition via integrating point and linear geomorphological variables, region-based features (e.g., shape, texture), and high-level landform descriptions. First, BoGW semantically characterizes the composition of geomorphological variables and meaningful parcels of each type of landform. Based on a landform’s semantics, the proposed method then integrates geomorphological variables and region-level features (e.g., shape, texture) to create the feature vector for the landform. Finally, BoGW classiﬁes a region derived from high-resolution DEM into a predeﬁned type of landform by the feature vector. The experimental results on crater and cirque detection indicated that the proposed BoGW could support landform object recognition from high-resolution DEMs.


Introduction
The research on landform recognition enables mapping the historical trajectory, current condition, and future tendency of terrain objects at different scales. High-resolution digital elevation model (DEM) and its derivatives (e.g., curvature, slope, aspect) offer the possibility of representing the details of the Earth's surface in three-dimensional space that can support various applications including vulnerability estimation of natural disaster [1], urban landscape [2], urban planning [3], ecological sustainability [4], etc. In recent decades, a number of landform classification systems have been proposed to facilitate landform characterization, including the taxonomies of geomorphological variables and their corresponding explicit descriptions [5][6][7][8][9][10]. Although previous research investigations concerning landform characterization acknowledged the significance of geomorphological variable in depicting the main structure of land surfaces, point or linear geomorphological variables alone cannot precisely characterize the structure of a regional landform (e.g., crater, cirque, etc.).
The first challenge originates from the heterogeneity between the point and linear geomorphological parameters derived from the DEM dataset and the whole structure of a landform object [11]. On another words, geomorphological parameters within different shapes are insufficient to represent the landform that generally is ensembled by multiple fragments within a meaningful organization. Up until now, few studies have reported an approach that exploited the semantics of landform to facilitate landform object recognition from high-resolution DEMs.
Moreover, the limitations of geomorphological variables in landform characterization lead to the emergence of a number of region-based geomorphological features such as shape, texture, context, etc. [12][13][14][15][16][17][18]. Although many solutions have been proposed [12][13][14][15][16][17], several challenges remain unsolved among landform characterization approaches. First, large-scale ground truth, or benchmark high-resolution DEM dataset concerning landform objects, have not been available in the community of geomorphology and terrain analysis. The lack of a well-prepared training dataset is a crucial obstacle for implementing cutting-edge machine learning algorithms into landform object recognition [19].
Above all, landform object recognition from high-resolution DEMs faces two challenges: (1) the heterogeneity between the point and linear geomorphological parameters and the whole structure of a landform object, and (2) limited availability of a suitable DEM dataset regarding landform objects. Thus, this paper reports an integrated model called the bag of geomorphological words (BoGW), which enables automatic landform recognition via integrating point and linear geomorphological variables, region-based features (e.g., shape, texture), and high-level landform descriptions. The remainder of this manuscript is organized as follows. Section 2 reports the works related to the focus of this paper. Section 3 presents the architecture and details of the proposed BoGW including feature generation, codebook generation, and classification, respectively. Section 4 presents the result of crater and cirque detection via BoGW. Section 5 provides the conclusions and perspective related to our effort on geomorphological object detection from high-resolution DEMs.

From Geomorphological Variables to Landform Object
As mentioned above, geomorphological variables alone cannot effectively deal with the representation of landform object. For example, the ridge and depression shown in the left part cannot represent the characteristics of a crater. In Figure 1, the crater is composed of several geomorphological variables involving ridgelines and a depression. Moreover, ridgelines and depression are close to circular, which refers to region-level parameters. Thus, the recognition on this crater object not only detects ridgelines and depression from high-resolution DEMs, but also determines whether their shapes are close to circular.
To map the gap between geomorphological parameters and landform object, we designed the BoGW based on the relationship between the geomorphological variables and the whole landform object. In detail, the first section of BoGW extracts a variety of geomorphological variables, and the second section of BoGW discovers the way geomorphological variables semantically form a landform object.

Bag-of-Words Model in Text and Image Analysis
Bag-of-words (BoW) is a model to discover the topic of text information with machine learning techniques [20], widely used in natural language processing and information retrieval. The bag in this model refers to a document that includes several words. Without considering the order of words in a sentence, the frequency of each word in a bag is used as the feature for determining the topic of this bag (document). BoW generally comprises two steps: designing a vocabulary list from documents then creating a feature vector (or semantic matrix) for the representation of the document. Moreover, other operations were taken to manage vocabularies, such as hashing word, n-grams, stopwords, and Term Frequency-Inverse Document Frequency (TF-IDF).
The thought of BoW attracts the attention of studies on image processing and pattern analysis. Image scene and object always contain multiple meaningful elements. Thus, based on the scope of BoW, a model named bag-of visual words (BoVW) was proposed to represent the constitution of image scene and object via local visual features [21]. The bag in BoVW refers to images and visual words refer to the semantically grouped local features (e.g., scale-invariant feature transform-SIFT). BoVW generally consists of three steps: detecting features via robust feature descriptors, generating a codebook to organize the detected features, and classifying an image with the codebook by a generative learning model or a discriminative learning model. The "code" in this codebook refers to the results derived by feature descriptors, which is analogue to the word in a document).
Although BoW and BoVW had been employed in many applications in recent years, the thought of BoW and BoVW had not adopted in terrain analysis yet. The thought of BoW and BoVW, which decomposes an object into multiple parcels being easier to be recognized, inspire us to design a model that also enables taking the advantages of BoVW in landform recognition based on high-resolution DEMs. The details of our proposed BoGW are presented in Section 3.

Bag-of-Geomorphometric Words (BoGW) for Landform Recognition
Referring the workflow of BoW and BoVW, Figure 2 shows the architecture of Bag-of-geomorphometric words (BoGW), which consists of three parts: (1) Semantics enriching: enriching the semantics of landforms from the open linked data sources. (2) Feature generation: creating a feature vector including geomorphological variables and region-level features. (3) Codebook generation: generating a codebook to fuse geomorphological variables and region-level features (quantitative parameters) and the semantics of landforms (qualitative weights). (4) Classification: classifying each object into a predefined class based on the output of the codebook. In BoGW, geomorphological words refer to geomorphological variables and region-based features including shape, texture, etc. Bag refers to the cluster of geomorphological variables and region-based features. Semantics enriching integrates the information derived from the existing terrain domain ontologies, terrain taxonomies and open liked data sources (e.g., volunteered geographical information, Wikipedia, etc.). The existing terrain ontologies and taxonomies might contain limited semantic information. Thus, we design this step to enrich the content of semantics.
Feature representation creates a feature vector that includes geomorphological variables and region-based features corresponding to elevation variation, elevation gradient, slope direction, etc.: {feature 1 , feature 2 , . . . , feature n }, where n is the number of features. The details of geomorphological variables and region features are listed in Table 1 of next subsection. Codebook generation aims to extract the semantics of each terrain class from the existing taxonomy and enrich the semantics with external open data sources, such as Wikipedia and online dictionaries. Then, we derive the keyword that supports to explicitly characterize terrain classes from the enriched semantics. Moreover, since the significance of each keyword on characterizing landform class varies, we quantitatively weight the priority of each keyword via a weighting vector-{weight 1 , weight 2 , . . . , weight n }-created by latent semantic analysis (LSA). Finally, we combine the feature vector and the weighting vector into a new weighted feature vector-{weight 1 × feature 1 , weight 2 × feature 2 , . . . , weight n × feature n }. The weighted feature vector is then used as the input feature for SVM-based classification.

Geomorphological Variable Extraction
We employ a spatial-contextual approach [14] to detect the geomorphological variables based on aspect and curvature. This approach is exploited well due to its capability of detecting geomorphological variables from both high-and low-resolution DEM. Figure 3 illustrates the principle of this method. In Figure 3, red cell is a pixel (CP) in a DEM, to be determined whether it belongs to a geomorphological variable. Orange cells are its adjacent pixels (AP), and gray cells are the its neighboring pixels (NP) over multiple distances. d and i denote the distance and the index of direction, respectively. This spatio-contextual approach measures the aspect difference and elevation difference between the red pixel (CP) and its neighboring pixel (NP) at each direction axis over multiple distances, and the aspect difference and elevation difference between one of its adjacent pixel (AP) and this AP's neighboring pixel (NP) at each direction axis over multiple distances. Direction set includes east-west direction axis, north-south direction axis, northeast-southwest direction axis, and northwest-southeast direction axis. Then, the results of aspect difference and elevation difference are fused to determine whether this red pixel belongs to a predefined geomorphological variable.

Region-Based Feature Detection
The approach for region-based feature extraction are summarized as follows, (1) Moment measures the elevation variation, and includes the first raw moment (mean), the second central moment (variance, or standard deviation), the third central moment (skewness), and the fourth central moment (kurtosis). The expressions for these four moments are shown in the following equation, where V denotes the elevation of a pixel in DEM, N refers to the number of pixels in the whole DEM or a local region of DEM. (2) Slope represents the steepness of land surface over vertical and horizontal dimensions.
Curvature represents the "slope" of slope. In details, profile curvature describes the convex and concave of a slope over vertical dimension, and platform curvature describes the convex and concave of a slope over horizontal dimension. (3) Local binary pattern (LBP) [22] calculates the direction of each pixel based on histogram of gradient (HOG). Unlike the LBP in computer vision that measures the gradient of intensity, this paper calculates LBP based on the gradient of elevation [14]. Assuming that the direction of one pixel is represented by a vector-[d1, d2, d3, d4, d5, d6, d7, d8], where d1-d8 respectively refers to the difference between the center pixel and its neighboring pixel in eight directions. If the center pixel is lower, similar or higher than its neighboring pixel, the value of d* would be accordingly assigned to −1, 0 and 1, respectively.
A previous work acknowledged that the pattern generated by LBP could provide more details [15]. In Figure 4, if two pixels respectively belong to summit and cliff. The aspect of these two pixels is equal-315 • , meaning that aspect alone cannot distinguish their difference. Meanwhile, the LBP pattern of these two pixels (  The approaches for detecting the following region-based features are based on the result of geomorphological variable extraction. (4) Hough circle transform focuses on determining whether a circular exists in the result of geomorphological variable extraction. This feature is helpful to detect circle landforms, such as crater, volcano, etc. Through defining the minimal and the maximal radius, Hough circle transform supports to identify all possible circles. (5) Contour approximation aims at detecting rectangle shape from the result of geomorphological variable extraction. Rectangle landform could be seen in the land surface formed by carving and deposition, such as canyon, Karst, etc.
Above of all, the structure of the features vector is shown as follows, where y is the index of terrain category. f D mom refers to the moment feature carrying four kinds of moments: dim( f D mom ) = 4. F lbp refers to the LBP pattern map, which containing the LBP of every pixel over eight directions: dim f D lbp = 1. f D shp refers to the binary result of Hough transform and contour approximation. If the circle or rectangle shape can be detected, f D shp is 1, otherwise 0. f D slp and f D curv respectively refers to the result of slope and mean curvature.

Codebook Generation
In practice, not all items in the feature vector of Equation (1) are useful to recognize specific landform classes. Previous works proved that the performance of machine learning in classification heavily relies on sparse features being helpful for the representation of data [23]. Thus, codebook generation aims at selecting the sparse features fitting for representing each landform class. Figure 5 shows the workflow of codebook generation, which composes of four steps: selecting the keywords from ontologies and open external data resources, filtering irrelevant keywords, collecting the keywords by latent semantic analysis, and assigning priority of each keyword.

Keyword Selection by Ontology and Open Data Resources
People define and categorize the landforms based on high-level explicit descriptions, rather than pixel-level features or linear features derived from DEMs. The domain concept that defines the characteristics of landforms have been proposed in ontologies, classification system, and open linked sources, such as landform ontology [24,25], geological ontology [26], hydrogeology ontology [27], and topography ontology [28].
However, formal definitions for objects, features, events and phenomena in ontologies and classification systems are always limited to specific background within a given period. Thus, besides semantic information in terrain domain ontologies, we attempt to extend the scope and amount of semantic information from three external open linked resources: online dictionaries (Dictionary and Webster Merriam), Wikipedia, and Thesaurus. Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation. The information from Wikipedia has been employed in many fields [29][30][31]. In comparison to the information from other resources, the origins of definitions and introductions in Wikipedia are labelled. Moreover, these origins are from educational materials, peer-reviewed literature, and books. These make the information stored in the Wikipedia reliable than the information from other volunteered resources.
First, we derive keywords (words and phrases) from the annotation and definition of each landform class in domain ontologies and taxonomies. An example of keyword extraction is shown in Table 1. The class name and class annotation are from USTopographic [28]. Two sentences "caused by the impact of a meteorite" and "caused by an explosion", are not selected as the keyword, since they express a dynamic action along with time changes, which is impossible to recognize from DEM. Specifically, prepositions, definite articles, indefinite articles are all removed here. The details of operations on removing irrelevant keywords are introduced in Section 3.2.3.
Then, the keywords derived from domain ontologies and taxonomies shown in Table 1 are used to collect all sentences that contain these keywords from online dictionaries and Wikipedia. Table 2 shows an example of extending crater-related keywords from online dictionaries (Dictionary). All keywords derived from USTopographic in Table 1 are marked as black bold text. Moreover, we label different types of keywords extracted from sentences collected from Dictionary as different colors, to clearly illustrate the origins of each keywords. Table 2. Keyword extraction result regarding "Crater" from Dictionary. Table 1 1. The cup-shaped depression or cavity on the surface of the earth or other heavenly body marking the orifice of a volcano. 2. Also called impact crater, meteorite crater (on the surface of the earth, moon, etc.), a bowl-shaped depression with a raised rim, formed by the impact of a meteoroid. 3. (On the surface of the moon) A circular or almost circular area having a depressed floor, almost always containing a central mountain and usually completely enclosed by walls that are often higher than those of a walled plain; ring formation; ring. 4. Impact crater is a depression in the ground believed to have been caused by a meteorite 5. The bowlike orifice of a geyser 6. The hole or pit in the ground.

Keywords Extracted from above Sentences
1. Impact crater, impact of a meteoroid; 2. Meteorite crater, meteorite; 3. Cup-shaped, bowl-shaped, circular area, almost circular area, bowl-like; 4. Depression, cavity, depression with a raised rim, depressed floor, hole, central mountain completely enclosed by walls higher than a walled plain, central mountain completely enclosed by ring formation, ring formation, ring; 5. Surface of the earth, in the ground, on the surface of the moon; 6. Surface of other heavenly body marking the orifice of a volcano; 7. Orifice of a geyser; 8. Hole, pit.
Another resource supporting to enrich keywords is from the synonyms of each keyword given by Thesaurus. The synonym of Thesaurus has been used in previous cases, such as information query [32,33], and information retrieval [34]. In this paper, we rank the derived keywords in term of relevance, and remove the synonym that carries lower priority in the ranking result.

Keywords-Based Text Post-Processing and Filtering
The keyword extracted from ontologies, dictionaries and Wikipedia may contain suffix and prefix, such as multi-, semi-, -based, -shaped, -driven, -like, etc. we define the n-grams as the word that prefix or suffix connects with. For example, bowl-shaped is defined as a keyword: bowl. Moreover, for other n-grams that contain multiple words without prefix and suffix, we divide this kind of n-grams into two words. For example, shock-metamorphic effects are defined as two keywords: shock effects and metamorphic effects.
Besides suffix, prefix and n-grams, the content included in some keywords might be meaningless for LiDAR-based landform recognition. Table 3 summarizes a list of these keywords to be removed with multiple categories.

Weighted Feature Vector Generation
We perform statistics on the frequency of each keyword on the results accessed by text post-processing and filter. The frequency of keywords is organized as a weighted vector, which is shown in the following equation, where wD mom , wD lbp , wD shp , wD slp and wD curv provides weight for D mom , D lbp , D shp , D slp and D curv , respectively in Equation (2). Then, the weighted feature vector shown in Figure 1 is expressed as follows, Table 4 lists the details of geomorphological variables and region-based features. The column titled Data attribute lists the data that supports detect feature detection. The column titled High-level abstract includes the keyword for characterizing landform that geomorphological variables and region-based features may represent.

Classification
The classification section aims to learn the weighted features of landform classes, and predicting whether a detected region belongs to a landform class. The input features for training and text follows the structure of the weighted feature vector shown in Equation (4). The following table summarizes the workflow of classification: Training Portion: Step A1. Labeling the minimal bounding box (MBB) of multiple objects belonging to a predefined landform class based on high resolution DEMs.
Step A2. Calculating the frequency of keywords for this predefined landform class, and creating a referenced weight vector.
Step A3. Creating a referenced feature vector based on the MBBs of this predefined landform class.
Step A4. Creating a referenced weighted feature vector via combining the referenced weight vector and the referenced feature vector. The structure of referenced weighted feature vector is shown in Equation (4). Test Portion: Step T1. Detecting geomorphological variables with the spatiocontextual approach reported in [14].
Step T2. Generating multiple MBBs based on the result of geomorphological variable detection.
Step T3. Creating the feature vector for each MBB received by Step T2.
Step T4. Creating a referenced weighted feature vector for each MBB received by Step T2, through combining the referenced weight vector gained by Step A2 and the feature vector generated by Step T3. Prediction Portion: Conducting classification via SVM classifier: the training data is the weighted feature vector obtained by Step A4, and the test data is the weighted feature vector obtained by Step T4.

Experimental Analysis
Based on the high-resolution DEMs, we selected crater, cirque and cliff as the landform class to be detected in the experimental section. Our previous work [14] found that some commonly-used approaches for moderate spatial resolution DEM could not perform well on high-resolution DEM. The results shown in Reference [14] proved that the traditional algorithms for crater detection without additional processing could not effective extract the whole structure of craters. In our experiment, we also found that the crater and cirque structure were not available on the high-resolution DEM used in the experiment. Thus, this paper only reported the performance of our method.
The domain ontology and taxonomy is USTopographic, and the open-linked data source includes Dictionary, Merriam Webster, and Wikipedia. In the first part of the experiment, we extracted all information regarding crater, cirque and cliff from the USTopographic and open-linked data sources and then semantically organized the information to enrich the semantics in USTopographic. In the second part, we provided the detection results on crater, cirque and cliff via BoGW.

Enriching Semantics
We showed the result of building enriches semantics taking a crater as an example. The workflow includes three steps introduced in Section 4: extracted keywords from the USTopographic, derived sentences and documents from open-linked data sources (Dictionary, Merriam Webster, and Wikipedia) based on the extracted keywords, and selected useful sentences and documents. Table 5 lists the frequency of keywords. Keywords were respectively extracted from the USTopographic (Table 1), and Dictionary, Merriam-Webster and Wikipedia ( Table 2). The irrelevant keywords shown in Table 3 were filtered, and the remaining keywords were matched with features ( Table 4). The details of the mapping between keywords and features are listed in Table 5. Specifically, the term "crater" was strongly relevant to "basin" and "depression" with respect to USTopographic, Dictionary, Merriam-Webster, and Wikipedia. Thus, we viewed basin and depression as two keywords for the representation of crater. The content included in Table 5 revealed some phenomena. First, three keywords included in the USTopographic-circular-shaped depression, volcanic cone summit and land surface-were ranked in the primary, the secondary and the quaternary keyword. This means that domain ontology and taxonomy could provide professional and commonly used terms to formally describe a landform class. Moreover, some irrelevant keywords, such as rock, contain top, volcano flack, etc., and the top-ranked keywords (e.g., hole/pit/sinkhole/circular openings, raise rim/ring formation), were commonly observed in the domain ontology and taxonomy, and open-linked data resources. This indicates that characteristics of a landform class were generally defined by similar descriptions from miscellaneous resources. This phenomenon further proved that the information from both professionally established ontologies and taxonomies and volunteered dataset were useful to support landform recognition and classification. Moreover, Table 5 listed the keywords-associated geomorphological variables and features used for crater and cirque, and corresponding algorithms for detecting those variables and features. Thus, according to the results listed in Table 5, the WF of crater and cirque are shown as follows, [15,25)), 0.14 × (D slp2 ∈ [25, 90)) where D shp1 and D shp2 respectively denotes the circular shape and closed circular shape. D slp1 and D slp2 respectively denotes the slope and cliff.
Furthermore, we organized the extracted keywords of crater as semantics via creating triple stores [35]. Figure 6 compared the conceptual hierarchy from the existed ontology and the enrich semantics. The triple store enclosed by orange rounded rectangles denoted the keyword that defined the relationship between crater and other landform classes, and the triple store enclosed by yellow rectangles denoted the keyword that characterized the crater class. The information derived by discovering relevant documents from open-linked data resources could effectively enrich the semantics of a landform class, and further provides more features for this landform characterization.

Crater and Cirque Detection
To verify the contribution of enrich semantics and the performance of BoGW in landform object detection from DEM, we illustrated the detection results on crater and cirque detection based on a large-scale DEM dataset accessed from the 3D Elevation Program established by U.S. Geological Survey [36]. The experimental dataset mainly covers Sunset Crater Volcano National Monument, which locates in the north of Flagstaff in U.S. State of Arizona. Sunset Crater Volcano National Monument is an important place where enables observing and studying young volcanic craters and cirques. Figure 7 shows the location and visualization of the experiment dataset. A number of craters are obviously visible in the National Map and TIN data. The extent of DEM is a rectangular area has top-left coordinate (35.430462963, −111.573981482), and right-bottom coordinate (35.2937037037, −111.248425926) projected with coordinate system GCS_North_American 1983. To detect craters and cirque more accurate, we used a DEM with 1m spatial resolution, the dimensionality of 3516 × 1477. As shown in Table 4, ridgeline refers to a fundamental element of crater and cirque. Thus, we detected ridgelines with the spatio-contextual approach. Here, we selected 0.46 as the threshold of elevation difference in the method. Figure 8A shows the result of ridgeline detection. The detected ridgelines were labeled as red lines, and the background was a curvature map. A majority of ridgelines belonging to the rims of craters and cirques could be detected. Suffering from the rough terrestrial surface which high resolution DEM always represents, some ridgelines were not linear and extensive. However, a further correction for these disconnected ridgelines was not given. We found that many ridgelines were not represented as a linear feature, which were different from the representation observed in a low-resolution DEM.
Then, we calculated the MBB that encloses the result of geomorphological variable (ridgeline) detection. Moreover, considering that the rims of a crater might be detected as different strings of ridgelines, we created each similar MBB at multiple scales. Figure 8B shows the result of multi-scales MBB. Black lines referred to detected ridgelines. Red boxes, orange boxes and blue boxes respectively denoted the MBB built with small, medium and big scale. In Table 3, crater and cirque shared some similar keywords, such as depression, bowl-shaped, partially enclosed, etc. Moreover, Figure 9 visualized the similarity between crater and cirque with satellite imageries. Red ellipse denoted the ideal rims of a crater, and cyan line was the ridgeline invisible from the satellite image. Figure 9A showed a crater that had a flat boundary in its southeastern part, where was labeled as a cyan curve. Such cater that had a breach in the surrounding rim, could also be seen in the experimental dataset. The object in Figure 9B seemed approximately to be a hollow, rather than crater, since it had a wide and flat bottom area. Figure 9C showed a crater that had similar characteristics to the object shown in Figure 9B, having a breach in the surrounding rim, ambiguous depression, and flat bottom. The features listed in Table 4 and the objects shown in Figure 9 indicated that crater and cirque might be difficult for distinguishing in case of some detected objects. Thus, we evaluated the proposed BoGW model for landform recognition without distinguishing crater and cirque. Then, we calculated the weighted feature vector based on the area enclosed by each MBB. Then, we used an SVM classifier to classify the category of each MBB. Specifically, if the MBBs that were classified into crater or cirque overlapped each other, we selected the MBB that had the smallest size as the final detection result. Figure 10 showed the result of crater and cirque detection. Red boxes, yellow boxes, and cyan boxes were the true positive detection, false-negative detection, and false-positive detection, respectively. To illustrate the detection results clearer, we superimposed the detection results onto a satellite image and a curvature map, respectively. User' accuracy and producer's accuracy were 84% and 93%, respectively. The visual assessment and accuracy indicated that the proposed BoGW could support landform object detection from high-resolution DEMs. Moreover, we also found that the proposed BoGW generally produced much higher recall than precision in detecting crater and cirque. The reason accounting for these phenomena might include three parts. First, previous works reported that the algorithm for detecting circular shape of geomorphological variables could effectively extract a majority of objects being similar to crater and cirque. This means that few irrelevant objects were cleared away, which led to a high producer's accuracy, or recall. Moreover, we believed that this result posed the significance of shape in landform detection. Second, the existing approach for detecting shapes, such as Hough transform and approximation contour, might be confronted with the challenge of detecting the exact shape of geomorphological variables from high-resolution DEMs. For example, much false-negative detection occurred due to the difficulty of distinguishing bend linear features and curve features. Finally, the accuracy of geomorphological variable detection from high-resolution DEM played a key role of landform object detection. It leaded to the fact that user's accuracy was much lower than producer's accuracy. Table 6 quantitatively evaluates the detection results.

Conclusions
A variety of classification systems on geomorphological variables have been the fundamental parameter for landform characterization. However, geomorphological variables cannot effectively support representing a landform object, which is always represented by a regional feature instead of a point or linear one. Previous efforts on landform detection with regional features, such as template-based methods, object-based segmentation, machine learning algorithms, have several challenges to bridge the gap between the features derived from a DEM and the descriptions on the landform. Applying the thought of BoVW for visual recognition, this paper proposes a new algorithm called BoGW, aiming at representing the landform object via DEM-derivative features and descriptions semantically defined by human beings.
In comparison with the object detection from satellite imageries, landform object detection from DEM lacks massive ground truth dataset that represents the characteristics of landform classes. Moreover, the role of semantics has always been ignored in landforms recognition from remote sensing data. This paper accomplished an analysis of how semantics impact and facilitate landform object detection from high-resolution DEMs, and integrate the geomorphological variables and semantic descriptions to facilitate landform object detection.
In the future, building benchmark dataset regarding high-resolution DEMs is pressing, to fully take the advantages of deep learning techniques in DEM processing. Moreover, the integration of convolutional neural networks (or recurrent neural networks) and explicitly programed rules (e.g., knowledge graph) that aim to exploit high-level data features and knowledge would be worthy of further attention.