Ontology-Based Probabilistic Estimation for Assessing Semantic Similarity of Land Use/Land Cover Classification Systems

To accurately and formally represent the historical trajectory and present the current situation of land use/land cover (LULC), numerous types of classiﬁcation standards for LULC have been developed by different nations, institutes, organizations, etc.; however, these land cover classiﬁcation systems and legends generate polysemy and ambiguity in integration and sharing. The approaches for dealing with semantic heterogeneity have been developed in terms of semantic similarity. Generally speaking, these approaches lack domain ontologies, which might be a signiﬁcant barrier to implementing these approaches in terms of semantic similarity assessment. In this paper, we propose an ontological approach to assess the similarity of the domain of LULC classiﬁcation systems and standards. We develop domain ontologies to explicitly deﬁne the descriptions and codes of different LULC classiﬁcation systems and standards as semantic information, and formally organize this semantic information as rules for logical reasoning. Then, we utilize a Bayes algorithm to create a conditional probabilistic model for computing the semantic similarity of terms in two separate LULC land cover classiﬁcation systems. The experiment shows that semantic similarity can be effectively measured by integrating a probabilistic model based on the content of ontology.


Introduction
Mapping land cover (LULC) provides important support for representing the historical trajectory and present situation of earth observation [1,2], land management [3], pattern analysis [4], settlement monitoring [5], landscape planning [6], etc. These LULC classification maps are available at multiple spatial and temporal scales generated by numerous types of classification standards for LULC. Currently, tens of LULC classification systems have been developed by different nations, institutes, and organizations, such as the NLCD1992 and the NLCD2006 developed by USGS (U.S. Geological Survey), the C-CAP developed by NOAA (National Oceanic and Atmospheric Administration), the Land Cover classification systems, legends developed by the UN (United Nations), and Chinese Current Land Use Classification.
These land cover classification systems and legends generate two significant challenges in integration and sharing: (1) polysemy: a land parcel might be defined as different LULC types by various LULC classification systems; (2) ambiguity: the same term of LULC might be defined differently according to various LULC classification systems. Polysemy and ambiguity belong to semantic heterogeneity [7], which focuses on addressing the confusion of expression in natural language processing. Li and Ling divided the semantic heterogeneity in terms of LULC classification systems and standards into three major factors [8]. (1) Confounding conflicts: the same definition or concept represents Land 2021, 10, 920 2 of 14 diverse meanings. For example, the notion "commercial/industrial" belongs to the category "Commercial/Industrial" "Transportation" in NLCD1992 but belongs to the category "Developed High Intensity" in NLCD2006. (2) Scaling and unit conflicts: the same definition is represented at different scales and units. For example, the term "Low Density" in NLCD1992 and NLCD2006 is defined differently. (3) Naming conflicts: one word has multiple meanings, or one meaning can be expressed by using multiple words. For example, the "perennial" of NLCD 2006 and "long-term" of NLCD 1992 represent the same meaning.
To address the semantic heterogeneities, a number of works have proposed approaches regarding semantic harmonization to integrate multi-source information and features into a consistent one. Since the psychological study shows that similar features can attract more attention than different ones [9], semantic harmonization mainly focuses on semantic similarity to deal with semantic heterogeneity. Some previous works have used metadata to define the characteristics of the relationship of LULC types; however, the work proposed by Comber, Fisher, and Wadsworth [10] claimed that the metadata could not explicitly describe the meaning of LULC information. To deal with this challenge, a number of semantic harmonization regarding LULC focuses on statistical learning-based semantic similarity assessment, such as conceptual spaces [11], semantic metrics [12], integrating post-classification and semantic metrics [13], regression integrated correlation matrix [14], etc. Moreover, the user-machine interactive approach [15] and expert-enhanced system [16] have been developed to facilitate understanding the semantics for assessing semantic similarity.
Assessing the semantic similarity of various LULC terms requires the consideration of the explicit meanings of domain knowledge and the hidden expressions/relationships between a term and its neighboring terms. Thus, the domain ontologies could be a significant barrier to the implementation of those approaches in terms of semantic similarity assessment. For example, although the statistical model performs well on measuring the similarity of "high intensity" between high-intensity residential (NLCD 2006) and highintensity developed (NLCD 1992), it cannot measure the relevance between developed and residential. The ontology can semantically define and formally represent the domain knowledge based on a hierarchical taxonomy, including classes, instances, attributes, and relationships. For semantic similarity measuring, previous works claimed that an ontology could systematically organize the domain knowledge and explicitly discover the relevance and correlations among domain individuals [17,18].
Until now, the state-of-the-art ontology-based semantic similarity assessment for language recognition and knowledge modeling consists of edge-based similarity measuring, feature-based similarity measuring, information content-based similarity measuring, and gloss-based similarity measuring [17,[19][20][21]. Edge-based approaches are simple and easy to compute, but they cannot satisfy the demand for precision and accuracy of semantic similarity measures. Moreover, although the IC-based approaches successfully handle many applications regarding semantic similarity measures, informativeness or content are difficult to obtain from the limited volume information of LULC classification systems and standards. When the features are inadequate, feature-based approaches cannot accurately distinguish the small difference. The implementation of the gloss-based similarity method requires massive text information stored in a word base such as WordNet and Wiktionary; however, to our knowledge, the word base is still unreported in terms of LULC land cover classification and mapping. Thus, gloss-based similarity measuring might not be appropriate for measuring the semantic similarity of LULC classification systems and standards.
To accurately assess the semantics similarity of LULC classification systems with a limited amount of text information, we propose an ontology-enhanced probabilistic approach to enhance the semantic similarity measuring regarding the domain of LULC classification systems and standards. The remainder of this paper is organized as follows: Section 2 discusses the works relevant to ontology-based semantic similarity assessment; Section 3 presents our proposed methods for measuring semantic similarity, which includes an on-tology named LuLcSys-Ontology for a formal representation of LULC, and a probabilistic model for semantic similarity based on LuLcSys-Ontology; Section 4 shows our semantic similarity assessment by using other approaches and our proposed one; Section 5 concludes our work, details our contributions to the literature, and predicts several prospective relevant research fields.

Edge-Based Similarity Measuring
Edge-based similarity measuring aims to calculate the links or depth between the terms in a conceptual hierarchy. The approach to compute the link and depth of a path is shown as follows: link = min(len(path(a, b))) depth(a) = min(len(path(a, r))) where path(a, b) are the set that includes all paths between two separate terms a and b, len(path(a, b)) is the set that includes the length of each path between a and b. r is the root of a hierarchical taxonomy that includes both a and b.
Other extensive works on edge-based similarity measuring include the approaches proposed by Li, Bandar, and McLean [22] and Al-Mubaid and Nguyen [23]. The edge-based similarity measure is straightforward and requires low-cost computing; however, it might be ineffective to deal with the semantic similarity assessment for a hierarchical taxonomy with a complex structure. Additionally, the path and depth of a term vary according to different ontologies, which means that this term might be measured as different. Finally, it cannot represent the hidden information in ontologies.

Information Content (IC)-Based Similarity Measuring
The IC-based similarity measuring assesses the semantic similarity based on the informativeness of the concept [24]. Assuming a concept as a, p(a) is the probability of observing this concept, the informativeness of this concept (IC(a)) is shown as follows: Resnik [24] and the following methods designed an approach to measure the semantic similarity between two concepts based on the informativeness, which is shown as follows: where a and b are two independent concepts, Sub (a, b) denotes the set of all concepts that contains concepts a and b. Depending on Equation (3), the subsequent studies on IC-based similarity measures include two focuses [19]: Corpora-based IC computation method and intrinsic IC computation method. The corpora-based IC computation method computes the content of IC by using external information. Otherwise, the intrinsic IC computation method focuses on utilizing the knowledge included in ontology is more popular. Related applications include measuring IC from a conceptual hierarchy with optimized depth calculation [25], measuring IC from a conceptual hierarchy without depth calculation [26], and measuring IC from a conceptual hierarchy via a setting weights mechanism [27,28].
In general, an IC-based similarity measure relies on massive well-prepared data to discover the heterogeneous meanings of each term. In comparison to the volume of training data from semantic bases such as WordNet, the number of terms in the stateof-the-art LULC classification systems and standards is inadequate for generating an accurate measuring result. Moreover, although intrinsic IC computation methods can derive knowledge from ontology without the support of massive external information, the hierarchical taxonomy in an ontology might be very complex for this method.

Feature-Based Similarity Measuring
Feature-based similarity measuring focuses on the similarity between the properties of two concepts, which is based on the set theory proposed by Tversky [29]: where d(a) and d(b) are the descriptions for concept a and b, respectively, µ is the weight, d(a)/d(b) denotes that the descriptions belong to a but not b, and d(b)/d(a) denotes that the descriptions belong to b but not a.
Since the hierarchical taxonomies in an ontology have been becoming more and more complex, the investigation on semantic similarity has concentrated on the similarity of features rather than of terms [30]. Rodriguez and Egenhofer [31] proposed a feature-based semantic similarity with regard to the relationships between terms.
where A and B are the corresponding set of terms a and b, respectively. sim s (), sim f (), and sim n () are the synsets, features, and neighbor concepts, and µ s , µ f , and µ n are the weights for these three concepts, respectively. More details of computing sim s (), sim f (), and sim n () can be found in Reference [31]. Other feature-based similarity measures include X-similarity [32], integrating information-theoretical domain [33], using taxonomical features [34], measuring similarity without pre-defined ontology [35], matching concepts from diverse ontologies [36], etc. Appropriate weighting refers to the most significant limitation of the feature-based similarity measure. In general, a feature-based similarity measure might assign an appropriate weight for each feature by a trial-and-error procedure. Moreover, a feature-based similarity measure assigns a weight for each independent term; however, the terms in various LULC classification systems and standards might have overlapped features, making it difficult to determine an appropriate weight.

LuLcSys-Ontology
Based on Protege software [37], we developed a domain ontology named LuLcSys-Ontology to semantically define and formally organize the information extracted from LULC classification systems and standards. Figure 1 illustrates the conceptual model of LuLcSys-Ontology, which includes five components: Classes, Instances, Properties, and Restrictions. Instances includes the individuals that belong to a class item defined in Classes. The items in Properties refer to relationships, and the items in Restrictions refer to the precondition and context of relationships. More details are provided as follows.  The details of LuLcSys-Ontology are shown in Table 1.

Component Triple Relationship Content
Classes "subject" or "object" in the triple Three subclasses: Categories: the categories or classes of LULC classification systems and standards.
Codes: the codes corresponding to categories or classes.
Features: the characteristics of each category or class.

Instances
"subject" or "object" in the triple The terms or notations derived from the textural descriptions of LULC classification systems and standards.
Properties "predicate" in the triple Three types of properties: the details are shown in Table 2.
Restrictions "predicate" in the triple Restrictions define the validity of a property under specific conditions. Function terms "predicate" in the triple The characteristics of properties.
Moreover, each item in Instances should belong to at least one class in Classes. In LuLcSys-Ontology, properties are defined by the W3C Standards, including RDFS (Resource Description Framework Schema) and OWL (Ontology Web Language), and predefined by LuLcSys-Ontology. Since the Annotation property mainly represents the metainformation of ontology, we focus on data property-based triple (subject-data propertyobject), and the object property-based triple (subject-object property-object). In some cases, the data property-based triple might be incorporated into an object property-based triple. Table 2 lists the details of Properties, Restrictions, and Function terms. The properties that start with lulcsys:, rdf:, and owl:, show that this property is defined by LuLcSys-Ontology, RDFS, and OWL, respectively. The items of Restrictions and Function terms are defined by the W3C Semantic Web Standard. Based on the W3C Semantic Web Standard [38,39], all relationships in LuLcSys-Ontology were created as a triple relationship: "subject-predicate-object". Taking three categories of NCLD 2006 (Deciduous Forest, Evergreen Forest, and Mixed Forest) as the example, Figure 2 shows the transformation from the descriptions of these three categories into the semantic information of LuLcSys-Ontology.  Figure 2A shows the descriptions of three categories involving Deciduous Forest, Evergreen Forest, and Mixed Forest. Figure 2B shows the semantics explicitly defined in LuLcSys-Ontology. We label various components in different colors. The orange texts refer to Classes, the italic black texts are Properties, the red texts are Property restrictions, the green texts are instances that are defined by Object properties, and the blue texts are the instances that are defined by Data properties. Based on these components, all descriptions are organized as triple relationships-as shown in Figure 2B. Moreover, Figure 3 shows the partial structure of the LuLcSys-Ontology developed for NLCD 1992, including three classes of NLCD_1992: Categories, Codes, and Features. The yellow rectangles refer to the subclasses of these three classes, and the purple rectangles refer to the instances. All properties are represented by arrow lines. When an arrow line connects two rectangles, the rectangle that connects to the starting point of the arrow line refers to the "object" in the triple relationship, and the other rectangle refers to the "subject" in the triple relationship.

Rules Building
In comparison to a spatial database, the key advantage of an ontology is the capability of discovering hidden knowledge through rule-based reasoning supported by triple relationships. In this paper, we built reasoning rules by SWRL (semantic web rules language) [39], which is defined by the W3C Semantic Web Standard. Assume the triple relationship (subject-predicate-object) in ontology as P(Sub, Obj), where Sub, P() and Obj denotes subject, property, and object, respectively. Additionally, Sub new , P new (), and Obj new respectively, denotes the new subject, property, and object after reasoning based on P(Sub, Obj). The basic structure of SWRL in this paper is as follows: Then, based on the data properties and object properties, we develop two types of rules: the rule of data property-based triple, and the rule of object property-based triple.
Assuming object property and data property as oP() and dP(), based on Equation (6), we have the rule based on object property-based triples and data property-based triples as follows: [oP 1 (?s 1 , ?o 11 ) ∩ dP 1 (?s 1 , ?o 12 where i is the total number of object property-based triples. oP new (?x new , ?y new ) denotes a new object property-based triple. According to Equation (6), this new triple is also the result of logical reasoning. We present an example of the reasoning based on Deciduous Forest in Figure 2 and in Table 3. Assuming we have a tree called "target_tree"; then, we have two data propertybased triples and two object-property-based triples: Table 3. Examples of semantic modeling for LULC classification systems and standards.

High-Intensity Residential Class in NLCD 1992 Developed High-Intensity Class in NLCD 2011/2006/2011
Constructed materials account for 80 to100 percent of the cover. Impervious surfaces account for 80% to 100% of the total cover.

Pr(S 2 |O 2 )
The probability of observing the coverage of impervious surfaces.
Pr(S 1 |(D 1 |O 1 )) The probability that the coverage is no less than 80%, when the coverage of constructed materials is observed.
The probability that the coverage is no less than 80%, when the coverage of impervious surfaces is observed.
Based on these three triples, we can deduce a hidden relationship being unsupported by a spatial database: target_tree rdf:isInstancceOf Deciduous Forest.

Probabilistic Reasoning Embedded Ontology-Based Semantic Similarity Measuring
As mentioned previously, feature-based measuring is limited to accurately weighting each feature without massive training samples. Thus, semantically modeling the features, rather than quantitatively weighting, would be an alternative solution. We integrate the probabilistic model (Bayes) and the feature-based measuring method to assess semantic similarity. Based on the object property-based triples and data property-based triples in LuLcSys-Ontology, we create the Bayes-based conditional probabilities to assess the semantic similarity.
For separate terms (subjects) S 1 and S 2 in two LULC classification systems and standards, we assume that the object property-based triple and data property-based triple of S 1 are P(S 1 , O 1 ) and P(S 1 , D 1 ), respectively. Similarly, for S 2 , we assume its object property-based triple and data property-based triple as P(S 2 , O 2 ) and P(S 2 , D 2 ). Moreover, the common features of objects and data between S 1 and S 2 are O c and D c , O c ⊆ O 1 ∩ O 2 and D c ⊆ D 1 ∩ D 2 . The semantic similarity of S 1 and S 2 (sim(S 1 , S 2 )) is measured by the following expression: In Equation (8), we transform the semantic similarity of S 1 and S 2 to the probability of observing that they are similar, which is denoted as Pr(S 1 , S 2 ). The similarity is measured based on their common features of object (O c ) and common features of data (D c ), which is represented by Pr(S c ). Pr(S c ) is obtained by the following expression.
In Equation (9), Pr(S c |O c ) refers to the probability of observing S 1 and S 2 are similar based on O c . Pr(S c |(D c |O c )) is the probability of observing S 1 and S 2 are similar based on D c . The following table shows an example that explains the parameters in Equation (9).

Experiments
The datasets for the experiment include three major regional LULC classification systems and standards: NLCD1992 and NLCD 2011/2006/2011 from USGS, and NOAA Regional Land Cover Classification Scheme from NOAA. The first experiment assesses the semantic similarity between NLCD 1992 and NLCD 2011/2006/2011. Considering that the difference between NLCD 2011/2006/2011 and NOAA Regional Land Cover Classification Scheme has attracted much attention, the second experiment focuses on assessing the semantic similarity of these two land cover classification systems and legends. The classes of these land cover classification systems and legends are listed in Table 4.
According to the categories and descriptions of NLCD 1992, NLCD 2011/2006/2011, and NOAA Regional Land Cover Classification Scheme, we develop three separate LuLcSys-Ontologies for these land cover classification systems and legends: NLCD92_Ontology for NLCD 1992, NLCD11_Ontology for NLCD 2011/2006/2011, and NOAA_Ontology for NOAA Regional Land Cover Classification Scheme. Then, we compute the semantic similarity based on the triples of each two ontologies: NLCD92_Ontology and NLCD11_Ontology, and NLCD11_Ontology and NOAA_Ontology. The computing method includes three existing ontology-based approaches: edge-based measures [23], feature-based measures [26], information content-based measures [25], and our proposed approach. Table 5 shows the result of the semantic similarity assessment between NLCD 1992 and NLCD 2011/2006/2011. By comparing the textural descriptions of these two LULC classification systems and standards, both polysemy and ambiguity can be observed. In other words, no two classes are exactly the same, although they are defined as the same term. Based on the path and depth of each two terms in ontologies, PDBM cannot effectively assess the semantic similarities between most of the classes in NLCD 1992 and NLCD 2001/2006/2011. Meanwhile, we can observe that information content-based measures (ICBM) cannot assess the semantic similarities of some classes in these two LULC classification systems and standards. When there exists a limited volume of common features between two classes, the informativeness of their seminaries is challenging to assess; however, ICBM performs well on distinguishing some small differences between the two classes. For example, although the four classes of NLCD 1992 involving Row Crops, Small Grains, Fallow, and Orchards/Vineyards/Other are similar to the class of NLCD 2001/2006/2011 named Cultivated Crops, the similarities between each of these four classes of NLCD 1992 and Cultivated Crops are different. ICBM can produce more accurate results than feature-based measures (FBM) in measuring this semantic similarity. Moreover, many results by FBM are closer to the results of our proposed approach; however, FBM struggles to assess the small differences between two classes. For example, the semantic similarity of Grasslands/Herbaceous and Sedge/Herbaceous is not the same as the semantic similarity of Grasslands/Herbaceous and Lichens and Moss, because Lichens and Moss are specifically defined for the landscape of Alaska; however, FBM produces the same similarity result. Thus, without the support of a conditional probabilistic model, ICBM and FBM are limited in measuring the semantic similarity of LULC classification systems and standards based on ontology.   Table 6 shows the result of the semantic similarity assessment between NLCD 2001/2006/ 2011 and NOAA Regional Land Cover Classification Scheme. The results include both polysemy and ambiguity. PDBM cannot effectively assess the semantic similarities for a majority of classes between NLCD 2001/2006/2011 and NOAA Regional Land Cover Classification Scheme. Without a manual interpretation, ICBM seems to have challenges to deal with measuring the semantic similarities of some classes (e.g., Barren Land (Rock/Sand/Clay) and Barren Land) between these two LULC classification systems and standards. Moreover, although FRM overperforms ICBM, it still cannot recognize the hidden differences. For example, the semantic similarity assessment of Palustrine Emergent Wetland (Persistent) and Emergent Herbaceous Wetlands, and Estuarine Emergent Wetland (Persistent) and Emergent Herbaceous Wetlands requires discovering the hidden relationship among Palustrine, Estuarine, and Emergent; however, this hidden relationship might not be explicit without the domain knowledge semantically organized by the conceptual hierarchy of the ontology.  As we can see from Tables 5 and 6, using previous ontology-based semantic similarity for LULC classification systems and standards, the performance of existing approaches is ranked as: FBM > ICBM > PBM; however, the weaknesses of each approach prevent them from producing an accurate result of semantic similarity. By incorporating probabilistic models into FBM, our proposed approach can more accurately measure semantic similarity.
The result of semantic similarity measuring could be useful for a number of applications. First, the changes of LULC have been a significant research focus of remote sensing and land planning. Due to the fact that LULC maps within different periods were generated by various LULC classification systems, the changes of LULC based on those maps might not be available. The similarity degrees among these LULC classification systems can facilitate people quantitatively analyzing the changes of LULC in a more accurate way. Moreover, LULC classification systems are generated based on specific LULC conditions of different areas, countries, or regions. The semantic similarity of LULC classification systems of different places represents the characteristics of these places in terms of LULC to some extent.

Conclusions
The emergence of multi-type LULC classification systems and standards facilitates the generation of LULC classification maps and digital products; however, the heterogeneities of diverse LULC classification systems and standards impact the efficiency of using these products in land monitoring, management, and utilization. To address the heterogeneities, ontology-based approaches have been commonly exploited by information science. This paper integrates probabilistic models and ontologies to facilitate measuring semantic similarity of different LULC classification systems and standards.
In this paper, we developed domain ontologies to explicitly define the descriptions and code of different LULC classification systems and standards as semantic information and rules for logic reasoning. Based on the semantics and rules, we applied the Bayes algorithm to create a conditional probabilistic model for computing the semantic similarity of LULC categories in separate LULC classification systems and standards. The experiment shows that semantic similarity can be effectively measured by integrating a probabilistic model based on the content of ontology.
There are several possible extensions of this research that focus on integrating the content of different LULC classification systems and standards. To explicitly represent the hidden semantic information, the fusion of various domain ontologies for LULC classification systems and standards still needs to be investigated. Moreover, since the nature of LULC information inherits geographical context, geo-referenced information would be an aspect of the semantic similarity measuring. Based on discussions of the feature-based approach and the IC-based approach, it might be useful to study integrating informativeness and features to assess the semantic similarity of LULC classification systems and standards.