Knowledge Embedding with Geospatial Distance Restriction for Geographic Knowledge Graph Completion

: A Geographic Knowledge Graph (GeoKG) links geographic relation triplets into a large-scale semantic network utilizing the semantic of geo-entities and geo-relations. Unfortunately, the sparsity of geo-related information distribution on the web leads to a situation where information extraction systems can hardly detect enough references of geographic information in the massive web resource to be able to build relatively complete GeoKGs. This incompleteness, due to missing geo-entities or geo-relations in GeoKG fact triplets, seriously impacts the performance of GeoKG applications. In this paper, a method with geospatial distance restriction is presented to optimize knowledge embedding for GeoKG completion. This method aims to encode both the semantic information and geospatial distance restriction of geo-entities and geo-relations into a continuous, low-dimensional vector space. Then, the missing facts of the GeoKG can be supplemented through vector operations. Speciﬁcally, the geospatial distance restriction is realized as the weights of the objective functions of current translation knowledge embedding models. These optimized models output the optimized representations of geo-entities and geo-relations for the GeoKG’s completion. The e ﬀ ects of the presented method are validated with a real GeoKG. Compared with the results of the original models, the presented method improves the metric Hits@10(Filter) by an average of 6.41% for geo-entity prediction, and the Hits@1(Filter) by an average of 31.92%, for geo-relation prediction. Furthermore, the capacity of the proposed method to predict the locations of unknown entities is validated. The results show the geospatial distance restriction reduced the average error distance of prediction by between 54.43% and 57.24%. All the results support the geospatial distance restriction hiding in the GeoKG contributing to reﬁning the embedding representations of geo-entities and geo-relations, which plays a crucial role in improving the quality of GeoKG completion.


Introduction
A Knowledge Graph (KG) is a system that understands facts about people, places and things, and how these entities are all connected [1]. To emphasize the "connection", facts in a KG are represented as triplets with the form <head entity, relation, tail entity> or <entity, property, value> (both abbreviated as <h, r, t>). For example, the knowledge "Gorgona is a major island of Elba" is represented as <Elba, major island, Gorgona>, and "Elba has seven islands" is represented as <Elba, number of islands, 7>. Then, the knowledge facts form a graph similar to the example illustrated in Figure 1. The semantic difference of entities and relations can be distinguished by mining the graph links of these entities and relations. This difference is more significant than that found in the traditional knowledge base, where semantic differences of entities are only expressed by the independent property values of the entities. Thus, KGs can facilitate artificial intelligence (AI) applications, such as semantic searching [2,3], question answering (QA) [4], and smart education [5]. As a special KG, a geographic KG (GeoKG) becomes the effective organization form of geographic information, especially the geographic relation triplets extracted from web texts existing in newswires, collaborative encyclopedias, social media, official or domain websites, and so on. represented as triplets with the form <head entity, relation, tail entity> or <entity, property, value> (both abbreviated as <h, r, t>). For example, the knowledge "Gorgona is a major island of Elba" is represented as <Elba, major island, Gorgona>, and "Elba has seven islands" is represented as <Elba, number of islands, 7>. Then, the knowledge facts form a graph similar to the example illustrated in Figure 1. The semantic difference of entities and relations can be distinguished by mining the graph links of these entities and relations. This difference is more significant than that found in the traditional knowledge base, where semantic differences of entities are only expressed by the independent property values of the entities. Thus, KGs can facilitate artificial intelligence (AI) applications, such as semantic searching [2,3], question answering (QA) [4], and smart education [5]. As a special KG, a geographic KG (GeoKG) becomes the effective organization form of geographic information, especially the geographic relation triplets extracted from web texts existing in newswires, collaborative encyclopedias, social media, official or domain websites, and so on. However, current GeoKGs are far from complete, due to missing geographic entities (geoentities) or geographic relations (geo-relations) in their fact triplets. In English DBpedia (https://wiki.dbpedia.org/develop/datasets/dbpedia-version-2016-10), 53.86% of lake entities lack any source of water (property "inflow"), and 85.80% of mountain entities have no fact describing where their parent peaks are (property "parent mountain peak"). The sparsity of geographic information distribution on the web is the major reason for this data being unavailable. During the construction of a GeoKG, information extraction systems are often unable to detect enough references to geographic information from the massive web resource in a limited amount of time. Furthermore, the professionalism of geographic knowledge also reduces the likelihood that collected geographic information will be fully transformed into structured knowledge. Consequently, the generated GeoKGs will inevitably miss a large number of geographic knowledge facts. The incompleteness of the necessary information seriously impacts the performance of GeoKG applications, because GeoKGs only supply limited facts for a query or inference. Therefore, KG completion, used to supplement missing facts and guarantee basic data completion, has become an increasingly important task of KG research [6].
Translation knowledge embedding (translation KE) models are effective tools to complete KGs, and are able to achieve state-of-the-art completion performance [7,8]. These methods use the known entities and relations in a KG to fill in the missing facts by mining the potential semantics between However, current GeoKGs are far from complete, due to missing geographic entities (geo-entities) or geographic relations (geo-relations) in their fact triplets. In English DBpedia (https://wiki.dbpedia. org/develop/datasets/dbpedia-version-2016-10), 53.86% of lake entities lack any source of water (property "inflow"), and 85.80% of mountain entities have no fact describing where their parent peaks are (property "parent mountain peak"). The sparsity of geographic information distribution on the web is the major reason for this data being unavailable. During the construction of a GeoKG, information extraction systems are often unable to detect enough references to geographic information from the massive web resource in a limited amount of time. Furthermore, the professionalism of geographic knowledge also reduces the likelihood that collected geographic information will be fully transformed into structured knowledge. Consequently, the generated GeoKGs will inevitably miss a large number of geographic knowledge facts. The incompleteness of the necessary information seriously impacts the performance of GeoKG applications, because GeoKGs only supply limited facts for a query or inference. Therefore, KG completion, used to supplement missing facts and guarantee basic data completion, has become an increasingly important task of KG research [6].
Translation knowledge embedding (translation KE) models are effective tools to complete KGs, and are able to achieve state-of-the-art completion performance [7,8]. These methods use the known entities and relations in a KG to fill in the missing facts by mining the potential semantics between the known entities. More concretely, the translation KE models encode (or "embed") the semantics of knowledge (both entities and relations) into a continuous, low-dimensional vector space [9]. Then, the missing entities or relations of facts will be predicted by a vector operation. For example, the vectors l h and l r of the entity h and the relation r are encoded by a translation KE model, then the missing t of the triplet <h, r, ?> can be predicted with the operation l h + l r . The capacities of the translation KE methods have been verified in many studies, but these methods may perform poorly in the GeoKG completion. The translation KE models used in these methods assume that the entity should have distinct representations for different relations to improve the rationality of knowledge embedding. Unfortunately, the sparsity of geographic information distribution on the web results in sparse links in GeoKGs. The geo-entities in GeoKGs only connect a few other geo-entities, and one type of geo-relation contains a limited number of fact triplets. Thus, these models cannot obtain enough triplets of each type of geo-relation as training data to adequately represent geo-entities and geo-relations.
Meanwhile, the geospatial information hiding in a GeoKG can become the crucial additional information to enhance the representations of geographic knowledge of a link-sparse GeoKG. Intuitively, the symbolic geo-entity in a GeoKG must refer to a precise (or vague) geospatial location or scope. Then, the symbolic geo-relations between two geo-entities are actually the reflection of the geospatial distance restriction of these two geo-entities. That is, the geo-relations which can be used in describing two geo-entities are restricted by their geospatial pattern. However, current models fail to utilize these geospatial patterns. In this paper, a method with geospatial distance restriction is presented to optimize knowledge embedding for supplying the missing geographic facts of a link-sparse GeoKG. This method aims to optimize the training process of current translation KE models using the geospatial distance restriction. Then, the geo-entities with same distance in a geographic space will maintain a similar distance from each other in the embedding vector space. Finally, the missing geographic facts can be predicted by a vector operation according to both the semantic relationship and the geospatial distance feature between geo-entities in the GeoKG.
To summarize, our main contributions are as follows: 1.
The presented method introduces a geospatial distance restriction to refine the embedding representations of geographic knowledge in a link-sparse GeoKG, which fuses geospatial information and semantic information into a low-dimensional vector space; 2.
From the viewpoint of GIS (Geographic Information System), a novel task is designed to predict the geospatial locations of unknown geo-entities. This task is different from the tasks of the current translation KE research which only focuses on measuring the semantic relationship.
The rest of this paper is organized as follows. A brief review of translation KE model is introduced in Section 2. Section 3 proposes the GeoKG datasets used in this study, the optimization method with geospatial distance restriction for a translation KE model, and the workflow of GeoKG completion by a translation KE model. The experimental datasets, evaluation tasks and results, and result analysis are presented in Section 4. Section 5 is devoted to discussion, and Section 6 concludes this work.

Review of Translation KE Model
Translation KE is inspired by word embedding methods. For word embedding, the words in the corpus are encoded into a continuous low-dimensional semantic vector space, where each word is represented by a fixed dimensional real-valued vector [10,11]. If the distance between two words is close, these words have similar semantics or related semantics [12]. For example, the distance between "France" and "U.S.A" (or, "France" and "French") is less than the distance between "France" and "Mountain" in the vector space. Likewise, translation KE model aims to encode both entities and their relations in a KG into a continuous low-dimensional semantic vector space. In this space, the vector of the head entity projected ("translated") by the vector of the relationship should be similar to the vector of the tail entity. For example, as shown in Figure 2, after embedding the triplet "<Black Sea, inflow, Dniester>", the calculation result l Black Sea + l in f low ≈ l Dniester can be outputted, where l Black Sea , l in f low and l Dniester is the vector of "Black Sea", "inflow" and "Dniester". Sea, inflow, Dniester>", the calculation result + ≈ can be outputted, where , and is the vector of "Black Sea", "inflow" and "Dniester". A typical translation KE model consists of three steps [13]: (1) representing entities and relations, (2) defining a scoring function, and (3) learning entity and relation representations from observed facts in the current KG. Let us take the first translation KE model TransE [9] as an illustration: Firstly, TransE represents entities and relations as vectors in the same space. The vectors of relations are used for translating the head vector to the tail vector.
Secondly, a scoring function is defined to measure the plausibility of representations as: where , and are the vectors of , and of a triplet <h, r, t>. If a triplet <h, r, t> appears in the KG, it is regarded as a positive triplet.
, of the positive triplet should be low, and is high otherwise. Thirdly, TransE learns the representations of an entity and its relations through minimizing the following objective function, which is defined based on a margin-based ranking loss. (2) where ′ and ′ are the head entity and tail entity in negative triplet (the triplet does not appear in the KG), and > 0 is a margin hyperparameter separating the positive triplets and negative triplets. This objective function is designed to improve the gap between positive triplets and negative triplets. Generally, because there are no unreasonable triplets collected into the KG, the negative triplets are generated by replacing the head entity or tail entity of a positive triplet with a random entity. The optimization is realized by stochastic gradient descent (SGD) [14]. The extension models of TransE are proposed to improve the accuracy of completion based on more complex restrictions. These extension models optimize the effects of entity and relation representation through: (1) mining the internal restrictions in KG fact triplets. For example, one head entity (tail entity) has multiple tail entities (head entities) under a relation by TransH [15]; one relation is used for different semantic of entities by TransR [7], CTransR [7], TransG [16], TransD [8], TransA [17]; the heterogeneity and imbalance that exists in the KG by TranSparse [18], and so forth; (2) adding extra information aside from fact triplets, such as entities' types by TKRL [19], entities' attributes by TransEA [20], entities' textual descriptions by DKRL [21], relation paths by PTransE [22], graph structures by GAKE [23] and TCE [24], facts' temporal information [25], and so on. A typical translation KE model consists of three steps [13]: (1) representing entities and relations, (2) defining a scoring function, and (3) learning entity and relation representations from observed facts in the current KG. Let us take the first translation KE model TransE [9] as an illustration: Firstly, TransE represents entities and relations as vectors in the same space. The vectors of relations are used for translating the head vector to the tail vector.
Secondly, a scoring function is defined to measure the plausibility of representations as: where l h , l t and l r are the vectors of h, t and r of a triplet <h, r, t>. If a triplet <h, r, t> appears in the KG, it is regarded as a positive triplet. f r (h, t) of the positive triplet should be low, and is high otherwise. Thirdly, TransE learns the representations of an entity and its relations through minimizing the following objective function, which is defined based on a margin-based ranking loss.
where h and t are the head entity and tail entity in negative triplet (the triplet does not appear in the KG), and γ > 0 is a margin hyperparameter separating the positive triplets and negative triplets. This objective function is designed to improve the gap between positive triplets and negative triplets. Generally, because there are no unreasonable triplets collected into the KG, the negative triplets are generated by replacing the head entity or tail entity of a positive triplet with a random entity. The optimization is realized by stochastic gradient descent (SGD) [14]. The extension models of TransE are proposed to improve the accuracy of completion based on more complex restrictions. These extension models optimize the effects of entity and relation representation through: (1) mining the internal restrictions in KG fact triplets. For example, one head entity (tail entity) has multiple tail entities (head entities) under a relation by TransH [15]; one relation is used for different semantic of entities by TransR [7], CTransR [7], TransG [16], TransD [8], TransA [17]; the heterogeneity and imbalance that exists in the KG by TranSparse [18], and so forth; (2) adding extra information aside from fact triplets, such as entities' types by TKRL [19], entities' attributes by TransEA [20], entities' textual descriptions by DKRL [21], relation paths by PTransE [22], graph structures by GAKE [23] and TCE [24], facts' temporal information [25], and so on.
All of the above translation KE models assume that the entity should have distinct representations for different relations to improve the rationality of knowledge embedding. Unfortunately, these models will achieve poor performance on the link-sparse GeoKG because the geo-entities in GeoKGs only connect a few other geo-entities, and one type of geo-relation contains a limited number of fact triplets.

Materials and Methods
First, we extract two GeoKG datasets from the general datasets. Next, we present the method to optimize the translation KE model with geospatial distance restriction. Then, we introduce the workflow for completing a GeoKG by translation KE model.

GeoKG Extraction
There are currently no mature and open GeoKG datasets available, so we extract two GeoKGs separately from open datasets DBpedia and GADM.

GeoKG from DBpedia
DBpedia is a KG project designed to extract structured information from Wikipedia (https: //wiki.dbpedia.org/about), such as entity's category, entity type, properties, coordinates (geo-entities), abstract, context, and so on. DBpedia dumped its data by text file, where the raw structured information is organized as an n-triple format as "<http://dbpedia.org/resource/Lake_Erie http: //dbpedia.org/property/inflow http://dbpedia.org/resource/Detroit_River>". This information can be easily parsed into a fact triplet form as "<Lake Erie, inflow, Detroit River>". The DBpedia as a general KG contains different types of entities, so its geo-entities can reflect the sparseness of geographic information on the web. We use the English DBpedia version 2016-10 to build a GeoKG. This version contains 6.6 million entities and 1.7 billion triplets by 57 dump files. The details of generation are as follows.
(1) Using "category" and "entities' types" files to save the triplets whose head entities and tail entities are both geo-entities. Here, the entities belonging to the DBpedia categories "Agent-Organization" and "Place" are designated as geo-entities; (2) Using "entities' coordinates" files to filter the triplets whose head geo-entities and tail geo-entities both having geographic coordinates (the center points' longitude and latitude of geo-entities in DBpedia) information from the above results.
Finally, the GeoKG "GeoDBpedia" contains 44,819 geo-entities, 86 geo-relations, and 107,133 fact triples. The spatial distribution of geo-entities is shown in Figure 3 (blue points). All of the above translation KE models assume that the entity should have distinct representations for different relations to improve the rationality of knowledge embedding. Unfortunately, these models will achieve poor performance on the link-sparse GeoKG because the geo-entities in GeoKGs only connect a few other geo-entities, and one type of geo-relation contains a limited number of fact triplets.

Materials and Methods
First, we extract two GeoKG datasets from the general datasets. Next, we present the method to optimize the translation KE model with geospatial distance restriction. Then, we introduce the workflow for completing a GeoKG by translation KE model.

GeoKG Extraction
There are currently no mature and open GeoKG datasets available, so we extract two GeoKGs separately from open datasets DBpedia and GADM.

GeoKG from DBpedia
DBpedia is a KG project designed to extract structured information from Wikipedia (https://wiki.dbpedia.org/about), such as entity's category, entity type, properties, coordinates (geoentities), abstract, context, and so on. DBpedia dumped its data by text file, where the raw structured information is organized as an n-triple format as "<http://dbpedia.org/resource/Lake_Erie http://dbpedia.org/property/inflow http://dbpedia.org/resource/Detroit_River>". This information can be easily parsed into a fact triplet form as "<Lake Erie, inflow, Detroit River>". The DBpedia as a general KG contains different types of entities, so its geo-entities can reflect the sparseness of geographic information on the web. We use the English DBpedia version 2016-10 to build a GeoKG. This version contains 6.6 million entities and 1.7 billion triplets by 57 dump files. The details of generation are as follows.
(1) Using "category" and "entities' types" files to save the triplets whose head entities and tail entities are both geo-entities. Here, the entities belonging to the DBpedia categories "Agent-Organization" and "Place" are designated as geo-entities; (2) Using "entities' coordinates" files to filter the triplets whose head geo-entities and tail geoentities both having geographic coordinates (the center points' longitude and latitude of geo-entities in DBpedia) information from the above results.

GeoKG from GADM
GADM, the Database of Global Administrative Areas, provides maps and spatial data for all countries and their sub-divisions (https://www.gadm.org/index.html). As a geographic domain database, the GADM contains abundant geo-entities (divisions). Although the explicit geo-relations of GADM are only the administrative relationships between geo-entities, these relationships are essential information for each geo-entity in this database. Thus, the GADM can be used to simulate an ideal link-dense GeoKG, while the GADM itself is not a KG dataset. We use the GADM version 3.6 and extract the data located in the range of France to generate a GeoKG dataset.
To enrich the types of geo-relations, we design nine geo-relations.
(1) Four types of geo-relations are extended from the administrative relationship: ispartof1, ispartof2, ispartof3, and ispartof4. An administrative relationship means a given geo-entity is a portion of a high-level geo-entity for the purpose of administration, so this relationship can be considered as a semantic relationship. The level of the given geo-entity may be 2, 3, 4, or 5 in GADM, and the level of its high-level geo-entity may be 1, 2, 3, or 4. Among above levels, level 1 is the top level, and level 5 is the lowest level. For example, <Ambronay, ispartof3, Belley > represents "the division 'Ambronay' is a part of the administrative level 3 division 'Belley'".
The details of generation are described in Appendix A. Finally, the GeoKG "GADM-KG-FRA" contains 40,799 geo-entities and 555,443 fact triples. The spatial distribution of geo-entities is shown in Figure 4 (blue points).

GeoKG from GADM
GADM, the Database of Global Administrative Areas, provides maps and spatial data for all countries and their sub-divisions (https://www.gadm.org/index.html). As a geographic domain database, the GADM contains abundant geo-entities (divisions). Although the explicit geo-relations of GADM are only the administrative relationships between geo-entities, these relationships are essential information for each geo-entity in this database. Thus, the GADM can be used to simulate an ideal link-dense GeoKG, while the GADM itself is not a KG dataset. We use the GADM version 3.6 and extract the data located in the range of France to generate a GeoKG dataset.
To enrich the types of geo-relations, we design nine geo-relations.
(1) Four types of geo-relations are extended from the administrative relationship: ispartof1, ispartof2, ispartof3, and ispartof4. An administrative relationship means a given geo-entity is a portion of a high-level geo-entity for the purpose of administration, so this relationship can be considered as a semantic relationship. The level of the given geo-entity may be 2, 3, 4, or 5 in GADM, and the level of its high-level geo-entity may be 1, 2, 3, or 4. Among above levels, level 1 is the top level, and level 5 is the lowest level. For example, <Ambronay, ispartof3, Belley > represents "the division 'Ambronay' is a part of the administrative level 3 division 'Belley'".
The details of generation are described in Appendix A. Finally, the GeoKG "GADM-KG-FRA" contains 40,799 geo-entities and 555,443 fact triples. The spatial distribution of geo-entities is shown in Figure 4 (blue points).

KG Node Degree
We use the node degree to explore the quantitative difference of our GeoKG datasets and general KG datasets on entity linking. The node degree is a measure indicating how many edges (relations) link with a node (entity) in graph theory [26]. A high degree means this entity connects to more other

KG Node Degree
We use the node degree to explore the quantitative difference of our GeoKG datasets and general KG datasets on entity linking. The node degree is a measure indicating how many edges (relations) link with a node (entity) in graph theory [26]. A high degree means this entity connects to more other entities. WN18 and FB15K are two general KG datasets in many KG completion research publications. WN18 is a triplet dataset containing contains 40,943 entities and 18 types of relations, which is extracted from WordNet (a lexical knowledge base to support dictionary and thesaurus;). FB15K is a triplet dataset containing 14,951 entities and 1345 types of relations, which is extracted from Freebase (a fact knowledge base like DBpedia). Figure 5 shows the cumulative frequency of degree 1 to 9 of each dataset. It is apparent from this figure that the percentage of entities with degree 1 in GeoDBpedia (70.14%) far exceeds GADM-KG-FRA (0%), WN18 (14.87%), and FB15K (2.18%). Moreover, the percentage of entities with degree ≤ 2 in GeoDBpedia is 86.52%, while this percentage for WN18 is just over half this amount. Consequently, GeoDBpedia, as a link-sparse GeoKG, cannot supply enough training data for current translation KE models to learn the entity's distinct representations of different relations. Thus, the performance of above translation KE models on the real GeoKG completion will be poor. entities. WN18 and FB15K are two general KG datasets in many KG completion research publications. WN18 is a triplet dataset containing contains 40,943 entities and 18 types of relations, which is extracted from WordNet (a lexical knowledge base to support dictionary and thesaurus;). FB15K is a triplet dataset containing 14,951 entities and 1345 types of relations, which is extracted from Freebase (a fact knowledge base like DBpedia). Figure 5 shows the cumulative frequency of degree 1 to 9 of each dataset. It is apparent from this figure that the percentage of entities with degree 1 in GeoDBpedia (70.14%) far exceeds GADM-KG-FRA (0%), WN18 (14.87%), and FB15K (2.18%). Moreover, the percentage of entities with degree ≤ 2 in GeoDBpedia is 86.52%, while this percentage for WN18 is just over half this amount. Consequently, GeoDBpedia, as a link-sparse GeoKG, cannot supply enough training data for current translation KE models to learn the entity's distinct representations of different relations. Thus, the performance of above translation KE models on the real GeoKG completion will be poor.

Geospatial Distance Restriction Hiding in GeoKG
The GeoKG supports abundant geospatial information, which will become the crucial additional information to optimize the representations of geographic knowledge. Each geo-entity of GeoKG must refer to a precise or vague geospatial location. Then, the symbolic geo-relations between two geo-entities can be regarded as the reflection of the geospatial distance of these two geo-entities. That is, the geo-relations which can be used in describing two geo-entities are restricted by their geospatial distance.
We assume that each geo-relation can only be used to describe the geo-entities which are located in a certain distance range. Figure 6 shows the cumulative frequency distributions of the Euclidean distances between the head geo-entity and tail geo-entity of fact triplets. Figure 6a is the result of five geo-relations' fact triplets in GeoDBpedia, and Figure 6b is that of nine geo-relations' fact triplets in GADM-KG-FRA. Here, the distance between two geo-entities is the geospatial distance between their center points. It is apparent that there are significant differences in the distance distributions of different geo-relations' fact triplets. As shown in Figure 6a, the number of "parent mountain peak" geo-relation triplets with a short distance is obviously larger than that of "mouth country" georelation triplets with a short distance. This means that if we have two geo-entities with a short distance, the most likely geo-relation between them is "parent mountain peak", otherwise the georelation is "mouth country". Inspired by above statistical results, we will introduce the geospatial Cumulative frequency Degree GeoDBpedia GADM-KG-FRA WN18 FB15K Figure 5. Cumulative frequency of entity degree 1 to 9 of different datasets.

Geospatial Distance Restriction Hiding in GeoKG
The GeoKG supports abundant geospatial information, which will become the crucial additional information to optimize the representations of geographic knowledge. Each geo-entity of GeoKG must refer to a precise or vague geospatial location. Then, the symbolic geo-relations between two geo-entities can be regarded as the reflection of the geospatial distance of these two geo-entities. That is, the geo-relations which can be used in describing two geo-entities are restricted by their geospatial distance.
We assume that each geo-relation can only be used to describe the geo-entities which are located in a certain distance range. Figure 6 shows the cumulative frequency distributions of the Euclidean distances between the head geo-entity and tail geo-entity of fact triplets. Figure 6a is the result of five geo-relations' fact triplets in GeoDBpedia, and Figure 6b is that of nine geo-relations' fact triplets in GADM-KG-FRA. Here, the distance between two geo-entities is the geospatial distance between their center points. It is apparent that there are significant differences in the distance distributions of different geo-relations' fact triplets. As shown in Figure 6a, the number of "parent mountain peak" geo-relation triplets with a short distance is obviously larger than that of "mouth country" geo-relation triplets with a short distance. This means that if we have two geo-entities with a short distance, the most likely geo-relation between them is "parent mountain peak", otherwise the geo-relation is "mouth country". Inspired by above statistical results, we will introduce the geospatial distance restriction into a translation KE model.  6. Cumulative frequency distributions of the distances between the head geo-entity and tail geo-entity of fact triplets: (a) GeoDBpedia (parent mountain peak: a peak's parent as a particular peak in the higher terrain connected to the peak; crosses: where the bridge crosses a river; inflow: a source of the water in the body of water; located in area: where the entity is located in a place; mouth country: where the body of water flows into a country); (b) GADM-KG-FRA.

Model Optimization with Geospatial Distance Restriction
As mentioned in Section 2, the objective functions of translation KE models aim to widen the gap between positive triplets and negative triplets. Then, the training results will assign positive triplets the high score and negative triplets the low score. Thus, our method introduces geospatial distance restriction to translation KE model by modifying the objective function.   Figure 6. Cumulative frequency distributions of the distances between the head geo-entity and tail geo-entity of fact triplets: (a) GeoDBpedia (parent mountain peak: a peak's parent as a particular peak in the higher terrain connected to the peak; crosses: where the bridge crosses a river; inflow: a source of the water in the body of water; located in area: where the entity is located in a place; mouth country: where the body of water flows into a country); (b) GADM-KG-FRA.

Model Optimization with Geospatial Distance Restriction
As mentioned in Section 2, the objective functions of translation KE models aim to widen the gap between positive triplets and negative triplets. Then, the training results will assign positive triplets the high score and negative triplets the low score. Thus, our method introduces geospatial distance restriction to translation KE model by modifying the objective function.
Firstly, we add a geospatial weight to the objective function to ease this gap if the distance between two geo-entities in negative triplets is unreasonable. The geospatial weight is defined as: where dis(h, t) is the distance between the head geo-entity and tail geo-entity. θ is a compensation term to avoid having a denominator equal to zero. For simplicity, the locations of head geo-entity and tail geo-entity are both abstracted as points. Then the dis(h, t) is measured as: where h x and h y are the longitude and latitude of the head geo-entity; t x and t y are the longitude and latitude of the tail geo-entity. Next, the objective function becomes: Specifically, the effects of two above depicted functions are: if the distance between h and t of a negative triplet is greater or less than the distance between h and t of the positive triplet, the w geo will be less than 1, then the final score w geo f r (h , t ) of the negative triplet is turned lower. Thus, the model has to generate a lower f r (h, t) or higher f r (h , t ) to achieve the effects as before. A lower f r (h, t) means that h and t need to be closer to each other in the vector space. A higher f r (h , t ) will increase the distance between h and t in the vector space.
Most objective functions of translation KE models are the same as in Equation (2), so the proposed method has the ability to optimize all these translation KE models. Next, TransR and TransD will be optimized with geospatial distance restriction in the same way.
TransR [7] builds entity and relation embedding in separate entity spaces and relation spaces, to reflect the phenomenon that an entity may have multiple semantics and various relations which focus on the different semantics of entities. Its scoring function is: where M r is a projection matrix from the entity space to the relation space of r.
TransD [8] uses two vectors to represent an entity (a relation) to further recognize the different meanings between the head entity and tail entity in one triplet. Its scoring function is: where, w h , w t and w r are the mapping vectors for the representations of h, t and r, and I is an identity matrix. Then, the vectors of h and t are projected by w r w h + I and w r w t + I .
The objective functions of TransR and TransD both are the same as TransE, so the proposed geospatial weight can also be introduced by the same objective function as Equation (5) into TransR and TransD.
From the original models TransE, TransR and TransD, the optimized models by the proposed method are named as TransE-GDR, TransR-GDR and TransD-GDR. We implement the translation KE models based on the open-source package Fast-TransX (https://github.com/thunlp/Fast-TransX). This package includes TransE, TransR and TransD code. Then, TransE-GDR, TransR-GDR and TransD-GDR are obtained by modifying the code.

GeoKG Completion
After obtaining the trained translation KE models, the missing entities or relations of facts will be predicted with vector operations.

Entity Prediction and Relation Prediction
Given the vectors l h and l r of the head geo-entity h and the geo-relation r, the missing tail geo-entity ? of the triplet <h, r, ?> can be predicted by the operation l h + l r . The detailed procedure is as follows.
(1) the missing tail geo-entity "?" of the triplet <h, r, ?> is replaced by all vectors of geo-entities in the known GeoKG. Then, the candidate triplets of triplet <h, r, ?> are generated, whose number equals to the number of all geo-entities in GeoKG.
(2) the score of each candidate triplet <h, r, x> is calculated by a trained translation KE model. For example, if the translation KE model is TransE-GDR, the score of candidate triplet will obtained Equation (1).
(3) these candidate triplets can be ranked by scores in ascending order. The tail geo-entity x of the first candidate triplet becomes the missing tail entity of the triplet <h, r, ?>, or the tail geo-entities of the top-n candidate triplets become the candidate missing tail geo-entities for subsequent inference with external information.
Similarly, the missing head geo-entity of triplet <?, r, t> or the missing geo-relation of triplet <h, ?, t> can also be predicted through the above steps.

Location Prediction
Because the geo-entities of GeoKG have geospatial coordinates, the result of entity prediction can be used to predict the location of an unknown geo-entity. If the geospatial distance restriction is successfully embedded into the semantic vector space, the geo-entity which is closer to the correct geo-entity in the geographic space will be predicted at higher ranking positions on entity prediction. So, even the entity prediction does not give the correct geo-entity, the location of the missing geo-entity can be generated from the geospatial coordinates of the predicted candidate geo-entities: the coordinate of the first predicted geo-entity become the possible location of the missing geo-entity; or a polygon constructed by the coordinates of the top-n predicted geo-entities becomes an area which the missing geo-entity with high probability is located in.

Experiments and Results
Two kinds of tasks are used to evaluate the performance of various models in GeoKG completion: link prediction and location prediction. The former is a common task, and the latter is a novel task to explore the models' capacities in predicting the locations of unknown entities. First, we introduced the experimental datasets extracted from the above two GeoKGs. Next, we presented the metrics and results of the two evaluation tasks. Finally, we analyze these results.

Experimental Datasets
We generate the experimental dataset from GeoDBpedia and GADM-KG-FRA. Because some types of geo-relations in GeoDBpedia only have few fact triplets, which will influence the training effect, the geo-relations with a number of fact triplets exceeding 100 are selected from GeoDBpedia.
In addition, the triplets whose relations (properties) are sensitive to geospatial distance are reserved manually. As illustrated above, the geo-relation "parent mountain peak" may be applied to describe two mountains if these two geo-entities are close to each other in geospatial terms. The geo-relation "twinTown" is used to describe a legal or social agreement between towns, and is not related to their geospatial distance. This dataset is named as "GeoDBpedia21" and its types of geo-relations are listed in Table 1. where the bridge crosses a river major island which small major islands the island has mouth country where the body of water flows into a country island an island belongs to or contains the place right tributary a stream or river that flows into its right larger stream or main stem (or parent) river or a lake left tributary a stream or river that flows into its left larger stream or main stem (or parent) river or a lake All fact triplets of GADM-KG-FRA are used as the experimental dataset, which is named as "GADM9" for simplicity.
Above two experimental datasets are divided into the training set, validation set and test set as 8:1:1. If the triplets' head geo-entities or tail geo-entities do not appear in the training set, these triplets will be removed from the validation set and test set. Table 2 gives the statistics of final GeoDBpedia21 and GADM9.

Link Prediction and Results
The purpose of link prediction is to evaluate the model's performance on entity prediction and relation prediction. Link prediction is based on the results of KG completion as Section 3.3, where the translation KE model is trained by the fact triplets of training set. For entity prediction, the head entities (tail entities) are removed from the fact triplets of validation set or test set. The valid or test fact triplets become the incomplete fact triplets. After generating the ranked candidate triplets of each incomplete fact triplet through the steps of Section 3.3, the ranks of the correct head entities (geo-entities) are obtained. Two metrics are reported: (i) MeanRank, the mean of the correct entity rank, and (ii) Hits@10, the proportion of correct entities ranked in the top 10. The mean of the results of head entity prediction and tail entity prediction is the result of entity prediction.
Relation prediction is similar to entity prediction; the relations in valid triplets or test triplets are removed and replaced with all relations in the training set. The metrics include MeanRank and Hits@1 (the proportion of correct relations ranked in the top 1).
Note that, for some facts of a GeoKG, the correct geo-entities or geo-relations may not be unique. To avoid mistaking corrupted triplets as errors in the valid phase or test phase, these triplets will be removed from the training set, validation set, and test set before ranking. Thus, the above metrics can be further divided into the (Raw) part and (Filter) part.
Learning rate λ, margin γ, and vector space's dimension k are three important parameters of translation KE models. The learning rate λ is a parameter of SGD algorithm, which is used to control the rate of gradient descent for learning the representations of entities and relations. The margin γ is a margin hyperparameter of the margin-based ranking loss mentioned in Section 2. The dimension k is the dimension of the embedding vector space. First, we select λ for SGD among {0.1, 0.01, 0.001, 0.0001, 0.00001}, γ among {0.5, 1, 1.5, 2, 4, 6, 8, 10}, and k among {50, 100, 150, 200}. Next, the best configurations of GeoDBpedia21 and GADM9 are determined according to the metrics MeanRank and Hits@10 of entity prediction with the validation sets. The optimal configurations of each translation KE model used in the test set are shown in Table 3. Tables 4 and 5 display the results on entity prediction and relation prediction. It can be seen that the optimized models (TransE-GDR, TransR-GDR and TransD-GDR) both outperform their originals (TransE, TransR and TransD) on GeoDBpedia21: MeanRank(Filter) is reduced by an average of 429.60, and the Hits@10(Filter) is improved by an average of 6.41% on entity prediction; MeanRank(Filter) is reduced by an average of 2.84, and the Hits@10(Filter) is improved by an average of 46.56% on relation prediction. As a contrast, the difference between the results of each model on GADM9 is not obvious.

Location Prediction and Results
Location prediction experiment is designed based on the results of entity prediction (Filter). The means of error distances between the correct geo-entity and the top 1, top 5 or top 10 predicted geo-entities will be reported, and labeled as MeanDis@1, MeanDis@5, and MeanDis@10. To simulate the application scenario of inferring the locations of unknown geo-entities which are not indexed by gazetteers (training set), the correct geo-entities will be excepted from the prediction results to calculate the metric MeanDis. Thus, the above metrics can be further divided into the (Known) part and (Unknown) part. Figure 7 shows the error distance results of location prediction on GeoDBpedia21 and GADM9. all optimized models (TransE-GDR, TransR-GDR and TransD-GDR) decline on each metric. Concretely, on GeoDBpedia21, the average reducing rates of three (Know) metrics are −31.06%, −49.87% and −51.14%; of three (Unknown) metrics are −43.81%, −50.65% and −51.56%. On GADM9, the average reducing rates of three (Know) metrics are −10.34%, −32.82% and −38.52%; of three (Unknown) metrics are −29.56%, −36.40% and −39.93%.

Result Analysis
Comparing the results of GeoDBpedia21 and GADM9 (Tables 4 and 5), it can be seen that the difference between the results of each model on GADM9 is not more obvious than those obtained on GeoDBpedia21. The reason for this is that a link-dense GeoKG provides enough links of geo-entities, which can facilitate models to encode the geo-entities and geo-relations into the right positions in a low-dimensional vector space by multiple links. Thus, the link prediction results of the geographic distance restriction on GADM9 do not show a substantial improvement. Besides, it is apparent that TransR and TransD behave significantly worse than TransE on the link-sparse GeoDBpedia21. Meanwhile, the performance of TransR and TransD is close or better to that of TransE on the linkdense GADM9. Theoretically, TransR and TransD as the extension models of TransE, should achieve better prediction results than TransE [27]. Therefore, the above difference between two datasets

Result Analysis
Comparing the results of GeoDBpedia21 and GADM9 (Tables 4 and 5), it can be seen that the difference between the results of each model on GADM9 is not more obvious than those obtained on GeoDBpedia21. The reason for this is that a link-dense GeoKG provides enough links of geo-entities, which can facilitate models to encode the geo-entities and geo-relations into the right positions in a low-dimensional vector space by multiple links. Thus, the link prediction results of the geographic distance restriction on GADM9 do not show a substantial improvement. Besides, it is apparent that TransR and TransD behave significantly worse than TransE on the link-sparse GeoDBpedia21. Meanwhile, the performance of TransR and TransD is close or better to that of TransE on the link-dense GADM9. Theoretically, TransR and TransD as the extension models of TransE, should achieve better prediction results than TransE [27]. Therefore, the above difference between two datasets indicates that the sparseness of links between geo-entities severely limits the performances of TransR and TransD on current GeoKG. Meanwhile, the presented method alleviates the influence of sparseness on geographic knowledge representation by capturing the geospatial distance restriction hiding in the GeoKG. Thus, TransR-GDR and TransD-GDR achieve performances that are close to or better than TransE-GDR on GeoDBpedia21.
The performance of location prediction is mainly affected by the entity prediction results on GeoDBpedia21. Because the entity prediction results of original models are significantly worse than their optimized models, the location prediction results of the original models are poor. As shown in Figure 7c,d, while the entity prediction results of all models are close to each other on GADM9, the optimized models show the better performance on location prediction than that of the original models. It means that more geo-entities near the correct location in a geographic space are predicted at higher ranking positions in entity prediction using our method.
In summary, the optimized models (TransE-GDR, TransR-GDR and TransD-GDR) perform well on both link-dense GeoKG (GADM9) and link-sparse GeoKG (GeoDBpedia21). Thus, these optimized models are the better options for GeoKG completion, especially when the sparsity of GeoKG is unknown. Among these optimized models, although TransR-GDR and TransD-GDR achieve similar performance on the above tasks, which are both better than that of TransE-GDR, TransD-GDR can be the preferred model in GeoKG completion due to TransR-GDR requiring a longer training time.
Next, we explore the models' performance of geo-entity prediction by different geo-relation types, and the impact of training set scale on geo-entity prediction effect. Then, we give a specific case to illustrate the results of geo-entity prediction.

Geo-Entity Prediction by Different Geo-Relation Types
We compare the models' performance of entity prediction on GeoDBpedia21 by different geo-relation types. The results are shown in Table 6. Note that, the result of geo-relation "river" is not reported, because that the head geo-entities or tail geo-entities of triplets with geo-relation "river" in test set do not appear in the training set, and all "river" triplets are removed from test set finally. The geo-relations which contain more than 50 test triplets are selected for further analysis to ensure that the data is representative. These geo-relations are "inflow", "mountain range", "parent mountain peak", "located in area", "source country", "mouth place" and "mouth mountain". As mentioned before, for translation KE models, the semantic difference of relations is distinguished by mining the graph links of KGs. And the proposed method uses the geographical distance restriction from GeoKG to enable the models to distinguish the semantic difference better. Thus, if a geo-relation has (a) special semantic reflecting by the graph links of GeoKGs, and (b) more candidate geo-entities which located in the certain distance range, this geo-relation may achieve better prediction performance. Then geo-entity similarity and distance distribution similarity are designed to explore the geo-relations' difference on geo-entity prediction: 1.
The geo-entity similarity is a similarity of geo-relations based on whether the geo-entities become the head or tail geo-entities of the geo-relations' fact triplets. A geo-relation can be represented as a vector: dimension is the number of head geo-entities plus that of tail geo-entities in the training set, and the value is 1 when a geo-entity as the head (tail) geo-entity of the geo-relation triplet. The geo-entity similarity between two geo-relations is higher, the geo-entities of these geo-relations' fact triplets are more consistent, namely the semantics of these geo-relations are more similar.

2.
The distance distribution similarity is a similarity of geo-relations based on the frequency distributions of the distances between the head geo-entities and tail geo-entities of fact triplets. A geo-relation can be represented as a vector: dimension is the number of distance range, and value is the number of geo-entities in a distance range. Then the distance distribution similarity between two geo-relations is higher, the distance distributions of these geo-relations' geo-entities are more similar. Figure 8 shows the two similarity results of above 7 geo-relations. It can be seen that (1) the geo-relations ("inflow" and "mountain range"), whose geo-entity prediction is significantly improved by the proposed method, have more other geo-relations with similar distance distributions (similarity ≥ 0.8) and no other geo-relations with same geo-entities (similarity ≥ 0.8).
(2) the geo-relations ("parent mountain peak", "located in area" and "source country"), whose geo-entity prediction slightly improved by the proposed method, have less other geo-relations with similar distance distribution and no other geo-relations with same geo-entities. (3) the geo-relations ("mouth place" and "mouth mountain"), which achieve worse prediction performances, have less other geo-relations with similar distance distributions but one other geo-relations with same geo-entities. To summarize, if a geo-relation can obtain more candidate geo-entities which located in its certain distance range from GeoKG, but these candidate geo-entities are different from the known geo-entities of this geo-relation's fact triplets, the prediction performance of this geo-relation will be better. and no other geo-relations with same geo-entities. (3) the geo-relations ("mouth place" and "mouth mountain"), which achieve worse prediction performances, have less other geo-relations with similar distance distributions but one other geo-relations with same geo-entities. To summarize, if a georelation can obtain more candidate geo-entities which located in its certain distance range from GeoKG, but these candidate geo-entities are different from the known geo-entities of this georelation's fact triplets, the prediction performance of this geo-relation will be better.

Impact of Training Set Scale on Geo-Entity Prediction
We extract sub-datasets from GeoDBpedia21 by proportion 30%, 40%, 50%, 60%, 70%, 80%, 90% and 100% to analyze the impact of training set scale on geo-entity prediction performance. The triplet number of training set of each sub-dataset is 13,990, 18,654, 23,326, 27,987, 32,651, 37,317, 41,984 and 46,657. In order to ensure the comparability of the results, the validation sets and test sets of each sub-data set are the same: the triplet number of validation set and test set is 810 and 785. Figure 9 shows the geo-entity prediction results of the model TransD and TransD-GDR on each sub-data set. The orange line and values represent the improvements of the performance of TransD-GDR compared with that of TransD. It can be seen that, as the training set scale increases, the overall trend of Hits@10(Filter) decreases. To further analyze the reason for this result, we calculate the node degrees of geo-entities of the test set in different training sets of sub-data sets. Figure 10 shows the cumulative frequency of degree 1 to 9 of each sub-dataset. It is apparent that as the training set scale increases, the node degree of geo-entities also increases. More concretely, the geo-entities will connect more other geo-entities in a training set with a large number of triplets. Then the original model TransD cannot obtain enough training triplets to refine the representation of geo-entities and geo-relations, so the improvement of TransD-GDR decreases. Thus, the impact of the training set scale on the performance of the proposed method depends on whether the training set supplies enough triplets to increase the connections between geo-entities or not. the node degree of geo-entities also increases. More concretely, the geo-entities will connect more other geo-entities in a training set with a large number of triplets. Then the original model TransD cannot obtain enough training triplets to refine the representation of geo-entities and geo-relations, so the improvement of TransD-GDR decreases. Thus, the impact of the training set scale on the performance of the proposed method depends on whether the training set supplies enough triplets to increase the connections between geo-entities or not.    Figure 9. The performance of geo-entity prediction on different sub-datasets.
cannot obtain enough training triplets to refine the representation of geo-entities and geo-relations, so the improvement of TransD-GDR decreases. Thus, the impact of the training set scale on the performance of the proposed method depends on whether the training set supplies enough triplets to increase the connections between geo-entities or not.    Figure 10. Cumulative frequency of entity degree 1 to 9 of sub-datasets.

Geo-Entity Prediction Case
We use a specific case to demonstrate the detail results of entity prediction: "which mountain range the Monte Acero belongs to". This fact is not supported by the DBpedia entity "Monte Acero" (http://dbpedia.org/page/Monte_Acero). The above presented case is converted into an incomplete triplet in the form of DBpedia: <http://dbpedia.org/resource/Monte_Acero, http://mappings.dbpedia.org/server/ontology/classes/ Mountain, ?> (abbreviated as <Monte Acero, mountain range, ?> for simplicity) The missing tail geo-entity ? is predicted through the steps of Section 3.3, where TransD and TransD-GDR trained on GeoDBpedia21 are selected in this case. Table 7 shows the top 10 predicted tail geo-entities of the case based on TransD and TransD-GDR. The Apennine Mountains is proved to be the correct tail geo-entity of <Monte Acero, mountain range, "?"> through a special website (https://www.mountain-forecast.com/peaks/Monte-Acero/forecasts/736). The results indicate that the correct geo-entity predicted by TransD-GDR has a higher rank than that generated by TransD. Thus, compared with the original translation KE model, the presented method enhances the chances of correctly predicting the missing geo-entities. Table 7. Tail geo-entity prediction results of the incomplete triplet <Monte Acero, mountain range, ?>.

Discussion
Current translation KE models assume that the entity should have distinct representations for different relations to improve the rationality of knowledge embedding. However, because that the geo-entities in GeoKG only connect a few other geo-entities, and one type of geo-relation contains a limited number of fact triplets, these current models will achieve poor performance on this link-sparse GeoKG. We address the shortcomings of such models by introducing the geospatial distance restriction hiding in the GeoKG. Through the proposed method, the translation KE method has the ability to capture and embed the geospatial distance restriction with the semantic information of GeoKG into a vector space. In this vector space, the geo-entities with same distance in a geographic space will maintain a similar distance from each other. Then, the optimized model outputs the refined representations of geo-entities and geo-relations, which improves the completion performance on the link-sparse GeoKG. Concretely, on the one hand, the geospatial distance restriction drives the geo-entities near the correct location to be given higher ranking positions in entity prediction. Then, the correct geo-entity as one of geo-entities near the correct location also appears higher in the ranking, which increases the possibility of predicting the missing entity correctly. On the other hand, the geospatial distance restriction is merged into the semantic vector spaces of the optimized models, which facilitates relation prediction to find the best geo-relation for two geo-entities based on their geospatial distance. Therefore, the presented method alleviates the influence of link-sparseness on geographic knowledge representation by capturing the geospatial distance restriction hiding in the GeoKG.
However, there are still some issues that require further investigation.
(1) In this study, the geospatial distance between two geo-entities is simplified to the distance of these geo-entities' center points, which leads to that the geospatial distance between two geo-entities with some relation types, such as the adjacency of two administrative divisions, or the cross of a river and a bridge, is not equal to zero. While this simplicity is not completely rigorous from the geographic perspective, it is still reasonable. For example, although the geospatial distance between two geo-entities with adjacency relation is not equal to zero under this simplicity, this geospatial distance is not unlimited. If the adjacency relation is of two administrative divisions, their geospatial distance will not exceed the maximum length of all divisions. Thus, the geospatial distance is still a valuable restriction for the above relation types. Because high quality geometric data of different types of point geo-entities, linear geo-entities and area geo-entities is scattered across different datasets yet (like GADM only contains the polygons of administrative divisions), we will attempt to introduce the geospatial distance restriction, topology restriction and orientation restriction of the geo-entities with difference geometric types into translation KE models after fusing current GeoKGs and geographical databases based on aligning technology [28,29] in subsequent research.
(2) The translation KE model is a data-driven model for representing geo-entities and geo-relations of GeoKGs. On one hand, the model understands the semantic difference of geo-relations by mining the graph links of the detailed GeoKGs but not by the conceptual level of geo-relation. So, if the fact triplets of two geo-relations have the same geo-entities in the training set, the proposed method will treat these two geo-relations as the same geo-relations. This is because the semantic and the geographical distance restrictions of these two geo-relations are the same, as geo-relation "mouth place" and "mouth mountain" stated in Section 4.4.1. On the other hand, the error distance of location prediction is influenced by the distribution of geo-entities in GeoKGs. For example, the average error distance of location prediction is 369,725.24 m (MeanDis@1(Unknown)) with GeoDBpedia21 because the geospatial information of the GeoDBpedia21 is scarce at a global scale, so the geospatial distance between geo-entities is large. Thus, aligning different GeoKGs and geographical databases to supply more fact triplets is a feasible way for the translation KE model to further distinguish the semantic of geo-relations and reduce the error distance of location prediction. Besides, merging the similar graph links of geo-relations before model training can also improve the prediction performance of models.
(3) Translation KE model is a tool to take full advantage of known geo-entities and geo-relations that exist in the GeoKG to complete the GeoKG. However, this method cannot complete the missing geo-entities or geo-relations which have never appeared in a GeoKG. Meanwhile, information extraction is an effective way to extract the missing unknown facts from external structured, semi-structured or unstructured texts [30,31]. Thus, integration of different completion results may be worth researching as a way to improve the quality of GeoKGs completion.

Conclusions
In this paper, a method with geospatial distance restriction is presented to optimize knowledge embedding for supplying the missing geographic facts of a link-sparse GeoKG. Specifically, by adding the geospatial distance restriction to the objective function of translation KE models, these models output the optimized representations of geo-entities and geo-relations of a link-sparse GeoKG. Then, the missing geo-entities or geo-relations can be predicted by a vector operation according to both the semantic relationship and the geospatial distance feature between geo-entities in the GeoKG.
The effects of the presented method are validated on a real GeoKG extracted from DBpedia. Compared with the results of the original models, the presented method improves the metric Hits@10(Filter) by an average of 6.41% for geo-entity prediction, and the Hits@1(Filter) by an average of 31.92% for geo-relation prediction. Furthermore, a set of novel experiments explored the models' capacities for predicting the locations of unknown geo-entities. The results show that the geospatial restriction reduced the average geospatial distance error of prediction by between 54.43% and 57.24%. All results indicate that the presented method successfully captures the geospatial distance restriction hiding in the GeoKG to refine the representations of geo-entities and geo-relations in the link-sparse GeoKG. In addition, more experimental results indicate that the completion performance of the proposed method is influenced by the graph structure of GeoKG: the proposed method will improve the completion performance better on (1) the GeoKGs whose links of geo-entities are more sparse, and (2) the geo-relation with more candidate geo-entities differed from the known geo-entities of this geo-relation's fact triplets within the distance restriction.
Optimizing the embedding representations of geographic knowledge in the GeoKG is the next step: (1) adding the connections between geo-entities and non-geo-entities to methods for learning the difference between geo-entities; and (2) fusing multiple geospatial restriction, which include not only distance relations but also topological relations and orientation relations. Additionally, introducing these embedding representations of geographic knowledge into other GeoKG applications, such as geographic knowledge extraction and geographic QA, is a promising direction for further research.