Intelligent Interaction with Virtual Geographical Environments Based on Geographic Knowledge Graph

: The core of intelligent virtual geographical environments (VGEs) is the formal expression of geographic knowledge. Its purpose is to transform the data, information, and scenes of a virtual geographic environment into “knowledge” that can be recognized by computer, so that the computer can understand the virtual geographic environment more easily. A geographic knowledge graph (GeoKG) is a large-scale semantic web that stores geographical knowledge in a structured form. Based on a geographic knowledge base and a geospatial database, intelligent interactions with virtual geographical environments can be realized by natural language question answering, entity links, and so on. In this paper, a knowledge-enhanced Virtual geographical environments service framework is proposed. We construct a multi-level semantic parsing model and an enhanced GeoKG for structured geographic information data, such as digital maps, 3D virtual scenes, and unstructured information data. Based on the GeoKG, we propose a bilateral LSTM-CRF (long short-term memory-conditional random ﬁeld) model to achieve natural language question answering for VGEs and conduct experiments on the method. The results prove that the method of intelligent interaction based on the knowledge graph can bridge the distance between people and virtual environments.


Introduction
Virtual geographical environments (VGEs) were proposed in the late 1990s and have since been deeply studied with regards to visualization, geographic simulation, predictive analysis, knowledge sharing, collaborative modeling [1][2][3][4][5][6], and so on. With the development of artificial intelligence and big data, intelligent VGEs have to meet the new challenges of geographic knowledge engineering [7]. VGE knowledge engineering provides the supporting theory, method, and technical detail for knowledge-based intelligent VGEs. VGE knowledge involves an evolving process including knowledge generation, integration, and sharing. Intelligent VGEs, based on knowledge, impose higher requirements on the interactivity and intelligent services and will improve the level of intelligence of geographic knowledge services for VGEs.
The core of intelligent VGEs is the formal expression of geographic knowledge. Its purpose is to transform the data, information, and scenes of a virtual geographic environment into "knowledge" that can be recognized by computer, so that computers can understand virtual geographic environments more easily. It can be used in knowledge search, geographic reasoning [8], spatial cognitive simulation [9,10], natural language interaction, and so on.
In terms of the interaction between humans and virtual geographic environments, a lot of work has been done on immersion and interaction by many researchers. Virtual reality, computer graphics,

Related Work
The core idea of VGE knowledge searches based on GeoKG is to construct a semantic knowledge network. The framework is shown in Figure 1. A geographic knowledge base is a semantic web-in comparison, a geographic information database is based on a geo-spatial model such as a vector model, a raster model, or a volume model. Using a multi-level semantic conversion model, a geographic information database can be enhanced to a geographic knowledge base in order to enhance the knowledge service.

Related Work
The core idea of VGE knowledge searches based on GeoKG is to construct a semantic knowledge network. The framework is shown in Figure 1. A geographic knowledge base is a semantic web-in comparison, a geographic information database is based on a geo-spatial model such as a vector model, a raster model, or a volume model. Using a multi-level semantic conversion model, a geographic information database can be enhanced to a geographic knowledge base in order to enhance the knowledge service.

Knowledge Representation of VGE
Geographic knowledge is the higher level of geographic information. The expression and sharing of geographic knowledge are important in intelligent virtual geographic environments (VGEs). Mehdi Mekni [8] defined the notion of environment knowledge (EK) as a description (like a formal specification of a program) of the spatial concepts (geographic features) and relationships (topologic, semantic) that may exist in a geo-graphic environment. Lin [11] divided geographic knowledge into three levels: factual knowledge, rules and control knowledge, and decision-oriented knowledge. He proposed that the knowledge in VGEs (VGE-K) is geo-information related to various geoscience questions, the placement of phenomena in a geo-context and the extraction of geospatial rules [7]. Robert Laurini [12,13] presented a tentative conceptual framework for managing practical geographic knowledge, and geographic knowledge base (GKB) includes geographic objects, geographic structures, geographic relations, geographic rules, geographic ontology, gazetteer, physico-mathematical models, and external knowledge.
For knowledge representation and reasoning in multiagent geo-simulations, Mehdi Mekni proposed the informed virtual geographic environments (a semantically enriched and geometrically accurate VGE) [14]. He used the GIS data to generate an informed topologic graph (ITG), where each node corresponds to the map's triangles, and each arc corresponds to the adjacency relations between these triangles [15]. To achieve the intelligent VGEs, Lin [11] presented the knowledge engineering of VGEs (VGE-KE) which is a promising means of providing geo-knowledge.
From the perspective of geographical knowledge representation, we divide geographical knowledge into factual knowledge and process knowledge. Factual knowledge mainly refers to the knowledge (so-called "lightweight" knowledge) that reflects the external characteristics and connections of geographical entities, such as geographical terms, gazetteer, geographical distribution and geographical data. Procedural knowledge refers to the geographical model describing the spatial-temporal transformation of geography, such as the law of geographical evolution and the law of geographical prediction, which belongs to the knowledge with strong specialty.

Knowledge Representation of VGE
Geographic knowledge is the higher level of geographic information. The expression and sharing of geographic knowledge are important in intelligent virtual geographic environments (VGEs). Mehdi Mekni [8] defined the notion of environment knowledge (EK) as a description (like a formal specification of a program) of the spatial concepts (geographic features) and relationships (topologic, semantic) that may exist in a geo-graphic environment. Lin [11] divided geographic knowledge into three levels: factual knowledge, rules and control knowledge, and decision-oriented knowledge. He proposed that the knowledge in VGEs (VGE-K) is geo-information related to various geoscience questions, the placement of phenomena in a geo-context and the extraction of geospatial rules [7]. Robert Laurini [12,13] presented a tentative conceptual framework for managing practical geographic knowledge, and geographic knowledge base (GKB) includes geographic objects, geographic structures, geographic relations, geographic rules, geographic ontology, gazetteer, physico-mathematical models, and external knowledge.
For knowledge representation and reasoning in multiagent geo-simulations, Mehdi Mekni proposed the informed virtual geographic environments (a semantically enriched and geometrically accurate VGE) [14]. He used the GIS data to generate an informed topologic graph (ITG), where each node corresponds to the map's triangles, and each arc corresponds to the adjacency relations between these triangles [15]. To achieve the intelligent VGEs, Lin [11] presented the knowledge engineering of VGEs (VGE-KE) which is a promising means of providing geo-knowledge.
From the perspective of geographical knowledge representation, we divide geographical knowledge into factual knowledge and process knowledge. Factual knowledge mainly refers to the knowledge (so-called "lightweight" knowledge) that reflects the external characteristics and connections of geographical entities, such as geographical terms, gazetteer, geographical distribution and geographical data. Procedural knowledge refers to the geographical model describing the spatial-temporal transformation of geography, such as the law of geographical evolution and the law of geographical prediction, which belongs to the knowledge with strong specialty.

Knowledge Graph
The knowledge graph is a branch of artificial intelligence which has been widely studied. The knowledge graph (KG) was formally proposed in 2012 by Google to achieve a more intelligent search engine. A KG is essentially a knowledge base called a semantic network, and is a knowledge base with a directed graph structure, in which the nodes of the graph represent entities or concepts, and the edges of the graph represent various semantic relationships between entities (or concepts) [16].
Knowledge graphs are characterized by large scales, rich semantics, high quality, and friendly structures. They are mainly used in knowledge searches, intelligent question answering, and to assist decision-making. A knowledge graph is very suitable for representing factual knowledge in a virtual geographic environment.
According to its service objects, a knowledge graph can be divided into an open knowledge graph and a domain knowledge graph. A large number of knowledge graphs have been produced [17]. Examples of open knowledge graphs are WordNet, Freebase [18], DBpedia [19], YAGO [20], and Zhishi.me [21], as well as the Chinese CN-DBpedia [22]. Examples of domain knowledge graphs are KnowItAll [23], TextRunner [24], and Nell [25], among others. Although the linked open data cloud contains some multilingual knowledge graphs, most of the knowledge (including concepts, entities, triples, etc.) is still denoted in English, and the number of Chinese knowledge graphs is quite few [26].
The encyclopedia knowledge graph mainly extracts knowledge from encyclopedia websites and lacks geographic knowledge. To meet the needs of this study, a geographic knowledge graph (GeoKG) needs to be generated by extracting geographic information from the virtual geographic environment.

Geographic Knowledge Graph
A geographic knowledge graph (GeoKG) is an expanded knowledge graph with information of geography. It is a structured geo-semantic knowledge base. It organizes data in form of geographic entities and describes geographic concepts and entities in the form of "entity-relationship-entity" triples, as shown in Figure 2.
Knowledge graphs are characterized by large scales, rich semantics, high quality, and friendly structures. They are mainly used in knowledge searches, intelligent question answering, and to assist decision-making. A knowledge graph is very suitable for representing factual knowledge in a virtual geographic environment.
According to its service objects, a knowledge graph can be divided into an open knowledge graph and a domain knowledge graph. A large number of knowledge graphs have been produced [17]. Examples of open knowledge graphs are WordNet, Freebase [18], DBpedia [19], YAGO [20], and Zhishi.me [21], as well as the Chinese CN-DBpedia [22]. Examples of domain knowledge graphs are KnowItAll [23], TextRunner [24], and Nell [25], among others. Although the linked open data cloud contains some multilingual knowledge graphs, most of the knowledge (including concepts, entities, triples, etc.) is still denoted in English, and the number of Chinese knowledge graphs is quite few [26].
The encyclopedia knowledge graph mainly extracts knowledge from encyclopedia websites and lacks geographic knowledge. To meet the needs of this study, a geographic knowledge graph (GeoKG) needs to be generated by extracting geographic information from the virtual geographic environment.

Geographic Knowledge Graph
A geographic knowledge graph (GeoKG) is an expanded knowledge graph with information of geography. It is a structured geo-semantic knowledge base. It organizes data in form of geographic entities and describes geographic concepts and entities in the form of "entity-relationship-entity" triples, as shown in Figure 2.
A geographic entity refers to a geographic phenomenon that cannot be divided into similar phenomena, according to a certain standard in the geospatial world. It has a time, space, georelationship, and other various attributes.
Geographic knowledge is described in RDF (resource description framework) triples, and the form of visualization is a "point-edge" graph, as shown in Figure 2. The points represent geographical concepts, entities, and attribute values. Edges represent the various relationships, including relationships between concepts and concepts, concepts and entities, entities and entities, entities and attributes, and attributes and attribute values.
For example, as shown in Figure 2, < China, belong to, country > is a relationship between entity and concept, < China, capital, Beijing > is a relationship between entity and entity, and < Beijing, population, 20,693,000 > is a relationship between entity and attribute. A geographic entity refers to a geographic phenomenon that cannot be divided into similar phenomena, according to a certain standard in the geospatial world. It has a time, space, geo-relationship, and other various attributes.
Geographic knowledge is described in RDF (resource description framework) triples, and the form of visualization is a "point-edge" graph, as shown in Figure 2. The points represent geographical concepts, entities, and attribute values. Edges represent the various relationships, including relationships between concepts and concepts, concepts and entities, entities and entities, entities and attributes, and attributes and attribute values.
For example, as shown in Figure 2, < China, belong to, country > is a relationship between entity and concept, < China, capital, Beijing > is a relationship between entity and entity, and < Beijing, population, 20,693,000 > is a relationship between entity and attribute.
A GeoKG is constructed using a "top-down" approach, which is mainly divided into a schema layer and data layer, as shown in Figure 3. The schema layer is mainly a geographic ontology or concept layer, while the data layer is mainly geographic entities and related attributes. The basic idea is to realize the construction of geographic ontologies and concept layers, according to geographic ontologies, based on geographic information data (such as place name data, topographic maps, and so on), to achieve the extraction of geographical entities, related attributes, combined with the filling of relationships by thematic websites and encyclopedia data entities.
A GeoKG is constructed using a "top-down" approach, which is mainly divided into a schema layer and data layer, as shown in Figure 3. The schema layer is mainly a geographic ontology or concept layer, while the data layer is mainly geographic entities and related attributes. The basic idea is to realize the construction of geographic ontologies and concept layers, according to geographic ontologies, based on geographic information data (such as place name data, topographic maps, and so on), to achieve the extraction of geographical entities, related attributes, combined with the filling of relationships by thematic websites and encyclopedia data entities.
A knowledge graph for a VGE is mainly extracted from different VGE scenarios. For heterogeneous structured data such as maps, a semantic conversion model must be established, and geographic knowledge is acquired through geographic entity extraction, de-duplication, and alignment in order to complete and enhance the KG.

Question Answering Based on Knowledge Graph
Knowledge-based question answering methods can be divided into two categories: semantic parsing-based (SP-based) and information retrieval-based (IR-based) methods. Due to the emergence of large knowledge bases, information retrieval methods are popular. IR-based methods firstly obtain a series of candidates from a knowledge base using a relatively rough method, performs feature extraction on the questions and candidates, uses these to sort the candidates, and eventually selects the highest score as the final answer.
With the development of deep learning, a series of natural language question answering methods based on representation learning has emerged. Representation learning is mainly directed at the representation of semantic relations [39]. It is mainly divided into two types: mapping matrix or mapping vector. Mapping matrix methods mainly include structured models [40], semantic matching energy (SME) models [41], latent factor (LF) models [42], single layer (SL) models [43], and so on. Mapping vector methods mainly include TransE models [44], TransH models [45], TransR models [46], and TransD models [47], among others. An end-to-end approach based on learning natural language questions and answers is used to solve the question and answer problem, omitting the intermediate steps, by completely mapping the text to the knowledge time mapping from the data, lacking a semantic analysis of the displayed questions. The accuracy of these methods is often insufficient.
Entity linking is one of the key technologies for the interactive search of VGE scenes based on a knowledge graph. In-depth research has been conducted in the field of knowledge graphs. The representative methods are: (1) probabilistic generation model-based methods: building a probabilistic model between candidate entities and entity referential terms [48], improving the There already exist some GeoKG; for example, the GeoNames Ontology, OSM Semantic Network [27], LinkedGeoData, GeoWorldNet, and so on [28]. Research on the construction of GeoKGs mainly includes geographic entity extraction [29][30][31], topological and azimuthal relation extraction [32,33], and KG storage [34][35][36]. In terms of applications, there are the typical geo-semantic semantic sharing network systems (Geo-Wiki) [37] and the geographical knowledge-based dictionary (KIDGS) [38].
A knowledge graph for a VGE is mainly extracted from different VGE scenarios. For heterogeneous structured data such as maps, a semantic conversion model must be established, and geographic knowledge is acquired through geographic entity extraction, de-duplication, and alignment in order to complete and enhance the KG.

Question Answering Based on Knowledge Graph
Knowledge-based question answering methods can be divided into two categories: semantic parsing-based (SP-based) and information retrieval-based (IR-based) methods. Due to the emergence of large knowledge bases, information retrieval methods are popular. IR-based methods firstly obtain a series of candidates from a knowledge base using a relatively rough method, performs feature extraction on the questions and candidates, uses these to sort the candidates, and eventually selects the highest score as the final answer.
With the development of deep learning, a series of natural language question answering methods based on representation learning has emerged. Representation learning is mainly directed at the representation of semantic relations [39]. It is mainly divided into two types: mapping matrix or mapping vector. Mapping matrix methods mainly include structured models [40], semantic matching energy (SME) models [41], latent factor (LF) models [42], single layer (SL) models [43], and so on. Mapping vector methods mainly include TransE models [44], TransH models [45], TransR models [46], and TransD models [47], among others. An end-to-end approach based on learning natural language questions and answers is used to solve the question and answer problem, omitting the intermediate steps, by completely mapping the text to the knowledge time mapping from the data, lacking a semantic analysis of the displayed questions. The accuracy of these methods is often insufficient.
Entity linking is one of the key technologies for the interactive search of VGE scenes based on a knowledge graph. In-depth research has been conducted in the field of knowledge graphs. The representative methods are: (1) probabilistic generation model-based methods: building a probabilistic model between candidate entities and entity referential terms [48], improving the efficiency of entity linking, (2) thematic model-based methods: training latent Dirichlet allocation (LDA) topic models with training datasets, realizing entity elimination through semantic similarity [49], (3) graph-based methods: building a graph-based model to achieve target entity selection [50], and (4) deep neural network model-based methods: training entity representation models in a supervised or semi-supervised manner using deep neural network models, sorting them according to semantic similarities [51].
In summary, knowledge search in VGEs is mostly based on keywords-there is no geographic knowledge base as support. It is difficult to understand the intentions of people and achieve effective question-and-answer knowledge searches. Note that our goal is to use semantic parsing models to realize knowledge enhancement and complement, and to implement natural language question answering interactions and knowledge searches.

Multi-Level Semantic Parsing Model for VGE
In extracting geographic knowledge from a VGE scene and realizing the transformation of "view (graphic) → semantic", the core idea is to establish an interpretative model mapping from scene to semantics, as shown in Figure 4. The multi-level semantic parsing model is used to resolve the topological, directional, and distance relations in semantic descriptions at different scales of the virtual scene. efficiency of entity linking, (2) thematic model-based methods: training latent Dirichlet allocation (LDA) topic models with training datasets, realizing entity elimination through semantic similarity [49], (3) graph-based methods: building a graph-based model to achieve target entity selection [50], and (4) deep neural network model-based methods: training entity representation models in a supervised or semi-supervised manner using deep neural network models, sorting them according to semantic similarities [51].
In summary, knowledge search in VGEs is mostly based on keywords-there is no geographic knowledge base as support. It is difficult to understand the intentions of people and achieve effective question-and-answer knowledge searches. Note that our goal is to use semantic parsing models to realize knowledge enhancement and complement, and to implement natural language question answering interactions and knowledge searches.

Multi-Level Semantic Parsing Model for VGE
In extracting geographic knowledge from a VGE scene and realizing the transformation of "view (graphic) → semantic", the core idea is to establish an interpretative model mapping from scene to semantics, as shown in Figure 4. The multi-level semantic parsing model is used to resolve the topological, directional, and distance relations in semantic descriptions at different scales of the virtual scene.

Resolving Spatial Topological Relations
The model mainly considers the types of geographic entities and establishes a mapping of spatial topological relations expressions and spatial description vocabularies (as shown in Table 1). As shown in Figure 5, based on vector data (usually for small or medium-scale scenes) and 3D models (usually for large-scale scenes), geo-entities are extracted and, through a geometric calculation function, a spatial topology between the geo-entities is calculated. Then, topological relations and spatial type mapping vocabulary (as shown in Table 2) are used to convert the spatial topology to semantic relations.

Resolving Spatial Topological Relations
The model mainly considers the types of geographic entities and establishes a mapping of spatial topological relations expressions and spatial description vocabularies (as shown in Table 1). As shown in Figure 5, based on vector data (usually for small or medium-scale scenes) and 3D models (usually for large-scale scenes), geo-entities are extracted and, through a geometric calculation function, a spatial topology between the geo-entities is calculated. Then, topological relations and spatial type mapping vocabulary (as shown in Table 2) are used to convert the spatial topology to semantic relations.
efficiency of entity linking, (2) thematic model-based methods: training latent Dirichlet allocation (LDA) topic models with training datasets, realizing entity elimination through semantic similarity [49], (3) graph-based methods: building a graph-based model to achieve target entity selection [50], and (4) deep neural network model-based methods: training entity representation models in a supervised or semi-supervised manner using deep neural network models, sorting them according to semantic similarities [51].
In summary, knowledge search in VGEs is mostly based on keywords-there is no geographic knowledge base as support. It is difficult to understand the intentions of people and achieve effective question-and-answer knowledge searches. Note that our goal is to use semantic parsing models to realize knowledge enhancement and complement, and to implement natural language question answering interactions and knowledge searches.

Multi-Level Semantic Parsing Model for VGE
In extracting geographic knowledge from a VGE scene and realizing the transformation of "view (graphic) → semantic", the core idea is to establish an interpretative model mapping from scene to semantics, as shown in Figure 4. The multi-level semantic parsing model is used to resolve the topological, directional, and distance relations in semantic descriptions at different scales of the virtual scene.

Resolving Spatial Topological Relations
The model mainly considers the types of geographic entities and establishes a mapping of spatial topological relations expressions and spatial description vocabularies (as shown in Table 1). As shown in Figure 5, based on vector data (usually for small or medium-scale scenes) and 3D models (usually for large-scale scenes), geo-entities are extracted and, through a geometric calculation function, a spatial topology between the geo-entities is calculated. Then, topological relations and spatial type mapping vocabulary (as shown in Table 2) are used to convert the spatial topology to semantic relations.  For example, the description "river flowing through China" expresses the topological relationship of "line-area", is identified as "intersecting" type through topological calculations, and is mapped into the description of language relations, such as "flow through" or "flow-in". Other examples of resolving spatial topological relations are shown in Table 3.  Table 3. Examples of resolving spatial topological relations.  For example, the description "river flowing through China" expresses the topological relationship of "line-area", is identified as "intersecting" type through topological calculations, and is mapped into the description of language relations, such as "flow through" or "flow-in". Other examples of resolving spatial topological relations are shown in Table 3. Table 3. Examples of resolving spatial topological relations. For example, the description "river flowing through China" expresses the topological relationship of "line-area", is identified as "intersecting" type through topological calculations, and is mapped into the description of language relations, such as "flow through" or "flow-in". Other examples of resolving spatial topological relations are shown in Table 3.  Table 3. Examples of resolving spatial topological relations. For example, the description "river flowing through China" expresses the topological relationship of "line-area", is identified as "intersecting" type through topological calculations, and is mapped into the description of language relations, such as "flow through" or "flow-in". Other examples of resolving spatial topological relations are shown in Table 3.  Table 3. Examples of resolving spatial topological relations. For example, the description "river flowing through China" expresses the topological relationship of "line-area", is identified as "intersecting" type through topological calculations, and is mapped into the description of language relations, such as "flow through" or "flow-in". Other examples of resolving spatial topological relations are shown in Table 3.  Table 3. Examples of resolving spatial topological relations.   Table 3. Examples of resolving spatial topological relations. Table 3. Examples of resolving spatial topological relations. Table 3. Examples of resolving spatial topological relations.

Resolving Spatial Orientation Relations
The semantic analysis of orientation relationships needs to consider the constraints of topological and distance relations. As shown in Figure 6, we first need to describe the types of geometric features (such as point, line, surface, and volume) and separately calculate the spatial orientation, eight-direction spatial azimuth model for small-scale scenes, and three-dimensional azimuth model for large-scale scenes. The former divides a two-dimensional plane into eight parts, as shown in Figure 7 (left); the latter divides three-dimensional space into 27 parts, as shown in Figure  7 (right). Then, according to the mapping vocabulary, the semantic relations between entity A and entity B are confirmed.  For example, the expression corresponding to a "southeast" orientation description vocabulary is shown in Equation (1). When the spatial orientation relation between A and B satisfies Equation (1), the defined target object A (dot) is located a southeast direction from the reference object B (dot).

Resolving Spatial Orientation Relations
The semantic analysis of orientation relationships needs to consider the constraints of topological and distance relations. As shown in Figure 6, we first need to describe the types of geometric features (such as point, line, surface, and volume) and separately calculate the spatial orientation, eight-direction spatial azimuth model for small-scale scenes, and three-dimensional azimuth model for large-scale scenes. The former divides a two-dimensional plane into eight parts, as shown in Figure 7 (left); the latter divides three-dimensional space into 27 parts, as shown in Figure  7 (right). Then, according to the mapping vocabulary, the semantic relations between entity A and entity B are confirmed.  For example, the expression corresponding to a "southeast" orientation description vocabulary is shown in Equation (1). When the spatial orientation relation between A and B satisfies Equation (1), the defined target object A (dot) is located a southeast direction from the reference object B (dot).

Resolving Spatial Orientation Relations
The semantic analysis of orientation relationships needs to consider the constraints of topological and distance relations. As shown in Figure 6, we first need to describe the types of geometric features (such as point, line, surface, and volume) and separately calculate the spatial orientation, eight-direction spatial azimuth model for small-scale scenes, and three-dimensional azimuth model for large-scale scenes. The former divides a two-dimensional plane into eight parts, as shown in Figure 7 (left); the latter divides three-dimensional space into 27 parts, as shown in Figure  7 (right). Then, according to the mapping vocabulary, the semantic relations between entity A and entity B are confirmed. For example, the expression corresponding to a "southeast" orientation description vocabulary is shown in Equation (1). When the spatial orientation relation between A and B satisfies Equation (1), the defined target object A (dot) is located a southeast direction from the reference object B (dot).

Resolving Spatial Orientation Relations
The semantic analysis of orientation relationships needs to consider the constraints of topological and distance relations. As shown in Figure 6, we first need to describe the types of geometric features (such as point, line, surface, and volume) and separately calculate the spatial orientation, eight-direction spatial azimuth model for small-scale scenes, and three-dimensional azimuth model for large-scale scenes. The former divides a two-dimensional plane into eight parts, as shown in Figure 7 (left); the latter divides three-dimensional space into 27 parts, as shown in Figure 7 (right). Then, according to the mapping vocabulary, the semantic relations between entity A and entity B are confirmed.

Resolving Spatial Orientation Relations
The semantic analysis of orientation relationships needs to consider the constraints of topological and distance relations. As shown in Figure 6, we first need to describe the types of geometric features (such as point, line, surface, and volume) and separately calculate the spatial orientation, eight-direction spatial azimuth model for small-scale scenes, and three-dimensional azimuth model for large-scale scenes. The former divides a two-dimensional plane into eight parts, as shown in Figure 7 (left); the latter divides three-dimensional space into 27 parts, as shown in Figure  7 (right). Then, according to the mapping vocabulary, the semantic relations between entity A and entity B are confirmed. For example, the expression corresponding to a "southeast" orientation description vocabulary is shown in Equation (1). When the spatial orientation relation between A and B satisfies Equation (1), the defined target object A (dot) is located a southeast direction from the reference object B (dot). In Equation (1), SR denotes the azimuth relation expression of the target A (dot) and reference B (dot),

260
For example, the expression corresponding to a "southeast" orientation description vocabulary 261 is shown in Equation (1). When the spatial orientation relation between A and B satisfies Equation

262
(1), the defined target object A (dot) is located a southeast direction from the reference object B (dot).

263
In Equation (1) For example, the expression corresponding to a "southeast" orientation description vocabulary is shown in Equation (1). When the spatial orientation relation between A and B satisfies Equation (1), the defined target object A (dot) is located a southeast direction from the reference object B (dot). In Equation (1), SR denotes the azimuth relation expression of the target A (dot) and reference B (dot), DIR denotes a directional relationship function, TOP denotes a topological relationship category, and DIS denotes a spatial distance. The example of resolving spatial orientation relations is shown in Table 4.

Resolving Spatial Distance Relations
The semantic conversion model of distance relations mainly uses distance calculations in twoor three-dimensional space to determine semantic relations with natural language descriptions. The corresponding distance ( ) is shown in Equation (2). Then, distance relation type and semantic relations mapping vocabulary (as shown in Table 5) are used to convert the spatial distance to semantic relations.
(2) Qualitative expression needs to be associated with qualitative distances under certain distance scale standards. Thresholds were considered with spatial scales in order to achieve conversion from qualitative to quantitative distances. A threshold set is defined as { , , , ⋯ } ⊆ . Due to different scales, in practice, the description of the same qualitative distance may correspond to totally different quantitative distances. For example (as shown in Table 6), in China, "Tiananmen Square" is 10 km away from "Olympic Park" and it is generally qualitatively expressed that they are "far"; whereas "Beijing" is about 120 km away from "Tianjin", but the two are typically qualitatively expressed as "near".

Resolving Spatial Distance Relations
The semantic conversion model of distance relations mainly uses distance calculations in twoor three-dimensional space to determine semantic relations with natural language descriptions. The corresponding distance ( ) is shown in Equation (2). Then, distance relation type and semantic relations mapping vocabulary (as shown in Table 5) are used to convert the spatial distance to semantic relations.
(2) Qualitative expression needs to be associated with qualitative distances under certain distance scale standards. Thresholds were considered with spatial scales in order to achieve conversion from qualitative to quantitative distances. A threshold set is defined as { , , , ⋯ } ⊆ . Due to different scales, in practice, the description of the same qualitative distance may correspond to totally different quantitative distances. For example (as shown in Table 6), in China, "Tiananmen Square" is 10 km away from "Olympic Park" and it is generally qualitatively expressed that they are "far"; whereas "Beijing" is about 120 km away from "Tianjin", but the two are typically qualitatively expressed as "near".

Resolving Spatial Distance Relations
The semantic conversion model of distance relations mainly uses distance calculations in twoor three-dimensional space to determine semantic relations with natural language descriptions. The corresponding distance ( ) is shown in Equation (2). Then, distance relation type and semantic relations mapping vocabulary (as shown in Table 5) are used to convert the spatial distance to semantic relations.
(2) Qualitative expression needs to be associated with qualitative distances under certain distance scale standards. Thresholds were considered with spatial scales in order to achieve conversion from qualitative to quantitative distances. A threshold set is defined as { , , , ⋯ } ⊆ . Due to different scales, in practice, the description of the same qualitative distance may correspond to totally different quantitative distances. For example (as shown in Table 6), in China, "Tiananmen Square" is 10 km away from "Olympic Park" and it is generally qualitatively expressed that they are "far"; whereas "Beijing" is about 120 km away from "Tianjin", but the two are typically qualitatively expressed as "near".

Resolving Spatial Distance Relations
The semantic conversion model of distance relations mainly uses distance calculations in twoor three-dimensional space to determine semantic relations with natural language descriptions. The corresponding distance (D ij ) is shown in Equation (2). Then, distance relation type and semantic relations mapping vocabulary (as shown in Table 5) are used to convert the spatial distance to semantic relations.
(2) Qualitative expression needs to be associated with qualitative distances under certain distance scale standards. Thresholds were considered with spatial scales in order to achieve conversion from qualitative to quantitative distances. A threshold set is defined as D D i , D i+1 , D i+2 , · · · ⊆ D. Due to different scales, in practice, the description of the same qualitative distance may correspond to totally different quantitative distances. For example (as shown in Table 6), in China, "Tiananmen Square" is 10 km away from "Olympic Park" and it is generally qualitatively expressed that they are "far"; whereas "Beijing" is about 120 km away from "Tianjin", but the two are typically qualitatively expressed as "near". Table 6. Examples of resolving spatial distance relations.

Scene of VGE Spatial Topological Representation Semantic Graph Representation
scale standards. Thresholds were considered with spatial scales in order to achieve conversion from qualitative to quantitative distances. A threshold set is defined as { , , , ⋯ } ⊆ . Due to different scales, in practice, the description of the same qualitative distance may correspond to totally different quantitative distances. For example (as shown in Table 6), in China, "Tiananmen Square" is 10 km away from "Olympic Park" and it is generally qualitatively expressed that they are "far"; whereas "Beijing" is about 120 km away from "Tianjin", but the two are typically qualitatively expressed as "near". Table 6. Examples of resolving spatial distance relations.

120KM
Beijing Tianjin 120KM 2 Beijing Tianjin 120KM Beijing Tianjin Near qualitative to quantitative distances. A threshold set is defined as { , , , ⋯ } ⊆ . Due to different scales, in practice, the description of the same qualitative distance may correspond to totally different quantitative distances. For example (as shown in Table 6), in China, "Tiananmen Square" is 10 km away from "Olympic Park" and it is generally qualitatively expressed that they are "far"; whereas "Beijing" is about 120 km away from "Tianjin", but the two are typically qualitatively expressed as "near". Table 6. Examples of resolving spatial distance relations.

120KM
Beijing Tianjin 120KM 2 Beijing Tianjin 120KM Beijing Tianjin Near qualitative to quantitative distances. A threshold set is defined as { , , , ⋯ } ⊆ . Due to different scales, in practice, the description of the same qualitative distance may correspond to totally different quantitative distances. For example (as shown in Table 6), in China, "Tiananmen Square" is 10 km away from "Olympic Park" and it is generally qualitatively expressed that they are "far"; whereas "Beijing" is about 120 km away from "Tianjin", but the two are typically qualitatively expressed as "near".

Multi-Element Geographic Knowledge Extraction Model
VGE data are characterized by heterogeneity and coming from multiple sources. Structured geographic information data includes two-dimensional vector data, three-dimensional model data, and so on. Basic geographic information data is a symbolic understanding of the real world. It is characterized by rich knowledge and accurate spatial relationships. Thus, it is an important source for GeoKG.
The key idea of the multi-geographic geographic knowledge extraction model is to design different models for a three-dimensional scene. The geographic knowledge extraction model includes: (1) extraction of geographic entities and spatial relationships, and (2), based on the multilevel semantic analysis model, and according to the varying characteristics of different feature layers, construction of a map from spatial relational data to RDF (resource description frameworks) semantic triples.
The process of knowledge extraction from VGE structured data is illustrated in Figure 8. According to the structure of data at different element levels stored in the relational database, the extraction model of geographic entities and semantic relations is designed and represented as geographical knowledge. The triples are stored in the graph database.  The main method used is D2R structured data extraction. D2R (relation database to RDF) refers to converting the structured data in a relational database management system (RDBMS) into semantic RDF resources. We aim to establish a specific map based on the ontological model and to realize the mapping of relational data to RDF semantic data. The flow is demonstrated as followed: Firstly, for the basic format of the structured data stored in different data layers, determine the extracted content, and establish the extraction model for different data layers; Secondly, mapping specification based on the ontological model is constructed to associate the structured data with concepts and entities in the knowledge graph.
Finally, the concepts, entities, and attributes of association mapping are stored as triples. Taking a 1:250,000 map as an example, geographic semantic knowledge extracted from the relational database (schematic diagram) is demonstrated in Table 7.

Multi-Element Geographic Knowledge Extraction Model
VGE data are characterized by heterogeneity and coming from multiple sources. Structured geographic information data includes two-dimensional vector data, three-dimensional model data, and so on. Basic geographic information data is a symbolic understanding of the real world. It is characterized by rich knowledge and accurate spatial relationships. Thus, it is an important source for GeoKG.
The key idea of the multi-geographic geographic knowledge extraction model is to design different models for a three-dimensional scene. The geographic knowledge extraction model includes: (1) extraction of geographic entities and spatial relationships, and (2), based on the multilevel semantic analysis model, and according to the varying characteristics of different feature layers, construction of a map from spatial relational data to RDF (resource description frameworks) semantic triples.
The process of knowledge extraction from VGE structured data is illustrated in Figure 8. According to the structure of data at different element levels stored in the relational database, the extraction model of geographic entities and semantic relations is designed and represented as geographical knowledge. The triples are stored in the graph database. The main method used is D2R structured data extraction. D2R (relation database to RDF) refers to converting the structured data in a relational database management system (RDBMS) into semantic RDF resources. We aim to establish a specific map based on the ontological model and to realize the mapping of relational data to RDF semantic data. The flow is demonstrated as followed: Firstly, for the basic format of the structured data stored in different data layers, determine the extracted content, and establish the extraction model for different data layers; Secondly, mapping specification based on the ontological model is constructed to associate the

Multi-Element Geographic Knowledge Extraction Model
VGE data are characterized by heterogeneity and coming from multiple sources. Structured geographic information data includes two-dimensional vector data, three-dimensional model data, and so on. Basic geographic information data is a symbolic understanding of the real world. It is characterized by rich knowledge and accurate spatial relationships. Thus, it is an important source for GeoKG.
The key idea of the multi-geographic geographic knowledge extraction model is to design different models for a three-dimensional scene. The geographic knowledge extraction model includes: (1) extraction of geographic entities and spatial relationships, and (2), based on the multilevel semantic analysis model, and according to the varying characteristics of different feature layers, construction of a map from spatial relational data to RDF (resource description frameworks) semantic triples.
The process of knowledge extraction from VGE structured data is illustrated in Figure 8. According to the structure of data at different element levels stored in the relational database, the extraction model of geographic entities and semantic relations is designed and represented as geographical knowledge. The triples are stored in the graph database. The main method used is D2R structured data extraction. D2R (relation database to RDF) refers to converting the structured data in a relational database management system (RDBMS) into semantic RDF resources. We aim to establish a specific map based on the ontological model and to realize the mapping of relational data to RDF semantic data. The flow is demonstrated as followed: Firstly, for the basic format of the structured data stored in different data layers, determine the extracted content, and establish the extraction model for different data layers; Secondly, mapping specification based on the ontological model is constructed to associate the

Multi-Element Geographic Knowledge Extraction Model
VGE data are characterized by heterogeneity and coming from multiple sources. Structured geographic information data includes two-dimensional vector data, three-dimensional model data, and so on. Basic geographic information data is a symbolic understanding of the real world. It is characterized by rich knowledge and accurate spatial relationships. Thus, it is an important source for GeoKG.
The key idea of the multi-geographic geographic knowledge extraction model is to design different models for a three-dimensional scene. The geographic knowledge extraction model includes: (1) extraction of geographic entities and spatial relationships, and (2), based on the multi-level semantic analysis model, and according to the varying characteristics of different feature layers, construction of a map from spatial relational data to RDF (resource description frameworks) semantic triples.
The process of knowledge extraction from VGE structured data is illustrated in Figure 8. According to the structure of data at different element levels stored in the relational database, the extraction model of geographic entities and semantic relations is designed and represented as geographical knowledge. The triples are stored in the graph database.

Multi-Element Geographic Knowledge Extraction Model
VGE data are characterized by heterogeneity and coming from multiple sources. Structured geographic information data includes two-dimensional vector data, three-dimensional model data, and so on. Basic geographic information data is a symbolic understanding of the real world. It is characterized by rich knowledge and accurate spatial relationships. Thus, it is an important source for GeoKG.
The key idea of the multi-geographic geographic knowledge extraction model is to design different models for a three-dimensional scene. The geographic knowledge extraction model includes: (1) extraction of geographic entities and spatial relationships, and (2), based on the multilevel semantic analysis model, and according to the varying characteristics of different feature layers, construction of a map from spatial relational data to RDF (resource description frameworks) semantic triples.
The process of knowledge extraction from VGE structured data is illustrated in Figure 8. According to the structure of data at different element levels stored in the relational database, the extraction model of geographic entities and semantic relations is designed and represented as geographical knowledge. The triples are stored in the graph database.  The main method used is D2R structured data extraction. D2R (relation database to RDF) refers to converting the structured data in a relational database management system (RDBMS) into semantic RDF resources. We aim to establish a specific map based on the ontological model and to realize the mapping of relational data to RDF semantic data. The flow is demonstrated as followed: Firstly, for the basic format of the structured data stored in different data layers, determine the extracted content, and establish the extraction model for different data layers; Secondly, mapping specification based on the ontological model is constructed to associate the structured data with concepts and entities in the knowledge graph.
Finally, the concepts, entities, and attributes of association mapping are stored as triples. Taking a 1:250,000 map as an example, geographic semantic knowledge extracted from the The main method used is D2R structured data extraction. D2R (relation database to RDF) refers to converting the structured data in a relational database management system (RDBMS) into semantic RDF resources. We aim to establish a specific map based on the ontological model and to realize the mapping of relational data to RDF semantic data. The flow is demonstrated as followed: Firstly, for the basic format of the structured data stored in different data layers, determine the extracted content, and establish the extraction model for different data layers; Secondly, mapping specification based on the ontological model is constructed to associate the structured data with concepts and entities in the knowledge graph.
Finally, the concepts, entities, and attributes of association mapping are stored as triples. Taking a 1:250,000 map as an example, geographic semantic knowledge extracted from the relational database (schematic diagram) is demonstrated in Table 7. Table 7. Semantic relationships of geographic entities extracted from a 1:250000 map of Zhengzhou (in China).

317
"what are the cities near the east of Zhengzhou?"), we only consider unilateral questions (e.g., "What 318 are the cities near Zhengzhou?"). As shown in Figure 9, the key technologies include: geographic 319 entity identification, resource mapping, and querying statement generation.

Intelligent Interaction with VGE Based on Geographic Knowledge Graph
Due to the complexity of multilingual relations in natural language question-answering (such as "what are the cities near the east of Zhengzhou?"), we only consider unilateral questions (e.g., "What are the cities near Zhengzhou?"). As shown in Figure 9, the key technologies include: geographic entity identification, resource mapping, and querying statement generation. Due to the complexity of multilingual relations in natural language question-answering (such as "what are the cities near the east of Zhengzhou?"), we only consider unilateral questions (e.g., "What are the cities near Zhengzhou?"). As shown in Figure 9, the key technologies include: geographic entity identification, resource mapping, and querying statement generation.
We use a semantic parsing-based (SP-based) approach to achieve natural language question and answer queries. This can be divided into: (1) identification of geographic entities, which identifies the geographic entity from the natural language text, and (2) geographic entity linking, in which the identified geographic entities are mapped to the geographic knowledge base in order to obtain candidate entities.

Bilateral LSTM-CRF Model for Geographic Entity Identification
A LSTM (long short-term memory) model is characterized by its ability to preserve contextual information. A CRF (conditional random field) model can take into account the influence between front and back annotations from the sentence level. Combining the advantages of the two models, we used a bilateral LSTM-CRF [52] model to improve the accuracy of geographic entity identification in question answering.
A LSTM model is implemented by calculating the logarithmic probability of the entire question (see Equation (3)), which, in order to meet the natural language grammatical structure, can obtain higher probability.
where ( ) represents the predicted probability of the i-th word in the question. A smaller ( ) implies a higher probability that the statement S conforms to the grammar. The logarithmic We use a semantic parsing-based (SP-based) approach to achieve natural language question and answer queries. This can be divided into: (1) identification of geographic entities, which identifies the geographic entity from the natural language text, and (2) geographic entity linking, in which the identified geographic entities are mapped to the geographic knowledge base in order to obtain candidate entities.

Bilateral LSTM-CRF Model for Geographic Entity Identification
A LSTM (long short-term memory) model is characterized by its ability to preserve contextual information. A CRF (conditional random field) model can take into account the influence between front and back annotations from the sentence level. Combining the advantages of the two models, we used a bilateral LSTM-CRF [52] model to improve the accuracy of geographic entity identification in question answering.
A LSTM model is implemented by calculating the logarithmic probability of the entire question (see Equation (3)), which, in order to meet the natural language grammatical structure, can obtain higher probability.
where P(W i ) represents the predicted probability of the i-th word in the question. A smaller LP(S) implies a higher probability that the statement S conforms to the grammar. The logarithmic probability obtained by a basic LSTM model only considers the position and contextual information of candidate entities and does not use any information of the candidate entities itself. We extracted two additional features to improve the LSTM model: the length L (in words) of the candidate geographic entity and the IDF (inverse document frequency) value of the geographic entity. One of the CRF linear layers is added after the hidden layer of the bilateral LSTM network. The model structure is shown in Figure 10. By introducing a state transition matrix A and setting the matrix P to be the output sequence of the annotation sequence y = (y 1 , y 2 , · · · , y n ) corresponding to the output observation sequence X of the double-layer LSTM network, the predicted output is: where A i,j represents the probability of transition from the i-th state to the j-th state in the time series and P i,j represents the probability that the i-th word in the input observation sequence is the j-th label. probability obtained by a basic LSTM model only considers the position and contextual information of candidate entities and does not use any information of the candidate entities itself. We extracted two additional features to improve the LSTM model: the length L (in words) of the candidate geographic entity and the IDF (inverse document frequency) value of the geographic entity. One of the CRF linear layers is added after the hidden layer of the bilateral LSTM network. The model structure is shown in Figure 10. By introducing a state transition matrix A and setting the matrix P to be the output sequence of the annotation sequence = ( , , ⋯ , ) corresponding to the output observation sequence X of the double-layer LSTM network, the predicted output is: where , represents the probability of transition from the i-th state to the j-th state in the time series and , represents the probability that the i-th word in the input observation sequence is the j-th label.

Multi-feature Logistic Model for Geographic Entity Linking
Geographic entity linking is used to calculate the semantic similarities between candidate geographic entities and geographic referential entities. We apply multiple characteristics to implement the logistic regression model. In order to enable multiple feature values, to better represent the semantic similarity of entities, after completing entity identification and matching the set of candidate entities, it is necessary to calculate the feature values for all entity pairs to evaluate the linking possibility between them, as is shown in Figure 11. The weighted eigenvalue F between the entity m and the candidate entity ∈ can be calculated, as follows:

Multi-feature Logistic Model for Geographic Entity Linking
Geographic entity linking is used to calculate the semantic similarities between candidate geographic entities and geographic referential entities. We apply multiple characteristics to implement the logistic regression model. In order to enable multiple feature values, to better represent the semantic similarity of entities, after completing entity identification and matching the set of candidate entities, it is necessary to calculate the feature values for all entity pairs to evaluate the linking possibility between them, as is shown in Figure 11. probability obtained by a basic LSTM model only considers the position and contextual information of candidate entities and does not use any information of the candidate entities itself. We extracted two additional features to improve the LSTM model: the length L (in words) of the candidate geographic entity and the IDF (inverse document frequency) value of the geographic entity. One of the CRF linear layers is added after the hidden layer of the bilateral LSTM network. The model structure is shown in Figure 10. By introducing a state transition matrix A and setting the matrix P to be the output sequence of the annotation sequence = ( , , ⋯ , ) corresponding to the output observation sequence X of the double-layer LSTM network, the predicted output is: where , represents the probability of transition from the i-th state to the j-th state in the time series and , represents the probability that the i-th word in the input observation sequence is the j-th label.

Multi-feature Logistic Model for Geographic Entity Linking
Geographic entity linking is used to calculate the semantic similarities between candidate geographic entities and geographic referential entities. We apply multiple characteristics to implement the logistic regression model. In order to enable multiple feature values, to better represent the semantic similarity of entities, after completing entity identification and matching the set of candidate entities, it is necessary to calculate the feature values for all entity pairs to evaluate the linking possibility between them, as is shown in Figure 11. The weighted eigenvalue F between the entity m and the candidate entity ∈ can be calculated, as follows: The weighted eigenvalue F between the entity m and the candidate entity e ∈ E m can be calculated, as follows: In the matching candidate entity set, the entity with the highest weighted feature value will become the final matching pair of entity m.
The specific proposed method is to select the geographic knowledge base as a training data set and extract the entity set L = { m i , e i } whose linking information is sufficient (that is, there is a large number of linkages) as the training data, and then use the logistic regression model to calculate the weight of each feature value. For an entity m and corresponding candidate entity e, each eigenvalue weight ω must satisfy the following relationships: where ω = ω 1 , ω 2 , ω 3 , . . . , ω n , f = f 1 , f 2 , f 3 , . . . , f n . The probability that the entities e 1 and e 2 are linked to the entity m can be calculated using the sigmoid function, where the calculation formula is as follows: P((e 1 e 2 ) = true) = 1 If s(m, e 1 ) > s(m, e 2 ), then P((e 1 e 2 ) = true) > 0.5; if not, then P((e 1 e 2 ) = true) < 0.5. The final weights can be determined by maximum likelihood estimation and Logistic regression models. Then, the weight ω is substituted into formula 5 to solve the weighted eigenvalues, in order to discover and predict new entity links.

Discussion
In this section, the case study of formalizing VGE scene of Beijing was constructed by using multi-feature logistic model, and the experiment was designed for testing the interaction with VGE by natural language question answering. Then, the results were analyzed to evaluate the ability of this method for intelligent interaction with a virtual geographic environment based on a geographic knowledge graph.
The experimental data mainly included three parts: basic geographic information data of the VGE, corpus data, and encyclopedia knowledge base data. The basic geographic information data of the VGE included two-dimensional vector data and three-dimensional model data of Beijing city in China. The encyclopedia knowledge graph is CN-DBpedia, which has 80 million entities, and 120 million relations.
The experimental platform was a Windows 10 system, the database was MongoDB, and the development environment used was Eclipse. The results of knowledge card are shown in Figure 12. The knowledge card includes: relationships of Entities (shown as a "node-edge" graph), summary of entity, infobox of entity, and geographic entity category.
Using the knowledge graph, in combination with three-dimensional scenes, question and answer (Q&A)-based VGE interactions were realized. The results are shown in Figure 13. The system supports two types of interaction: (1) Knowledge graph-based interaction. If an entity node is clicked in the knowledge graph visualization view, while the node has a geographic location attribute, the corresponding view on the right side of the map will dynamically jump to the corresponding geographic location. For example, if the entity node with label "Shanghai" (in Chinese) is clicked in the Knowledge Graph, the corresponding virtual scene viewpoint will dynamically jump to Shanghai.
(2) Interaction by Q&A. If "where is the capital of China" is typed into the search box, the corresponding map will retrieve the entity "Beijing" and its related attribute information will be displayed in the lower left view. The map view on the right side will follow the linkage and automatically roam the global virtual geographic environment to the Beijing area.  Using the knowledge graph, in combination with three-dimensional scenes, question and answer (Q&A)-based VGE interactions were realized. The results are shown in Figure 13. The system supports two types of interaction: (1) Knowledge graph-based interaction. If an entity node is clicked in the knowledge graph visualization view, while the node has a geographic location attribute, the corresponding view on the right side of the map will dynamically jump to the corresponding geographic location. For example, if the entity node with label "Shanghai" (in Chinese) is clicked in the Knowledge Graph, the corresponding virtual scene viewpoint will dynamically jump to Shanghai.
(2) Interaction by Q&A. If "where is the capital of China" is typed into the search box, the corresponding map will retrieve the entity "Beijing" and its related attribute information will be displayed in the lower left view. The map view on the right side will follow the linkage and automatically roam the global virtual geographic environment to the Beijing area. Using the knowledge graph, in combination with three-dimensional scenes, question and answer (Q&A)-based VGE interactions were realized. The results are shown in Figure 13. The system supports two types of interaction: (1) Knowledge graph-based interaction. If an entity node is clicked in the knowledge graph visualization view, while the node has a geographic location attribute, the corresponding view on the right side of the map will dynamically jump to the corresponding geographic location. For example, if the entity node with label "Shanghai" (in Chinese) is clicked in the Knowledge Graph, the corresponding virtual scene viewpoint will dynamically jump to Shanghai.
(2) Interaction by Q&A. If "where is the capital of China" is typed into the search box, the corresponding map will retrieve the entity "Beijing" and its related attribute information will be displayed in the lower left view. The map view on the right side will follow the linkage and automatically roam the global virtual geographic environment to the Beijing area. In summary, users can interaction with VGE using the human-computer conversation on the natural language. The method of intelligent interaction based on GeoKG can bridge the distance between people and virtual environments. However, there are still many deficiencies in our work, and we need to make further improvements in the following areas: (1) Semantic transformation of large-scale scene models (such as BIM (building information modeling) models) to satisfy the semantic search of and interaction with interior spaces; (2) Semantic search and questioning in larger geographic fields. We only used geographic data of the Beijing urban area in our experiment. For a large-scale knowledge graph, the data volume was too small. It is necessary to use global crowdsourcing techniques to achieve the semantic extraction and transformation of massive data for global geospatial information.
(3) Natural language question answering needs to be strengthened through geographic language training. The training data used in this study used a common corpus; therefore, a special corpus in the field of geography needs to be constructed for geo-entity recognition and phrase segmentation, in order to train Q&A models and to improve the accuracy of geographic question answering.
(4) Further improvements need to be made on the accuracy of natural language question answering. We only used the relevant models to implement Q&A and did not perform relevant experimental analysis for different corpora and different training models. The next step requires a detailed analysis of the accuracy of the training of the model, in conjunction with a specific geo-corpus to further improve the accuracy of entity links and scene interactions.
In a word, the intelligent interaction of virtual geographic environments could be the research hotspot in this field, and this could be a focus of continuous further research on the formalized representation model, intelligent service, and applications based on the GeoKG in the future.

Conclusions
VGE knowledge engineering is currently in its infancy. At present, there are few Chinese geographical knowledge graphs, and there is a lack of geographical knowledge in the encyclopedic knowledge graph. In this paper, based on the geographic information data of a virtual geographic environment, geographic knowledge is extracted, and virtual geographic environment scenes are formally expressed as spatial knowledge that can be recognized by computer. The scene interaction of natural language question-and-answer and the virtual geographic environment is realized by using natural language processing technology.
Specifically, with the aim to develop intelligent services for VGE, we constructed the "spatial-to-semantic" conversion model, proposed a method for extracting geographic knowledge from structured geospatial information, realized the enhancement of Chinese geographic knowledge, and proposed a bilateral LSTM-CRF model and multiple feature-influenced logistic model to implement intelligent interactions with a VGE.
Using geospatial information data, language data, and open knowledge base data, experiments were conducted to build a large-scale VGE knowledge graph. Semantic querying of geographic knowledge can be implemented, and related attribute information of entities can be displayed in the form of knowledge cards. Location information is associated with geo-entities. Furthermore, observation viewpoints of virtual scenes can change with Q&A interaction. This provides a new means for intelligent interaction with VGEs.