3.1. Article Outputs
Geographic information ontology was developed in the 1990s when ontology was introduced into the field of geographic information science [3
]. However, the total the number of articles published before 2000 was very small. The situation changed fundamentally after 2001, and persistent explosive growth in the number of papers related to geographic information ontology occurred from 2001 to 2009. Compared to 2001–2009, in 2010–2016 the trend of growth declined slightly, but the number of papers was more and relatively stable (Figure 1
). According to the growth curve shown in Figure 1
and the characteristics and development of geographic ontology research, the research can be divided into the three following stages.
Embryonic stage (before 2000): During this time, some computer scientists and geographers had already used ontology technology in geographic information science research, such as Smith and Mark [24
] and Fonseca and Egenhofer [25
]. The growth in the number of papers was slow, and most of these papers were published in comprehensive journals.
Rapid development stage (2001–2009): During this period, 692 papers were published, and 54 countries and regions were involved in the research. The US contributed most of the published papers. The 238 papers from the US accounted for 35.69% of contemporary documents, followed by China (119 papers, 17.19%), the UK (117 papers, 16.91%), and Germany (81 papers, 11.71%). In this period, geographic information science developed rapidly, and the means of obtaining geospatial data increased in diversity. The emergence of geographic information systems, volunteered geographic information, and big data with different systems and structures hindered the communication and sharing of geographic information and made it quite difficult for users to process massive amounts of geographic data [2
]. Moreover, the expressions of these data focused on structure and ignored their semantics. How to provide knowledge-based geographic information services was an urgent issue [12
]. Geographic information ontology can remove the obstacles to the sharing of multi-source, multi-data structure, and multi-classification systems, and can also produce a clear understanding of user requirements, thus providing knowledge-level smart geographic information services [3
]. Therefore, studies related to geographic information ontology displayed considerable development in this stage.
Relatively stable stage (2010–2016): In this stage, 841 papers were published, and a total of 58 countries and regions participated in the research. Although the growth of papers somewhat declined compared to the papers published between 2001 and 2009, the number of papers is increasing, and the annual number fluctuated only slightly. In general, the number of published papers in this period tended to be stable.
3.2. Subject Categories
The disciplinary constitution of a specific research field reveals the degree of merging of disciplines involved in that research field and, to some degree, their impact [29
]. Each paper in the WoS is classified into one or more subject categories. In this article, we use statistical analyses of subject categories to reveal the focus of geographic information ontology research. Moreover, the relationships among these subject categories are also revealed through cluster relationships based on the co-occurrence of geographic information ontology disciplines.
The 52 subject categories involved in geographic information ontology research occur 2256 times altogether. Note that some papers are classified into more than one category; when the statistical analyses are performed, these different categories are counted separately. For example, if a paper is classified as representing both computer science and physical geography, then we add 1 to the numbers of both computer science and physical geography papers. Computer science (868, 38.47%), engineering (285, 12.63%), and geography (267, 11.83%) are the top three categories in terms of number of publications, and these categories make up 70.61% of the papers published from 2001 to 2016. In addition, remote sensing (128, 5.67%) and mathematics (45, 1.99%) display the fastest growth among all of the subject categories.
To study the relationships among subjects regarding the study of geographic information ontology and their evolution, the CiteSpace software was used to perform a co-occurrence network cluster analysis of these subjects [19
]. The time span from 2001 to 2016 was divided into four equal sections, and 20 subject categories with the highest frequency of reference are selected for analysis from each section to draw the subject co-occurrence network cluster graph, which is shown in Figure 2
and includes 28 nodes and 44 co-occurrence links. In this figure, the colors of the bars at the top represent the year limits, and each node represents a subject category. The colors at the centers of the nodes represent the dates of the earliest appearance of the subjects, and the circular structures that encircle the nodes represent the history since each subject’s appearance. The colors of the circles correspond to those on the time color bar, and the thicknesses of the circles vary in direct proportion to the number of papers published within the time frame. The nodes with purplish-red apertures have high betweenness centrality (The betweenness centrality refers to the number of intermediate nodes that a node acts as the shortest path between the other two nodes. It is an index to measure the importance of a point in the network [19
]), the links between nodes represent the co-occurrence relations between pairs of subjects, the thicknesses of the connecting lines are proportional to the number of co-occurrences of pairs of jointly appearing subjects, and the colors of the links correspond to the color of the time bar when both subjects occur together for the first time.
The figure shows that there are universal relationships and coordination between the subjects. The diversity of these subjects suggests that the study of geographic information ontology is not limited to geographical science and computer-related specialties; instead, it extends to other related subjects. Figure 2
is divided into approximately four clusters, of which Cluster #3 is the largest. This cluster consists of six main subjects, the published articles of which are mainly modeling and data processing-related studies. The published articles in Cluster #1 are mainly related to studies of cyber-infrastructure. The published articles in Cluster #2 have economics and energy as their topics. Finally, the main areas of study in Cluster #4 are geography and computer science. The subjects in these four clusters are interconnected, implying that the subjects become connected to one another during development, followed by continuous progress through learning from other subjects.
In Figure 2
, computer science (842) is the subject with the largest number of published papers. The position of computer science at the center of the graph suggests that it is closely related to other branches of study in geographic information ontology; thus, it is one of the most important subjects in this field. Establishing geographic information ontologies is intended to permit mutual understanding between computers and humans or between computer systems; intelligent man–machine interaction; interoperation between computer systems; and knowledge expression, sharing and reusing in computer systems [14
]. In recent years, geographic information ontology has been introduced into artificial intelligence and used in studies of deep machine learning to enable computers to serve mankind intelligently [9
]. Mathematics (27) has a small number of published papers but is encircled by a very thick purple ring, indicating that this subject has high centrality. Moreover, the mathematics node is the critical node connecting Clusters #2 and #3 (geoscience, economics, geology, and remote sensing), suggesting that the study of geographic information ontology is no exception to the rule that any natural science must become pragmatic using mathematical methods [30
]. As part of our study of geographic information ontology, we perform a semantic analysis first in order to permit mathematical explanations to formalize language in the database. The present content of ontology studies shows insufficient detail and a low degree of formalization, and a majority of these studies only provide a natural language in the terminology or definition databases of quasi-natural language instead of realizing interactions between computers and humans or between computers. Additional future mathematical studies of geographic information ontology are required [2
3.3. Characteristics of Journal Co-Citation
As mentioned above, co-occurrence clustering analysis of subject categories depends on the subject categories of the journals in which the papers were published. The same statement holds for journal co-citation analysis, which depends on the journals that published the cited papers. Journal co-citation refers to the phenomenon in which two journals are cited in one paper, reflecting the connectivity between journals and disciplines. The knowledge base distribution of a field can be revealed through journal co-citation [31
We divide the time from 2001 to 2016 into four periods. In each period, the top 20 journals cited with the greatest frequency by papers about geographic information ontology are selected for analysis, and the Pathfinder algorithm is used for pruning. Moreover, we have checked the corresponding cited documents for the cited journals in the network. If the citation rates of a journal in the network are high in only one or two articles, we will delete this journal information. After pruning, the network includes forty nodes that represent journals with the highest citation frequencies and fifty-four links representing co-citations. Next, on the basis of the topic, we perform a clustering analysis of the citing papers from the cited journals. The results are shown in Figure 3
. This figure contains three obvious clusters. (a) In Cluster #1, the citing articles are mainly papers concerned with geographic information ontology and computers and include frontier terms such as “spatial”, “systems”, “modeling”, and “semantic”. (b) In Cluster #2, the citing articles primarily describe research into geographic information ontology and engineering and include frontier terms such as “knowledge acquisition”, “artificial intelligence”, “fuzzy”, and “flexible”. (c) Cluster #3 is the largest cluster and contains eleven journals, in which the citing articles focus on geographic information ontology, geography, philosophy, and other related fields of study. Here, the frontier terms are “geography”, “difference”, and “post-Marxism”. The top three journals with the highest referring frequency are the International Journal of Geographical Information Science, Transactions in GIS, and Knowledge Acquisition (Table 1
). These journals are the main sources of the cited documents and represent the potential journals that may accept contributions related to geographic information ontology.
More importantly, some of the nodes in Figure 3
are surrounded by purple rings, indicating that the cited publications have high betweenness centrality. Moreover, the degree of centrality is proportional to the width of the purple ring. These journals have strong co-citation relationships with others in the co-citation network (Table 1
). The International Journal of Geographical Information Science has the highest centrality (0.91), followed by the Annals of the Association of American Geographers (0.71), Environment and Planning A (0.47), Knowledge Acquisition (0.15), Communications of the ACM (0.13), and Antipode (0.13).
3.4. Characteristics of Author Co-Citation
Author co-citation refers to the phenomenon in which two authors are co-cited in other documents. By computing author co-citation relationships, the interconnections between academic communities and authors within a research field can be revealed [33
]. Author co-citation analysis has now become a potentially prolific analysis method that can be used to uncover the current status of scientific structures as well as their changes, and it can also be used to carry out frontier analyses, domain analyses, and scientific assessments. Figure 4
is a network clustering mapping that was pruned using the Pathfinder algorithm. This network contains the twenty most-cited authors in each time slice (the time from 2001 to 2016 is divided into four equal time slices) and consists of 53 co-citation nodes and 60 links.
In Figure 4
, each node corresponds to an author. The sizes of the nodes represent the total numbers of citations received by the authors. Attention should be paid on the color of node center which tells us the earliest time this author was cited. The ring around the node reflects the history of his/her citation, and the color of the ring is consistent with its time partition spectrum. For example, in Figure 4
, the earliest time that GUARINO N was cited by articles of geographic information ontology in WoS between 2001 and 2016 is 2001 (the color of node center is blue). Since then, there has been a stable growth in his citations (the color of ring enclosing the node from sky-blue, green, to yellow). We not only observe the citation history of earlier published articles but also can see those of recently published articles by observing the colors. The outermost purple rings enclosing the nodes indicate betweenness centrality, which means that the scholar is key in connecting different academic clusters. Hence, frequently cited authors may not have high betweenness centrality. If an author with betweenness centrality has been cited many times, he (or she) may have had a fundamental influence on the development of geographic information ontology research. In addition, the red inner rings surrounding some nodes indicate that the number of citations of by those authors’ papers changed (increased or decreased suddenly) over a short period of time. These changes are usually associated with fundamental changes in the study of geographic information ontology.
With respect to the content of the citing articles shown in Figure 4
, Cluster #1 focuses on applications of geographic information ontology in urban studies; Cluster #2 studies the theory of geographic information ontology in terms of philosophy; Cluster #3 is primarily composed of semantic normalization studies; Cluster #4 includes several different topics, such as the recognition of geographic information ontologies and artificial intelligence (AI) ontology; and Cluster #5 concentrates on spatial analysis, semantic analysis, etc.
shows that Gruber TR and Guarino have the highest numbers of citations as well as high centrality (Table 2
). According to [37
], in some sense, an ontology is a detailed software specification [37
]. The first widely accepted definition of an ontology, which is due to [38
], states that it is an explicit declaration of a conceptualization. On this basis, in 1995, Gruber gave a further definition, namely, that an ontology is a formal and explicit declaration of a conceptual model shared within a domain [13
]. He also listed five principles of ontology design: clarity and objectivity, uniformity, extendibility, minimum code deviation, and minimum ontology commitment. Subsequently, in 2008, Gruber attempted to comprehensively solve applied problems in social information and semantic information using collective knowledge systems [39
In 1995, Guarino added that an ontology is a logical language model. In addition, in order to differentiate this theory from others, Guarino usually uses the word “ontology” to mean the “ontology of philosophy” [40
]. Contrary to epistemology, which studies the origin and essence of human knowledge, his ontology theory focuses on existence [41
] proposed that the concept classification of ontology should be based on the level of specification and domain dependence. To better classify ontology [43
] studied the concept classification of ontology thoroughly and carefully. Through analysis of the nature of concepts, the properties of concepts, and relations among concepts, he provided a theory guiding concept classification [43
]. In the opinion of Guarino, the differences among ontological concepts exist in the definition of concepts as well as their properties. Therefore, he provided further explanation of the consistency and unity of ontological concepts and concentrated on the dependence concept of ontology, which is applied to attributes [44
Cluster #4 occurs in the center of Figure 4
and stands for studies by scholars with significant impacts on geographic information ontology. Apart from Guarino, Smith B is also surrounded by a thick purple ring (the highest centrality from Table 2
) with red dots in the center, suggesting that his studies have made important contributions to the development of geographic information ontology. Smith B collaborated with Mark DM in 1998, and they published a paper titled “Ontology and geographic kinds”. Specific reasons for carrying out geographic information ontology-related research are proposed in that paper, thus marking the official position of information ontology in geographic information science [24
]. They also co-led a project titled “Geographic classification: ontology investigation” in 1999. In this project, they attempted to clarify geographic objects and ontologies associated with cognitive classification. They therefore carried out a systematic study into different categories of geographic information ontology using a questionnaire, and they performed stress tests on human cognition of some humanistic concepts in different language environments [26
]. In 2004, Smith defined the SNAP and SPAN ontologies to distinguish points and periods of time and applied this concept to the representation of processes and spatio-temporal reasoning in spatio-temporal ontologies [46
Kuhn is also a researcher who investigates who investigates geographic information ontology. As early as 1993, he proposed that spatial information theory should be oriented to GIS users [47
]. In regard to the method of construction of geographic information ontology, Kuhn designed and implemented a construction method based on natural language. By analyzing keywords and their semantic implications in German transport norms and standards, he then extracted an ontology that could be applied to automobile navigation from traffic data [4
]. Moreover, he also proposed the concept of semantic reference systems [49
]. He then continued his work to apply semantic reference systems to semantic transformations among instance objects in different land management systems using the theory and methods of Conceptual Space [51
]. In 2005, he suggested that semantic interactions between geographic information services could be divided into two categories, specifically (1) defining the semantic connotations of metadata corresponding to every service interface for querying and searching semantic information, and (2) defining a mathematical model of semantic interoperation for the expression of semantic knowledge and reasoning using that knowledge [52
According to [53
], ontology can be used in information systems to avoid problems including potential inconsistencies among GISs. Geographic information systems contain information about objects and their attributes at specific locations, and these attributes range from surface elevation to natural land cover. Both natural and artificial elements are involved, which suggests a need for different consistency rules. As a consequence, ontology is necessary for element classification to achieve consistency. Based on the above points, Frank proposed a five-layer classification structure that applies to both GISs and other systems related to large-scale geographic entities. In this classification structure, the top-level ontology is a type of natural entity that is independent of the human mind, whereas the others are associated with epistemology, and different rules correspond to different layers [53
According to [14
], ontology is a theory that describes entities, concepts and the relationship between their features and relative functions using specific words from a specific point of view. The authors of [14
] performed many studies in ontology-based geographic information integration, and they proposed the idea of ontology-driven geographic information systems (ODGISs) using the theory of ontology-driven information systems from Guarino. According to a paper by Fonseca, in different information processing periods, real things processed using information ontology can be divided into four spatial paradigms. In addition, with respect to interoperation, that paper mainly discusses ontology applications in the design and realization of information systems. Subsequently, in 2002, on the basis of the four proposed spatial paradigms described above and the features and reality of geographic information science, Fonseca proposed four new spatial paradigms (physical space, logical space, expression space, and recognition space) to analyze the relationships among the human mind, information systems, and the real world on different levels and from different perspectives [54
In addition, Rovert G and his colleagues have made great contributions to the research of ontology. In the project aiming at construction of semantic network data about earth environment ontology conducted by the JPL (Jet Propulsion Lab, in Pasadena, CA, USA) of NASA, according to related concepts in the GCMD (Global Change Master Directory), JPL developed SWEET, which contains thousands of notions related to Earth and space science and definitions of relationships of these notions under the guidance of ISO1911X, and they published SWEET on the internet using OWL (Ontology Web Language) [56
]. Pier Luigi Buttigieg and his colleagues have achieved a lot in environmental ontology research; they have summarized the content and structure of environmental ontology [57
]. Their research results also contribute to the research of the whole geospatial ontology.
With respect to the topic of citing articles, the citing articles of Cluster #4 mostly focus on subjects such as ontology recognition and AI as well as ontological theory. In addition, this cluster contains many citing articles that show great concern for the semantic-based integration and interoperation of geographic information from a real-world perspective. At present, the rapid growth of AI and the demands of individual users open a new door for geographic information ontology research. Organizing and expressing data on the basis of geographic information ontology to satisfy individual demand will be an important direction of geographic information ontology research [3
From the above analysis, geographic information ontology studies ontological cognition and ontology concept primitively. With the development of Internet, more and more information systems appear on the Internet. Data integration and interoperation between multi-source heterogeneous information systems become an important question to be solved. However, both data integration and interoperation are not simply doing connection among databases, any research related to information sharing and interoperation could be traced to semantic heterogeneity [5
]. Specifically, in Geographic Information Science, researchers often use ontology to solve this problem. The semantic web based on ontology makes internet information more specific and semantic-rich, enabling computer to do analysis and reasoning for intellectualization of network data and network service processing. Then, our Internet will be a universal and powerful information integration and exchange platform [2