Centrality as a Method for the Evaluation of Semantic Resources for Disaster Risk Reduction

Čerba, Otakar; Jedlička, Karel; Čada, Václav; Charvát, Karel

doi:10.3390/ijgi6080237

Open AccessArticle

Centrality as a Method for the Evaluation of Semantic Resources for Disaster Risk Reduction

by

Otakar Čerba

^1,*,†

,

Karel Jedlička

^1,†

,

Václav Čada

^1,† and

Karel Charvát

^2,†

¹

Department of Geomatics, University of West Bohemia, Plzeň 306 14, Czech Republic

²

Czech Centre for Science and Society, Praha 5, 150 00, Czech Republic

^*

Author to whom correspondence should be addressed.

^†

All authors contributed equally to this work.

ISPRS Int. J. Geo-Inf. 2017, 6(8), 237; https://doi.org/10.3390/ijgi6080237

Submission received: 25 April 2017 / Revised: 16 July 2017 / Accepted: 4 August 2017 / Published: 6 August 2017

(This article belongs to the Special Issue Smart Solutions for Disaster Risk Reduction: Big Data Concepts for Disaster Risk Reduction (DRR))

Download

Browse Figures

Versions Notes

Abstract

:

Clear and straightforward communication is a key aspect of all human activities related to crisis management. Since crisis management activities involve professionals from various disciplines using different terminology, clear and straightforward communication is difficult to achieve. Semantics as a broad science can help to overcome communication difficulties. This research focuses on the evaluation of available semantic resources including ontologies, thesauri, and controlled vocabularies for disaster risk reduction as part of crisis management. The main idea of the study is that the most appropriate source of broadly understandable terminology is such a semantic resource, which is accepted by—or at least connected to the majority of other resources. Important is not only the number of interconnected resources, but also the concrete position of the resource in the complex network of Linked Data resources. Although this is usually done by user experience, objective methods of resource semantic centrality can be applied. This can be described by centrality methods used mainly in graph theory. This article describes the calculation of four types of centrality methods (Outdegree, Indegree, Closeness, and Betweenness) applied to 160 geographic concepts published as Linked Data and related to disaster risk reduction. Centralities were calculated for graph structures containing particular semantic resources as nodes and identity links as edges. The results show that (with some discussed exceptions) the datasets with high values of centrality serve as important information resources, but they also include more concepts from preselected 160 geographic concepts. Therefore, they could be considered as the most suitable resources of terminology to make communication in the domain easier. The main research goal is to automate the semantic resources evaluation and to apply a well-known theoretical method (centrality) to the semantic issues of Linked Data. It is necessary to mention the limits of this study: the number of tested concepts and the fact that centralities represents just one view on evaluation of semantic resources.

Keywords:

centrality; Data Network; Linked Data resource; crisis management; semantics

1. Introduction

Disaster risk reduction activities consist of collecting, processing, and visualizing large spatial data sets [1,2,3,4] which can be created as a combination of existing data with links to other data (Linked Data approach [5,6,7]). The Linked Data approach is one of the most efficient to deal with spatial data in terms of data volume, speed of processing, or intelligibility of data presentation and visualization. Linked Data, semantics (which is an integral part of Linked Data), and relevant tools (thesauri, ontologies, knowledge bases, controlled vocabularies, etc.) can contribute to one of the main tasks of disaster risk reduction as well as early warning activities. This task is connected with the necessity of fast communication, intelligibility, and common understanding of essential concepts, including their machine processing, or the development of advanced tools such as decision support systems [8,9].

This study focuses on geographic and geography-related concepts [10] used in the disaster risk reduction domain. Geography and related disciplines motivated by the very important role of geography (including geoinformatics, geomatics, and cartography) dealing with spatial information play a crucial role in crisis management and disaster risk reduction [1,2,3,4,8,9], because knowledge related to localization or position are crucial for all crisis management and risk reduction activities, and geography is essential in the Linked Data space [11]. Moreover Reference [7] mentions: “geography is another factor that can often connect information from varied topical domains” [7]. Geographical data are also a very important part of the Linking Open Data cloud diagram, which contains specific resources of spatial and geographic data (such as GeoNames.org or LinkedGeoData.org), but other very important Linked Data resources (such as DBpedia, AGROVOC, or Wikidata) also include spatial components (for example, data with coordinates or geographical concepts).

The objective of this this research is to analyse identity links (details in [7]) in Linked Data resources containing terms from the disaster risk reduction domain and to identify suitable semantic resources. The process of finding a suitable semantic resource is not only important from the communication point of view, but also from the metadata description point of view. Identity links represent the highest level of Linked Data according to the 5-star ranking system [5]. These links enable the interconnection of independent data resources and construct a network of identical objects. This approach is very important from the point of view of data sharing, understanding, common communication among subjects participating in disaster risk reduction activities, automated data processing, or the derivation of new information or consequences in crisis management (detailed information on the importance of links between Linked Data resources are published in [6,12,13]). As the quantitative criteria for identical links evaluation, various types of centrality [14] (details in Section 2 and Section 3 ) were chosen. The particular types of centrality evaluate resources based on their position in the Linked Data space. This is the main benefit of this research, because the selection of fitting semantic resources is usually driven by the subjective opinion of users, national priorities, or the number of terms published in a resource. The interconnection of resources to semantic information in other Linked Data databases can provide a complex view on the Linked Data structure and choose an appropriate resource of concrete type of information or data.

The article is structured as follows. The introduction to Linked Data and semantics, including their benefits for disaster risk reduction, are mentioned in the first part. This section also contains the constraints of the described research and detailed structure of the article. The Materials and Methods section describes related works, details of selected metrics, and ways of collecting and processing sample data. The Results section focuses on the implementation of particular metrics on a selected sample of geographical concepts used in the disaster risk reduction domain. The results are commented on in the Discussion section. This part contains recommendations for appropriate thesauri or other semantic resources - the main goal of this paper. The last part summarizes the conclusions and introduces opportunities for further studies and research.

2. Materials and Methods

The research was realized in the following steps (workflow in Figure 1):

Selecting sample geographic objects.
Downloading identical representations of geographic objects from various Linked Data resources.
Development of Data Networks representing particular concepts (see an example in Figure 2).
Application of centrality metrics for resources evaluation.
Summarizing information from particular Data Networks.
Recommendation of thesauri or other semantic resources based on the results of the quantitative evaluation.

Steps 1, 2, and 3 are described in this section. The implementation of metrics to sample data (statistical evaluation of particular Data Networks) and summarizing the acquired data (Steps 4 and 5) are the crucial part of this section of the article, and they are published in the Results section. Recommendations (Step 6) are mentioned in the Discussion.

Ad 1. The data for the research were selected from keywords of relevant articles focused on disaster risk reduction. The publications were chosen by a method based on Snowball sampling (details and mathematical background of this method in [15]). This method is based on depth-first search of tree or graph structures. In the case of this study, the structure is composed of important publications and their references (bibliography). As the first-level input, the publication “Three-dimensional maps for disaster management” [16] (recommended as a reference paper for the journal special issue) was selected. Three iterations of searching were used. The second level consists of publications [17,18,19,20,21,22,23,24,25,26] (the references of [16] were more numerous, but only publications with keywords were taken into consideration).

Finally, a set of 160 items related to disaster risk reduction and its interconnection with geoinformatics, geomatics, cartography, and similar disciplines was created. These items are divided into concepts (from very general items such as collaboration or usability to specific issues such as Web Map Service or participatory GIS) and concrete objects and terms (such as Germany, M. F. Goodchild, Rhinopithecus bieti, Twitter, or Oder). Originally, the set of concepts and objects selected from the keywords was larger (about 350 terms), but the items on the list which were not represented in DBpedia (see below) or did not contain any identical links were replaced.

Ad 2. The searching of identical links and semantic resources containing equivalent (or very similar) representations of the same object or concept was realized by the script developed by authors. The script is driven by Bash script language. It uses XSLT language for data transformations and open software components: Saxon (XSLT processor), wget (file retrieving and downloading), grep (text processing), xmlstarlet (transformation between CSV, comma-separated values, and XML, extensible markup language, formats) and Graphviz (export of graphic schemas generated in DOT graph description language). As an input, the table contains the name of each data item and the identifier of the representation of the concept or object in the DBpedia knowledge base. DBpedia was used as the starting point of all searching processes because DBpedia is the crucial central point of the Linked Data space (see Linking Open Data cloud diagram; http://lod-cloud.net/). The script produces an XML file for each item. This file contains all identical links between particular representations, including acronyms of subject and object of the relation (in the terminology of RDF, resource description framework, triples), type of the relation, and possible error influencing the object of the relation. Moreover, it produces a graphic schema for all objects and concepts (Figure 2). The searching of the Linked Data network is realized by the “Follow Your Nose” approach (mentioned for example in [7] or [27]), which is based on sequentially scanning standardized identical links.

The script collected 1171 identical links, which were divided into 3 groups (Figure 3):

Links leading to correct nodes (Linked Data resources);
Links directed to data resources influenced by a semantic error (e.g., HTML view on data instead of real RDF data);
Links targeting to data resources containing a technical error (usually not working URI, uniform resource identifier).

The further processing concerns only correct links and resources as well as links and resources affected by a semantic error (777 links in total). Although the last mentioned category of data was not able to find any other interconnected resources, it is taken into consideration, because these resources can provide interesting new information, which is the reason for using semantic resources.

Ad 3. The development of the Data Network [13] (alternatively SameAs Network, e.g., in [28]) is ensured by the script created in R software with integrated igraph library. The script transforms the input CSV file containing particular identical links (coming from XML file generated in previous step) to the form of a directed graph. Then, the script processes the Data Network and computes quantitative metrics based on centrality described in the Results section.

The authors realize that centrality is just one method supporting the selection process of relevant semantic resources for disaster risk reduction activities. The research will continue by comparing explicit semantics contained in particular resources (together with domain experts; principles are mentioned in [29,30]) or by testing metrics for whole network or edges. The achieved results could be improved by processing a larger number of concepts and objects.

3. Results

Ad 4. Centrality could in general be described as the importance of a position of a node in a graph [13,31,32]. Therefore, this approach could be used to find the most relevant semantic resources for selected concepts and objects. Examples illustrating application of centrality in the domain of data semantics are available in [13] (mentions the degree, closeness, and betweenness centrality), and [13,33,34,35,36] (deals with the indegree, closeness, and betweenness centrality). References [14,37] mention the history of graph centrality research.

The four types of centrality including degree, closeness, betweenness, and indegree are computed for each Data Network, representing particular tested concepts selected from the disaster risk reduction domain. The following mathematical formulas (adopted from [37]) illustrate particular types of centrality of a vertex v, which is the part of directed graph G = (V, E), where V is a set of nodes (vertices) v and E is a set of edges e (for the Linked Data purposes the weight assigned to each edge is 1; the graph is not weighted).

Outdegree centrality (a part of degree centrality which is described in [37,38,39,40,41,42,43]) of the vertex v is measured as the number of edges leading from the node v. The values of the outdegree centrality in the network of semantic resources built on the basis of identity links means the number of other semantic resources, which are linked from the resource represented by the vertex v.

O u t d e g r e e (v) = \sum^{} d e g^{o u t} (v)

Indegree centrality is similar to the previous type of centrality computed as the number of edges leading to the node v from other nodes of the graph G. In the described case, this type of centrality shows how many semantic data resources refer to the resource represented by the vertex v.

I n d e g r e e (v) = \sum^{} d e g (v)

Closeness centrality [37,38,39,40,41,42,43,44] is defined as the average shortest path length between a particular vertex v and other nodes in the graph G. High values of closeness centrality in the case described in this text mean that the concrete semantic resource is close to other resources. It causes simple movement through the network of resources and acquiring of new information.

C l o s e n e s s (v) = \frac{1}{\sum_{y} d (y, v)}

where d(y,v) is the shortest way between nodes y and v in the graph G.

Betweenness centrality [13,31,37,39,40,41,43,44] is defined in terms of how “inbetween” a vertex is among the other vertices in the graph [14]. High values of the betweenness centrality in the network of semantic resources mean that the node could be a “bridge” among several independent (not directly interconnected) parts of the network.

B e t w e e n n e s s (v) = \sum_{s \neq t \neq v} \frac{σ_{s t} (v)}{σ_{s t}}

where σ_st is the total number of shortest paths in the graph G from node {s} s to node {t} and σ_st(v) is the total number of shortest paths from node {s} s to node {t} passing through the vertex v.

An optimal semantic resource from the view of centrality has following properties:

It is connected to many other resources.
It is referenced from many other resources.
It is close to other resources.
It interconnects independent subgraphs of the network.

Table 1 summarizes the properties of particular types of centrality. It is not possible to assess the described centralities, because they do not represent various variants of one method, but they are complementary expressing different kind of position of the node in a graph.

The centrality as well as the development of the Data Network are computed in R software with use of the igraph library. The normalization is performed by multiplying the raw values by n − 1, where n is the number of vertices in the graph.

Ad 5. Centrality values for particular Data Networks representing the occurrence of concepts and objects are summarized by computing the average of each centrality values. This step is realized by XSLT (Extensible Markup Language - Transformation) templates, which are able to find relevant values for each semantic resource as well as to compute the averages.

Table 2 shows the results of the centrality computation. Particular columns contain average values of the four used types of centrality calculated for each selected term related to disaster risk reduction. In the first column, there are acronyms of semantic resources. The highest values in each category (type of centrality) are emphasized in Table 2. These results are discussed in the next chapter.

4. Discussion

The results published in the previous section indicate the following information related to fitting semantic resources for disaster risk reduction:

Disaster risk reduction is a very large and multi-disciplinary field. Therefore, the portfolio of tested terms (keywords) is very heterogeneous. It contains specific terms (e.g., disaster response), general terms (e.g., accessibility, attention), geographical or personal names, and many concepts from other domains (information technologies, cartography, economics).
The automated searching process found 30 relevant semantic resources using Linked Data approach (Figure 4 and Table 2). However, the average resource contains just 24 tested concepts or objects (from 160). Only eight resources have better-than-average values of occurrence of the tested keywords.
Only two semantic resources contain all tested terms (Figure 4). In the case of DBpedia this fact is given by the selected system of searching of the Linked Data network (see Materials and Methods). Wikidata is the second most important resource from the view of occurrence of concepts or objects related to disaster risk reduction. This information shows that the role of Wikidata in the world of Linked Data is much more significant and it competes with DBpedia [45]. Both resources (DBpedia and Wikidata) represent the most complex semantic knowledge bases for disaster risk reduction purposes. It is evident not only from Figure 4, but also from Table 2, where Wikidata and DBpedia have the highest values in all types of centrality.
Because all values in the Table 2 are normalized, just the simple sum can be used as the overall indicator. In addition to DBpedia and Wikidata mentioned above, there are other interesting semantic resources: Biblioteca Nazionale Centrale di Firenze, Yago, Deutschen Nationalbibliothek, Library of Congress Name Authority File and NDL (National Diet Library). Except for high centrality values (especially closeness centrality), these resources have better-than-average occurrence of tested concepts. It is also interesting to note that all of these resources (except Yago) come from the domain of libraries.
There are important data sets missing in the set of semantic resources, such as AGROVOC, EuroVoc, GEMET (GEneral Multilingual Environmental Thesaurus), NAL (National Agricultural Library) or STW (Standard-Thesaurus Wirtschaft) Thesaurus for Economics. This is caused by the selected method of data exploitation, because none of them is connected to DBpedia or other resources related to DBpedia. This isolation of the group of the above-mentioned thesauri or ontologies is also evident from other research (e.g., [29]). The authors tested searching process starting in AGROVOC, but results were not satisfying due to the low number of tested terms contained in AGROVOC.
Geographical concepts [10] and objects represent a specific case of disaster risk reduction terms. In addition to the above-mentioned semantic resources, they are contained in specific thesauri, ontologies, or gazetteers such as GeoNames.org, LinkedGeoData (a Linked Data version of OpenStreetMap), or FAO Geopolitical Ontology (it is not mentioned in this research).

5. Conclusions

Linked Data are very important for all disciplines related to spatial data and geographic concepts. Linked Data in general (through the explicit semantics quite often provided by identical links between various semantic resources) support better and more intelligible communication. Fast and clear communication is very important for disaster risk reduction and early warning activities to prevent risk situations or minimize the impact of a risk situation. Therefore, the presented research is focused on evaluating identical links between semantic resources in the Linked Data space to find the most optimal resources for disaster risk reduction purposes.

As a quantitative criteria for identical links evaluation, various types of centrality (indegree, outdegree, closeness, and betweenness centrality) were chosen. Centrality is able to find a node (representing semantic resource) in a graph (Data Network in case of this study) with the most advantageous position with regard to other vertices of the graph. The developed scripts coded in R language and XSLT search for identity links of relevant concepts and objects connected to disaster risk reduction, compute values of centrality for particular concepts and objects, and summarize these values for semantic resources. The authors found more than 350 concepts and objects from keywords of essential publications dealing with the topic domain of this article; 160 relevant concepts and objects were selected and processed by the above-mentioned scripts.

The wide scope of the disaster risk reduction domain includes not only specific terms, but also concepts for information technologies, management, demography, geomorphology, geographical, biological, or personal names.

There are four essential conclusions following from this study:

DBpedia and Wikidata (as the most important resources in the Linked Data space) are the most relevant resources for the studied domain as well. Wikidata plays the role of a hub (a resource interlinked to other resources) and a bridge (a component connecting not-interlinked groups of resources). These conclusions follow from the values of the outdegree and betweeness centrality. DBpedia represents an authority among Linked Data resources in the field of disaster risk reduction (derived from the indegree centrality values). Based on the closeness centrality, DBpedia is also a central node of the Linked Data space in the case of the domain processed in this article.
There are several interesting resources (e.g., Biblioteca Nazionale Centrale di Firenze, Deutschen Nationalbibliothek, or Library of Congress Name Authority File) usually coming from library science.
Many interesting semantic resources related to agriculture or environmental protection (e.g., AGROVOC or GEMET) contain several disaster risk reduction concepts, but they are not linked to DBpedia.
There are several specific semantic resources for geographical objects, such as GeoNames.org or LinkedGeoData.

Information from Linked Data is undoubtedly useful. However, low reliability is identified (e.g., missing identical links between identical objects, technical errors of semantic resources, missing explicit semantics—definitions and description). This fact should be interpreted not as a problem of the Linked Data approach, but as an opportunity for domain experts to participate in Linked Data initiatives and improve shared information as well as awareness of their domain.

Acknowledgments

This publication was supported by the project LO1506 of the Czech Ministry of Education, Youth and Sports.

Author Contribution

The research was designed by Otakar Čerba. Karel Jedlička, Václav Čada and Karel Charvát proposed the method of selection of tested terms and concepts. They also processed the impact of the research and results. Otakar Čerba and Karel Jedlička developed the evaluation method for identity links All authors cooperated in centrality computation (experimental phase of the research). The article was written by Otakar Čerba and revised by Karel Jedlička, Karel Charvát and Václav Čada.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zerger, A.; Smith, D.I. Impediments to using GIS for real-time disaster decision support. Comput. Environ. Urban Syst. 2003, 27, 123–141. [Google Scholar] [CrossRef]
Schmidt-Thomé, P. The spatial effects and management of natural and technological hazards in Europe. In Final Report of the European Spatial Planning and Observation Network (ESPON) Project; Geological Survey of Finland: Espoo, Finland, 2005; pp. 1–197. [Google Scholar]
Tran, P.; Shaw, R.; Chantry, G.; Norton, J. GIS and local knowledge in disaster management: A case study of flood risk mapping in Viet Nam. Disasters 2009, 33, 152–169. [Google Scholar] [CrossRef] [PubMed]
Konecny, M.; Zlatanova, S.; Bandrova, T.L. Geographic Information and Cartography for Risk and Crisis Management; Springer: Heidelberg, Germany, 2010. [Google Scholar]
Berners-Lee, T. Linked Data-Design Issues. 2006. Available online: http://www.w3.org/DesignIssues/LinkedData.Html (accessed on 3 August 2017).
Bizer, C.; Heath, T.; Berners-Lee, T. Linked data-the story so far. In Semantic Services, Interoperability and Web Applications; Emerging Concepts; Springer: Heidelberg, Germany, 2009; pp. 205–227. [Google Scholar]
Heath, T.; Bizer, C. Linked data: Evolving the web into a global data space. In Synthesis Lectures on the Semantic Web: Theory and Technology; Morgan&Claypool Publishers: London, UK, 2011; pp. 1–136. [Google Scholar]
Reznik, T.; Horakova, B.; Szturc, R. Geographic Information for Command and Control Systems Demonstration of Emergency Support System. In Intelligent Systems for Crisis Management: Geo-Information for Disaster Management (GI4DM) 2012—Lecture Notes in Geoinformation and Cartography, 1st ed.; Zlatanova, S., Dilo, A., Peters, R., Scholten, H., Eds.; Springer-Verlag: Berlin, Germany, 2013; pp. 263–275. [Google Scholar] [CrossRef]
Reznik, T.; Horakova, B.; Szturc, R. Advanced methods of cell phone localization for crisis and emergency management applications. Int. J. Digit. Earth 2003, 8, 259–272. [Google Scholar] [CrossRef]
Kavouras, M.; Kokla, M. Theories of Geographic Concepts: Ontological Approaches to Semantic Integration; CRC Press, Taylor & Francis Group: Boca Raton, FL, USA, 2007; pp. 1–352. [Google Scholar]
Hart, G.; Dolbear, C. Linked Data: A Geographic Perspective; CRC Press, Taylor & Francis Group: Boca Raton, FL, USA, 2013. [Google Scholar]
Wood, D.; Zaidman, M.; Ruth, L.; Hausenblas, M. Linked Data; Manning Publications Co.: Shelter Island, NY, USA, 2014; p. 336. [Google Scholar]
Guéret, C.; Groth, P.; Stadler, C.; Lehmann, J. Assessing Linked Data Mappings Using Network Measures. In Extended Semantic Web Conference; Springer: Heidelberg, Germany, 2012; pp. 87–102. [Google Scholar]
Freeman, L.C. A set of measures of centrality based on betweenness. Sociometry 1977, 40, 35–41. [Google Scholar] [CrossRef]
Goodman, L.A. Snowball sampling. Ann.Math. Stat. 1961, 32, 148–170. [Google Scholar] [CrossRef]
Bandrova, T.; Zlatanova, S.; Konecny, M. Three-dimensional maps for disaster management. In ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences; Copernicus GmbH: Göttingen, Germany, 2012. [Google Scholar]
Bandrova, T.; Konecny, M. Mapping of Nature Risks and Disasters for Educational Purposes. Kartografija i Geoinformacije 2006, 5, 4–12. [Google Scholar]
Snoeren, G.; Zlatanova, S.; Crompvoets, J.; Scholten, H. Spatial Data Infrastructure for emergency management: The view of the users; Vrije Universiteit Amsterdam: Amsterdam, The Netherlands, 2007. [Google Scholar]
Charvat, K.; Kubicek, P.; Talhofer, V.; Konecny, M.; Jezek, J. Spatial data infrastructure and geovisualization in emergency management. In Resilience of Cities to Terrorist and Other Threats; Springer: Heidelberg, Germany, 2008; pp. 443–473. [Google Scholar]
Zlatanova, S. SII for Emergency Response: The 3D Challenges; ISPRS Archives–Volume XXXVII Part B4; Chen, J., Jiang, J., Nayak, S., Eds.; Copernicus GmbH: Bejing, China, 2008; pp. 1631–1637. [Google Scholar]
Goodchild, M.F.; Glennon, J.A. Crowdsourcing geographic information for disaster response: A research frontier. Int. J. Digit. Earth 2010, 3, 231–241. [Google Scholar] [CrossRef]
Kozel, J.; Štampach, R. Practical experience with a contextual map service. In Geographic Information and Cartography for Risk and Crisis Management; Springer: Heidelberg, Germany, 2010; pp. 305–316. [Google Scholar]
Zlatanova, S.; Dilo, A. A Data Model for Operational and Situational Information in Emergency Response: The Dutch Case; ISPRS: Torino, Italy, 2010. [Google Scholar]
Bell, R.; Glade, T. Multi-hazard analysis in natural risk assessments. WIT Trans. State Art Sci. Eng. 2011, 1, 1–371. [Google Scholar]
Konecny, M.; Kubicek, P.; Stachon, Z.; Sasinka, C. The usability of selected base maps for crises management—Users’ perspectives. Appl. Geomat. 2011, 3, 189–198. [Google Scholar] [CrossRef]
Reznik, T.; Horáková, B.; Janiurek, D. Emergency support system: Actionable real-time intelligence with fusion capabilities and cartographic displays. Adv. Mil. Technol. 2011, 6, 83–97. [Google Scholar]
Hausenblas, M. Exploiting linked data to build web applications. IEEE Int. Comput. 2009, 13, 68–73. [Google Scholar] [CrossRef]
Ding, L.; Shinavier, J.; Shangguan, Z.; McGuinness, D.L. SameAs networks and beyond: Analysing deployment status and implications of owl: SameAs in linked data. Proccedings of the International Semantic Web Conference, Shanghai, China, 7–11 November 2010. [Google Scholar]
Cerba, O.; Jedlicka, K. Geomatic Concepts in Agriculture Thesauri. AGRIS On-Line Papers Econ. Inform. 2015, 7, 33. [Google Scholar]
Cerba, O.; Jedlicka, K. Linked Forests: Semantic similarity of geographical concepts “forest”. Open Geosci. 2016, 8, 556–566. [Google Scholar] [CrossRef]
Guéret, C.; Groth, P.; Van Harmelen, F.; Schlobach, S. Finding the achilles heel of the web of data: Using network analysis for link-recommendation. In International Semantic Web Conference; Springer: Heidelberg, Germany, 2010; pp. 289–304. [Google Scholar]
Coursey, K.; Mihalcea, R. Topic identification using Wikipedia graph centrality. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume: Short Papers; Association for Computational Linguistics: Stroudsburg, PA, USA, 2009; pp. 117–120. [Google Scholar]
Hakimov, S.; Oto, S.A.; Dogdu, E. Named entity recognition and disambiguation using linked data and graph-based centrality scoring. In Proceedings of the 4th International Workshop on Semantic Web Information Management, Scottsdale, AZ, USA, 20 May 2012. [Google Scholar]
Zaveri, A.; Rula, A.; Maurino, A.; Pietrobon, R.; Lehmann, J.; Auer, S. Quality Assessment for Linked Data: A Survey; Semantic Web, IOS Press: Amsterdam, The Netherlands, 2016; pp. 63–93. [Google Scholar]
Cheng, G.; Tran, T.; Qu, Y. RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization; Springer: Heidelberg, Germany, 2011; pp. 114–129. [Google Scholar]
Sinha, R.; Mihalcea, R. Unsupervised Graph-Basedword Sense Disambiguation Using Measures of Word Semantic Similarity. In Proceedings of the IEEE International Conference Semantic Computing (ICSC), Irvine, CA, USA, 17–19 September 2007. [Google Scholar]
Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw. 1978, 1, 215–239. [Google Scholar] [CrossRef]
Haythornthwaite, C. Social network analysis: An approach and technique for the study of information exchange. Libr. Inf. Sci. Res. 1996, 18, 323–342. [Google Scholar]
Cimenler, O.; Reeves, K.A.; Skvoretz, J. A regression analysis of researchers’ social network metrics on their citation performance in a college of engineering. J. Inf. 2014, 8, 667–682. [Google Scholar] [CrossRef]
Freeman, L. The Development of Social Network Analysis. A Study in the Sociology of Science; BookSurge, LLC: North Charleston, SC, USA, 2004. [Google Scholar]
Varlamis, I.; Eirinaki, M.; Louta, M. A study on social network metrics and their application in trust networks. In Proceedings of the IEEE International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2010), University of Southern Denmark, Odense, Denmark, 9–11 August 2010; pp. 168–175. [Google Scholar]
Zimmermann, T.; Nagappan, N. Predicting defects using network analysis on dependency graphs. In Proceedings of the 2008 ACM/IEEE 30th International Conference on Software Engineering (ICSE’08), Leipzig, Germany, 10–18 May 2008. [Google Scholar]
Borgatti, S.P.; Everett, M.G. A graph-theoretic perspective on centrality. Soc. Netw. 2006, 28, 466–484. [Google Scholar] [CrossRef]
Thung, F.; Lo, D.; Osman, M.H.; Chaudron, M.R. Condensing class diagrams by analyzing design and network metrics using optimistic classification. In Proceedings of the 22nd International Conference on Program Comprehension; ACM: New York, NY, USA, 2014; pp. 110–121. [Google Scholar]
Macura, J. Porovnání projektů Wikidata a DBpedia Jako Zdrojů Prostorových dat: Comparison of Wikidata and DBpedia Projects as Spatial Data Sources, 2016. Available online: https://otik.uk.zcu.cz/xmlui/handle/11025/23748?show=full (accessed on 3 August 2017).

Figure 1. The workflow of the research.

Figure 2. Data Network of the term “M. F. Goodchild”.

Figure 3. Structure of identical links.

Figure 4. Occurrence of the tested concepts and objects in semantic resources.

Table 1. Types of centrality.

Centrality	Properties
Indegree	It shows the normalized value of the amount of nodes of the graph being connected to the vertex for which the centrality is computed. In the case of this article, the high value of indegree centrality means that this resource is directly referenced by other resources.
Outdegree	It expresses the normalized value of the amount of nodes of the graph being connected to the vertex by directed edge from the node for which the centrality is computed. In the case of this article, the high value of indegree centrality means that this resource contains many links to other resources.
Closeness	This type of centrality shows how close the node is to the other vertices in the graph. In the case of Linked Data it does not play a very important role, because the data networks are not very large (tens of nodes).
Betweenness	It identifies weak positions of the graph–nodes (resources) representing the bridges among independent parts of data network.

Table 2. Average centrality values of semantic resources.

Acronym	Outdegree	Indegree	Closeness	Betweenness
AA	0.0317	0.0016	0.0242	0.0001
AU	0.0006	0	0.0005	0
AV	0.0102	0	0.0088	0
BC	0.0457	0.1081	0.1260	0.0331
BE	0.0016	0	0.0015	0
BF	0.0388	0	0.0271	0
DB	0.0598	0.6853	0.7801	0.0477
DN	0.1041	0	0.0752	0
EI	0.0003	0	0.0003	0
ES	0.0025	0	0.0009	0
FA	0.0032	0.0050	0.0116	0
GA	0.0022	0	0.0020	0
GN	0.0132	0.0052	0.0156	0.0011
IE	0.0007	0	0.0006	0
IR	0.0035	0.0018	0.0022	0.0001
IS	0.0028	0	0.0020	0
LA	0.0653	0.0047	0.0418	0
LG	0.0027	0.0042	0.0096	0
LI	0.0036	0	0.0031	0
LW	0.0018	0.0024	0.0052	0
MB	0.0015	0	0.0014	0
ND	0.0605	0	0.0478	0
NI	0.0006	0	0.0006	0
NK	0.0030	0	0.0027	0
OE	0.0009	0	0.0009	0
TI	0.0019	0.0038	0.0075	0.0004
VI	0.0162	0.0364	0.0412	0.0080
WB	0.0009	0.0038	0.0075	0
WD	0.4966	0.2950	0.4921	0.1425
YA	0.1808	0	0.1109	0

Explanation of acronyms in the Table 2 (the same acronyms are used in the whole article): AA—Getty Art & Architecture Thesaurus, AU—Libraries Australia, AV—AGROVOC, BC—Biblioteca Nazionale Centrale di Firenze, BE—Biblioteca Nacional de Espaňa, BF—Bibliothèque Nationale de France, DB—DBpedia, DN—Deutschen Nationalbibliothek, EI—Eionet, ES—Eurostat Linked Statistics, FA—FAST Linked Data, GA—GADM, GN—GeoNames.org, IE—Institut National de la Statistique et des Études Économiques, IR—Identifiants et Référentiels, IS—International Standard Name Identifier, LA—Library of Congress Name Authority File, LG—LinkedGeoData, LI—LIUC Thesauro di Economia e Scienze Sociali, LW—Linked Web APIs, MB—MusicBrainz, ND—National Diet Library, NI—National Library of Israel, NK—Databáze Národní knihovny CR, OE—OpenEI, TI—Transparency International, VI—Virtual Internet Authority File, WB—World Bank Linked Data, WD—Wikidata, YA—Yago.

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Čerba, O.; Jedlička, K.; Čada, V.; Charvát, K. Centrality as a Method for the Evaluation of Semantic Resources for Disaster Risk Reduction. ISPRS Int. J. Geo-Inf. 2017, 6, 237. https://doi.org/10.3390/ijgi6080237

AMA Style

Čerba O, Jedlička K, Čada V, Charvát K. Centrality as a Method for the Evaluation of Semantic Resources for Disaster Risk Reduction. ISPRS International Journal of Geo-Information. 2017; 6(8):237. https://doi.org/10.3390/ijgi6080237

Chicago/Turabian Style

Čerba, Otakar, Karel Jedlička, Václav Čada, and Karel Charvát. 2017. "Centrality as a Method for the Evaluation of Semantic Resources for Disaster Risk Reduction" ISPRS International Journal of Geo-Information 6, no. 8: 237. https://doi.org/10.3390/ijgi6080237

APA Style

Čerba, O., Jedlička, K., Čada, V., & Charvát, K. (2017). Centrality as a Method for the Evaluation of Semantic Resources for Disaster Risk Reduction. ISPRS International Journal of Geo-Information, 6(8), 237. https://doi.org/10.3390/ijgi6080237

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Centrality as a Method for the Evaluation of Semantic Resources for Disaster Risk Reduction

Abstract

1. Introduction

2. Materials and Methods

3. Results

4. Discussion

5. Conclusions

Acknowledgments

Author Contribution

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI