Next Article in Journal
A Context-Aware Android Malware Detection Approach Using Machine Learning
Next Article in Special Issue
LPG-Based Knowledge Graphs: A Survey, a Proposal and Current Trends
Previous Article in Journal
Generalized Zero-Shot Learning for Image Classification—Comparing Performance of Popular Approaches
Previous Article in Special Issue
Multi-Microworld Conversational Agent with RDF Knowledge Graph Integration
 
 
Article
Peer-Review Record

Systematic Construction of Knowledge Graphs for Research-Performing Organizations

Information 2022, 13(12), 562; https://doi.org/10.3390/info13120562
by David Chaves-Fraga 1,*, Oscar Corcho 1, Francisco Yedro 1, Roberto Moreno 2, Juan Olías 2 and Alejandro De La Azuela 2
Reviewer 1:
Reviewer 2:
Information 2022, 13(12), 562; https://doi.org/10.3390/info13120562
Submission received: 14 October 2022 / Revised: 17 November 2022 / Accepted: 24 November 2022 / Published: 30 November 2022
(This article belongs to the Special Issue Knowledge Graph Technology and Its Applications)

Round 1

Reviewer 1 Report

The paper proposes a methodology for construction of knowledge graphs on the basis of relational data sources. The authors apply declarative mapping rules to populate target ontology. Templates of the rules are generated on the basis of ontology, and refined afterwards. Templates are converted into complete mapping rules on the basis of the structure of relevant source database tables. After that rules are validated first automatically (using R2RML engine) and then manually by knowledge engineers and domain experts.

The aim of the authors is to support support Spanish-speaking research organizations worldwide. The methodology is applied for integration of knowledge from several Spanish organizations databases. But it seems the methodology is independent of the language, so “Spain” should be removed from the title. More than that, the methodology seems to be independent from the target ontology and it is applicable not only for research organizations. May be the title should be generalized in this way too.

I suppose the topic of the paper in general (knowledge graph population) is urgent, and application of declarative mapping rules is a right way.

Nevertheless, several improvements can be recommended for the paper. The methodology includes many manual activities and these activities should be commented in more details:

·         1. The step of refining mapping templates (lines 227-232) should be commented in more details. A list of types of potential inconsistencies should be given accompanied with the examples. Examples of “detailed comments on the mapping” are also interesting.

·         2. The step of systematic filling of mapping rules should be commented in more details. It should be illustrated with examples, various kinds of filling should be distinguished if required. What are explicit conditions to apply SQL scripts for preliminary data transformation?

·         3. What does it mean “if the generated RDF is the expected one” during initial testing validation process formally [line 268]?

·         4. The step of validating the mappings with experts should be commented in more details. Different kinds of expert activities should be distinguished if possible and illustrated with examples.

Additional comments:

·         1. Is it possible to apply schema matching tools for “preliminary analysis of the database to ensure that the most relevant concepts of the ontology were also represented in the tables”?

·         2. In “Sustainable procedures” [lines 385-399] the authors mention an important issue of updating the target ontology (knowledge graph). It seems that it is not a part of the proposed methodology. Is it a future work to include this issue to the methodology?

Author Response

Comment1: it seems the methodology is independent of the language, so “Spain” should be removed from the title. 

Answer: We have modified the title, removing the constraint of the country. 

 

Comment2: the methodology seems to be independent from the target ontology and it is applicable not only for research organizations. May be the title should be generalized in this way too.

Answer: we already say several times in the paper that the proposed methodology can be reused or extended to other use cases with other ontologies and input data sources (line 103). We have added a sentence in the conclusions to remark on the independence of the workflow

 

Comment3: The step of refining mapping templates (lines 227-232) should be commented in more details. A list of types of potential inconsistencies should be given accompanied with the examples. Examples of “detailed comments on the mapping” are also interesting.

Answer: We have extended the paragraph with more examples and details on how this process is performed

 

Comment4: The step of systematic filling of mapping rules should be commented in more details. It should be illustrated with examples, various kinds of filling should be distinguished if required. What are explicit conditions to apply SQL scripts for preliminary data transformation?

Answer: The description of the filling step has been improved, giving some examples and mentioning explicitly the cases when the SQL views are needed.

 

Comment5: What does it mean “if the generated RDF is the expected one” during initial testing validation process formally [line 268]?

Answer: it means that we perform an initial validation step where we check if the RDF has been generated accordingly to the mapping rules, i.e., we run simple SPARQL queries asking for all resources by each class and their main properties. We have added these details to the text as well as an example.

 

Comment6: The step of validating the mappings with experts should be commented in more details. Different kinds of expert activities should be distinguished if possible and illustrated with examples.

Answer: We provide more details about the activities performed in the validation of the mappings by the experts, defining a set of tasks that must be taken into account in this kind of process.

 

Comment7: Is it possible to apply schema matching tools for “preliminary analysis of the database to ensure that the most relevant concepts of the ontology were also represented in the tables”?

Answer: It would, in our approach we decided to do it manually as we want to ensure the quality of the KG from the beginning but tools like MIRROR[1] or D2RQ[2] that rely on the relational database schema. Another benefit of this approach is that our solution is independent of the source as well, i.e., if the input data is in a schema-less format (CSV, XML, or JSON) it can also be followed. 

 

Comment8: In “Sustainable procedures” [lines 385-399] the authors mention an important issue of updating the target ontology (knowledge graph). It seems that it is not a part of the proposed methodology. Is it a future work to include this issue to the methodology?

Answer: We partially address this issue using a compact syntax (YARRRML) for declaring the mappings, but of course, new steps can be added to enhance it. For example, it would be interesting to develop a tool able to analyze the different versions of the mapping templates (extracted from each ontology release) to see the differences, so updating the mappings would be easy.  We have added these details to the future work in Section 7.

 

[1] Medeiros, L. F. D., Priyatna, F., & Corcho, O. (2015, June). MIRROR: Automatic R2RML mapping generation from relational databases. In International Conference on Web Engineering (pp. 326-343). Springer, Cham.

[2] Bizer, C., & Seaborne, A. (2004, November). D2RQ-treating non-RDF databases as virtual RDF graphs. In Proceedings of the 3rd international semantic web conference (ISWC2004) (Vol. 2004). Hiroshima: Springer.

Reviewer 2 Report

It is an interesting paper presenting a model of research laboratory production.
When reading the title of this paper, the reviewer was thinking about
modeling the knowledge for each resercher in order to facilitate his/her daily work. Rather, it was dedicated to research administrators and lab evaluators.

Apparently, there are no similar projects throughout the world since there is no state-of-the-art review; please confirm.
Second, what could be the possibilities to model this activity using something different from knowledge graphs.
Could you give examples for navigating in the KG and for reasoning?

Two final remarks:
1 - Precise more clearly that your project is dedicated to research administrators, not for researchers themselves.
2 - About knowledge capitalization for researchers, please have a look to the small article published in the USF-AWB newsletter, November issue (please not awb-usf which is a different entity).

Author Response

Comment1: Apparently, there are no similar projects throughout the world since there is no state-of-the-art review; please confirm.

Answer: As we described in the related work section, there are similar projects (where mappings were used to make the transformations) in other domains, but to the best of our knowledge, there isn’t any project or paper in this specific domain.

 

Comment2: Second, what could be the possibilities to model this activity using something different from knowledge graphs.

Answer: A global view that represents the whole domain could be implemented using other approaches such as RDBs, JSON schema, etc. Similar to the mapping rules, other proprietary approaches (e.g., Stardog mapping language) could be used. In our case, we opt to use standard technologies (i.e., R2RML), as there are many other tools beyond Oracle that parse these mappings and give more flexibility to the proposed workflow.

More info about Stardog mapping language:  https://docs.stardog.com/archive/7.5.0/virtual-graphs/mapping-data-sources.html 

More info about R2RML engines: https://rml.io/r2rml-implementation-report

Comment3: Could you give examples for navigating in the KG and for reasoning?

Answer: We include a link to a set of official SPARQL queries that can be run over the KG in section 5.1 that exploit the navigation but also reasoning.

 

Comment4:  Precise more clearly that your project is dedicated to research administrators, not for researchers themselves.

Answer: We think that the concept of “research performing organizations”  is clear enough as it is also used in official documents by the EU Commission for example. In the abstract, we already mention that we are focused on integrating the information systems of these organizations, not specific data from the researchers.

Example of official documentation at EU using this concept: https://ec.europa.eu/info/funding-tenders/opportunities/docs/2021-2027/horizon/guidance/guideline-for-promoting-research-integrity-in-research-performing-organisations_horizon_en.pdf

 

Comment5:  About knowledge capitalization for researchers, please have a look to the small article published in the USF-AWB newsletter, November issue (please not awb-usf which is a different entity).

Answer: We read the document and include some reflections in the conclusions

Round 2

Reviewer 1 Report

The authors have revised the paper w.r.t. the remarks, added explanations and comments for the most part of the remarks:

·         Title correction – fixed

·         Explanations for the step of refining mapping templates: examples of inconsistences are added, examples of comments are listed

·         Explicit conditions to apply SQL scripts for preliminary data transformation: added

·         Explanations on “if the generated RDF is the expected one”: added

·         Explanations on the step of validating the mappings with experts: a list of activities is added

·         Schema matching tools applicability: commented

·         Updating the target ontology: commented

Reviewer 2 Report

No

Back to TopTop