Next Article in Journal
A Comparison of Monocular Visual SLAM and Visual Odometry Methods Applied to 3D Reconstruction
Next Article in Special Issue
Medical Named Entity Recognition Fusing Part-of-Speech and Stroke Features
Previous Article in Journal
Resilience Analysis of Traffic Network under Emergencies: A Case Study of Bus Transit Network
Previous Article in Special Issue
Knowledge Interpolated Conditional Variational Auto-Encoder for Knowledge Grounded Dialogues
 
 
Article
Peer-Review Record

Domain Knowledge Graph Question Answering Based on Semantic Analysis and Data Augmentation

Appl. Sci. 2023, 13(15), 8838; https://doi.org/10.3390/app13158838
by Shulin Hu, Huajun Zhang * and Wanying Zhang
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Appl. Sci. 2023, 13(15), 8838; https://doi.org/10.3390/app13158838
Submission received: 10 July 2023 / Revised: 28 July 2023 / Accepted: 28 July 2023 / Published: 31 July 2023
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications)

Round 1

Reviewer 1 Report

Sure, here is the revised version of the article review:

 

Title: Domain Knowledge Graph Question Answering Based on Semantic Analysis and Data Augmentation

 

Journal: Applied Sciences

 

Section: Computing and Artificial Intelligence

 

Special Issue: Natural Language Processing (NLP) and Applications

 

Abstract:

 

The article addresses the domain of question-answering (QA) systems, specifically focusing on information retrieval-based question-answering (IRQA) and knowledge-based question-answering (KBQA) systems. While the IRQA system extracts answers from relevant text with some randomness, the KBQA system retrieves answers from structured data, resulting in higher accuracy. In certain domains, such as policy and regulations concerning household registration, precise and rigorous answers are crucial. To meet this need, the authors propose a QA system based on the household registration knowledge graph, aiming to provide accurate and rigorous answers for related household registration inquiries.

 

The proposed QA system adopts a semantic analysis-based approach to simplify each question into a simple problem consisting of a single event entity and a single intention relationship. This enables the system to quickly generate accurate answers by searching within the household registration knowledge graph.

 

One potential concern raised in the review pertains to the clarity of Figures 1, 2, and 3, which require further elaboration for better understanding. Additionally, some grammatical improvements are needed throughout the article. Nevertheless, with these minor revisions, the article can be approved for publication.

 

A noteworthy aspect of the study is the handling of the scarcity and imbalance of the QA corpus data in the household registration domain. The authors employ GPT3.5 for data augmentation to address this issue and explore its impact on the QA system's performance. The experimental results demonstrate that the accuracy rate of the QA system, when using the augmented dataset, reaches 93%, representing a 6% improvement compared to the baseline.

 

In conclusion, the article is well-written, and with some necessary adjustments, it can be recommended for publication.

Sure, here is the revised version of the article review:

 

Title: Domain Knowledge Graph Question Answering Based on Semantic Analysis and Data Augmentation

 

Journal: Applied Sciences

 

Section: Computing and Artificial Intelligence

 

Special Issue: Natural Language Processing (NLP) and Applications

 

Abstract:

 

The article addresses the domain of question-answering (QA) systems, specifically focusing on information retrieval-based question-answering (IRQA) and knowledge-based question-answering (KBQA) systems. While the IRQA system extracts answers from relevant text with some randomness, the KBQA system retrieves answers from structured data, resulting in higher accuracy. In certain domains, such as policy and regulations concerning household registration, precise and rigorous answers are crucial. To meet this need, the authors propose a QA system based on the household registration knowledge graph, aiming to provide accurate and rigorous answers for related household registration inquiries.

 

The proposed QA system adopts a semantic analysis-based approach to simplify each question into a simple problem consisting of a single event entity and a single intention relationship. This enables the system to quickly generate accurate answers by searching within the household registration knowledge graph.

 

One potential concern raised in the review pertains to the clarity of Figures 1, 2, and 3, which require further elaboration for better understanding. Additionally, some grammatical improvements are needed throughout the article. Nevertheless, with these minor revisions, the article can be approved for publication.

 

A noteworthy aspect of the study is the handling of the scarcity and imbalance of the QA corpus data in the household registration domain. The authors employ GPT3.5 for data augmentation to address this issue and explore its impact on the QA system's performance. The experimental results demonstrate that the accuracy rate of the QA system, when using the augmented dataset, reaches 93%, representing a 6% improvement compared to the baseline.

 

In conclusion, the article is well-written, and with some necessary adjustments, it can be recommended for publication.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The research is well presented for the most part (I have a couple of minor suggestions).  There is one question though that I find unanswered.  You limited your experiments to the policies of just one city.  How sure are you that your approach will translate well to policies of other regions?  It would have been useful to see at least one other city's policies tested to answer this question.  The other issue I have is minor and it involves the introduction.  I am completely unfamiliar with household registration policies.  You provide a few example types of data that might be provided in a household registration, but you never explicitly state what the registration is used for or what a policy might entail.  I suggest you add a sentence or two in the introduction to clarify this for any reader who does not know.

The rest of my comments are in the next box, showing typos/grammar errors, formatting errors, etc, that I spotted.  I suggest one more draft to fix these issues and then another proofreading of the paper to make sure your fixes do not introduce any further errors.

 

Most of my comments deal with minor spelling/grammar/punctuation/formatting issues.  In a few cases, I suggest better words or phrases.  You have several instances where you either do not spell out an acronym or you spell it out later in the paper. 

Intro:  You might want to spell out QA, IRQA and KBQA in the intro as a reader may skip the abstract.
Throughout your paper you typically do not have a blank space after ), usually you would have a space when used for references or enumerated lists
36-37:  grammatical error in this sentence ("provides" is wrong).  I think "construct...semantic query templates" is expressed wrong.  
48:  you say randomness, in what way?  Because the documents used for training the LLM are randomly collected or are of different topics?  
You throw around the word "understand" in this introduction.  I think it is the wrong word.  It is not proven that any of these NLU systems "understand".  What they do is classify with high accuracy.  I suggest you change "understand" to "classify", "recognize", or some related word.
87:  probably should be "knowledge graphs" (plural)
89:  spell out RMBA here
107: "map the question and..." is not correctly worded, perhaps "to map questions to answers" or "match a question to one or more templates or rules to generate the answer"
112:  Spell out KG here.  It would be useful if you give an example of a "complex question".
115:  I believe it should be "candidate results"
Page 3:  I wonder if the order of the last three paragraphs should be reorganized:  You first talk about query graphs and then knowledge graphs and then explain what knowledge graphs are.  Shouldn't you explain KGs, then talk about BASEBALL/LUNAR and then have the paragraph on query graphs?  On line 130 you say "in existence" implying that the systems may have been written in the 50s.  I think you should say "were developed" instead of "already in existence"
171:  you capitalize each word of section 2 but not section 3
186:  there should probably be a blank line after this line and before figure 1 (same appears to go with many of your figures, such as after line 229)
188:  in the caption, "chart" should be capitalized or the rest of the words lower cased
200:  "Guidelines lists" --> "list"
Figure 2 is a bit blurry, it would be helpful to blow up the graph on the right hand side, perhaps you could redraw this so that the table and corpus are on top and the arrows point below it to the extracted data so you can blow it up.
233:  remove the '-' in rela-tionships.
237-238:  no need to subscript the m and n later in both lines
Figure 3:  like with figure 2, the graph is hard to read, especially the text in the links, try to blow this up a little
267:  Modify the caption!
Figure 4:  second box has "IntentIon" instead of "Intention".  Shouldn't the "Ask Again" box loop back up to the Intention Classification box?
298:  Are you using a single neural network?  It might be useful to know more about it.  How many hidden layers, how deep, what is the input (how many input nodes), etc?
302:  Shouldn't parsing be Parsing?  And add a blank space after the period in 4.2.
309:  Add a space after Table 2.  In this table, add a space after "Class" and before the number.  Class 5's "processing" should be capitalized to match the other lines.
313:  Spell out RoBERTa here rather than on line 322 and also spell out BERT.
314:  Spell out BiLSTM here rather than on the next page.
316:  Spell out CLS
389:  Start with a capital letter ("We")
404:  need a reference for LTP developed by Harbin University of Technology
Figure 6:  other figures were left justified, this one looks centered
460:  this sentence should probably not be indented
475:  Same (not indented)
504/505:  again, inconsistencies with these headings compared to other sections (capitalization and also font type for 5.1)
506:  "Data" instead of "data"
537/538: again, font issues on the section headers
577:  move to the top of the next page
Table 6:  some typos here, you have "Attetion" instead of "Attention"
624:  this caption should be on the same page as the table

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Please see the attached report.

Comments for author File: Comments.pdf

Minor editing of English language required

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

All my comments are addressed. 

Minor spell check required.

Back to TopTop