Next Article in Journal
Cost Optimization of Reinforced Concrete Section According to Flexural Cracking
Previous Article in Journal
Improving Mobile Game Performance with Basic Optimization Techniques in Unity
 
 
Article
Peer-Review Record

Methods, Models and Tools for Improving the Quality of Textual Annotations

Modelling 2022, 3(2), 224-242; https://doi.org/10.3390/modelling3020015
by Maria Teresa Artese and Isabella Gagliardi *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Modelling 2022, 3(2), 224-242; https://doi.org/10.3390/modelling3020015
Submission received: 31 January 2022 / Revised: 1 April 2022 / Accepted: 8 April 2022 / Published: 12 April 2022

Round 1

Reviewer 1 Report

The main focus in this paper is on textual features models and tools that can help to improve the quality of keywords, using natural language processing with machine learning and deep learning approaches.

Authors have explained that in this paper, different steps of the pipeline will be addressed and different solutions will be analysed, implemented, evaluated, and compared, using statistical methods, machine learning, and neural networks as appropriate.

Authors have also mentioned that the models are trained on different datasets found on the Internet or created ad hoc that share some characteristics with the starting dataset, and the results are presented, discussed, and compared with the state of the art.

 

Here is my comments which might help to improve the quality of the paper:

 

  • Some of the sentences in the abstract are repeated in the Introduction section. This not very professional and I suggest the authors to modify and change the repeated sentences in the Introduction section.
  • Section 2 (Related works) has covered the methods implemented by other researchers but there should be more information about the results and their achievements. Additional discussion on the comparison of the methods and models and their strengths and weaknesses will be beneficial. This will help the reader to compare the efficiency of the existing methods and models and have a better understanding of the knowledge gap in the filed.
  • In section 3, Data is a sub title and the numbering format should be applied. More clarification is required on why pre-processing step is not required in Data section. And it is not clear if pre-processing has been done in this research or not. Moreover, it is explained that lemmatization and pos tagging can be done optionally, but it is not clear if they have been done in this research or not. More explanation is required on this.
  • Data set has been described in section 4 which is related to results and discussion. Data gathering approach should be described in your methodology, not your result.
  • The machine learning approach and deep learning approach, has been explained in the results section. The structure of the machine learning and deep earning models and how they have been implemented, should be described in the methodology, not results.
  • For the evaluation of the machine learning models, accuracy percentage and confusion matrix have been presented. I strongly suggest the authors to consider other evaluation metrics such as F1 score as well. A new table can be created based on the presented confusion matrix to discuss F1 score, recall and precession metrics.
  • More detail is required on the voting system which has been defined, that can associate a degree of confidence to each term (suggestion of correction).
  • The paper has got different discussion sections in different parts of section 4. In fact, section 4 is a mixture of methodology, results and discussion. These should be separated and the structure of the paper needs significant improvement.
  • The language needs improvement. There are structural and grammatical errors in some of the sentence

Author Response

Please see the attached, it is clearer.

Methods, Models and Tools for Improving the Quality of Short Texts *

Maria Teresa Artese and Isabella Gagliardi

 

Manuscript ID: modelling-1600564

 

* Original Title: Methods, Models and Tools for Improving the Quality of Textual Features

The title has been changed to make the paper more consistent and easily understandable

 

Reviewer #1

 

We are thankful to the editor and anonymous reviewers for reviewing our article and for the valuable comments and suggestions.

The current version of the paper has been carefully and extensively revised following the reviewers’ suggestions. We have taken into account the reviewers’ comments and we have added paragraphs and new information, to respond to their requests as described in detail below.

 

Line numbers refer to the file of the revised paper in pdf format.

 

Some of the sentences in the abstract are repeated in the Introduction section. This not very professional and I suggest the authors to modify and change the repeated sentences in the Introduction section.

We thank the reviewer for his helpful notes. We have revised the paper and the text has been changed accordingly.

Section 2 (Related works) has covered the methods implemented by other researchers but there should be more information about the results and their achievements. Additional discussion on the comparison of the methods and models and their strengths and weaknesses will be beneficial. This will help the reader to compare the efficiency of the existing methods and models and have a better understanding of the knowledge gap in the filed.

At several points, in section 2, we have better detailed the results, strengths, and weaknesses of the related works: some specifications have been added at lines 72-75, 77, 78, 82-84, 88-89, 92-94.

In section 3, Data is a sub title and the numbering format should be applied.

Thank you for your suggestion. We applied numbering format to sections 3 and 4.

More clarification is required on why pre-processing step is not required in Data section. And it is not clear if pre-processing has been done in this research or not. Moreover, it is explained that lemmatization and pos tagging can be done optionally, but it is not clear if they have been done in this research or not. More explanation is required on this.

The preprocessing phase section has been revised and better detailed, both in the Data section (lines 118-123) and at beginning of Section 4 (lines 170-174).

Data set has been described in section 4 which is related to results and discussion. Data gathering approach should be described in your methodology, not your result

The machine learning approach and deep learning approach, has been explained in the results section. The structure of the machine learning and deep earning models and how they have been implemented should be described in the methodology, not results.

The paper has got different discussion sections in different parts of section 4. In fact, section 4 is a mixture of methodology, results and discussion. These should be separated and the structure of the paper needs significant improvement.

After the revision of the paper, we decided to maintain the same structure in the paper: the first part defines a more theoretical view of the problem and in the section “Our approach: Results and Discussions” we describe in more detail the proposed solution, the tools and methods devised in the experimentations, together with the results obtained and some comments and evaluations.

For the evaluation of the machine learning models, accuracy percentage and confusion matrix have been presented. I strongly suggest the authors to consider other evaluation metrics such as F1 score as well. A new table can be created based on the presented confusion matrix to discuss F1 score, recall and precession metrics.

Thank you for your suggestion. We added two tables with Recall, Precision, and F1 score, and briefly discussed these metrics with respect to the confusion matrix. (Tables 3a and b, lines 281-293)

More detail is required on the voting system which has been defined, that can associate a degree of confidence to each term (suggestion of correction).

We have rewritten and added some more details about the voting system (in Task 1: Off the shelf standard packages, lines 253-254, and in Task “: Statistical approach, lines 365-367)

The language needs improvement. There are structural and grammatical errors in some of the sentence.

Thank you. We have revised the paper: the grammar errors have been corrected and the English improved.

 

 

Author Response File: Author Response.pdf

Reviewer 2 Report

The article presents and evaluates various approaches for assessing and improving the quality of textual tags in for instance a library setting.

It is well-structured, however the authors should simplify their sentences (and correct them in the process) as they are often too long, e.g. "The need for tools to make the best use of tags, already available and associated to data, in order to improve navigation and search in multilingual text archives, is a pressing request." , or "This paper focuses on textual features models and tools, which, based on natural language processing with machine learning and deep learning approaches, allow, in a supervised and unsupervised way, to improve the quality of keywords." Some sentences should be reformulated: "This obviously makes that the model succeeds to discriminate very well the various languages." Therefore, please correct all grammatical and spelling problems.

With respect to the methodology, the paper can be improved. First, it does not become clear, what you regard as "FEATURES". Sometimes you refer to them as "single words (1-gram) and character n-grams". Then again, you write: "using char 3-grams with 400, 800 and 1200 features" and "Extracting the n most frequent elements, with n=1200, 800, and 400, as the number of n increases, the number of features extracted from the data increases". How do you represent 3-grams as inputs to your chosen algorithms?

Therefore, please state clearly what a "feature" (this term is even part of your title) in your understanding is and use this meaning consistently in the article. The same applies to the used data sets: Please refer to them in a consistent fashion.

The neural network in figure 3 is not explained regarding its architecture. Why did you select this architecture? (Tensorflow screenshots are not sufficient!). At the beginning of section 4 you should explain in more detail what "tags" have been used and how many of them and from which languages for the following evaluations. In the experimental results, explain the difference between the datasets "Train", "Test" and others. In the result tables, please highlight the best results and interpret them in your own words.

In task 4 "semantic relatedness": what is the influence of homonyms?

The paper (and also any section) should not end with a bullet list.

In its current form, the article cannot be accepted. However, it has potential and can be accepted once all problems have been resolved.

 

 

 

Author Response

Please see the attached, it is clearer.

Methods, Models and Tools for Improving the Quality of Short Texts *

Maria Teresa Artese and Isabella Gagliardi

 

Manuscript ID: modelling-1600564

 

* Original Title: Methods, Models and Tools for Improving the Quality of Textual Features

The title has been changed to make the paper more consistent and easily understandable

 

Reviewer #2

 

We are thankful to the editor and anonymous reviewers for reviewing our article and for the valuable comments and suggestions.

The current version of the paper has been carefully and extensively revised following the reviewers’ suggestions. We have taken into account the reviewers’ comments and we have added paragraphs and new information, to respond to their requests as described in detail below.

 

Line numbers refer to the file of the revised paper in pdf format.

It is well-structured, however the authors should simplify their sentences (and correct them in the process) as they are often too long, e.g. "The need for tools to make the best use of tags, already available and associated to data, in order to improve navigation and search in multilingual text archives, is a pressing request." , or "This paper focuses on textual features models and tools, which, based on natural language processing with machine learning and deep learning approaches, allow, in a supervised and unsupervised way, to improve the quality of keywords." Some sentences should be reformulated: "This obviously makes that the model succeeds to discriminate very well the various languages." Therefore, please correct all grammatical and spelling problems.

Thank you. We have revised the paper: the grammar errors have been corrected and the English improved.

With respect to the methodology, the paper can be improved. First, it does not become clear, what you regard as "FEATURES". Sometimes you refer to them as "single words (1-gram) and character n-grams". Then again, you write: "using char 3-grams with 400, 800 and 1200 features" and "Extracting the n most frequent elements, with n=1200, 800, and 400, as the number of n increases, the number of features extracted from the data increases". How do you represent 3-grams as inputs to your chosen algorithms?

Therefore, please state clearly what a "feature" (this term is even part of your title) in your understanding is and use this meaning consistently in the article. The same applies to the used data sets: Please refer to them in a consistent fashion.

Thank you for your comment. Indeed, the term ‘feature’ was used inconsistently, both to indicate words or terms as in the title, and to indicate characteristics used by machine learning and ANN algorithms. So, to improve understanding, we changed the title to the paper, using 'short texts' instead of 'textual features', and always replaced 'features' with "words" or "keywords" or "tags" for that meaning. In addition, we added the explanation of ‘features’ at the beginning of section 4.1. It has also been made more consistent in the use and naming of the datasets in the different experiments.

Char 3 grams input has been vectorized using Python Sklearn feature extraction methods (lines 261-262).

The neural network in figure 3 is not explained regarding its architecture. Why did you select this architecture? (Tensorflow screenshots are not sufficient)

The ANN used for the identification of the language is a simple RNN, composed by three levels fully connected.

The figure has been eliminated, that did not help in the comprehension, and a short description of the architecture and the motivation of the architectural choice have been added (Lines 299-304).

At the beginning of section 4 you should explain in more detail what "tags" have been used and how many of them and from which languages for the following evaluations.

The description of the datasets has been improved and the consistency of the datasets used has been added (lines 205-208 and 211-214).

In the experimental results, explain the difference between the datasets "Train", "Test" and others. In the result tables, please highlight the best results and interpret them in your own words.

More detail has been added on train/test and the other datasets, to improve clarity (lines 215-216, 275-276, 396-400)

In task 4 "semantic relatedness": what is the influence of homonyms?

The models used to measure the semantic relatedness of two terms, word embedding models using word2vec, cannot handle homonyms. We therefore plan to experiment in the future with Bert or Elmo, which can handle homonymy cases from the usage context of a word. This enhancement has been added to future work (lines 476-477).

The paper (and also any section) should not end with a bullet list. 

The structure of the sections with bullets has been revised and a sentence has been added to not end any section with bullets.

In its current form, the article cannot be accepted. However, it has potential and can be accepted once all problems have been resolved.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Authors have addressed my comments appropriately and the paper has been improved. 

Author Response

We are thankful to the  anonymous reviewer for reviewing our article and for the nice words of appreciation

Reviewer 2 Report

Thank you for taking into account the previous comments. Your work really has improved. Yet, there are still some remarks:

  • Your new title makes no sense. You do not improve the quality of short texts, but of annotations or tags (see your own sentence on line 105). Please think about it and fix the respective occurrences in the main text as well.
  • Before describing your methodology in section 4, please explain clearly, how the given tags are used/incorporated in your test cases. Please elaborate clearly after line 169. Maybe describe it as an extension of figure 1.
  • You should add a section before the conclusion in which you discuss your major findings again and maybe elaborate on possible problems you encountered during your experiments.

Other than that, the paper can be accepted when you thoroughly review it.

 

 

 

 

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Back to TopTop