Next Article in Journal
Making a Case for Hybrid GFRP-Steel Reinforcement System in Concrete Beams: An Overview
Next Article in Special Issue
Comparing Speaker Adaptation Methods for Visual Speech Recognition for Continuous Spanish
Previous Article in Journal
Numerical Method of Increasing the Critical Buckling Load for Straight Beam-Type Elements with Variable Cross-Sections
 
 
Article
Peer-Review Record

Attentional Extractive Summarization

Appl. Sci. 2023, 13(3), 1458; https://doi.org/10.3390/app13031458
by José Ángel González, Encarna Segarra, Fernando García-Granada *,‡, Emilio Sanchis and Lluís-F. Hurtado
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Appl. Sci. 2023, 13(3), 1458; https://doi.org/10.3390/app13031458
Submission received: 22 December 2022 / Revised: 10 January 2023 / Accepted: 17 January 2023 / Published: 22 January 2023

Round 1

Reviewer 1 Report

This paper describes an attentional extractive summarization approach. Overall the paper is well written. The following suggestion can improve the quality of this paper:

  The authors should mention the main advantage and disadvantages of the ROUGE metric. In an extractive summary, what is the difference in the metric that can be  considered a significant difference in practice?   Is it possible to consider a corpus of scientific papers? Related works should mention related strategies for extractive summarization, including those based on network and graph analysis, e.g. doi: 10.1016/j.physa.2018.03.013 doi: 10.1109/ASRU.2009.5373486 doi: 10.1016/j.physa.2011.10.015 . Other related approaches based on statistical analysis could also be considered.   It would also be interesting to see a discussion on how the quality of the results could affect other NLP tasks. For example, if you consider only summaries for text classification, how important is it to achieve improved summarization results?      
     

Author Response

Response to Reviewer 1 Comments
- The authors should mention the main advantage and disadvantages of the ROUGE metric. In an extractive summary, what is the difference in the metric that can be considered a significant difference in practice?
We have included the next sentences in Section 8 (Evaluation):
"It should be noted that ROUGE measure is based on ngrams overlapping. Therefore it is adequate when reference and generated summaries are extractive, however, it is no longer as suitable when the reference or the generated summaries are abstractive."
- Is it possible to consider a corpus of scientific papers?
Of course, this kind of corpus is a good application domain, however, the length of the scientific papers is much longer. Therefore, a study and adaptation of the summarization systems should be done.
- Related works should mention related strategies for extractive summarization, including those based on network and graph analysis, e.g. doi: 10.1016/j.physa.2018.03.013 doi: 10.1109/ASRU.2009.5373486 doi: 10.1016/j.physa.2011.10.015 . Other related approaches based on statistical analysis could also be considered.
We have added the next sentences in Section 1 (Introduction):
"Some successful approaches for extractive summarization are based on graph representations of the documents. This is the case of LexRank [10], TextRank [27], [1], [11], and [37], among others. "
- It would also be interesting to see a discussion on how the quality of the results could affect other NLP tasks. For example, if you consider only summaries for text classification, how important is it to achieve improved summarization results?
We added to future works the next sentence:
"Due to the similarity between the classification and summarization objectives, in the sense that they look for relevant segments of a text, it could be very interesting to study a strategy to approach a text classification system based on the output of a summarization system that provide the selected sentences."

Author Response File: Author Response.pdf

Reviewer 2 Report

It was conducted for a formalization of a general framework for extractive summarization that does not fall under the umbrella of the traditional extractive systems (based on suboptimal oracles or Reinforcement Learning to optimize the ROUGE). The main mechanism is based on the interpretation of the attention model  of hierarchical neural networks, that compute document-level representations of documents and summaries from sentence-level representations, which, in turn, are computed from word-level representations. However, there are some blurs in presentations or well not clear enough explanations as follows.
1)Although the different optimal strategies were applied, from the experimental results ( Table 2~Table 4), it is apparent that the performances of the proposed models are inferior than that of campared state of art related models, so some reasonable explanations are required, since pursuing the best performance is the first target;

2)the authors have published several close related papers in recent years and also cited in the paper (such as referece 9, 10, 11), however it could not be distinguished what the differences or achievements or progresses are in the paper, so it is required for the supplimentaries in the aspect.

Author Response

Response to Reviewer 2 Comments
Point 1: Although the different optimal strategies were applied, from the experimental results (Table 2~Table 4), it is apparent that the performances of the proposed models are inferior than that of compared state of art related models, so some reasonable explanations are required, since pursuing the best performance is the first target;

We tried to explain the interest of our approach in comparison with other systems in the following paragraphs/sentences:
“Regarding the CNN/DailyMail corpus, our systems obtain similar results to Pointer-Gen+Cov, CopyCat, and SummaRunner. The obtained results are worse in comparison to other systems that use oracles, especially in the case of BertSumEXT. This is possibly due to BertSumEXT starts from a very powerful contextualized pre-trained language model. It is interesting to observe that the results obtained by our systems are better than those obtained by some Reinforcement Learning based systems such as DQN and similar to Refresh. Therefore, our extractive summarization framework could be used as an alternative to Reinforcement Learning approaches and oracle-based systems.”
“Regarding the NewsRoom corpus, our systems outperformed most of the compared systems, except in the case of the ExConSumm system. It should be noted that this system is capable of generating variable length summaries based on the input text, unlike our systems.”

Point 2: the authors have published several close related papers in recent years and also cited in the paper (such as reference 9, 10, 11), however it could not be distinguished what the differences or achievements or progresses are in the paper, so it is required for the supplimentaries in the aspect.


There are two paragraphs in the Introduction Section related to this subject:
In paragraphs 4 and 5, some extensions of the cited works are mentioned:
- The Attentional Extractive Summarization framework is proposed with the aim of generalizing our previous proposals in extractive summarization and boosting future works and improvements under this framework.
- Therefore, the proposed framework also allows the development of novel summarization systems, in addition to those presented in this work. For example, a system based on Hierarchical Convolutional Neural Networks, trained to distinguish correct summaries for documents, that relies on statistics such as the norm of the activations in order to compute sentence scores, would also fall under the umbrella of our framework.
- In this work, our systems are evaluated and studied on the CNN/DailyMail corpus, and (additionally to the previous works) on NewsRoom corpus, comparing them with more recent systems based on diverse strategies (extractive and mixed summarization systems based on oracles or reinforcement learning).
- It is also new in this work a detailed analysis of the results, including the convergence of our models and the word-length distribution of system generated summaries
- Additionally, several examples are provided to illustrate the generated summaries and the attention weights used to score the sentences.

Author Response File: Author Response.pdf

Reviewer 3 Report

Dear authors

You will succseed if you brush the abstract to make it a little more English. Second, do not use "WE" but use "the authors". 

Before giving an acronym, the authors need to give the full name of that notion: line 41 Recall-Oriented Understudy for Gisting Evaluation (ROUGE). The same thing for BERT.

The reviewer has not appreciated the idea of the authors to  hide the sources, instead, using only numbers, e.g., Recently, 39 Reinforcement Learning strategies have been extensively applied [28], [39], [7], [38] in order 40 to dispense with the sentence labeling and optimizing directly the ROUGE evaluation 41 metric [20]. - Why not mention the names of researchers? Otherwise, we have to turn pages each time we see [X] to check the source and refer to it.

lines 100-102 The sentence needs revising, especially after "due to" -  Although it seems naive, it is especially robust when it is applied to articles of newspapers, generally due to, in this domain, the first sentences are dedicated to condense the information of all the document and they are used to get the reader’s attention. 

line 125  In [28] (Refresh) - please specify who is going to refresh what?

line 128. In [39], the authors also discussed about the suboptimal nature... (how could the authors discuss in [39] or [23] if it is not their research?)

Please bear in mind that the manuscript authors refer to the source, providing either the name(s) of the study or the title of the study.

line 144 ... them to the reference summaries e.g. compressing or paraphrasing (consider a comma in this part of the sentence)

line 169 (and line 178) "suboptimal oracle algorithms due to they require a binary sentence" (we suppose the authors need to reconsider or revise the use of "due to" to correctly use it in the entire text.

line 206 Finally, it is required an interpretable mechanism to compute (change the passive voice to the active voice)

Readers would be thankful if the authors change the outline of Figure 3 so that the sentence before it is not disrupted.

line 476 The athors state:  Specifically, these categories are four. (But they mention five. The same five categories are in Table 2 and Table 3.

 

Author Response

Response to Reviewer 3 Comments
We have tried to follow all your suggestions.
Thanks to your comments, we have improved the writing of the work.
- You will succeed if you brush the abstract to make it a little more English. Second, do not use "WE" but use "the authors".


We have revised the wording of the abstract and we have revised the wording of the paragraphs that included many uses of the pronoun "we", especially in section 1 (Introduction).

- Before giving an acronym, the authors need to give the full name of that notion: line 41 Recall-Oriented Understudy for Gisting Evaluation (ROUGE). The same thing for BERT.

Done.

- The reviewer has not appreciated the idea of the authors to hide the sources, instead, using only numbers, e.g., Recently, 39 Reinforcement Learning strategies have been extensively applied [28], [39], [7], [38] in order 40 to dispense with the sentence labeling and optimizing directly the ROUGE evaluation 41 metric [20]. - Why not mention the names of researchers? Otherwise, we have to turn pages each time we see [X] to check the source and refer to it.


Done in most cases. In those cases when there is a long enumeration of references we have not included the name of the authors.


- lines 100-102 The sentence needs revising, especially after "due to" - Although it seems naive, it is especially robust when it is applied to articles of newspapers, generally due to, in this domain, the first sentences are dedicated to condense the information of all the document and they are used to get the reader’s attention.
We have revised the wording of the work and we have replaced the use of "due to" with other more appropriate forms.
- line 125 In [28] (Refresh) - please specify who is going to refresh what?
We have removed the name of the system, we have left the reference only in the paper.
- line 128. In [39], the authors also discussed about the suboptimal nature... (how could the authors discuss in [39] or [23] if it is not their research?)
We have revised the wording to make it clearer.
- Please bear in mind that the manuscript authors refer to the source, providing either the name(s) of the study or the title of the study.
We have revised the wording to make it clearer.
- line 144 ... them to the reference summaries e.g. compressing or paraphrasing (consider a comma in this part of the sentence)
Done.
- line 169 (and line 178) "suboptimal oracle algorithms due to they require a binary sentence" (we suppose the authors need to reconsider or revise the use of "due to" to correctly use it in the entire text.
We have revised the wording of the work and we have replaced the use of "due to" with other more appropriate forms.
- line 206 Finally, it is required an interpretable mechanism to compute (change the passive voice to the active voice)
Done.
- Readers would be thankful if the authors change the outline of Figure 3 so that the sentence before it is not disrupted.
Done.
- line 476 The authors state: Specifically, these categories are four. (But they mention five. The same five categories are in Table 2 and Table 3.
Done.

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Dear nauthors, 

thank you for the job done. We hope that next your manuscript will not have those flows.

Back to TopTop