Next Article in Journal
Water-Based Rehabilitation in the Elderly: Data Science Approach to Support the Conduction of a Scoping Review
Next Article in Special Issue
Knowledge Graph Alignment Network with Node-Level Strong Fusion
Previous Article in Journal
Technologies and Applications of Communications in Road Transport
Previous Article in Special Issue
Zero-Shot Emotion Detection for Semi-Supervised Sentiment Analysis Using Sentence Transformers and Ensemble Learning
 
 
Article
Peer-Review Record

An End-to-End Mutually Interactive Emotion–Cause Pair Extractor via Soft Sharing

Appl. Sci. 2022, 12(18), 8998; https://doi.org/10.3390/app12188998
by Beilun Wang 1,2,*, Tianyi Ma 3, Zhengxuan Lu 1 and Haoqing Xu 1
Reviewer 2:
Appl. Sci. 2022, 12(18), 8998; https://doi.org/10.3390/app12188998
Submission received: 27 July 2022 / Revised: 5 September 2022 / Accepted: 6 September 2022 / Published: 7 September 2022
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications)

Round 1

Reviewer 1 Report

1) In the literatures mentioned in the related work, please provide the limitations of the work.

2) Except few, many of the references are not recent. Try to include six to ten of the recent references from last last three  years.

3) In some places the abbreviations are not written correctly. Example: at line number 100, for RNN.

4) Provide the reason for using the activation function such as 'Softmax' and 'ReLU' in the given architecture. 

5) in section 3.3, the authors mentioned about the presence of fewer negative pairs, but did not mention about the impact of this in the work, but in reality it must have some adverse effect on the performance. If the author could be able to mention the reason for the same, it will be strengthened the readability and understanding.

6) The conclusion is not able to capture the essence of the entire work done. It could be improved. The future work is also not described properly, which can be written in an elaborated way. 

Author Response

Point 1: In the literatures mentioned in the related work, please provide the limitations of the work.

Response 1: Limitations on related work are shown in line 98-99, line 108-112, line 115-116, line 125-126 and line 130-131.

 

Point 2: Except few, many of the references are not recent. Try to include six to ten of the recent references from last last three years.

Response 2: We have added some references related to the ECPE task and the method used. They are:

  • Tang, H.; Ji, D.; Zhou, Q. Joint multi-level attentional model for emotion detection and emotion-cause pair extraction. Neurocomputing 2020, 409, 329–340.

  • Fan, C.; Yuan, C.; Gui, L.; Zhang, Y.; Xu, R. Multi-task sequence tagging for emotion-cause pair extraction via tag distribution refinement. IEEE/ACM Transactions on Audio, Speech, and Language Processing 2021, 29, 2339–2350.

  • Jia, X.; Chen, X.; Wan, Q.; Liu, J. A Novel Interactive Recurrent Attention Network for Emotion-Cause Pair Extraction. In Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence, 2020, pp. 1–9.

  • Chen, F.; Shi, Z.; Yang, Z.; Huang, Y. Recurrent synchronization network for emotion-cause pair extraction. Knowledge-Based Systems 2022, 238, 107965.

  • Yu, J.; Liu, W.; He, Y.; Zhang, C. A mutually auxiliary multitask model with self-distillation for emotion-cause pair extraction. IEEE Access 2021, 9, 26811–26821.

  • Li, C.; Hu, J.; Li, T.; Du, S.; Teng, F. An effective multi-task learning model for end-to-end emotion-cause pair extraction. Applied Intelligence 2022, pp. 1–11.

  • Yang, X.; Yang, Y. Emotion-Type-Based Global Attention Neural Network for Emotion-Cause Pair Extraction. In Proceedings of the The International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery. Springer, 2021, pp. 546–557.

 

Point 3: In some places the abbreviations are not written correctly. Example: at line number 100, for RNN.

Response 3: Modified abbreviations for RNNs on lines 105 and 107, and modified abbreviations for Emiece on line 282. 

 

Point 4: Provide the reason for using the activation function such as 'Softmax' and 'ReLU' in the given architecture.

Response 4: A variety of activation functions can be used in the encoder part of Emiece, and Softmax and ReLU are most commonly used in the network layer. By mapping discrete embedding values to continuous ones, Softmax can express probabilities. In addition to better training the deeper network using ReLU, the output 0 keeps the computation efficient by alleviating the overfitting problem.

 

Point 5: in section 3.3, the authors mentioned about the presence of fewer negative pairs, but did not mention about the impact of this in the work, but in reality it must have some adverse effect on the performance. If the author could be able to mention the reason for the same, it will be strengthened the readability and understanding.

Response 5: Here is a writing error, the original sentence is intended to express the meaning that λ- is relatively small, since the number of negative pairs is much more than positive ones (in line 187 and the changes are highlighted in purple).

 

Point 6: The conclusion is not able to capture the essence of the entire work done. It could be improved. The future work is also not described properly, which can be written in an elaborated way.

Response 6: The conclusions and future work have been revised as follows.

In this paper, we propose an end-to-end model that mutually transfers information via soft-sharing between emotion and cause extraction tasks. By using weighted representations of sentiment and cause filtering nonsensical clauses, we improved the efficiency of emotion-cause pairing. The end-to-end model takes into account both emotion & cause extraction and emotion-cause pairing, thereby greatly reducing the cumulative error. Based on experiments conducted on the standard ECPE dataset, Emiece achieves significant improvements in emotion-cause pairs extraction over the original two-step ECPE model and other end-to-end models.

In the future, 1) we will examine the reasons for the model's mistakes by performing interpretability analysis on the possible wrong predictions in the current model. 2) We also attempt to reduce the network depth in order to perform the emotion-cause pair extraction as a solution to the problem that it is difficult to train a clause-level encoder with numerous parameters. 3) It is relatively complex to construct pairs with the Cartesian product, so we will use a more efficient module for pairing prediction.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper is about an end-to-end approach to solving emotion cause pair extraction. The main contributions of the paper are:

1. A parallel model is used instead of a two-stage pipeline to avoid error propagation.

2. Emotion clause encoder and cause clause encoder are used in parallel and the learned weights are soft-shared among them in the first layer. This allows mutual information transfer between them.

3. Weighted representation of emotion and cause extractor are used to filter out meaningless clauses.

The idea is intuitive and simple and it seems to work. 

About the value of λ_sf, the figure-4 does not seem very convincing as the values are going up and down. At 0.75, recall is relatively. It seems to me the value doesn't make much of a difference, except at certain points like 0.3 and 0.5. F-score does go up slightly, however.

The phrase, 'mutual information transfer' can be confusing. Perhaps it can be rephrased as 'mutual transfer of information'.

In the captions for figure-2 and figure-3, it would be better to mention what the symbols denote. Although it is mentioned in the text, it would still make it easier for the reader.

Also, any intuition about why sharing all layers worsens performance considerably?

It would be a good idea to discuss the cases, in any, where the proposed model makes a mistake, but one of the baseline gets it right. Or to report if no such cases are there.

Author Response

Point 1: About the value of λ_sf, the figure-4 does not seem very convincing as the values are going up and down. At 0.75, recall is relatively. It seems to me the value doesn't make much of a difference, except at certain points like 0.3 and 0.5. F-score does go up slightly, however.

Response 1: Several experiments have shown that changing λ_sf does not have a significant effect on the model's performance. Therefore, as mentioned in line 249-250 of the article, we chose the value when the composite performed best as the final parameter of λ_sf.

 

Point 2: The phrase, 'mutual information transfer' can be confusing. Perhaps it can be rephrased as 'mutual transfer of information'. 

Response 2: We have made the appropriate changes.

 

Point 3: In the captions for figure-2 and figure-3, it would be better to mention what the symbols denote. Although it is mentioned in the text, it would still make it easier for the reader.

Response 3: We have added notation instructions, see Figure 2 and Figure 3 for details.

 

Point 4: Also, any intuition about why sharing all layers worsens performance considerably?

Response 4: The reasons for the poor results due to sharing all layers have been added to the Ablation study section in lines 313 to 319.

Belinkov et al. [1] found that in the seq2seq machine translation model, the low-level layer of the RNNs unit (i.e., the first layer in the encoder) represents the word structure, while the high-level layer focuses on the semantic meaning. Since the semantic information of the emotion clause and the cause clause is quite different, sharing the high-level layer alone will blur the features of the text, not to mention that sharing all layers will confuse the word structure information with the semantic information, resulting in a reduced effect.

 

Point 5: It would be a good idea to discuss the cases, in any, where the proposed model makes a mistake, but one of the baseline gets it right. Or to report if no such cases are there.

Response 5: Based on the observations in the experiments, it is true that we cannot avoid such situations, and the errors are very common in other work. Therefore, we mentioned in the future work that A future study will examine the reasons for the model's mistakes by performing interpretability analysis on the possible wrong predictions in the current model.

 

 

Author Response File: Author Response.pdf

Back to TopTop