You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Article
  • Open Access

5 May 2023

Adapting Static and Contextual Representations for Policy Gradient-Based Summarization

,
and
1
Master Program of Digital Innovation, Tunghai University, Taichung 40704, Taiwan
2
Department of Computer Science, Tunghai University, Taichung 40704, Taiwan
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Deep Learning for Semantic Segmentation and Explainable AI Based on Sensing Technology

Abstract

Considering the ever-growing volume of electronic documents made available in our daily lives, the need for an efficient tool to capture their gist increases as well. Automatic text summarization, which is a process of shortening long text and extracting valuable information, has been of great interest for decades. Due to the difficulties of semantic understanding and the requirement of large training data, the development of this research field is still challenging and worth investigating. In this paper, we propose an automated text summarization approach with the adaptation of static and contextual representations based on an extractive approach to address the research gaps. To better obtain the semantic expression of the given text, we explore the combination of static embeddings from GloVe (Global Vectors) and the contextual embeddings from BERT (Bidirectional Encoder Representations from Transformer) and GPT (Generative Pre-trained Transformer) based models. In order to reduce human annotation costs, we employ policy gradient reinforcement learning to perform unsupervised training. We conduct empirical studies on the public dataset, Gigaword. The experimental results show that our approach achieves promising performance and is competitive with various state-of-the-art approaches.

1. Introduction

Automatic text summarization is a technique used to produce a concise version of a source document that preserves its essence while removing redundant information []. It is always an important research task in the natural language processing community. Due to the ever-growing volume of electronic documents made available in our daily lives, the need for an efficient summarization tool and reliable approach to save readers’ time is also increasing.
In general, automatic text summarization is categorized into extractive and abstractive summarization. The extraction-based approach usually composes a summary by selecting salient parts from the source text and concatenating them to create the final results [], while the abstractive-based approach often requires rewriting and synthesizing the text content in order to produce a summary, instead of directly extracting the crucial text from the source document []. An example comparison can be found in Table 1. The input document is taken from a CNN news article, where the extractive summarization identifies important sentences by applying BERT (Bidirectional Encoder Representations from Transformer) embeddings and performing K-Means clustering [], and the abstractive summarization trains a sequence-to-sequence model for the gap-sentences generation task [].
Table 1. A CNN news article (https://edition.cnn.com/2015/04/14/americas/chile-same-sex-civil-unions/index.html) (accessed on 1 February 2023) is used to demonstrate the extractive and abstractive summarization.
How to represent the text in a numerical form for the machine to process is an important step of NLP pipelines and also has significant impact on the summarization performance []. Although the traditional bag-of-words (BOW) model is simple and commonly used, it is not able to properly capture semantic relationships when two sentences have no words in common but share similar semantics []. For example, consider the following two sentences: “Brian left Beach Boys” and “Wilson abandoned the band”; they do not share any common words but have the same semantics. Most recently, word embeddings have gained much attention because of their ability to convert words into low-dimensional and dense representations with the aid of neural network language modeling. Global Vectors (GloVe) are an unsupervised method used to learn word representations by computing word–word co-occurrence, and the model is trained on Wikipedia and Gigaword corpora []. BERT (Bidirectional Encoder Representations from Transformers) [] is a pre-trained language model based on the Transformer architecture [] for the purpose of learning the semantic information of the input text. It is trained on two learning objectives, a Masked Language Model (MLM) and Next Sentence Prediction (NSP). BERT has been widely adopted in NLP models and has achieved state-of-the-art results on 11 NLP tasks. GPT-2 (Generative Pre-trained Transformer 2) is also a pre-trained language model but based on the decoder of the Transformer architecture []. Unlike BERT, which learns the words in a bidirectional manner, the training of GPT-2 is conducted in an autoregressive fashion by the task of predicting the next word. The word embeddings of GloVe are static, where each word has a fixed vector representation. On the other hand, BERT and GPT-2 produce contextualized representations where the word embeddings are context-sensitive [].
When learning text summarization with supervised training methods, the training data are usually not available, and it is often difficult to annotate on a large scale. Unsupervised learning is becoming popular because it is more applicable in real-world practice []. Reinforcement learning (RL)-based summarization models that maximize a reward function [] and adversarial training methods for text generation [] are two mainstream approaches to unsupervised extractive summarization. There have also been some efforts on automatically generating training data [] and better learning text representations in a self-supervised fashion [].
In this paper, we employ a policy gradient reinforcement learning model for unsupervised extractive summarization. Since text generation in the neural network model often suffers from a non-differentiable issue, reinforcement learning methods devise rewards to overcome this challenge []. Moreover, to better encode the semantics of words, we investigate the effectiveness of adapting and combing various embedding schemes to represent words. We formulate the extraction of important tokens from the given text as a sequence labeling task with binary labels to find an optimal assignment. The mechanism of our method is analogous to the Generative Adversarial Networks (GAN) [], where the RL agent plays the role of ‘generator’ for performing binary sequence labeling and the reward function plays the role of ‘discriminator’ for shaping the agent’s behavior. The major difference is that our reward functions do not accept any real samples and are not trainable networks.
The primary contributions of this paper are two-fold:
  • We propose an automated model for unsupervised extractive text summarization based on a policy gradient reinforcement learning approach. The semantic representations of the text are extracted from static and contextual embeddings, including GloVe-, BERT- and GPT-based models.
  • Empirical studies on the Gigaword dataset illustrate that the proposed solution is capable of creating reasonable summaries and is comparable with other state-of-the-art algorithms.
The rest of this paper is organized in the following manner. In Section 2, several research efforts and techniques related to the topic of this paper are reviewed. We then discuss the proposed approach and detailed components in Section 3. Experimental studies are reported to justify the effectiveness and analyze the performance of the proposed method in Section 4. Finally, we draw some conclusions and also provide several possible future research paths.

3. Proposed Method

In this section, we will present the proposed network, which contains a policy gradient reinforcement learning architecture, representation learning with various embeddings and optimization strategies (Figure 1).
Figure 1. The architecture of the proposed policy gradient reinforcement learning model for unsupervised extractive summarization.

3.1. The Policy Gradient Reinforcement Learning Architecture

In this work, we formulate the task of extractive summarization as the selection of important tokens from the document and solve the problem with a reinforcement learning algorithm. The input of the algorithm is a document d i with t characters (i.e., x 1 ,   x 2 , x t ), and the goal is to predict a binary label for each character x j to determine whether to include x j in the final summary. The output will be the concatenation of those chosen items.
Reinforcement learning is a popular model to solve planning problems and has been widely used to learn in highly dynamic and complex environments. The learning agent, in general, follows three steps repeatedly with a trial-and-error mechanism to learn the optimal policy. First, the agent observes the environment state s. Second, based on a policy, the agent takes an action a from the given state s. Third, the agent receives a reward r for taking action a in state s and the environment will subsequently provide a next state s’. Ultimately, the general goal of the agent is to maximize the accumulated rewards.
We describe how to incorporate our summarization approach to the reinforcement learning settings as follows:
  • State: A document d i = x 1 ,   x 2 , x t is considered as a state s. Each token x j is further represented by the concatenation of its GloVe ( h 1 ), BERT( h 2 ) and GPT-2( h 3 ) embeddings. GloVes are a kind of static embeddings, the BERT is a category of dynamic embeddings trained on masked language modeling and GPT-2 is another type of dynamic embeddings trained on autoregressive language modeling. We then pass the concatenated embeddings into an LSTM layer to encode the sequential information and capture contextual features. The state representation is then denoted as f [,].
  • Action: The selection of important words from a sentence is treated as a sequence labelling problem in this work. Given the state representation f , the algorithm will perform the action of producing a sequence of binary labels y = y 1 ,   y 2 , y t to indicate the importance of a word.
  • Reward: We apply three commonly used reward functions to measure the quality of the extracted text, including fluency ( R f l u ), similarity ( R s i m ) and compression ( R c o m ) [,,]. The fluency reward judges if the generated text is grammatically sound and semantically correct. Its score is calculated as the average of their perplexities by a language model. The similarity reward measures the semantic similarity between the generated summary and the source document in order to ensure the content’s preservation. We adopt the cosine similarity as the similarity score to compute the distance between the embeddings of the generated summary and the source document. The compression reward encourages the agent to generate summaries close to the predefined length. We refer to the prior research work [], R c o m y , L = exp y L σ L , to calculate the compression score, where y is the length of the generated summary, L is the target summary length and σ L is a hyper-parameter.

3.2. Training Algorithm

In this work, we use a policy gradient method to learn the summarization task. The learnable policy is implemented in a neural network structure, which is parameterized by θ . Given an input document ( d i ), the state representation f is obtained after concatenating the embeddings of GloVe, BERT and GPT-2 and subsequently passing them through the LSTM layer. The policy is defined as multiplication of the probability of the token y i being included into the summary:
π θ ( y | f ) = i Pr ( y i | f ,   θ )
After sampling the action based on π θ , the reward r π   will be given to evaluate the selected tokens. The goal of the learning is to maximize the reward, which is denoted as J θ , and the gradient can be derived as shown in Equation (2) by applying a policy gradient theorem []:
θ J θ = θ r π log π θ ( y | f )
We use the gradient to update the weight of θ with the learning rate α as follows:
θ θ + α θ J θ
Algorithm 1 shows the overall algorithm to train our proposed model.
Algorithm 1: Policy Gradient-Based Summarization Model
Parameters :   θ for the policy network π.
  • for each training iteration:
  •   Sample   m   examples   of   input   document   d 1 ,   d 2 , d m
  •   for   each   d i   with   t   characters   x 1 ,   x 2 , x t as a state s
  •     Convert   x 1 ,   x 2 , x t to f 1 ,   f 2 , f t
  •     Perform   an   action   by   producing   a   binary   probability   distribution   based   on   Equation   ( 1 )   for   each   f j   to   obtain   y 1 ,   y 2 , y t
  •    Calculate the rewards R f l u ,   R s i m   and   R c o m
  •    Compute the gradient in Equation (2)
  •    Update θ in Equation (3)
  •  End for
  • end for

4. Experiments and Results

In this section, we present empirical studies to illustrate our method with static and contextual embeddings for the summary generation. These studies consist of (1) the introduction of the dataset; (2) the definition of evaluation indices; (3) comparisons with other published approaches; and (4) ablation studies to analyze the effect of each embedding scheme.

4.1. Dataset

We apply our proposed framework on a public dataset, Gigaword. The training, validation and testing set sizes are 1 M, 189 K and 1951, respectively []. The annotation data are stored in JSON format, where each instance contains the ID, text and the corresponding summary. The average length of the source and summary are shown in Table 2.
Table 2. Data statistics for the Gigaword dataset, where AvgInputLen is the average length of the input document and AvgSummaryLen is the average summary length.

4.2. Evaluation Metric

To automatically measure the summarization performance of the proposed network, Recall-Oriented Understudy for Gisting Evaluation (ROUGE), which is the most broadly used evaluation measure in the relevant research, is employed []. ROUGE measures the co-occurrence information between the machine-generated summary and ground truth summary. In general, there are three popular ROUGE variants for the summarization task, ROUGE-1 (R-1), ROUGE-2 (R-2) and ROUGE-L (R-L). R-N computes the similarity score based on the percentage of overlap in the N-gram, R-1 compares the overlap in the unigram and R-2 measures the overlap in the bigram. Meanwhile, R-L refers to the length ratio of the longest common sequence between the generated and ground-truth summaries to indicate fluency.

4.3. Experimental Results

In order to validate the effectiveness of our method, we conduct an experimental investigation to compare its results against those of other methods, which are introduced as follows:
  • Lead-8: This approach is a simple baseline which directly selects the first eight words in the source document to assemble a summary.
  • Contextual Match []: This research work introduces two language models to create the summarization and maintain output fluency. A generic pre-trained language model is used to perform contextual matching and the other target domain-specific language model is used to guide the generation fluency.
  • AdvREGAN []: The method uses cycle-consistency to encode the input text representation and applies an adversarial reinforcement-based GAN to generate human-like text.
  • HC_title_8 []: This work extracts words from the input text based on a hill-climbing approach by discrete optimization algorithms with a summary length of about eight words.
  • HC_title_10 []: The model is identical to the HC_title_8 but with a summary length of about 10 words.
  • SCRL_L8 []: The approach models the sentence compression to fine-tune BERT using a reinforcement learning setup.
The operating environment of this experimental comparison is the Windows 10 OS equipped with an Intel Core i9 central processing unit (CPU) and 128 GB of memory. The graphic processing unit (GPU) is an NVIDIA GeForce RTX 3090 with 24 GB of memory. Our hyper-parameter setting is displayed in Table 3.
Table 3. Hyper-parameter configuration.
Based on the comparison results shown in Table 4 where other baselines are directly taken from the corresponding papers, we obtain the best results on R-1 (29.75%) and R-L scores (26.98%), indicating their ability to produce informative and coherent summaries. Although our model is ranked second best on R-2 (10.26%), the value is very close to the best method (10.27%). Reinforcement learning-based approaches (SCRL_L8 and our model) with multiple reward functions yield better performance on R-1 and R-2.
Table 4. Performance comparison (%).
In addition, we also provide several examples to illustrate our results in Table 5, including the input (INPUT_X), ground-truth summary (GOLD SUMMARY) and system-generated summary (GEN SUMMARY). As presented, our algorithm has the ability to extract crucial items and produce fair summaries. Nevertheless, there are still some mistakes to overcome for further improvements. First, because of partial selection from the source sentence to form the summary, the output is not always as grammatical as it should be (displayed in red text). Devising another reword function to incorporate grammatical constraints as the background knowledge could be a possible way to ensure grammaticality. Second, in some cases, the model fails to generate factually consistent summaries (displayed in blue text), reducing the faithfulness of the output and causing misunderstanding for the readers. One potential avenue for future research relates to leveraging textual entailment models to enhance factual consistency [].
Table 5. Four example outputs produced by our model.

4.4. Ablation Experiments

In order to verify the feasibility of our proposals, we conduct ablation studies on the Gigaword dataset to investigate the benefits of our approach and quantify the effect of each embedding scheme. As the results show in Table 6, adapting static and contextual embeddings in our proposed method can exceed the results of training with GloVe, BERT and GPT-2 independently in terms of R-1, R-2 and R-3. Therefore, we believe that our method can further enhance the performance of producing summaries through learning complex features among different embedding schemes.
Table 6. Ablation experiment results (%).

5. Conclusions

In this paper, we propose an unsupervised policy gradient reinforcement method based on the combination of various embeddings to resolve the text summarization problem. We conduct empirical studies on the Gigaword dataset, and the results are satisfactory. Compared to other existing methods, our model captures better representations of each token and yields the best performance in terms of ROUGE-1 and ROUGE-L.
Since the ROUGE-2 score of our method is ranked second (10.26%) and is only 0.01% lower than the top-performing method, future directions will focus on designing reward functions to increase bigram overlap. As our approach is capable of selecting important tokens, the investigation of combining chosen tokens into comprehensible phrases could be a possible solution.
Another challenge of our work will be the practical application in resource-constrained environments such as mobile computing due to the large embedding size. We are interested in knowledge distillation and low-dimensional representation for the purpose of deploying the proposed model in resource-restricted settings.

Author Contributions

Supervision, J.-S.J.; methodology, J.-S.J., C.-S.L. and C.-H.L.; investigation, C.-S.L.; writing—review and editing, C.-S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is financially supported by the National Science and Technology Council (NSTC) of Taiwan under Grant 111-2221-E-029-019-.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Luhn, H.P. The automatic creation of literature abstracts. IBM J. Res. Dev. 1958, 2, 159–165. [Google Scholar] [CrossRef]
  2. Ferreira, R.; de Souza Cabral, L.; Lins, R.D.; e Silva, G.P.; Freitas, F.; Cavalcanti, G.D.; Lima, R.; Simske, S.J.; Favaro, L. Assessing sentence scoring techniques for extractive text summarization. Expert Syst. Appl. 2013, 40, 5755–5764. [Google Scholar] [CrossRef]
  3. Nallapati, R.; Zhou, B.; Gulcehre, C.; Xiang, B. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv 2016, arXiv:1602.06023. [Google Scholar]
  4. Miller, D. Leveraging BERT for extractive text summarization on lectures. arXiv 2019, arXiv:1906.04165. [Google Scholar]
  5. Zhang, J.; Zhao, Y.; Saleh, M.; Liu, P. Pegasus: Pre-training with extracted gap-sentences for abstractive summarization. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020; PMLR: New York, NY, USA, 2020; pp. 11328–11339. [Google Scholar]
  6. Rossiello, G.; Basile, P.; Semeraro, G. Centroid-based text summarization through compositionality of word embeddings. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres, Valencia, Spain, 3 April 2017; pp. 12–21. [Google Scholar]
  7. Radev, D.R.; Jing, H.; Styś, M.; Tam, D. Centroid-based summarization of multiple documents. Inf. Process. Manag. 2004, 40, 919–938. [Google Scholar] [CrossRef]
  8. Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 26–28 October 2014; pp. 1532–1543. [Google Scholar]
  9. Kenton JD MW, C.; Toutanova, L.K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the NAACL-HLT 2019, Minneapolis, Minnesota, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
  10. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
  11. Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog. 2019, 1, 9. [Google Scholar]
  12. Ethayarajh, K. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 55–65. [Google Scholar]
  13. Amplayo, R.K.; Angelidis, S.; Lapata, M. Unsupervised opinion summarization with content planning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2–9 February; Volume 35, pp. 12489–12497.
  14. Hyun, D.; Wang, X.; Park, C.; Xie, X.; Yu, H. Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization. arXiv 2022, arXiv:2212.10843. [Google Scholar]
  15. Wang, Y.; Lee, H.Y. Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 4187–4195. [Google Scholar]
  16. Pasunuru, R.; Celikyilmaz, A.; Galley, M.; Xiong, C.; Zhang, Y.; Bansal, M.; Gao, J. (2021, May). Data augmentation for abstractive query-focused multi-document summarization. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2–9 February; Volume 35, pp. 13666–13674.
  17. Wang, H.; Wang, X.; Xiong, W.; Yu, M.; Guo, X.; Chang, S.; Wang, W.Y. Self-supervised learning for contextualized extractive summarization. arXiv 2019, arXiv:1906.04466. [Google Scholar]
  18. Liu, L.; Lu, Y.; Yang, M.; Qu, Q.; Zhu, J.; Li, H. Generative adversarial network for abstractive text summarization. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
  19. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
  20. Sharma, G.; Sharma, D. Automatic Text Summarization Methods: A Comprehensive Review. SN Comput. Sci. 2022, 4, 33. [Google Scholar] [CrossRef]
  21. Sharma, G.; Gupta, S.; Sharma, D. Extractive text summarization using feature-based unsupervised RBM Method. In Cyber Security, Privacy and Networking: Proceedings of ICSPN 2021; Springer Nature Singapore: Singapore, 2022; pp. 105–115. [Google Scholar]
  22. Liu, Y. Fine-tune BERT for extractive summarization. arXiv 2019, arXiv:1903.10318. [Google Scholar]
  23. Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 2014, 27. [Google Scholar]
  24. Ma, X.; Keung, J.W.; Yu, X.; Zou, H.; Zhang, J.; Li, Y. AttSum: A Deep Attention-Based Summarization Model for Bug Report Title Generation. IEEE Trans. Reliab. 2023; early access. [Google Scholar]
  25. Mendes, A.; Narayan, S.; Miranda, S.; Marinho, Z.; Martins, A.F.; Cohen, S.B. Jointly Extracting and Compressing Documents with Summary State Representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; Volume 1, pp. 3955–3966. [Google Scholar]
  26. Lebanoff, L.; Song, K.; Dernoncourt, F.; Kim, D.S.; Kim, S.; Chang, W.; Liu, F. Scoring Sentence Singletons and Pairs for Abstractive Summarization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 2175–2189. [Google Scholar]
  27. See, A.; Liu, P.J.; Manning, C.D. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, 30 July–4 August 2017; Volume 1, pp. 1073–1083. [Google Scholar]
  28. Li, Y. Deep reinforcement learning: An overview. arXiv 2017, arXiv:1701.07274. [Google Scholar]
  29. Alomari, A.; Idris, N.; Sabri AQ, M.; Alsmadi, I. Deep reinforcement and transfer learning for abstractive text summarization: A review. Comput. Speech Lang. 2022, 71, 101276. [Google Scholar] [CrossRef]
  30. Bian, J.; Huang, X.; Zhou, H.; Zhu, S. GoSum: Extractive Summarization of Long Documents by Reinforcement Learning and Graph Organized discourse state. arXiv 2022, arXiv:2211.10247. [Google Scholar]
  31. Wu, Y.; Hu, B. Learning to extract coherent summary via deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
  32. Hu, B.; Lu, Z.; Li, H.; Chen, Q. Convolutional neural network architectures for matching natural language sentences. Adv. Neural Inf. Process. Syst. 2014, 2, 2042–2050. [Google Scholar]
  33. Liu, Y.; Liu, P.; Radev, D.; Neubig, G. BRIO: Bringing Order to Abstractive Summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 22–27 May 2022; Volume 1, pp. 2890–2903. [Google Scholar]
  34. Xiao, L.; He, H.; Jin, Y. FusionSum: Abstractive summarization with sentence fusion and cooperative reinforcement learning. Knowl. Based Syst. 2022, 243, 108483. [Google Scholar] [CrossRef]
  35. Zhang, X.; Lapata, M. Sentence Simplification with Deep Reinforcement Learning. In Proceedings of the EMNLP 2017: Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Copenhagen, Denmark, 7–11 September 2017; pp. 584–594. [Google Scholar]
  36. Ghalandari, D.G.; Hokamp, C.; Ifrim, G. Efficient Unsupervised Sentence Compression by Fine-tuning Transformers with Reinforcement Learning. arXiv 2022, arXiv:2205.08221. [Google Scholar]
  37. Schumann, R.; Mou, L.; Lu, Y.; Vechtomova, O.; Markert, K. Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 5032–5042. [Google Scholar]
  38. Sutton, R.S.; McAllester, D.; Singh, S.; Mansour, Y. Policy gradient methods for reinforcement learning with function approximation. Adv. Neural Inf. Process. Syst. 1999, 12. [Google Scholar]
  39. Lin, C.Y.; Hovy, E. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, Canada, 27 May–1 June 2003; pp. 150–157. [Google Scholar]
  40. Zhou, J.; Rush, A.M. Simple Unsupervised Summarization by Contextual Matching. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 5101–5106. [Google Scholar]
  41. Aharoni, R.; Narayan, S.; Maynez, J.; Herzig, J.; Clark, E.; Lapata, M. mFACE: Multilingual Summarization with Factual Consistency Evaluation. arXiv 2022, arXiv:2212.10622. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.