CTGGAN: Controllable Text Generation with Generative Adversarial Network
Abstract
:1. Introduction
- We propose to use GANs to combine the language model with various discriminators. Through the adversarial training process, the rewards derived from the discriminators can be integrated into the language model decoding process to correctly guide generations under the constraints.
- We are the first to propose the hypothesis that the weight of various constraints should vary with the different contexts ahead, and design an algorithm to learn this dynamic weight through training samples.
- We perform text completion tasks on 15 kinds of English prefixes, and also make evaluations on customer service dialogue scenarios with soothing words to verify our algorithm. Both of them demonstrate the effectiveness of the proposed method.
2. Related Work
- Conditional Language Models: The conditional language models (CLMs) can be achieved with different implementations. For instance:
- Decoding-time Methods: Different from CLMs, decoding-time methods require no pre-trained language model training.
3. Materials and Methods
3.1. Introduction to Language Models
3.2. Introduction to GANs
3.3. Proposed Method
- Generator and Optimization: For the Generator part, i.e., , the probability of output text can be expressed as:
- Discriminator and Optimization: The Discriminator is responsible for scoring the qualities of texts, achieved through the combination of various constraint conditions. In this paper, for simplicity, two constraint conditions are set: positive sentiment and fluency. The former can be implemented using a discriminative model such as BERT [26], where the input is a piece of text, and the output is the probability of being positive, i.e., . The latter can be calculated using perplexity (PPL), evaluating the fluency of the generated text with a language model, i.e., . The choice of the language model can be either the original language model (without logits bias in Equation (7)) of the Generator or another well-trained language model. In this paper, for convenience, we choose the original language model of the Generator for PPL calculation:
4. Experiments and Results
4.1. Experiments on the Text-Completion Task
- Training data: We obtain 3000 samples from ChatGPT [27] as the positive sentiment controlled corpus, and select 1000 samples with high positive probabilities as the final dataset for the model’s training, as is shown in Table 1. The prompt we use is, taking the prefix “The year is 1910” as an example, “Continue to write after ‘The year is 1910’ to express a positive sentiment”. The generated corpus has a high quality to express positive sentiment to a certain extent; however, for the sake of positive probability measurement with quantity, we filter the low probability samples with a positive sentiment discriminator which is provided by the BOLT [22]. This operation is crucial as the only sentiment controller in our method setting is the positive sentiment discriminator, while the low positive probability samples will lead to a misgauge of the discriminator.
- Model’s setting: Similar to BOLT [22], we utilize GPT2-large (https://huggingface.co/gpt2-large, accessed on 5 December 2023) as the basis of the Generator part and also the fluency measurement model where the latter is calculated by Perplexity (PPL). As for positive sentiment judgment, we employ a BERT classifier model which is trained after “yelp polarity” dataset (https://www.yelp.com/dataset, accessed on 5 December 2023). During GANs training, parameters of both the models above (i.e., GPT2-large and BERT classifier) are frozen with no parameters updated. We only train parameters like bias network (i.e., in Equation (8)), embedding layers, and other linked parts (i.e., in Equation (14)).
- Evaluation metrics: As the Table 2 shows, we evaluate the generated texts from four aspects: internal positive classifier evaluation (Int.Clsf) measures the positive sentiment score as the form of the positive probability derived from the BERT classifier ( mentioned in Equation (15)); external positive classifier evaluation (Ext.Clsf) also calculates the positive probability but using an external classifier obtained from huggingface (https://huggingface.co/VictorSanh/roberta-base-finetuned-yelp-polarity, accessed on 5 December 2023); perplexity (PPL) evaluates the fluency of the generated text which is calculated by the GPT2-XL (https://huggingface.co/gpt2-xl, accessed on 5 December 2023) model according to Equation (12); the average occurrences of distinct trigrams (Dist-3) measures the degree of diversity of the generation which is reckoned up by NLTK (https://www.nltk.org/, accessed on 5 December 2023) package.
- Results: As can be seen from the Table 2, our method surpasses the traditional methods (each of them obtains the generated results from the corresponding decoding strategy straightforwardly on the 15 prefixes mentioned above) in the two metrics of and . Specifically, The value is better than other methods by an average of 14.58, and is better than other methods by an average of 0.35. In addition, the probability value of the external classifier (Ext.Clsf) can also reach a comparable level to traditional methods. Our method is weaker than methods such as “MuCola | Mix&Match | BOLT” in terms of internal classifier probability (Int.Clsf). It is because our method considers not only positive sentiment, but also the fluency of the sentence, so it needs to learn the weights of those two stuffs from training positive samples which is different from the traditional methods that only pursue high positive sentiment probability. Moreover, there exist gaps between the training samples we employ and the so-called “perfect” samples (whose positive probabilities can reach 1); thus, from the training perspective, the positive score ceiling of our method will be lower than that of other baseline methods.
- Ablation study: We only use the Generator part to train the above positive samples through the loss function of the language model (i.e., Equation (11)). The results are shown in Figure 2. It can be seen that without Discriminator to provide rewards to optimize the Generator network and guarantee the generated results, the evaluation metrics will decline to varying degrees, to be detailed, there are 4.6%, 6.8%, and 8.5% deterioration on metrics of Int.Clsf, Ext.Clsf, and Dist-3, and PPL value increases by 0.61 which means a worse fluency. Hence, the Discriminator plays an important role in guiding the Generator to produce high-quality constrained texts. In addition, it validates that CTG cannot be ensured by using only language model fine-tuning.
4.2. Experiments on Chinese Customer Service Dialogue
- Soothing reply discriminator: We harness the Chinese BERT model (https://github.com/ymcui/Chinese-BERT-wwm, accessed on 5 December 2023) as the basic model for the soothing word discriminator. The positive samples, a total of 5000 sentences with higher quality, are selected from the customer service staff’s responses in the customer service dialogue materials. We modify some of the sentences, for example, replacing the Chinese word “你(you)” (non-honorable word) with “您(you)” (honorable word). In order to get negative samples, that is, replies without soothing words, we use the ChatGLM-3 [28] large language model to rewrite the positive samples. The rewriting prompt we use is “将下列文本换成简短精炼的说法,意思保持不变,不要出现谦辞敬辞。(Change the following text into a short and concise statement, the meaning remains the same, and do not appear to be polite or respectful.)”. Table 3 displays some case comparisons before and after ChatGLM-3’s modifications which indicate that after modifications with the ChatGLM-3 model, the corresponding expressions are more frigid in terms of the emotion and do not have a soothing style. If a similar reply manner is used, it will aggravate the user’s dissatisfaction.
- Customer service dialogue training: We utilize the Chinese GPT-2 (https://github.com/Morizeyao/GPT2-Chinese, accessed on 5 December 2023) model as the basic part of the Generator. Similarly, this model is also used as the fluency discriminator, that is, to calculate the value of the generated text. During the training process of the Generator, it should be noted that tokens irrelevant to the customer service replies will be marked with a special MASK notation. In this way, when calculating the language model loss in Equation (11), the cross-entropy function of these tokens will be ignored and only tokens associated with customer service reply contents will be calculated.
- Results: Table 4 lists several cases generated under constraints, where W/. DIS refers to the results generated using our method, and W/O.DIS refers to those derived by training only the Generator (similar to the ablation study in Section 4.1). It can be seen that our method can generate replies that satisfy the custom service scenario by and large. Specifically, the response logic is basically correct (for example, the reply of “canceled your subscription” corresponds to the context of “I have canceled this package” in case 5), and it is generally guaranteed to generate replies with soothing words (for example, “Thank you” in case 4, and “anything I can do for you” in case 5). Nevertheless, the text generated by W/O. DIS will obviously retell the words from context ahead, e.g., “spend one yuan to get a 50-min call package” in case 3. Therefore, when using a lightweight language model like GPT-2 for dialogue training, our method can generate relatively controllable responses.
5. Discussion and Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Prabhumoye, S.; Black, A.W.; Salakhutdinov, R. Exploring Controllable Text Generation Techniques. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, 8–13 December 2020; pp. 1–14. [Google Scholar] [CrossRef]
- Zhang, H.; Song, H.; Li, S.; Zhou, M.; Song, D. A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models. ACM Comput. Surv. 2022, 56, 64. [Google Scholar] [CrossRef]
- Zhu, L.; Xu, Y.; Zhu, Z.; Bao, Y.; Kong, X. Fine-Grained Sentiment-Controlled Text Generation Approach Based on Pre-Trained Language Model. Appl. Sci. 2023, 13, 264. [Google Scholar] [CrossRef]
- Gururangan, S.; Marasović, A.; Swayamdipta, S.; Lo, K.; Beltagy, I.; Downey, D.; Smith, N.A. Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 8342–8360. [Google Scholar] [CrossRef]
- Huang, Y.; Wu, X.; Hu, W.; Feng, J.; Deng, C. State-Aware Adversarial Training for Utterance-Level Dialogue Generation. In Proceedings of the Towards Semi-Supervised and Reinforced Task-Oriented Dialog Systems (SereTOD), Abu Dhabi, United Arab Emirates; Beijing, China, 7 December 2022; pp. 62–74. [Google Scholar] [CrossRef]
- Gan, Y.; Lu, G.; Su, Z.; Wang, L.; Zhou, J.; Jiang, J.; Chen, D. A Joint Domain-Specific Pre-Training Method Based on Data Enhancement. Appl. Sci. 2023, 13, 4115. [Google Scholar] [CrossRef]
- Keskar, N.S.; McCann, B.; Varshney, L.R.; Xiong, C.; Socher, R. CTRL: A Conditional Transformer Language Model for Controllable Generation. arXiv 2019, arXiv:1909.05858. [Google Scholar]
- Dou, Z.Y.; Liu, P.; Hayashi, H.; Jiang, Z.; Neubig, G. GSum: A General Framework for Guided Neural Abstractive Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; pp. 4830–4842. [Google Scholar] [CrossRef]
- Krause, B.; Gotmare, A.D.; McCann, B.; Keskar, N.S.; Joty, S.; Socher, R.; Rajani, N.F. GeDi: Generative Discriminator Guided Sequence Generation. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 4929–4952. [Google Scholar] [CrossRef]
- Yu, D.; Yu, Z.; Sagae, K. Attribute Alignment: Controlling Text Generation from Pre-trained Language Models. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 2251–2268. [Google Scholar] [CrossRef]
- Qian, J.; Dong, L.; Shen, Y.; Wei, F.; Chen, W. Controllable Natural Language Generation with Contrastive Prefixes. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, 22–27 May 2022; pp. 2912–2924. [Google Scholar] [CrossRef]
- Liu, G.; Feng, Z.; Gao, Y.; Yang, Z.; Liang, X.; Bao, J.; He, X.; Cui, S.; Li, Z.; Hu, Z. Composable Text Controls in Latent Space with ODEs. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; pp. 16543–16570. [Google Scholar] [CrossRef]
- Ding, H.; Pang, L.; Wei, Z.; Shen, H.; Cheng, X.; Chua, T.S. MacLaSa: Multi-Aspect Controllable Text Generation via Efficient Sampling from Compact Latent Space. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, 6–10 December 2023; pp. 4424–4436. [Google Scholar] [CrossRef]
- Gu, Y.; Feng, X.; Ma, S.; Zhang, L.; Gong, H.; Zhong, W.; Qin, B. Controllable Text Generation via Probability Density Estimation in the Latent Space. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 12590–12616. [Google Scholar] [CrossRef]
- Dathathri, S.; Madotto, A.; Lan, J.; Hung, J.; Frank, E.; Molino, P.; Yosinski, J.; Liu, R. Plug and Play Language Models: A Simple Approach to Controlled Text Generation. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 30 April 2020. [Google Scholar]
- Yang, K.; Klein, D. FUDGE: Controlled Text Generation with Future Discriminators. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; pp. 3511–3535. [Google Scholar] [CrossRef]
- Liu, A.; Sap, M.; Lu, X.; Swayamdipta, S.; Bhagavatula, C.; Smith, N.A.; Choi, Y. DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; pp. 6691–6706. [Google Scholar] [CrossRef]
- Liu, G.; Li, Y.; Guo, Y.; Luo, X.; Wang, B. Multi-Attribute Controlled Text Generation with Contrastive-Generator and External-Discriminator. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022; pp. 5904–5913. [Google Scholar]
- Mireshghallah, F.; Goyal, K.; Berg-Kirkpatrick, T. Mix and Match: Learning-free Controllable Text Generationusing Energy Language Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 401–415. [Google Scholar]
- Kumar, S.; Paria, B.; Tsvetkov, Y. Gradient-based Constrained Sampling from Language Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 2251–2277. [Google Scholar] [CrossRef]
- Qin, L.; Welleck, S.; Khashabi, D.; Choi, Y. COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics. In Advances in Neural Information Processing Systems; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022; Volume 35, pp. 9538–9551. [Google Scholar]
- Liu, X.; Khalifa, M.; Wang, L. BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 186–200. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.C.; Bengio, Y. Generative Adversarial Nets. In Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Wang, J.; Yu, L.; Zhang, W.; Gong, Y.; Xu, Y.; Wang, B.; Zhang, P.; Zhang, D. IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, 7–11 August 2017. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; et al. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Du, Z.; Qian, Y.; Liu, X.; Ding, M.; Qiu, J.; Yang, Z.; Tang, J. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 320–335. [Google Scholar]
Prefix | ChatGPT Generation | Positive Probability |
---|---|---|
The year is 1910 | The year is 1910, and a collective passion for learning and knowledge fuels a relentless pursuit of truth and enlightenment. | 0.9767 (keep) |
Once upon a time | Once upon a time, a genuine smile from a stranger brightened someone’s day, spreading positivity far and wide. | 0.9961 (keep) |
The country | The country had a strong emphasis on education and intellectual growth, fostering a culture of learning and knowledge. | 0.9466 (keep) |
The movie | The movie we attended the premiere of left us star-struck, walking the red carpet and witnessing the magic of cinema unfold before our eyes. | 0.2231 (deleted) |
Model | Int.Clsf (↑) | Ext.Clsf (↑) | PPL (↓) | Dist-3 (↑) |
---|---|---|---|---|
COLD [21] | 61.46 | 55.10 | 9.09 | 0.30 |
MuCola [20] | 93.22 | 86.55 | 11.36 | 0.55 |
Mix&Match [19] | 96.09 | 84.98 | 66.75 | 0.82 |
BOLT [22] | 95.78 | 80.12 | 8.12 | 0.65 |
Our Model | 81.56 | 80.73 | 9.25 | 0.93 |
Scenario | Soothing-Expression | ChatGLM-3 Gennerated |
---|---|---|
Change the mobile data | 直接给您改不了,因为您套餐一百零八送的带宽是多少兆,它就是多少兆的. (I am sorry that I can’t change it for you directly, because it is equal to the number of megabytes of bandwidth which is given by your package of one hundred and eight.) | 直接改不了,因为套餐一百零八送的带宽就是多少兆. (I can’t change it directly, because the bandwidth of the package of one hundred and eight is so many megabytes.) |
Reset the password | 那没有关系,我把重置密码的方法给您发过来,您可以发送短信重置密码. (That’s okay. I will send you the method to reset your password, and you can send a text message to reset it.) | 没问题,我将重置密码的方法告诉你,你发短信即可重新设置密码. (No problem. I’ll show you how to reset your password, and you can reset it by texting it.) |
Cancel the account | 销户的话,您需要到营业厅办理,首先要看看这个号码上有没有活动,如果没有活动,现场销户,如果有活动,您需要和营业厅协商中断违约金,才可以中断. (If you want to cancel your account, you need to go to the business office to handle it. First, it is advisable for you to check if there is any activity on this number. If there is no activity, you can cancel the account on the spot. If there is activity, you need to negotiate with the business office for liquidated damages before you can cancel.) | 如果你要销户,你需要到营业厅办理。首先查看号码是否有活动,无活动则现场销户,有活动则需与营业厅协商中断违约金. (Cancelling the account, you need to go to the business hall to do it. First check whether there is any activity on the number. If there is no activity, the account will be canceled on the spot. If there is activity, you need to negotiate with the business office for a penalty for interruption.) |
Prompt | W/O.DIS | W/.DIS |
---|---|---|
用户:喂你好,帮我查下我流量呢? (hello, please check the mobile data for me.) | 号码是您的吗?(Is the phone number yours?) | 号码已经给您办理了,请稍后联系我们查询。(The number has been processed for you, you can contact us later.) |
用户:我宽带用不了了,电视也不能看。(I can’t use my broadband, and I can’t watch TV) | 号码是您的吗?(Is the phone number yours?) | 局部网络故障,无法使用 (Localized network failure and unavailability) |
用户:你好,我的通话用超了,是不是可以花一块钱办一个五十分钟的通话包?(I have overused my calls, can I get a 50-min call package with one yuan?) | 先生,您的通话包是五十分钟,是不是可以花一块钱办一个五十分钟的通话包. (Sir, your call package is 50 min, is it possible to spend one yuan to get a 50-min call package?) | 先生,请问还有其他可以帮助您的吗?您可以通过以下方式进行咨询 (Sir, is there anything else that can help you? You can make an inquiry in the following ways) |
用户:喂我问一下我现在这个WIFI宽带是没用了哇?(Hey, I’m here to ask if the WIFI broadband is useless.) | 您的wifi是有用的吗?(Is your wifi still in use?) | 话务员给您开通一下WIFI,然后您点开一下WIFI就可以使用了。感谢您。(The operator will turn on the WIFI for you, and then you can use it by clicking on the WIFI. Thank you.) |
用户:把我这个套餐都取消了,怎么又变成九十多了呢?(I have canceled this package, why did it become more than 90 again?) 客服:这个是扣月租,但是我们这边显示您并没用更改过套餐,您这个是六月十二号办的,这个八十九块钱套餐,七月一号生效的。(This is a deduction of monthly rent, but it shows that you haven’t changed your plan. You applied 89 yuan package on June 12th, and it took effect on July 1st.) 用户:你给我取消。(Cancel the plan for me.) | 来电,您这个套餐是六月十二号办的。(Caller ID shows your package is applied on June 12th.) | 来电显示您已经取消订购了,请问还有其他需要帮助的吗?(Caller ID shows that you have canceled your subscription. Is there anything I can do for you?) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, Z.; Huang, Y.; Chen, Y.; Wu, X.; Feng, J.; Deng, C. CTGGAN: Controllable Text Generation with Generative Adversarial Network. Appl. Sci. 2024, 14, 3106. https://doi.org/10.3390/app14073106
Yang Z, Huang Y, Chen Y, Wu X, Feng J, Deng C. CTGGAN: Controllable Text Generation with Generative Adversarial Network. Applied Sciences. 2024; 14(7):3106. https://doi.org/10.3390/app14073106
Chicago/Turabian StyleYang, Zhe, Yi Huang, Yaqin Chen, Xiaoting Wu, Junlan Feng, and Chao Deng. 2024. "CTGGAN: Controllable Text Generation with Generative Adversarial Network" Applied Sciences 14, no. 7: 3106. https://doi.org/10.3390/app14073106
APA StyleYang, Z., Huang, Y., Chen, Y., Wu, X., Feng, J., & Deng, C. (2024). CTGGAN: Controllable Text Generation with Generative Adversarial Network. Applied Sciences, 14(7), 3106. https://doi.org/10.3390/app14073106