Research on the Identification of a Winter Olympic Multi-Intent Chinese Problem Based on Multi-Model Fusion
Abstract
:1. Introduction
- Aiming at the scarcity of corpus for Winter-Olympics-related problems, crawler technology is used to obtain Winter-Olympics-related information, extracting information such as the basic information of the athletes being inquired about, the achievements they have received, their careers, and their competitions. A user question dataset about the Winter Olympics domain is automatically generated through a customized template containing single-, two-, and three-intent question data.
- The Chinese pretraining language model Bert-base-chinese learning is used to obtain dynamic text semantic vector representations containing richer semantic information and improve the text’s semantic representation ability.
- To address the problem of a single-head attention mechanism with single feature expression, the multi-heads attention mechanism is introduced so that the model can obtain more information about the problem text from different perspectives and improve the feature expression ability of the model.
- The improved multi-intent recognition model BCNBLMATT based on BERT and the multi-heads attention mechanism is proposed and the problem text is encoded through Bert-base-chinese to obtain the dynamic text semantic vector representation; the local feature extraction of TextCNN and the context-dependent relationship of BiLSTM-Multi-heads attention feature extraction are combined to obtain the local feature and contextual feature information of the problem text. By fusing these two kinds of features, the problem of incomplete feature extraction is solved, and the superiority of this model in terms of the multi-intent recognition effect is verified by comparing and analyzing it with other models on the Winter Olympics Chinese question dataset and MixATIS question dataset.
2. Related Research
2.1. Winter Olympics Field
2.2. Multi-Intention Recognition Task
2.3. Multi-Label Text Classification Task Based on Deep Learning Technology
3. BCNBLMATT Model Building
3.1. Overall Architecture of the Model
3.2. Text Representation Layer
3.3. TextCNN Layer
3.4. BiLSTM-Multi-Heads Attention Layer
3.5. Feature Fusion and Intention Classification
4. Experiment and Result Analysis
4.1. Dataset
4.2. Experimental Parameters
4.3. Evaluating Indicator
4.4. The Influence of the Number of Attention Heads and the Size of Convolutional Kernels on the Model’s Intention Recognition Performance
4.5. Comparative Experiment
- (1)
- Bert: the BERT pre-trained language model was only used for feature extraction and intention classification.
- (2)
- Bert-textcnn: BERT was used as a pre-trained language model for local feature extraction and intention classification using TextCNN.
- (3)
- Bert-blmatt: To verify this model’s effectiveness, local decomposition validation was performed on the model, and Bert-blmat was used as a comparative experiment. BERT was used as a pre-trained language model, global semantic features were extracted through BiLSTM, the noise was reduced through Multi-heads attention, and key features were extracted for intention recognition.
- (4)
- Bert-cnn+blatt: In order to verify the effectiveness of Multi-heads attention, this article replaced Attention with Multi-head attention for comparative experiments. BERT was used as a pre-trained language model, and the local features of the text and the key features in the global semantic information were extracted through TextCNN and BiLSTM-attention. Finally, the two were concatenated and fused for intention recognition.
- (5)
- BCNBLMATT: Using BERT as a pre-trained language model, the problem text was transformed into text semantic features. This feature was then used to obtain the local features of the text and the key features in the global semantic information through TextCNN and BiLSTM-Multi-heads attention. Finally, the two were concatenated and fused for intention recognition
4.6. Experimental Results and Analysis
5. Conclusions and Outlook
Author Contributions
Funding
Conflicts of Interest
References
- Luo, L.; Li, S.; He, Q.; Yang, C.; Chen, T. Winter Olympic Q & A system based on knowledge map, TF-IDF and BERT model. CAAI Trans. Intell. Syst. 2021, 16, 819–826. [Google Scholar]
- Xu, P.Y.; Sarikaya, R. Exploiting shared information for multiintent natural language sentence classification. In Proceedings of the 14th Annual Conference of the International Speech Communication Association, Lyon, France, 25–29 August 2013; pp. 3785–3789. [Google Scholar]
- Kim, B.; Ryu, S.; Gary, G.L. Two-stage multi-intent detection for spoken language understanding. Multimed. Tools Appl. 2017, 76, 11377–11390. [Google Scholar] [CrossRef]
- Yang, C.; Feng, C. Multi-intention recognition model with combination of syntactic feature and convolution neural network. J. Comput. Appl. 2018, 38, 1839–1845+1852. [Google Scholar]
- Liu, J.; Li, Y.L.; Lin, M. Research of short text multi-intent detection with capsule network. J. Front. Comput. Sci. Technol. 2020, 14, 1735–1743. [Google Scholar]
- Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. arXiv 2017, arXiv:1710.09829. [Google Scholar]
- Weld, H.; Huang, X.; Long, S.; Poon, J.; Han, S.C. A survey of joint intent detection and slot-filling models in natural language understanding. ACM Comput. Surv. (CSUR) 2021, 55, 1–38. [Google Scholar] [CrossRef]
- Li, S.; Sun, Z.P. Bidirectional Interaction Model for Joint Multiple Intent Detection and Slot Filling. Comput. Eng. Appl. 2023. Available online: http://kns.cnki.net/kcms/detail/11.2127.TP.20230321.0934.004.html (accessed on 4 October 2023).
- Li, D.; Yang, Y.; Meng, X.; Zhang, X.; Song, C.; Zhao, Y. Review on Multi-Lable Classification. J. Front. Comput. Sci. Technol. 2023. Available online: http://kns.cnki.net/kcms/detail/11.5602.TP.20230627.1225.003.html (accessed on 4 October 2023).
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Kim, Y. Convolutional neural networks for sentence classification. arXiv 2014, arXiv:1408.5882. [Google Scholar]
- Baker, S.; Korhonen, A. Initializing neural networks for hierarchical multi-label text classification. In BioNLP 2017; Association for Computational Linguistics: Vancouver, BC, Canada, 2017; pp. 307–315. [Google Scholar]
- Zheng, C.; Wang, X.; Wang, T. Multi-label classification for medical text based on ALBERT-TextCNN model. J. Shandong Univ. (Nat. Sci.) 2022, 57, 21–29. [Google Scholar]
- Lipton, Z.C.; Berkowitz, J.; Elkan, C. A critical review of recurrent neural networks for sequence learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
- Yang, P.; Sun, X.; Li, W.; Ma, S.; Wu, W.; Wang, H. SGM: Sequence generation model for multi-label classification. arXiv 2018, arXiv:1806.04822. [Google Scholar]
- Liu, P.; Qiu, X.; Chen, X.; Wu, S.; Huang, X.J. Multi-timescale long short-term memory neural network for modelling sentences and documents. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015. [Google Scholar]
- Hu, J.; Kang, X.; Nishide, S.; Ren, F. Text multi-label sentiment analysis based on Bi-LSTM. In Proceedings of the 2019 IEEE 6th International Conference on Cloud Computing and Intelligence Systems, Singapore, 25–27 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 16–20. [Google Scholar]
- Zhou, Y.; Xu, J.; Cao, J.; Xu, B.; Li, C. Hybrid attention networks for Chinese short text classification. Comput. Sist. 2017, 21, 759–769. [Google Scholar] [CrossRef]
- She, X.; Zhang, D. Text classification based on hybrid CNN-LSTM hybrid model. In Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design, Hangzhou, China, 8–9 December 2018; Volume 2, pp. 185–189. [Google Scholar]
- Chaofan, L.; Kai, M. Electronic Medical Record Text Classification Based on Attention Mechanism Combined with CNN-BiLSTM. Sci. Technol. Eng. 2022, 22, 2363–2370. [Google Scholar]
- Xu, J.; Cai, Y.; Wu, X.; Lei, X.; Huang, Q.; Leung, H.F.; Li, Q. Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing 2020, 386, 42–53. [Google Scholar] [CrossRef]
- Yang, X.R.; Zhao, S.W.; Zhang, R.X.; Yang, X.J.; Tang, Y.H. BiLSTM_CNN Classification Model Based on Self-Attention and Residual Network. Comput. Eng. Appl. 2022, 58, 172–180. [Google Scholar]
- Song, Z.S.; Niu, Y.; Zheng, L.; Tie, J.; Jiang, H. Multiscale double-layer convolution and global feature text classification model. Comput. Eng. Appl. 2023. Available online: http://kns.cnki.net/kcms/detail/11.2127.TP.20230214.1508.046.html (accessed on 4 October 2023).
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Duan, D.D.; Tang, J.S.; Wen, Y.; Yuan, K.H. Chinese short text classification algorithm based on BERT model. Comput. Eng. 2021, 47, 79–86. [Google Scholar]
- Liu, B.; Pu, Y. BERT-base approach for long document classification. J. Sichuan Univ. (Nat. Sci. Ed.) 2023, 60, 81–88. [Google Scholar]
- Lee, J.S.; Hsiang, J. Patent classification by fine-tuning BERT Language model. World Pat. Inf. 2020, 61, 101965. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Qin, L.; Xu, X.; Che, W.; Liu, T. AGIF: An Adaptive Graph-Interactive Framework for Joint Multiple Intent Detection and Slot Filling. Findings of the Association for Computational Linguistics: EMNLP 2020. arXiv 2020, arXiv:2004.10087. [Google Scholar]
Winter Olympics Problem Dataset | MixATIS Dataset | |
---|---|---|
train | 3003 | 13,056 |
dev | 987 | 896 |
test | 987 | 768 |
Question | Label |
---|---|
哪位运动员在平昌冬奥会的女子大回转的比赛中取得了第1名的成绩? | 运动员姓名 |
(Which athlete achieved first place in the women’s slalom at the Pingchang Winter Olympics?) | (Athlete name) |
韩雨桐在第24届界冬奥会中获得的名次是,他获得的成就有哪些? | 运动员名次/运动员所获成就 |
(What was Han Yutong’s ranking and achievements at the 24th Winter Olympics?) | (Athlete rankings/athlete achievement) |
埃琳娜·朗格迪尔的年龄? 他来自哪个国家? 他的教练是谁? | 运动员年龄/运动员国家/教练 |
(What is the age of Elena Langedier? Which country does he come from? Who is his coach?) | (Athlete age/athlete country/athlete coach) |
Question | Label |
---|---|
给我看看从西雅图到明尼阿波利斯的票价? | 机票价格 |
(Show me a fare from Seattle to Minneapolis?) | (Airfares) |
加拿大国际航空公司为哪些城市提供服务, 以及所有可用的餐点是什么? | 城市/餐点 |
(What cities does Air Canada International serve and what are all the available meals?) | (City/Meal) |
告诉我华盛顿特区附近的机场, 从波士顿机场到波士顿市中心的地面交通是什么,以及1765年大陆航空公司从波士顿到旧金山有多少站? | 机场/地面服务/航班数量 |
(Tell me about the airports near Washington DC, what is the ground transportation from Boston Airport to downtown Boston, and how many stops did Continental Airlines make from Boston to San Francisco in 1765?) | (Airports/ground services/number of flights) |
Parameter | Value |
---|---|
kernels size | 3,4,5 |
kernels number | 100 |
heads | 12 |
hidden size | 768 |
word vector size | 768 |
activation function | Relu |
learning rate | 1 × 10−5 |
optimizer | Adam |
dropout | 0.5 |
batch size | 21 |
Real Category | Prediction Category | |
---|---|---|
Positive | Negative | |
Positive | TP | FN |
Negative | FP | TN |
Dataset | Model | P % | R % | F1 % |
---|---|---|---|---|
Winter Olympics Problem | Bert | 98.40 | 94.44 | 95.51 |
Bert-textcnn | 97.87 | 95.68 | 96.47 | |
Bert-blmatt | 97.74 | 96.03 | 96.65 | |
Bert-cnn+blatt | 98.18 | 95.30 | 96.22 | |
BCNBLMATT | 99.60 | 96.96 | 98.12 | |
MixATIS | Bert | 97.62 | 94.74 | 95.78 |
Bert-textcnn | 97.81 | 96.13 | 96.78 | |
Bert-blmatt | 97.22 | 96.90 | 96.89 | |
Bert-cnn+blatt | 97.61 | 95.82 | 96.51 | |
BCNBLMATT | 97.70 | 97.04 | 97.20 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, P.; Liang, Q.; Cai, Z. Research on the Identification of a Winter Olympic Multi-Intent Chinese Problem Based on Multi-Model Fusion. Appl. Sci. 2023, 13, 11048. https://doi.org/10.3390/app131911048
Liu P, Liang Q, Cai Z. Research on the Identification of a Winter Olympic Multi-Intent Chinese Problem Based on Multi-Model Fusion. Applied Sciences. 2023; 13(19):11048. https://doi.org/10.3390/app131911048
Chicago/Turabian StyleLiu, Pingshan, Qi Liang, and Zhangjing Cai. 2023. "Research on the Identification of a Winter Olympic Multi-Intent Chinese Problem Based on Multi-Model Fusion" Applied Sciences 13, no. 19: 11048. https://doi.org/10.3390/app131911048
APA StyleLiu, P., Liang, Q., & Cai, Z. (2023). Research on the Identification of a Winter Olympic Multi-Intent Chinese Problem Based on Multi-Model Fusion. Applied Sciences, 13(19), 11048. https://doi.org/10.3390/app131911048