Refined Answer Selection Method with Attentive Bidirectional Long Short-Term Memory Network and Self-Attention Mechanism for Intelligent Medical Service Robot
Abstract
1. Introduction
- (1)
- A refined answer selection method is proposed for a medical service robot to realize an intelligent question-answering service. The required knowledge-based text is constructed as background information to match the question and answer.
- (2)
- Self-attention is adopted to extract global features of the input data before passing the data to the circulating layer to solve the long-distance dependent learning. An attentive Bi-LSTM network is designed to have a more precise measurement of the similarity between the QA pair with consideration of the background knowledge information.
- (3)
- A knowledge-based QA dataset is built to verify the effectiveness of the proposed approach. The proposed method could achieve impressive performance on the answer selection task with an accuracy of 71.4%, MAP of 68.8%, and the BLUE indica-tor of 3.10.
2. Related Work
2.1. Natural Language Processing (NLP)
2.2. Bidirectional Long Short-Term Memory (Bi-LSTM) Network
2.3. Attention Mechanism
3. Materials and Methods
3.1. Word Embedding
3.2. Self-Attention Based Feature Extraction
3.3. Attentive Bi-LSTM Circulating
3.4. Similarity Calculation
4. Experiments and Results
4.1. Experimental Setup
4.1.1. Dataset Building
4.1.2. Data Pre-Processing
4.1.3. Implementation Details
4.2. Evaluation Metric
4.3. Baseline Methods
- Bi-LSTM A Bi-LSTM neural network is employed as feature extraction for QA pairs. The model is one of the baselines of a knowledge-based dataset.
- Double Bi-LSTM This model proposes double Bi-LSTM to learn the features of given questions and candidate answers. The model is one of the baselines of a knowledge-based dataset.
- Attentive Bi-LSTM The model integrates an attention mechanism into the Bi-LSTM model for the purpose of improving the semantic understanding of questions. The cosine similarity is employed for calculation. The model is one of the baselines of a knowledge-based dataset.
- Multi-scale CNN [33] The multi-scale CNN model constructs different sizes of feature maps to extract the semantic information from the text. This model is one of the baselines of a medical QA.
- QA-CLWR [34] The method proposes a collaborative learning-based answer selection model (QA-CL), where a parallel training architecture is deployed to collaboratively learn the initial word vector matrix of the sentence by CNN and bidirectional LSTM (Bi-LSTM) at the same time.
- Attentive LSTM [35] The method combines attention and LSTM to extract the contextual representation for the question and answer separately, which is the baseline of accuracy.
- Multi-Level Composite CNNs (MCCNN) [36] This method presents stacking CNNs with different sizes to capture multi-level semantic information, which is the baseline of accuracy. The output of each CNN layer is generated by max pooling, and, finally, they are concatenated for prediction.
- BiCNN [37] This method proposed the average pooling with a bigram CNN model, which is the baseline of MAP.
- ABCNN [38] The authors propose an attention-based CNN, which is the baseline of MAP.
- Reference [39] The authors utilized LSTM-based deep learning models for non-factoid answer selection, which is the baseline of MAP.
4.4. Results and Analysis
4.4.1. Experimental Results with Different Parameter Settings
4.4.2. Experimental Results of a Knowledge-Based QA Dataset
4.4.3. Experimental Results of Different Datasets
4.5. Ablation Analysis
4.6. Analysis of Background Knowledge Information
4.7. Error Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ahmed, S.T.; Kumar, V.; Kim, J. AITel: eHealth Augmented Intelligence based Telemedicine Resource Recommendation Framework for IoT devices in Smart cities. IEEE Internet Things J. 2023, 1. [Google Scholar] [CrossRef]
- Ahmed, S.T.; Thouheed, S.; Kumar, V. 6G enabled federated learning for secure IoMT resource recommendation and propagation analysis. Comput. Electr. Eng. 2022, 102, 108210. [Google Scholar] [CrossRef]
- Zhang, L.; Li, F.; Wang, P. A blockchain-assisted massive IoT data collection intelligent framework. IEEE Internet Things J. 2021, 9, 14708–14722. [Google Scholar]
- Li, F.; Liu, K.; Zhang, L. EHRChain: A Blockchain-based EHR System Using Attribute-Based and Homomorphic Cryptosystem. IEEE Trans. Serv. Comput. 2021, 15, 2755–2765. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to Sequence Learning with Neural Networks. In Proceedings of the 28th Conference on Neural Information Processing Systems (NIPS), Montreal, QC, Canada, 8–13 December 2014; Volume 27, pp. 3104–3112. [Google Scholar]
- Hou, J.Z.; Zhang, S.Y.; Yu, K. Algorithm of answer sentence selection based on Q and A interaction. Comput. Mod. 2021, 1, 120–126. [Google Scholar]
- Maulud, D.H.; Ameen, S.Y.; Omar, N. Review on natural language processing based on different techniques. Asian J. Res. Comput. Sci. 2021, 10, 1–17. [Google Scholar] [CrossRef]
- Bengio, Y.; Ducharme, R.; Vincent, P. A neural probabilistic language model. J. Mach. Learn. Res. 2003, 3, 1137–1155. [Google Scholar]
- Collobert, R.; Weston, J.; Bottou, L. Natural language processing from scratch. J. Mach. Learn. Res. 2011, 12, 2493–2537. [Google Scholar]
- Rush, A.M.; Chopra, S. A neural attention model for abstractive sentence summarization. arXiv 2015, arXiv:1509.00685. Available online: https://arxiv.org/pdf/150900685.pdf (accessed on 14 March 2022).
- Xu, K.; Ba, J.; Kiros, R. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 2048–2057. [Google Scholar]
- Wang, Y.; Huang, M.; Zhao, L. Attention-based LSTM for aspect-level sentiment classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 606–615. [Google Scholar]
- Liu, Z.; Lin, W.; Shi, Y. A Robustly Optimized BERT Pre-training Approach with Post-training. In Proceedings of the China National Conference on Chinese Computational Linguistics, Hohhot, China, 12–15 August 2021; pp. 471–484. [Google Scholar]
- Özçift, A.; Akarsu, K.; Yumuk, F. Advancing natural language processing (NLP) applications of morphologically rich languages with bidirectional encoder representations from transformers (BERT): An empirical case study for Turkish. Autom. Časopis Autom. Mjer. Elektron. Računarstvo Komun. 2021, 62, 226–238. [Google Scholar] [CrossRef]
- Neutel, S.; de Boer, M.H.T. Towards Automatic Ontology Alignment using BERT. In Proceedings of the AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering, Palo Alto, CA, USA, 22–24 March 2021. [Google Scholar]
- Grail, Q.; Perez, J.; Gaussier, E. Globalizing BERT-based transformer architectures for long document summarization. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, Kiev, Ukraine, 21–23 April 2021; pp. 1792–1810. [Google Scholar]
- Liu, D.L.; Niu, Z.D. Multi-Scale Deformable CNN for Answer Selection. IEEE Access 2019, 7, 164986–164995. [Google Scholar] [CrossRef]
- Liu, Y.H.; Yang, B. Bi-LSTM-based Natural Language Q&A for Marriage Law. Comput. Eng. Des. 2019, 1000–7024. [Google Scholar]
- Hanifah, A.F.; Kusumaningrum, R. Non-Factoid Answer Selection in Indonesian Science Question Answering System using Long Short-Term Memory (LSTM). Procedia Comput. Sci. 2021, 179, 736–746. [Google Scholar] [CrossRef]
- Wakchaure, M.; Kulkarni, P. A Scheme of Answer Selection in Community Question Answering Using Machine Learning Techniques. In Proceedings of the International Conference on Intelligent Computing and Control Systems (ICCS), Madurai, India, 15–17 May 2019; pp. 879–883. [Google Scholar]
- Vaswani, A.; Shaxeer, N.; Parmar, N. Attention Is All You Need. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Liu, Y.; Bao, Z.; Zhang, Z. Information cascades prediction with attention neural network. Hum.-Cent. Comput. Inf. Sci. 2020, 10, 13. [Google Scholar] [CrossRef]
- Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Bahdanau, D.; Cho, K.H.; Bengio, Y. Neural Machine Transition by Jointly Learning to Align and Translate. In Proceedings of the International Conference on Learning Representation, San Diego, CA, USA, 7–9 May 2015; Volume 1049, p. 0473. Available online: https://arxiv.org/pdf/1409.0473.pdf (accessed on 20 February 2023).
- Yu, S.; Wang, Y.B. NAIRS: A Neural Attentive Interpretable Recommendation System. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM), Melbourne, Australia, 11–15 February 2019; pp. 786–789. [Google Scholar] [CrossRef]
- Cu, J.Y. A Study of Answer Selection Ranking Based on Attention Mechanism; Northern Polytechnic University: Grande Prairie, AB, Canada, 2020. [Google Scholar]
- Xu, D.; Ji, J.H.; Huang, H.K. Gated Group Self-attention for Answer Selection. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 2019, 1905, 10720. [Google Scholar]
- Chen, X.; Yang, Z.; Liang, N. Co-attention fusion based deep neural network for Chinese medical answer selection. Appl. Intell. 2021, 51, 6633–6646. [Google Scholar] [CrossRef]
- Mikolov, T.; Sutskever, I.; Chen, K. Distributed representations g words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing System, Lake Tahoe, NV, USA, 5–10 December 2013; Volume 2, pp. 3111–3119. [Google Scholar]
- Srivastava, N.; Geoffrey, E. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Papineni, K.; Roukos, S. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association-for-Computational-Linguistics, Philadelphia, PA, USA, 7–12 July 2002; pp. 311–318. [Google Scholar]
- Zhang, S.; Zhang, X.; Wang, H.; Cheng, J.; Li, P.; Ding, Z. Chinese medical question answer matching using end-to-end character-level multiscale CNNs. Appl. Sci. 2017, 7, 767. [Google Scholar] [CrossRef]
- Shao, T.H.; Kui, X.Y.; Zhang, P.E.; Chen, H.H. Collaborative Learning for Answer Selection in Question Answering. IEEE Access 2019, 7, 7337–7347. [Google Scholar] [CrossRef]
- Tan, M.; dos Santos, C.N.; Xiang, B. Improved representation learning for question answer matching. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12 August 2016; pp. 7–12. [Google Scholar]
- Ye, D.; Zhang, S.; Wang, H. Multi-level composite neural networks for medical question answer matching. In Proceedings of the Third IEEE International Conference on Data Science in Cyberspace, Guangzhou, China, 18–21 June 2018; pp. 18–21. [Google Scholar]
- Yang, Y.; Yih, W.T.; Meek, C. WikiQA: A challenge dataset for open domain question answering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Lisbon, Portugal, 17–21 September 2015; pp. 2013–2018. [Google Scholar]
- Yin, W.; Schütze, H.; Xiang, B.; Zhou, B. ABCNN: Attention-based convolutional neural network for modeling sentence pairs. Trans. Assoc. Comput. Linguist. 2016, 4, 259–272. [Google Scholar] [CrossRef]
- Tan, M.; Xiang, B.; Zhou, B. LSTM-based deep learning models for non-factoid answer selection. In Proceedings of the International Conference on Learning Representations, San Juan, Puerto Rico, 2–4 May 2016; Available online: https://arxiv.org/pdf/1511.04108.pdf (accessed on 20 February 2023).










| Question | Answers | 
|---|---|
| What is the main reason why fog tends to form on clear nights from late fall to early spring of the following year? | Right: Atmospheric inverse radiation is weak on clear nights, and cooling near the ground is rapid. False: The amount of water vapor in the atmosphere is high on clear days. False: Evaporation of water vapor from the ground is strong on clear nights. False: There is less condensed nuclei material in the clear atmosphere. | 
| Question | Answers | 
|---|---|
| After touching the small animals at the zoo yesterday, I had itchy skin and an allergic reaction. What should I do? | Right: Hello. Your situation is a manifestation of a skin allergy. The most common treatment is anti-allergy treatment, such as taking loratadine vitamin c and ketotifen. You also need to drink more water and do not eat spicy food. False: Measles and chickenpox are two different diseases. For example, if measles is prevalent in the area, vaccination is recommended. | 
| Knowledge | 
|---|
| The original Earth’s atmosphere was composed mainly of carbon dioxide, carbon monoxide, methane, and ammonia, and lacked oxygen, which was not suitable for biological survival. After a long evolutionary process, the Earth’s atmosphere was transformed into an atmosphere suitable for biological respiration, with nitrogen and oxygen as the main components. Oxygen at high altitudes in the Earth’s atmosphere is synthesized into ozone under the sun’s ultraviolet light, forming the ozone layer. | 
| Stop Words | 
|---|
| ! # + & a] B W R 1 3 5 9 一些 下 | 
| Row | Learning Rate | Hidden Size | Accuracy | BLEU | 
|---|---|---|---|---|
| A | 0.7 | 90 | 63.6% | 5.70 | 
| B | 0.6 | 110 | 69.4% | 4.80 | 
| C | 0.3 | 120 | 70.6% | 3.00 | 
| D | 0.3 | 130 | 71.4% | 3.10 | 
| Row | Method | Accuracy | BLEU | 
|---|---|---|---|
| A | Bi-LSTM | 59.6% | 5.35 | 
| B | Double Bi-LSTM | 57.4% | 5.60 | 
| C | Attentive Bi-LSTM | 62.7% | 3.22 | 
| D | Multi-scale CNN | 64.7% | -- | 
| E | Ours | 71.4% | 3.10 | 
| Row | Method | Accuracy | BLEU | 
|---|---|---|---|
| A | Ours without Bi-LSTM | 59.6% | 5.35 | 
| B | Ours without attention | 61.6% | 4.20 | 
| C | Ours without self-attention | 62.7% | 3.22 | 
| D | Ours | 71.4% | 3.10 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, D.; Liang, Y.; Ma, H.; Xu, F. Refined Answer Selection Method with Attentive Bidirectional Long Short-Term Memory Network and Self-Attention Mechanism for Intelligent Medical Service Robot. Appl. Sci. 2023, 13, 3016. https://doi.org/10.3390/app13053016
Wang D, Liang Y, Ma H, Xu F. Refined Answer Selection Method with Attentive Bidirectional Long Short-Term Memory Network and Self-Attention Mechanism for Intelligent Medical Service Robot. Applied Sciences. 2023; 13(5):3016. https://doi.org/10.3390/app13053016
Chicago/Turabian StyleWang, Deguang, Ye Liang, Hengrui Ma, and Fengqiang Xu. 2023. "Refined Answer Selection Method with Attentive Bidirectional Long Short-Term Memory Network and Self-Attention Mechanism for Intelligent Medical Service Robot" Applied Sciences 13, no. 5: 3016. https://doi.org/10.3390/app13053016
APA StyleWang, D., Liang, Y., Ma, H., & Xu, F. (2023). Refined Answer Selection Method with Attentive Bidirectional Long Short-Term Memory Network and Self-Attention Mechanism for Intelligent Medical Service Robot. Applied Sciences, 13(5), 3016. https://doi.org/10.3390/app13053016
 
        

 
       