Parallel Fusion of Graph and Text with Semantic Enhancement for Commonsense Question Answering
Abstract
:1. Introduction
- We propose the PRS-QA model, a parallel fusion model based on self-attention and multihead cross-attention, enabling the question context to aid in the reasoning of the graph neural network layers.
- We design a semantic enhancement module that utilizes self-attention to highlight the key nodes in the graph, and we employed additive attention and pooling layers to obtain a vector representation that integrates graph structural information and textual semantic information.
- We concatenate the difference in the representation between the tail node and the head node with the previous relationship representation to form a new edge vector. The final edge vector is determined based on the relevance of the new edge vector to the question context.
2. Related Work
3. Preliminaries
3.1. Knowledge Graph
3.2. Graph Pooling Layer
3.3. Graph Attention Networks
3.4. Task Definition
4. Methodology
4.1. Overview
4.2. Question Context Embedding Module
4.3. Relation Embedding Optimization and Parallel Fusion Module
4.3.1. Optimizing Relation Embedding Module
4.3.2. Parallel Fusion Module
4.4. Semantic Enhancement Module
4.5. Answer Prediction Module
5. Experimental Setup
5.1. Datasets
5.2. Implementation Details
5.3. Comparison Methods
6. Results and Analysis
6.1. Main Results
6.2. Ablation Studies
- With Parallel Fusion Module: This setting explored the model’s performance using only the parallel fusion strategy and compared the accuracy improvement over the baseline model QA-GNN.
- With Parallel Fusion Module and Optimizing Relation Embedding Module: This variant added the optimizing relation embedding module on top of the fusion module to evaluate the changes in accuracy and standard deviation.
- With Parallel Fusion Module and Semantic Enhancement Module: This variant added the semantic enhancement module to the fusion module to assess the impact of fine-grained semantics on answer accuracy.
6.3. Impact of the Number of PRS-QA Layers
6.4. Impact of the Number of Attention Heads Involved in Graph Pooling
6.5. Analysis of Model Generalizability
6.6. Model Interpretability Analysis
6.7. Error Analysis
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Khashabi, D.; Min, S.; Khot, T.; Sabharwal, A.; Tafjord, O.; Clark, P.; Hajishirzi, H. UNIFIEDQA: Crossing Format Boundaries with a Single QA System. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16–20 November 2020; pp. 1896–1907. [Google Scholar]
- Liu, Y. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Speer, R.; Chin, J.; Havasi, C. Conceptnet 5.5: An open multilingual graph of general knowledge. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; Van Den Berg, R.; Titov, I.; Welling, M. Modeling relational data with graph convolutional networks. In Proceedings of the The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, 3–7 June 2018; proceedings 15. Springer: Berlin/Heidelberg, Germany, 2018; pp. 593–607. [Google Scholar]
- Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. Stat 2017, 1050, 10–48550. [Google Scholar]
- Wang, Y.; Zhang, H.; Liang, J.; Li, R. Dynamic heterogeneous-graph reasoning with language models and knowledge representation learning for commonsense question answering. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 14048–14063. [Google Scholar]
- Ye, Q.; Cao, B.; Chen, N.; Xu, W.; Zou, Y. Fits: Fine-grained two-stage training for knowledge-aware question answering. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 13914–13922. [Google Scholar]
- Yang, Z.; Zhang, Y.; Cao, P.; Liu, C.; Chen, J.; Zhao, J.; Liu, K. Information bottleneck based knowledge selection for commonsense reasoning. Inf. Sci. 2024, 660, 120134. [Google Scholar] [CrossRef]
- Lin, B.Y.; Chen, X.; Chen, J.; Ren, X. Kagnet: Knowledge-aware graph networks for commonsense reasoning. arXiv 2019, arXiv:1909.02151. [Google Scholar]
- Feng, Y.; Chen, X.; Lin, B.Y.; Wang, P.; Yan, J.; Ren, X. Scalable multi-hop relational reasoning for knowledge-aware question answering. arXiv 2020, arXiv:2005.00646. [Google Scholar]
- Yasunaga, M.; Ren, H.; Bosselut, A.; Liang, P.; Leskovec, J. QA-GNN: Reasoning with language models and knowledge graphs for question answering. arXiv 2021, arXiv:2104.06378. [Google Scholar]
- Zhang, X.; Bosselut, A.; Yasunaga, M.; Ren, H.; Liang, P.; Manning, C.D.; Leskovec, J. Greaselm: Graph reasoning enhanced language models for question answering. arXiv 2022, arXiv:2201.08860. [Google Scholar]
- Sun, Y.; Shi, Q.; Qi, L.; Zhang, Y. JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, WA, USA, 10–15 July 2022; pp. 5049–5060. [Google Scholar]
- Zhang, M.; He, T.; Dong, M. Meta-path reasoning of knowledge graph for commonsense question answering. Front. Comput. Sci. 2024, 18, 181303. [Google Scholar] [CrossRef]
- Zheng, C.; Kordjamshidi, P. Dynamic Relevance Graph Network for Knowledge-Aware Question Answering. In Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, 12–17 October 2022. [Google Scholar]
- Chen, Y.; Dai, X.; Chen, D.; Liu, M.; Dong, X.; Yuan, L.; Liu, Z. Mobile-former: Bridging mobilenet and transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5270–5279. [Google Scholar]
- Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 2013, 26, 2787–2795. [Google Scholar]
- Zhang, M.; Dai, R.; Dong, M.; He, T. Drlk: Dynamic hierarchical reasoning with language model and knowledge graph for question answering. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 5123–5133. [Google Scholar]
- Zhang, Q.; Chen, S.; Fang, M.; Chen, X. Joint reasoning with knowledge subgraphs for Multiple Choice Question Answering. Inf. Process. Manag. 2023, 60, 103297. [Google Scholar] [CrossRef]
- Sha, Y.; Feng, Y.; He, M.; Liu, S.; Ji, Y. Retrieval-Augmented Knowledge Graph Reasoning for Commonsense Question Answering. Mathematics 2023, 11, 3269. [Google Scholar] [CrossRef]
- Zheng, J.; Ma, Q.; Qiu, S.; Wu, Y.; Ma, P.; Liu, J.; Feng, H.; Shang, X.; Chen, H. Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 9155–9173. [Google Scholar]
- Lee, J.; Lee, I.; Kang, J. Self-attention graph pooling. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 3734–3743. [Google Scholar]
- Ashish, V. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, I. [Google Scholar]
- Talmor, A.; Herzig, J.; Lourie, N.; Berant, J. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 4149–4158. [Google Scholar]
- Mihaylov, T.; Clark, P.; Khot, T.; Sabharwal, A. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018; pp. 2381–2391. [Google Scholar]
- Jin, D.; Pan, E.; Oufattole, N.; Weng, W.H.; Fang, H.; Szolovits, P. What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Appl. Sci. 2021, 11, 6421. [Google Scholar] [CrossRef]
- Liu, L.; Jiang, H.; He, P.; Chen, W.; Liu, X.; Gao, J.; Han, J. On the variance of the adaptive learning rate and beyond. In Proceedings of the 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Clark, P.; Etzioni, O.; Khot, T.; Khashabi, D.; Mishra, B.; Richardson, K.; Sabharwal, A.; Schoenick, C.; Tafjord, O.; Tandon, N.; et al. From ‘F’to ‘A’on the NY regents science exams: An overview of the aristo project. Ai Mag. 2020, 41, 39–53. [Google Scholar]
- Wang, X.; Kapanipathi, P.; Musa, R.; Yu, M.; Talamadupula, K.; Abdelaziz, I.; Chang, M.; Fokoue, A.; Makni, B.; Mattei, N.; et al. Improving natural language inference using external knowledge in the science questions domain. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 7208–7215. [Google Scholar]
- Lee, J.; Yoon, W.; Kim, S.; Kim, D.; Kim, S.; So, C.H.; Kang, J. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020, 36, 1234–1240. [Google Scholar] [CrossRef] [PubMed]
- Liu, F.; Shareghi, E.; Meng, Z.; Basaldella, M.; Collier, N. Self-Alignment Pretraining for Biomedical Entity Representations. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; pp. 4228–4238. [Google Scholar]
Dataset | Training Set | Validation Set | Test Set | Options |
---|---|---|---|---|
CommonsenseQA(IH) | 8500 | 1221 | 1241 | 5 |
OpenBookQA | 4957 | 500 | 500 | 4 |
MedQA-USMLE | 10,178 | 1272 | 1273 | 4 |
Hyperparameter | Dataset | ||
---|---|---|---|
CommonsenseQA | OpenBookQA | MedQA-USMLE | |
Number of PRS-QA layers | 5 | 5 | 5 |
Number of attention heads in GAT | 2 | 2 | 2 |
Embedding dimension of entity nodes | 200 | 200 | 200 |
Node retention ratio in graph pooling | 0.92 | 0.94 | 0.94 |
Dropout rate | 0.2 | 0.2 | 0.2 |
Learning rate of parameters in LM | |||
Learning rate of parameters not in LM | |||
Batch size | 64 | 128 | 128 |
Number of epochs | 30 | 150 | 30 |
Method | IHdev-Acc. (%) | IHtest-Acc. (%) |
---|---|---|
RoBERTa-large (w/o KG) | 73.07 (±0.45) | 68.69 (±0.56) |
+RGCN | 72.69 (±0.19) | 68.41 (±0.66) |
+GconAttn | 71.61 (±0.39) | 68.59 (±0.96) |
+KagNet | 73.47 (±0.22) | 69.01 (±0.76) |
+RN | 74.57 (±0.91) | 69.08 (±0.21) |
+MHGRN | 74.45 (±0.10) | 71.11 (±0.81) |
+QA-GNN | 76.54 (±0.21) | 73.41 (±0.92) |
+MRGNN | 75.3 (±0.7) | 73.6 (±0.5) |
+GreaseLM | 78.5 (±0.5) | 74.2 (±0.4) |
+JointLK | 77.88 (±0.25) | 74.43 (±0.83) |
+DRGN | 78.20 | 74.00 |
+RAKG | 76.74 | 73.51 |
+MKSQA | - | 74.53 (±0.52) |
+CET+SAFE | - | 74.54 |
+PRS-QA | 78.52 (±0.43) | 74.68 (±0.16) |
Method | RoBERTa-Large | AristoRoBERTa |
---|---|---|
Fine-tuned LMs(w/o KG) | 64.80 (±2.37) | 78.40 (±1.64) |
+RGCN | 62.45 (±1.57) | 74.60 (±2.53) |
+GconAttn | 64.75 (±1.48) | 71.80 (±1.21) |
+RN | 65.20 (±1.18) | 75.35 (±1.39) |
+MHGRN | 66.85 (±1.19) | 80.6 |
+QA-GNN | 67.80 (±2.75) | 82.77 (±1.56) |
+MRGNN | 69.5 (±1.7) | 83.6 (±0.9) |
+DRGN | 70.10 | 81.80 |
+DRLK | 70.20 | - |
+PRS-QA | 70.36 (±1.03) | 83.4 (±0.64) |
Model | Time |
---|---|
G is a dense graph | |
L-hop MHGRN | |
L-layer QA-GNN | |
L-layer GreaseLM | |
L-layer PRS-QA | |
G is a sparse graph with maximum node degree | |
L-hop MHGRN | |
L-layer QA-GNN | |
L-layer GreaseLM | |
L-layer PRS-QA |
Method | IHdev-Acc. (%) |
---|---|
w/ Parallel Fusion Module | 77.93 |
w/ Parallel Fusion Module and Optimizing Relation Embedding Module | 78.17 |
w/ Parallel Fusion Module and Semantic Enhancement Module | 78.24 |
PRS-QA | 78.52 |
Methods | Dev | Test |
---|---|---|
BioBERTa-Large | 36.1 | 36.7 |
SapBERT-Base | - | 37.2 |
+QA-GNN | - | 38.0 |
+GreaseLM | 38.3 | 38.5 |
+DRLK | 39.1 | 40.4 |
+PRS-QA | 39.23 | 40.46 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zong, J.; Li, Z.; Chen, T.; Zhang, L.; Zhan, Y. Parallel Fusion of Graph and Text with Semantic Enhancement for Commonsense Question Answering. Electronics 2024, 13, 4618. https://doi.org/10.3390/electronics13234618
Zong J, Li Z, Chen T, Zhang L, Zhan Y. Parallel Fusion of Graph and Text with Semantic Enhancement for Commonsense Question Answering. Electronics. 2024; 13(23):4618. https://doi.org/10.3390/electronics13234618
Chicago/Turabian StyleZong, Jiachuang, Zhao Li, Tong Chen, Liguo Zhang, and Yiming Zhan. 2024. "Parallel Fusion of Graph and Text with Semantic Enhancement for Commonsense Question Answering" Electronics 13, no. 23: 4618. https://doi.org/10.3390/electronics13234618
APA StyleZong, J., Li, Z., Chen, T., Zhang, L., & Zhan, Y. (2024). Parallel Fusion of Graph and Text with Semantic Enhancement for Commonsense Question Answering. Electronics, 13(23), 4618. https://doi.org/10.3390/electronics13234618