Fake News Detection Based on Knowledge-Guided Semantic Analysis
Abstract
:1. Introduction
- We propose a dual-branch neural network with heterogeneous architectures for fake news detection based on knowledge-guided semantic analysis, which compares the news text to external knowledge for efficiently exposing fake news.
- To construct the bridge between the text and external knowledge, triplets are taken into consideration and a fuzzy-matching-based triplet alignment technique is developed to handle the case where equivalent elements are presented in different forms.
- To capture the inconsistency between the news content and external knowledge, a triplet aggregation module is developed for obtaining document-level knowledge representation as the guidance of text semantic analysis. In addition, we also consider the rationality of general knowledge as a complementary clue, which is measured by a triplet scoring module in the knowledge embedding space.
- Finally, to leverage the complementarity between text semantics and external knowledge, a text and knowledge interaction module with learnable weights is constructed to obtain the final detection results based on rationality scores from different branches.
2. Related Work
2.1. Text-Content-Based Detection Methods
2.1.1. Text-Feature-Based Detection Methods
2.1.2. Graph-Modeling-Based Detection Methods
2.2. Social-Network-Based Detection Methods
2.2.1. User-Stance-Analysis-Based Detection Methods
2.2.2. Propagation-Analysis-Based Detection Methods
2.3. External-Knowledge-Based Detection Methods
2.4. Comparison between Different Types of Detection Methods
3. Fake News Detection Framework Based on Knowledge-Guided Semantic Analysis
3.1. The Preprocessing Operations of Extracting Triplets
3.2. Triplet Alignment Module
3.2.1. Construction of the Knowledge Graph
3.2.2. A Fuzzy-Matching-Based Triplet Alignment Method
3.3. A Dual-Branch Network for Fake News Detection Based on Knowledge-Guided Semantic Analysis
3.3.1. Text Semantic Embedding Subnetwork
3.3.2. Knowledge Graph Semantic Analysis Subnetwork
- 1.
- The training process of general knowledge embedding: For a given knowledge graph (where and denote the set of entities and relations in the knowledge graph), the mapping function of elements in the knowledge graph to feature vectors can be obtained by training a knowledge embedding model, such as TransE [28] used in this work, where the dimension of the knowledge representation is set as k.
- 2.
- Triplet aggregation module based on token and channel mixing mechanism: Instead of constructing the graph-based learning procedure of document-level external knowledge representations like existing methods, we designed a triplet aggregation module based on a token and channel mixing mechanism. It can fully conduct the interaction of knowledge representations within and across different triplets, which can obtain a better document-level knowledge representation to guide the analysis of text semantics.
- 3.
- Triplet scoring module based on rationality measurement: To further enhance the detection capability based on text semantics and external knowledge, we construct a triplet scoring module based on the rationality measurement. This can measure irrationality in view of general knowledge as a complementary clue, which considers the relationship of entities and relations in the knowledge embedding space.
3.4. The Interaction Module of Text Semantic and General Knowledge
3.5. The Design of the Loss Function
4. Experiments
4.1. Fake News Dataset
Evaluation Metrics
4.2. Experimental Settings
4.3. Comparison Experiment
4.4. Performance Analysis with Different Text Embedding Networks
- 1.
- TEN-GloVe: In the text embedding subnetwork, the BERT model is replaced by GloVe, which can obtain the representation vector of each word in the document and take the average value as the text semantic embedding of the document.
- 2.
- TEN-FastText: This is the same as TF-GloVe, but uses FastText to obtain the representation vector of each word in the document.
- 3.
- Proposed method: The proposed detection model in this work.
4.5. Performance Analysis with Different Knowledge Embedding Networks
- 1.
- KEN\tri_a: In this case, the triplet embedding aggregation module is removed from the knowledge embedding subnetwork. Specifically, the text semantic embedding and the document triplet scores obtained from the triplet scoring module are used to detect fake news.
- 2.
- KEN\tri_s: In this case, the triplet scoring module is removed. Specifically, the text semantic representation and the aggregation representation obtained by the triplet embedding aggregation module are concatenated as a detection feature to expose fake news, which contains rich text semantics guided by external knowledge.
- 3.
- Non-KEN: The knowledge embedding subnetwork is removed completely; the fake news detection is conducted by only applying the text embedding subnetwork.
- 4.
- Proposed method: The proposed dual-branch model based on knowledge-guided semantic analysis.
- 1.
- When removing the triplet aggregation module (KEN\tri_a), the detection model is inefficient in capturing the knowledge information about triplets in the whole document, which results in a performance drop.
- 2.
- By removing the triplet scoring module (KEN\tri_s), the detection model cannot measure the rationality of the entire document, leading to performance degradation.
- 3.
- The non-KEN model suffers from a distinct performance drop since the knowledge embedding is ignored in this case. This implies that only applying the text semantic embedding is insufficient to expose fake news precisely.
4.6. Performance Evaluation on Different Fake News Datasets
5. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Guo, B.; Ding, Y.; Yao, L.; Liang, Y.; Yu, Z. The future of false information detection on social media: New perspectives and trends. ACM Comput. Surv. 2020, 53, 1–36. [Google Scholar] [CrossRef]
- Zhao, J.; Zhao, Z.; Shi, L.; Kuang, Z.; Liu, Y. Collaborative mixture-of-experts model for multi-domain fake news detection. Electronics 2023, 12, 3440. [Google Scholar] [CrossRef]
- Gangireddy, S.C.R.; P, D.; Long, C.; Chakraborty, T. Unsupervised fake news detection: A graph-based approach. In Proceedings of the HT ’20: 31st ACM Conference on Hypertext and Social Media, Virtual Event, 13–15 July 2020; pp. 75–83. [Google Scholar]
- Yuan, L.; Shen, H.; Shi, L.; Cheng, N.; Jiang, H. An explainable fake news analysis method with stance information. Electronics 2023, 12, 3367. [Google Scholar] [CrossRef]
- Silva, A.; Han, Y.; Luo, L.; Karunasekera, S.; Leckie, C. Propagation2Vec: Embedding partial propagation networks for explainable fake news early detection. Inf. Process. Manag. 2021, 58, 102618. [Google Scholar] [CrossRef]
- Hu, L.; Yang, T.; Zhang, L.; Zhong, W.; Tang, D.; Shi, C.; Duan, N.; Zhou, M. Compare to the knowledge: Graph neural fake news detection with external knowledge. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Virtual Event, 1–6 August 2021; pp. 754–763. [Google Scholar]
- Hu, L.; Wei, S.; Zhao, Z.; Wu, B. Deep learning for fake news detection: A comprehensive survey. AI Open 2022, 3, 133–155. [Google Scholar] [CrossRef]
- Potthast, M.; Kiesel, J.; Reinartz, K.; Bevendorff, J.; Stein, B. A stylometric inquiry into hyperpartisan and fake news. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; pp. 231–240. [Google Scholar]
- Kong, S.H.; Tan, L.M.; Gan, K.H.; Samsudin, N.H. Fake news detection using deep learning. In Proceedings of the IEEE 10th Symposium on Computer Applications & Industrial Electronics (ISCAIE), Penang, Malaysia, 18–19 April 2020; pp. 102–107. [Google Scholar]
- Vaibhav, V.; Annasamy, R.M.; Hovy, E.H. Do sentence interactions matter? leveraging sentence level representations for fake news classification. In Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing, Hong Kong, China, 4 November 2019; pp. 134–139. [Google Scholar]
- Nguyen, V.; Sugiyama, K.; Nakov, P.; Kan, M. FANG: Leveraging social context for fake news detection using graph representation. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management, Virtual Event, 19–23 October 2020; pp. 1165–1174. [Google Scholar]
- Jin, Z.; Cao, J.; Zhang, Y.; Luo, J. News Verification by Exploiting Conflicting Social Viewpoints in Microblogs. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 2972–2978. [Google Scholar]
- Oshikawa, R.; Qian, J.; Wang, W.Y. A survey on natural language processing for fake news detection. In Proceedings of the 12th Language Resources and Evaluation Conference, Palais du Pharo, France, 11–16 May 2020; pp. 6086–6093. [Google Scholar]
- Wu, L.; Rao, Y.; Jin, H.; Nazir, A.; Sun, L. Different absorption from the same sharing: Sifted multi-task learning for fake news detection. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, Hong Kong, China, 3–7 November 2019; pp. 4643–4652. [Google Scholar]
- Zhang, J.; Dong, B.; Yu, P.S. FakeDetector: Effective fake news detection with deep diffusive neural network. In Proceedings of the 36th IEEE International Conference on Data Engineering, Dallas, TX, USA, 20–24 April 2020; pp. 1826–1829. [Google Scholar]
- Bian, T.; Xiao, X.; Xu, T.; Zhao, P.; Huang, W.; Rong, Y.; Huang, J. Rumor detection on social media with bi-directional graph convolutional networks. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 549–556. [Google Scholar]
- Dou, Y.; Shu, K.; Xia, C.; Yu, P.S.; Sun, L. User preference-aware fake news detection. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, 11–15 July 2021; pp. 2051–2055. [Google Scholar]
- Zhang, H.; Fang, Q.; Qian, S.; Xu, C. Multi-modal knowledge-aware event memory network for social media rumor detection. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 1942–1951. [Google Scholar]
- Wu, K.; Yuan, X.; Ning, Y. Incorporating relational knowledge in explainable fake news detection. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Delhi, India, 11–14 May 2021; pp. 403–415. [Google Scholar]
- Li, J.; Ni, S.; Kao, H. Meet the truth: Leverage objective facts and subjective views for interpretable rumor detection. In Proceedings of the Findings of the Association for Computational Linguistics, Online, 1–6 August 2021; pp. 705–715. [Google Scholar]
- Cabot, P.L.H.; Navigli, R. REBEL: Relation extraction by end-to-end language generation. In Proceedings of the Findings of the Association for Computational Linguistics, Online, 1–6 August 2021; pp. 2370–2381. [Google Scholar]
- Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 6–8 July 2019; pp. 7871–7880. [Google Scholar]
- Ilievski, F.; Szekely, P.; Zhang, B. Cskg: The commonsense knowledge graph. In Proceedings of the The Semantic Web: 18th International Conference, Virtual Event, 6–10 June 2021; pp. 680–696. [Google Scholar]
- Zhang, H.; Li, Z.; Liu, S.; Huang, T.; Ni, Z.; Zhang, J.; Lv, Z. Do sentence-level sentiment interactions matter? sentiment mixed heterogeneous network for fake news detection. IEEE Trans. Comput. Soc. Syst. 2023, 1–11. [Google Scholar] [CrossRef]
- Navarro, G. A guided tour to approximate string matching. ACM Comput. Surv. 2001, 33, 31–88. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Turc, I.; Chang, M.; Lee, K.; Toutanova, K. Well-read students learn better: The impact of student initialization on knowledge distillation. arXiv 2019, arXiv:1908.08962. [Google Scholar]
- Bordes, A.; Usunier, N.; García-Durán, A.; Weston, J.; Yakhnenko, O. Translating embeddings for modeling multi-relational data. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 2787–2795. [Google Scholar]
- Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. MLP-Mixer: An all-MLP architecture for vision. In Proceedings of the Annual Conference on Neural Information Processing Systems, Online, 6–14 December 2021; pp. 24261–24272. [Google Scholar]
- Hendrycks, D.; Gimpel, K. Gaussian error linear units (GELUs). arXiv 2019, arXiv:1911.03925. [Google Scholar]
- Ba, L.J.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
- Rashkin, H.; Choi, E.; Jang, J.Y.; Volkova, S.; Choi, Y. Truth of varying shades: Analyzing language in fake news and political fact-checking. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 9–11 September 2017; pp. 2931–2937. [Google Scholar]
- Rubin, V.L.; Conroy, N.; Chen, Y.; Cornwell, S. Fake news or truth? using satirical cues to detect potentially misleading news. In Proceedings of the Workshop on Computational Approaches to Deception Detection, Avignon, France, 23–27 April 2016; pp. 7–17. [Google Scholar]
- Kim, Y. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 1746–1751. [Google Scholar]
- Rao, G.; Huang, W.; Feng, Z.; Cong, Q. LSTM with sentence representations for document-level sentiment classification. Neurocomputing 2018, 308, 49–57. [Google Scholar] [CrossRef]
- Wang, Y.; Qian, S.; Hu, J.; Fang, Q.; Xu, C. Fake news detection via knowledge-driven multimodal graph convolutional networks. In Proceedings of the International Conference on Multimedia Retrieval, Dublin, Ireland, 8–11 June 2020; pp. 540–547. [Google Scholar]
- Yang, S.H.; Chen, C.C.; Huang, H.H.; Chen, H.H. Entity-aware dual co-Attention network for fake news detection. arXiv 2023, arXiv:2302.03475. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of tricks for efficient text classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, 3–7 April 2017; pp. 427–431. [Google Scholar]
- Ma, J.; Gao, W.; Wong, K. Detect rumors in microblog posts using propagation structure via kernel learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 708–717. [Google Scholar]
- Lu, Y.; Li, C. GCAN: Graph-aware co-attention networks for explainable fake news detection on social media. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 505–514. [Google Scholar]
Method | Information | Benefit | Drawback |
---|---|---|---|
Text-content -based method | Text | High computational efficiency | Hard to deal with synonyms and explore irrationality beyond text information |
Social-network -based method | Text and social network data | The expected performance tends to be higher, when user/propagation data are available | May become invalid to deploy due to data privacy |
External-knowledge -based method | Text and external knowledge | The expected performance tends to be higher, when external knowledge is available | Require to carefully design the fusion strategy between text and external knowledge |
Detection Method | Information | Accuracy | Recall | Macro-F1 |
---|---|---|---|---|
CNN | Text | 78.93 | 78.42 | 79.04 |
LSTM | Text | 80.63 | 81.00 | 80.32 |
KMGCN | Text and EK | 85.13 | 86.27 | 85.75 |
Dual-CAN | Text and EK | 87.65 | 87.12 | 88.35 |
SMHN | Text and EK | 89.12 | 89.54 | 89.29 |
Proposed method | Text and EK | 94.45 | 95.61 | 94.38 |
Model | Accuracy | Recall | Macro-F1 |
---|---|---|---|
TEN-GloVe | 81.67 | 80.78 | 81.63 |
TEN-FastText | 82.39 | 81.22 | 82.26 |
Proposed method | 94.45 | 95.61 | 94.38 |
Model | Accuracy | Recall | Macro-F1 |
---|---|---|---|
KEN\tri_a | 91.06 | 93.11 | 91.05 |
KEN\tri_s | 92.22 | 94.78 | 91.19 |
Non-KEN | 86.28 | 88.35 | 87.26 |
Proposed method | 94.45 | 95.61 | 94.38 |
Dataset | Accuracy | Recall | Macro-F1 |
---|---|---|---|
Twitter15 | 93.51 | 93.66 | 92.83 |
Twitter16 | 90.87 | 88.97 | 90.24 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, W.; He, P.; Zeng, Z.; Xu, X. Fake News Detection Based on Knowledge-Guided Semantic Analysis. Electronics 2024, 13, 259. https://doi.org/10.3390/electronics13020259
Zhao W, He P, Zeng Z, Xu X. Fake News Detection Based on Knowledge-Guided Semantic Analysis. Electronics. 2024; 13(2):259. https://doi.org/10.3390/electronics13020259
Chicago/Turabian StyleZhao, Wenbin, Peisong He, Zhixin Zeng, and Xiong Xu. 2024. "Fake News Detection Based on Knowledge-Guided Semantic Analysis" Electronics 13, no. 2: 259. https://doi.org/10.3390/electronics13020259
APA StyleZhao, W., He, P., Zeng, Z., & Xu, X. (2024). Fake News Detection Based on Knowledge-Guided Semantic Analysis. Electronics, 13(2), 259. https://doi.org/10.3390/electronics13020259