Research on Feature Fusion Method Based on Graph Convolutional Networks
Abstract
:1. Introduction
- The design of the BGF model is based on the following theoretical foundations
- Specific advantages of the BGF model
2. Related Work
3. Methods
3.1. Graph Convolutional Networks Module
3.2. BERT Model
3.3. Linear Fusion
3.4. Multilayer Integration
3.5. Interactive Output Layer
3.6. Loss Function
4. Results
4.1. Experimental Data
- R8 dataset: The R8 dataset is a subset of the Reuters-21578 dataset, widely used for the performance evaluation of text categorization algorithms;Source: Reuters-21578;Number of categories: Eight categories;Data composition: Consists of news articles, each labeled as belonging to one or more predefined categories;Purpose: Mainly used to test the performance of text categorization models in multi-category categorization tasks;Characteristics: There may be some similarity between the categories, but the number of categories is small, which is suitable for quickly verifying the model performance.
- R52 dataset: The R52 dataset is a subset of the Reuters-21578 dataset, which is widely used for the performance evaluation of text categorization algorithms;Source: Reuters-21578;Number of categories: 52 categories;Data composition: Consists of news articles, each labeled as belonging to one or more predefined categories;Purpose: Used to evaluate the performance of a text categorization model in a finer-grained, multi-category categorization task;Characteristics: Larger number of categories, potentially greater discrimination between categories, more challenging.
- MR dataset: The MR (Movie Reviews) dataset is used for binary sentiment categorization tasks;Source: Movie Reviews Collection;Number of categories: Two categories (positive and negative);Data composition: Each document contains a positive or negative sentiment label, each document in the dataset has a unique identifier and a relevance score indicating how relevant the document is to the given query term, and the entire dataset contains 5331 positive and 5331 negative reviews;Purpose: It is mainly used to test the performance of the sentiment analysis model in a binary classification task;Characteristics: The dataset is balanced with an equal number of positive and negative comments, which is suitable for testing the performance of sentiment classification models.
4.2. Experimental Setting
4.2.1. Experimental Environment
4.2.2. Parameter Setting
4.2.3. Data Preprocessing
- Segmentation and deduplication: The text undergoes segmentation, and common stopwords are removed;
- Construction of vocabulary list: A vocabulary list is created based on the corpus, mapping each word to a unique index;
- Text graph construction: A heterogeneous graph is constructed for the entire corpus, where nodes represent words and documents, and edges signify co-occurrence relationships between words, and between words and documents;
- Feature representation: Text is encoded using BERT to generate semantic embedding representations of text nodes. These embeddings are then integrated with global structural features captured by the GCN.
4.2.4. Parameter Setting
- Dataset division: Randomly select 10% of the training set as the validation set and use the remaining 90% for training;
- Training strategy: Train the model on the four datasets for up to 50 epochs. If the validation loss does not decrease for five consecutive epochs, the training is halted;
- Model structure: The BGF model comprises 2 layers of GCN and 12 layers of BERT. The word embedding dimension is set to 200.
4.3. Evaluation Indicators
- Data preparation: Initially, separate the training set and test set from the dataset. The training set is utilized to train the model, while the test set evaluates the model’s performance;
- Model training: Train the model using the training set to enable it to learn the features necessary for text classification;
- Model prediction: Apply the trained model to predict categories for the test set, generating predicted categories for each sample;
- Result analysis: Calculate the number of True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN) from the prediction results;
- Calculate accuracy: Based on the statistical results, compute the accuracy of the model on the test set using the following formula.
4.4. Experimental Comparison
4.5. Experimental Analysis
4.5.1. Statistical Significance Analysis
- On the R8 dataset, the BGF model improves accuracy by 0.28% over BertGCN, yielding a p-value of 0.03, indicating statistical significance.
- On the R52 dataset, the BGF model improves accuracy by 0.9% over BertGCN, yielding a p-value of 0.045, indicating statistical significance.
- On the MR dataset, the BGF model improves accuracy by 0.03% over BertGCN, with a corresponding p-value of 0.87, indicating an improvement but not reaching statistical significance.
4.5.2. Confidence Interval Analysis
- On the R8 dataset, the 95% confidence interval for the accuracy of the BGF model is [98.10, 98.80], and, for the BertGCN model, it is [97.85, 98.49].
- On the R52 dataset, the 95% confidence intervals for the accuracy of the BGF model are [93.45, 94.09], while, for the BertGCN model, they are [92.50, 93.24].
- On the MR dataset, the 95% confidence intervals for accuracy are [85.89, 86.65] for the BGF model and [85.93, 86.55] for the BertGCN model.
4.5.3. Error Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhang, J.; Yan, K.; Ma, X. Analysis of complex spam filtering algorithm based on neural network. Comput. Appl. 2022, 42, 770. [Google Scholar]
- Li, Z.; Fan, Y.; Jiang, B.; Lei, T.; Liu, W. A Survey on Sentiment Analysis and Opinion Mining for Social Multimedia. Multimed. Tools Appl. 2019, 78, 6939–6967. [Google Scholar] [CrossRef]
- Fan, H.; Li, S.; Aihaiti, Z. The application and impact of machine learning algorithms in China’s intelligence research—A perspective based on CSSCI journal articles. Libr. Intell. Knowl. 2022, 39, 96–108. [Google Scholar]
- Yang, X.; Guo, M.; Hou, H.; Yuan, J.; Li, X.; Li, K.; Wang, W.; He, S.; Luo, Z. Improved BiLSTM-CNN+ Attention sentiment classification algorithm incorporating sentiment dictionary. Sci. Technol. Eng. 2022, 22, 8761–8770. [Google Scholar]
- Peng, F.; Schuurmans, D. Combining naive Bayes and n-gram language models for text classification. In Advances in Information Retrieval, Proceedings of the European Conference on IR Research, ECIR 2003, Pisa, Italy, 14–16 April 2003; Springer: Berlin, Germany, 2003; pp. 335–350. [Google Scholar]
- Joachims, T. Text categorization with support vector machines: Learning with manyrelevant features. In Machine Learning: ECML-98, Proceedings of the European Conference on Machine Learning, Chemnitz, Germany, 21–23 April 1998; Springer: Berlin, Germany, 1998; pp. 137–142. [Google Scholar]
- Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann: San Francisco, CA, USA, 1993. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Tenney, I.; Das, D.; Pavlick, E. BERT rediscovers the classical NLP pipeline. arXiv 2019, arXiv:1905.05950. [Google Scholar]
- Huang, L.; Ma, D.; Li, S.; Zhang, X.; Wang, H. Text Level Graph Neural Network for Text Classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 3444–3450. [Google Scholar] [CrossRef]
- Deng, C.; Zhong, G.; Wang, D. Text categorization based on attention gated graph neural network. Comput. Sci. 2022, 49, 326–334. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Yao, L.; Mao, C.S.; Luo, Y. Graph Convolutional Networks for Text Classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 29–31 January 2019; pp. 7370–7377. [Google Scholar] [CrossRef]
- Lin, Y.; Meng, Y.; Sun, X.; Han, Q.; Kuang, K.; Li, J.; Wu, F. BertGCN: Transductive Text Classification by Combining GCN and BERT. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2021; pp. 1456–1462. [Google Scholar] [CrossRef]
- Yang, P.; Sun, X.; Li, W.; Ma, S.; Wu, W.; Wang, H. SGM: Sequence Generation Model for Multi-label Classification. In Proceedings of the International Conference on Computational Linguistics, Santa Fe, NM, USA, 20–26 August 2018; pp. 3915–3926. [Google Scholar]
- Zhang, M.L.; Zhou, Z.H. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 2006, 18, 1338–1351. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 2235–2249. [Google Scholar]
- Yin, W.; Kann, K.; Yu, M.; Schütze, H. Comparative study of CNN and RNN for natural language processing. arXiv 2017, arXiv:1702.01923. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 1–9. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A robustly optimized BERT pre-training approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Lan, Z.; Chen, M.; Goodman, S.; Gimpel, K.; Sharma, P.; Soricut, R. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019; pp. 1–8. [Google Scholar]
- Sanh, V.; Debut, L.; Chaumond, J.; Wolf, T. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv 2019, arXiv:1910.01108. [Google Scholar]
- Sun, Y.; Wang, S.; Li, Y.; Feng, S.; Tian, H.; Wu, H.; Wang, H. Ernie 2.0: A continual pre-training framework for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 8968–8975. [Google Scholar]
- Liu, X.; Cheng, H.; He, P.; Chen, W.; Wang, Y.; Poon, H.; Gao, J. Adversarial training for large neural language models. arXiv 2020, arXiv:2004.08994. [Google Scholar]
- Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.R.; Le, Q.V. Xlnet: Generalized autoregressive pretraining for language understanding. Adv. Neural Inf. Process. Syst. 2019, 32, 3678–3693. [Google Scholar]
- Kim, Y. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; ACL Press: Doha, Qatar, 2014; pp. 1746–1751. [Google Scholar]
- Ramos, J. Using tf-idf to determine word relevance in document queries. In Proceedings of the First Instructional Conference on Machine Learning, Washington, DC, USA, December 2003; Volume 242, pp. 29–48. [Google Scholar]
- Jiang, H.; Zhang, R.; Guo, J.; Fan, Y.; Cheng, X. Comparative analysis of graph convolutional networks and self-attention mechanism on text categorization task. J. Chin. Inf. 2021, 35, 84–93. [Google Scholar]
- Pang, B.; Lee, L. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL’05), Ann Arbor, MI, USA, 25–30 June 2005; pp. 115–124. [Google Scholar]
Dataset | Number of Documents | Number of Words | Number of Nodes | Number of Categories | Average Length |
---|---|---|---|---|---|
R8 | 7674 | 7688 | 15,362 | 8 | 65.72 |
R52 | 9100 | 8892 | 17,992 | 52 | 69.82 |
MR | 10,662 | 18,764 | 29,426 | 2 | 20.39 |
Experimental Environment | Configuration Information |
---|---|
Operating System | Windows 11 |
GPU | NVIDIA GeForce RTX 4080 |
Memory | 16 G |
RAM | 32 G |
Programming Languages | Python 3.6.13 |
Deep Learning Frameworks | PyTorch 1.8.1 |
Parameter | Numeric Value |
---|---|
Initial Learning Rate | |
Learning Rate of BERT Module | |
Dropout Rate | 0.5 |
Number of GCN Layers | 2 |
Number of BERT Layers | 12 |
Word Embedding Dimension | 200 |
Predicted Value Real Value | Predicted Value | ||
---|---|---|---|
Positive P | Negative N | ||
Real Value | Positive P | TP | FN |
Negative N | FP | TN |
Models | R8 | R52 | MR |
---|---|---|---|
TextGCN | 97.07% | 93.56% | 76.74% |
BertGCN | 98.17% | 92.87% | 86.24% |
BGF | 98.45% | 93.77% | 86.27% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, D.; Chen, X. Research on Feature Fusion Method Based on Graph Convolutional Networks. Appl. Sci. 2024, 14, 5612. https://doi.org/10.3390/app14135612
Wang D, Chen X. Research on Feature Fusion Method Based on Graph Convolutional Networks. Applied Sciences. 2024; 14(13):5612. https://doi.org/10.3390/app14135612
Chicago/Turabian StyleWang, Dong, and Xuelin Chen. 2024. "Research on Feature Fusion Method Based on Graph Convolutional Networks" Applied Sciences 14, no. 13: 5612. https://doi.org/10.3390/app14135612
APA StyleWang, D., & Chen, X. (2024). Research on Feature Fusion Method Based on Graph Convolutional Networks. Applied Sciences, 14(13), 5612. https://doi.org/10.3390/app14135612