Sparse Attention-Based Residual Joint Network for Aspect-Category-Based Sentiment Analysis
Abstract
1. Introduction
- We propose SPA-RJ Net to explore the effectiveness of sparse attention in ACSA. It employs two aspect-guided sparse attentions—aspect-category and aspect-sentiment attention—which introduce sparsity into attention via the α-entmax function. This enables the model to focus more effectively on a smaller number of aspect-specific elements, improving accuracy and interpretability.
- SPA-RJ Net adopts a residual joint learning framework with adjusting a balancing hyperparameter. The ACD module extracts aspect features by sparse aspect-category attention and transfers them to the main ACSA module via the residual pathway, which enhances final sentiment classification by providing explicit guidance on relevant aspect categories.
- Our experiments with ABSA benchmark datasets (SemEval and MAMS) validate the efficiency and the interpretability of SPA-RJ Net, which highlights the benefits of sparse attention and residual joint learning in the ABSA task domain.
2. Related Work
2.1. Attention-Based Deep Learning Models for ABSA
2.2. Attention with Sparsity Constraints
2.3. Joint Learning for ABSA
3. Background
3.1. Attention Mechanism
3.2. Sparse Attention Distribution Transformation
3.3. Residual Learning
4. Sparse-Attention-Based Residual Joint Network (SPA-RJ Net)
4.1. Input Layer for Embedding
4.2. Sparse Aspect-Category Attention
4.3. Residual Connection from ACD to ACSA
4.4. Feature Extraction
4.5. Sparse Aspect-Sentiment Attention
4.6. Loss Function for Joint Learning
5. Experiment
5.1. Dataset Description
5.2. Evaluation Metrics
5.3. Comparison Models
- ATAE-LSTM [37]. The concatenations of an aspect embedding vector and each word embedding vector of the review are served as inputs to the LSTM model and an attention layer, allowing to selectively focus on aspect-relevant information.
- TD-LSTM [68]. TD-LSTM employs two target-dependent LSTMs that process the preceding part with the target term in both directions. Attention is applied to the concatenated hidden states, emphasizing words around the target.
- MemNet [38]. MemNet is a deep memory network with multiple computational layers, each of which includes an attention mechanism and explicit memory. Multi-hop attention enhances the ability to capture nuanced patterns by focusing on key context.
- IAN [41]. The interactive attention network (IAN) learns the interaction between aspect terms and words, generating separate representations for both, which effectively reflects the mutual influence between targets and contexts through two attention layers.
- AOA [42]. The attention-over-attention (AOA) jointly models the interaction between aspects and sentences, explicitly capturing the relationship between aspects and context, which enhances accuracy, particularly in complex multi-aspect contexts.
- ASGCN [48]. This model first uses GCN over the dependency tree to exploit syntactical information and word dependencies and to learn the aspect-specific representations.
- SAGCN [49]. SAGCN utilizes GCN to find correlation between the aspect and sentence using the dependency tree.
- DSMN [69]. DSMN uses the multi-hop attention guided by dynamically selected context memory. Then, it integrates the aspect information with the memory networks.
- CMA-MemNet [70]. The rich semantic information between the aspect and the sentence is extracted by this memory network. It uses multi-head self-attention to make up for the problem where it ignores the semantic information of the sequence itself.
- DualGCN [50]. This model uses two GCNs, SynGCN and SemGCN, that jointly consider the syntax structures and semantic correlations, with regularizers constraining attention scores in the SemGCN.
- MAT+2RO [16]. In a multi-task learning setting with ACD and ACSA, this LSTM-based model applies hybrid regularizations to the attention mechanism, combining aspect-level and task-level constraints to enhance performance.
5.4. Experiment Environment
5.5. Experiment Result
5.6. Ablation Study
5.7. Case Study
6. Discussion and Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Model [#Ref] | Year | Architecture Characteristics | Input Format | Source of Experiment Result | |||
---|---|---|---|---|---|---|---|
REST 14 | REST 15 | REST 16 | MAMS | ||||
ATAE-LSTM [37] | 2016 | Two LSTMs (left-to-aspect, aspect-to-right) | ⟨Left Context⟩, ⟨Right Context⟩, ⟨Aspect Term⟩ | [71] | [72] | ||
TD-LSTM [68] | 2016 | LSTM over concatenated word-aspect embeddings with attention | ⟨[Word, Aspect Term]⟩ sequence | ||||
MemNet [38] | 2015 | Multiple attention model over external memory | ⟨Context Words⟩, ⟨Aspect Term⟩ | ||||
IAN [41] | 2017 | Two parallel LSTMs for aspect and context with interactive attention | ⟨Context Words⟩, ⟨Aspect Term⟩ | ||||
AOA [42] | 2018 | Attention-over-attention mechanism | ⟨Context Words⟩, ⟨Aspect Term⟩ | ||||
ASGCN [48] | 2021 | Aspect-specific graph convolutional network | ⟨Sentence⟩, ⟨Aspect Index⟩, ⟨Dependency Graph⟩ | [57] | - | ||
SAGCN [49] | 2021 | Sparse attention-guided GCN | ⟨Sentence⟩, ⟨Aspect Term⟩, ⟨Dependency Graph⟩ | - | |||
DSMN [69] | 2020 | Dynamic memory update with multiple reasoning hops | ⟨Context⟩, ⟨Aspect Term⟩ | - | |||
CMA-MemNet [70] | 2021 | Dual memory modules (coarse-to-fine granularity) | ⟨Context⟩, ⟨Aspect Term⟩, ⟨Aspect Category⟩ | - | |||
DualGCN [50] | 2022 | Dual GCNs (syntactic + semantic) fused via gating | ⟨Sentence⟩, ⟨Aspect Index⟩, ⟨Dependency Graph⟩, ⟨Semantic Graph⟩ | Original paper | - | ||
MAT-2RO [16] | 2023 | Multi-level attention routing + hybrid regularization | ⟨Sentence⟩, ⟨Aspect Categories⟩ | Original paper | - |
References
- Zhang, W.; Li, X.; Deng, Y.; Bing, L.; Lam, W. A survey on avspect-based sentiment analysis: Tasks, methods, and challenges. IEEE Trans. Knowl. Data Eng. TKDE 2022, 35, 11019–11038. [Google Scholar] [CrossRef]
- Do, H.H.; Prasad, P.W.; Maag, A.; Alsadoon, A. Deep learning for aspect-based sentiment analysis: A comparative review. Expert Syst. Appl. 2019, 118, 272–299. [Google Scholar] [CrossRef]
- Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Elfadel, I.M.; Wyatt, J.L., Jr. The “softmax” nonlinearity: Derivation using statistical mechanics and useful properties as a multiterminal analog circuit element. In Proceedings of the Advances in Neural Information Processing Systems, NIPS, Denver, CO, USA, 29 November–2 December 1993; Volume 6. [Google Scholar]
- Li, X.; Bing, L.; Lam, W.; Shi, B. Transformation Networks for Target-Oriented Sentiment Classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL, Melbourne, Australia, 15–20 July 2018; pp. 946–956. [Google Scholar]
- Veličković, P.; Perivolaropoulos, C.; Barbero, F.; Pascanu, R. Softmax is not enough (for sharp out-of-distribution). In Proceedings of the First Workshop on System-2 Reasoning at Scale, NeurIPS, Vancouver, BC, Canada, 14 December 2024. [Google Scholar]
- Niculae, V.; Blondel, M. A regularized framework for sparse and structured neural attention. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NeurIPS, Long Beach, CA, USA, 4–9 December 2017; pp. 3340–3350. [Google Scholar]
- Liu, Y.; Liu, J.; Chen, L.; Lu, Y.; Feng, S.; Feng, Z.; Wang, H. Ernie-sparse: Learning hierarchical efficient transformer through regularized self-attention. arXiv 2022, arXiv:2203.12276. [Google Scholar]
- Li, Q. A comprehensive survey of sparse regularization: Fundamental, state-of-the-art methodologies and applications on fault diagnosis. Expert Syst. Appl. 2023, 229, 120517. [Google Scholar] [CrossRef]
- Feng, A.; Zhang, X.; Song, X. Unrestricted attention may not be all you need–masked attention mechanism focuses better on relevant parts in aspect-based sentiment analysis. IEEE Access 2022, 10, 8518–8528. [Google Scholar] [CrossRef]
- Sun, S.; Zhang, Z.; Huang, B.; Lei, P.; Su, J.; Pan, S.; Cao, J. Sparse-softmax: A simpler and faster alternative softmax transformation. arXiv 2021, arXiv:2112.12433. [Google Scholar]
- Martins, A.T.; Astudillo, R.F. From softmax to sparsemax: A sparse model of attention and multi-label classification. In Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML, New York, NY, USA, 20–22 June 2016; pp. 1614–1623. [Google Scholar]
- Blondel, M.; Martins, A.F.; Niculae, V. Learning with fenchel-young losses. J. Mach. Learn. Res. JMLR 2020, 21, 1–69. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Kenton, W.C.; Toutanova, L.K. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, 2–7 June 2019; Volume 1, pp. 4171–4186. [Google Scholar]
- Hu, M.; Zhao, S.; Guo, H.; Su, Z. Hybrid regularizations for multi-aspect category sentiment analysis. IEEE Trans. Affect. Comput. 2023, 14, 3294–3304. [Google Scholar] [CrossRef]
- Kirange, D.; Deshmukh, R.R.; Kirange, M. Aspect based sentiment analysis semeval-2014 task 4. Asian J. Comput. Sci. Inf. Technol. 2014, 4, 72–75. [Google Scholar] [CrossRef]
- Pontiki, M.; Galanis, D.; Papageorgiou, H.; Manandhar, S.; Androutsopoulos, I. SemEval-2015 Task 12: Aspect Based Sentiment Analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval, Denver, CO, USA, 4–5 June 2015; pp. 486–495. [Google Scholar]
- Pontiki, M.; Galanis, D.; Papageorgiou, H.; Androutsopoulos, I.; Manandhar, S.; Al-Smadi, M.; Eryiğit, G. Semeval-2016 task 5: Aspect based sentiment analysis. In Proceedings of the International Workshop on Semantic Evaluation, San Diego, CA, USA, 16–17 June 2016; pp. 19–30. [Google Scholar]
- Jiang, Q.; Chen, L.; Xu, R.; Ao, X.; Yang, M. A challenge dataset and effective models for aspect-based sentiment analysis. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, Miami, FL, USA, 12–16 November 2019; pp. 6280–6285. [Google Scholar]
- Chemmanam, A.J.; Jose, B.A. Joint learning for multitasking models. In Responsible Data Science; Springer Nature: Singapore, 2021; p. 155. [Google Scholar]
- Ebski, S.J.; Arpit, D.; Ballas, N.; Verma, V.; Che, T.; Bengio, Y. Residual connections encourage iterative inference. In Proceedings of the International Conference on Learning Representations, ICLR, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Liu, H.; Chatterjee, I.; Zhou, M.; Lu, X.S.; Busorrah, A. Aspect-based sentiment analysis: A survey of deep learning methods. IEEE Trans. Comput. Soc. Syst. 2020, 7, 1358–1375. [Google Scholar] [CrossRef]
- Vo, D.T.; Zhang, Y. Target-dependent twitter sentiment classification with rich automatic features. In Proceedings of the International Joint Conference on Artificial Intelligence, IJCAI, Buenos Aires, Argentina, 25–31 July 2015; pp. 1347–1453. [Google Scholar]
- Zhou, J.; Huang, J.X.; Chen, Q.; Hu, Q.V.; Wang, T.; He, L. Deep learning for aspect-level sentiment classification: Survey, vision, and challenges. IEEE Access 2019, 7, 78454–78483. [Google Scholar] [CrossRef]
- Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed representations of words and phrases and their compositionality. In Proceedings of the Advances in Neural Information Processing Systems, NIPS, Lake Tahoe, NA, USA, 5–8 December 2013; Volume 26, pp. 3111–3119. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Dong, L.; Wei, F.; Tan, C.; Tang, D.; Zhou, M.; Xu, K. Adaptive recursive neural network for target-dependent twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL, Baltimore, MD, USA, 22–27 June 2014; Volume 2, pp. 49–54. [Google Scholar]
- De Mulder, W.; Bethard, S.; Moens, M.F. A survey on the application of recurrent neural networks to statistical language modeling. Comput. Speech Lang. 2015, 30, 61–98. [Google Scholar] [CrossRef]
- Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
- Graves, A.; Graves, A. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012; pp. 37–45. [Google Scholar]
- Zhang, S.; Zheng, D.; Hu, X.; Yang, M. Bidirectional long short-term memory networks for relation classification. In Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, Shanghai, China, 30 October–1 November 2015; pp. 73–78. [Google Scholar]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef] [PubMed]
- Hoang, M.; Bihorac, O.A.; Rouces, J. Aspect-based sentiment analysis using bert. In Proceedings of the 22nd Nordic Conference on Computational Linguistics, Turku, Finland, 30 September–2 October 2019; pp. 187–196. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations, ICLR, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Tang, D.; Qin, B.; Liu, T. Document modeling with gated recurrent neural network for sentiment classification. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP, Lisbon, Portugal, 17–21 September 2015; pp. 1422–1432. [Google Scholar]
- Wang, Y.; Huang, M.; Zhu, X.; Zhao, L. Attention-based LSTM for aspect-level sentiment classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, Austin, TX, USA, 1–4 November 2016; pp. 606–615. [Google Scholar]
- Tang, D.; Qin, B.; Liu, T. Aspect Level Sentiment Classification with Deep Memory Network. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, Lisbon, Portugal, 17–21 September 2015; pp. 214–224. [Google Scholar]
- Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 1480–1489. [Google Scholar]
- Cheng, J.; Zhao, S.; Zhang, J.; King, I.; Zhang, X.; Wang, H. Aspect-level sentiment classification with heat (hierarchical attention) network. In Proceedings of the ACM on Conference on Information and Knowledge Management, CIKM, Singapore, 6–10 November 2017; pp. 97–106. [Google Scholar]
- Ma, D.; Li, S.; Zhang, X.; Wang, H. Interactive Attention Networks for Aspect-Level Sentiment Classification. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI, Melbourne, Australia, 19–25 August 2017; pp. 4068–4074. [Google Scholar]
- Huang, B.; Ou, Y.; Carley, K.M. Aspect level sentiment classification with attention-over-attention neural networks. In Proceedings of the Social, Cultural, and Behavioral Modeling: 11th International Conference, SBP-BRiMS, Washington, DC, USA, 10–13 July 2018; Volume 11, pp. 197–206. [Google Scholar]
- Xu, Q.; Zhu, L.; Dai, T.; Yan, C. Aspect-based sentiment classification with multi-attention network. Neurocomputing 2020, 388, 135–143. [Google Scholar] [CrossRef]
- Rhanoui, M.; Mikram, M.; Yousfi, S.; Barzali, S. A CNN-BiLSTM model for document-level sentiment analysis. Mach. Learn. Knowl. Extr. 2019, 1, 832–847. [Google Scholar] [CrossRef]
- Lin, J.; Najafabadi, M.K. Aspect level sentiment analysis with CNN Bi-LSTM and attention mechanism. In Proceedings of the IEEE/ACIS 8th International Conference on Big Data, Cloud Computing, and Data Science, BCD, Hochimin City, Vietnam, 14–16 December 2023; pp. 282–288. [Google Scholar]
- Ayetiran, E.F. Attention-based aspect sentiment classification using enhanced learning through CNN-BiLSTM networks. Knowl.-Based Syst. 2022, 252, 109409. [Google Scholar] [CrossRef]
- Lu, G.; Liu, Y.; Wang, J.; Wu, H. CNN-BiLSTM-Attention: A multi-label neural classifier for short texts with a small set of labels. Inf. Process. Manag. 2023, 60, 103320. [Google Scholar] [CrossRef]
- Zhang, C.; Li, Q.; Song, D. Aspect-based sentiment classification with aspect-specific graph convolutional networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, Online, 16–20 November 2020; pp. 4568–4578. [Google Scholar]
- Hou, X.; Huang, J.; Wang, G.; Qi, P.; He, X.; Zhou, B. Selective Attention Based Graph Convolutional Networks for Aspect-Level Sentiment Classification. In Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15), Mexico City, Mexico, 11 June 2021; pp. 83–93. [Google Scholar]
- Li, R.; Chen, H.; Feng, F.; Ma, Z.; Wang, X.; Hovy, E. DualGCN: Exploring syntactic and semantic information for aspect-based sentiment analysis. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 7642–7656. [Google Scholar] [CrossRef]
- Brauwers, G.; Frasincar, F. A general survey on attention mechanisms in deep learning. IEEE Trans. Knowl. Data Eng. TKDE 2021, 35, 3279–3298. [Google Scholar] [CrossRef]
- Van Elteren, J.T.; Tennent, N.H.; Šelih, V.S. Multi-element quantification of ancient/historic glasses by laser ablation inductively coupled plasma mass spectrometry using sum normalization calibration. Anal. Chim. Acta 2009, 644, 1–9. [Google Scholar] [CrossRef] [PubMed]
- Memisevic, R.; Zach, C.; Pollefeys, M.; Hinton, G.E. Gated softmax classification. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 6–9 December 2010; Volume 23. [Google Scholar]
- Liu, W.; Zhang, Y.M.; Li, X.; Yu, Z.; Dai, B.; Zhao, T.; Song, L. Deep hyperspherical learning. In Proceedings of the Advances in Neural Information Processing Systems, NIPS, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Luong, T.; Pham, H.; Manning, C.D. Effective approaches to attention-based neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, Lisbon, Portugal, 17–21 September 2015; pp. 1412–1421. [Google Scholar]
- Laha, A.; Chemmengath, S.A.; Agrawal, P.; Khapra, M.; Sankaranarayanan, K.; Ramaswamy, H.G. On controllable sparse alternatives to softmax. Adv. Neural Inf. Process. Syst. NIPS 2018, 31, 6423–6433. [Google Scholar]
- Dhanith, P.J.; Surendiran, B.; Rohith, G.; Kanmani, S.R.; Devi, K.V. A Sparse Self-Attention Enhanced Model for Aspect-Level Sentiment Classification. Neural Process. Lett. 2024, 56, 47. [Google Scholar] [CrossRef]
- Hu, M.; Zhao, S.; Zhang, L.; Cai, K.; Su, Z.; Cheng, R.; Shen, X. CAN: Constrained Attention Networks for Multi-Aspect Sentiment Analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, Hong Kong, China, 3–7 November 2019; pp. 4601–4610. [Google Scholar]
- Nguyen, H.; Shirai, K. A joint model of term extraction and polarity classification for aspect-based sentiment analysis. In Proceedings of the 2018 10th International Conference on Knowledge and Systems Engineering, KSE, Ho Chi Minh City, Vietnam, 1–3 November 2018; pp. 323–328. [Google Scholar]
- Zhao, G.; Luo, Y.; Chen, Q.; Qian, X. Aspect-based sentiment analysis via multitask learning for online reviews. Knowlege-Based Syst. 2023, 264, 110326. [Google Scholar] [CrossRef]
- Fan, X.; Zhang, Z. A fine-grained sentiment analysis model based on multi-task learning. In Proceedings of the 2024 4th International Symposium on Computer Technology and Information Science, ISCTIS, Xi’an, China, 12–14 July 2024; pp. 157–161. [Google Scholar]
- Schmitt, M.; Steinheber, S.; Schreiber, K.; Roth, B. Joint aspect and polarity classification for aspect-based sentiment analysis with end-to-end neural networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP, Brussels, Belgium, 31 October–4 November 2018; pp. 1109–1114. [Google Scholar]
- Pei, Y.; Wang, Y.; Wang, W.; Qi, J. Aspect-Based Sentiment Analysis with Multi-Task Learning. In Proceedings of the 2022 5th International Conference on Computing and Big Data, ICCBD, Shanghai, China, 16–18 December 2022; pp. 171–176. [Google Scholar]
- Wang, Y.; Chen, Q.; Wang, W. Multi-task bert for aspect-based sentiment analysis. In Proceedings of the IEEE International Conference on Smart Computing, SMARTCOMP, Irvine, CA, USA, 23–27 August 2021; pp. 383–385. [Google Scholar]
- Treumann, R.; Baumjohann, W. Beyond Gibbs-Boltzmann-Shannon: General entropies—The Gibbs-Lorentzian example. Front. Phys. 2014, 2, 49. [Google Scholar] [CrossRef]
- Anastasiadis, A. Tsallis entropy. Entropy 2012, 14, 174–176. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Las Vegas, NA, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Tang, D.; Qin, B.; Feng, X.; Liu, T. Effective LSTMs for Target-Dependent Sentiment Classification. In Proceedings of the COLING, Osaka, Japan, 11–16 December 2016; pp. 3298–3307. [Google Scholar]
- Lin, P.; Yang, M.; Lai, J. Deep selective memory network with selective attention and inter-aspect modeling for aspect level sentiment classification. IEEE/ACM Trans Audio Speech Lang Process 2021, 29, 1093–1106. [Google Scholar] [CrossRef]
- Zhang, Y.; Xu, B.; Zhao, T. Convolutional multi-head self-attention on memory for aspect sentiment classification. IEEE/CAA J. Autom. Sin. 2020, 7, 1038–1044. [Google Scholar] [CrossRef]
- Liu, H.; Wu, Y.; Li, Q.; Lu, W.; Li, X.; Wei, J.; Feng, J. Enhancing aspect-based sentiment analysis using a dual-gated graph convolutional network via contextual affective knowledge. Neurocomputing 2023, 553, 126526. [Google Scholar] [CrossRef]
- Deng, X.; Han, D.; Jiang, P. A Context-focused Attention Evolution Model for Aspect-based Sentiment Classification. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2023, 22, 1–14. [Google Scholar] [CrossRef]
Data Set | Positive | Negative | Neutral | |||
---|---|---|---|---|---|---|
Train | Test | Train | Test | Train | Test | |
REST 14 | 1631 | 530 | 640 | 181 | 144 | 43 |
REST 15 | 754 | 252 | 280 | 245 | 39 | 25 |
REST 16 | 1348 | 321 | 647 | 131 | 134 | 31 |
MAMS | 1332 | 334 | 1485 | 372 | 1508 | 377 |
Data Set | Food | Service | Ambience | Price | ||||
---|---|---|---|---|---|---|---|---|
Train | Test | Train | Train | Test | Test | Train | Test | |
REST 14 | 1166 | 402 | 562 | 167 | 385 | 105 | 302 | 80 |
REST 15 | 530 | 254 | 256 | 159 | 177 | 69 | 110 | 40 |
REST 16 | 692 | 257 | 933 | 145 | 234 | 57 | 270 | 24 |
MAMS | 1158 | 267 | 1134 | 269 | 183 | 50 | 118 | 19 |
Model | REST 14 | REST 15 | REST 16 | MAMS | ||||
---|---|---|---|---|---|---|---|---|
Acc. | Macro-F1 | Acc. | Macro-F1 | Acc. | Macro-F1 | Acc. | Macro-F1 | |
ATAE-LSTM | 77.20 | 67.02 | 78.48 | 60.53 | 83.77 | 61.71 | 70.63 | 67.02 |
TD-LSTM | 78.00 | 66.73 | 76.39 | 58.70 | 82.16 | 54.21 | 74.59 | 66.73 |
MemNet | 78.20 | 65.80 | 77.31 | 58.28 | 85.44 | 65.99 | 63.29 | 69.64 |
IAN | 79.26 | 70.09 | 78.50 | 52.65 | 84.74 | 55.21 | 76.60 | 70.09 |
AOA | 79.97 | 70.42 | 78.17 | 57.02 | 87.50 | 66.21 | 77.26 | 70.42 |
ASGCN | 80.77 | 72.02 | 79.89 | 61.89 | 88.99 | 67.48 | - | - |
SAGCN | 77.14 | 67.16 | 76.19 | 59.73 | 88.30 | 67.40 | - | - |
DSMN | 78.17 | 70.13 | 79.16 | 61.70 | 86.70 | 67.40 | - | - |
CMA-MemNet | 79.14 | 71.23 | 80.90 | 62.90 | 87.13 | 68.76 | - | - |
DualGCN | 84.27 | 78.08 | 81.73 | 65.05 | 89.29 | 68.08 | - | - |
MAT-2RO | 84.28 | 74.45 | 77.51 | 52.78 | - | - | - | - |
DA-ST Net | 85.69 | 67.51 | 80.95 | 55.19 | 92.95 | 83.65 | 74.47 | 74.10 |
SPA-ST Net (α = 2) | 89.64 | 76.30 | 83.53 | 65.17 | 93.38 | 85.78 | 78.00 | 77.99 |
DA-RJ Net | 84.73 | 66.35 | 81.15 | 55.45 | 89.74 | 82.25 | 73.06 | 72.32 |
SPA-RJ Net (α = 2) | 89.41 | 74.56 | 84.33 | 67.34 | 94.44 | 88.26 | 80.78 | 79.56 |
Data Set | Positive | Neutral | Negative | |||
---|---|---|---|---|---|---|
Precision | Recall | Precision | Recall | Precision | Recall | |
REST 16 | 90.50 | 81.02 | 72.21 | 80.11 | 95.02 | 97.10 |
MAMS | 83.50 | 82.01 | 73.78 | 78.14 | 79.00 | 81.21 |
Model | Sparsity Parameter | REST 14 | REST 15 | REST 16 | MAMS | ||||
---|---|---|---|---|---|---|---|---|---|
Acc. | Macro-F1 | Acc. | Macro-F1 | Acc. | Macro-F1 | Acc. | Macro-F1 | ||
SPA-RJ Net | 1 (=softmax) | 84.73 | 56.35 | 81.15 | 55.45 | 89.74 | 62.25 | 73.06 | 72.32 |
1.5 | 86.97 | 62.78 | 78.18 | 53.25 | 93.59 | 86.51 | 81.00 | 80.67 | |
2 | 90.23 | 74.63 | 84.33 | 67.34 | 94.44 | 88.26 | 80.78 | 79.56 | |
4 | 88.31 | 59.03 | 83.73 | 57.22 | 90.08 | 70.90 | 78.24 | 77.91 |
Ablation Model | REST 14 | REST 15 | REST 16 | MAMS | ||||
---|---|---|---|---|---|---|---|---|
Acc. | Macro-F1 | Acc. | Macro-F1 | Acc. | Macro-F1 | Acc. | Macro-F1 | |
SPA-RJ Net (with GloVe) | 72.90 | 52.10 | 66.07 | 50.13 | 88.89 | 77.92 | 62.69 | 62.13 |
SPA-RJ Net (w/o ACD) | 89.41 | 73.73 | 82.14 | 58.54 | 90.6 | 81.2 | 75.65 | 74.02 |
SPA-RJ Net (w/o residual connection) | 85.14 | 66.18 | 84.33 | 57.61 | 92.31 | 83.02 | 75.47 | 72.48 |
SPA-RJ Net (w/o balancing hyperparameter) | 86.80 | 68.94 | 82.33 | 56.96 | 90.44 | 87.02 | 76.79 | 75.88 |
SPA-RJ Net (complete) | 90.23 | 74.63 | 84.33 | 67.34 | 94.44 | 88.26 | 80.78 | 79.56 |
Attention Config. | REST 14 | REST 15 | REST 16 | MAMS | |||||
---|---|---|---|---|---|---|---|---|---|
Aspect Category Att. | Aspect Sentiment Att. | Acc. | Macro-F1 | Acc. | Macro-F1 | Acc. | Macro-F1 | Acc. | Macro-F1 |
DA | DA | 84.73 | 66.35 | 81.15 | 55.45 | 89.74 | 62.25 | 73.06 | 72.32 |
SPA | DA | 87.90 | 68.60 | 82.74 | 56.53 | 89.96 | 76.06 | 80.66 | 79.49 |
DA | SPA | 87.90 | 71.42 | 82.14 | 56.03 | 92.09 | 83.37 | 79.62 | 79.03 |
SPA | SPA | 90.23 | 74.63 | 84.33 | 67.34 | 94.44 | 88.26 | 80.78 | 79.56 |
Sample Cases | Aspect/Label | Results |
---|---|---|
1. This happened while many dinners were enjoying their meal and it was great. | Food/Positive | Positive (✓) |
2. The noise here is so bad that people entered and left without ordering and they lowered lights during the dinner. | Ambience/Negative | Negative (✓) |
3. Stat by enjoying the atmosphere while having a glass of wine at the bar and we are ready to having fun. | Ambience/Positive | Positive (✓) |
4. The service was rather slow but waiter was very pleasant and obliging | Service/Positive | Negative (✗) |
5. The food was barely decent and out server was nowhere to be found. | Service/Negative Food/Positive | Negative (✓) Positive (✓) |
6. The atmosphere was amazing for such a reasonable price. | Ambience/Positive Price/Positive | Positive (✓) Positive (✓) |
7. The food was good. But not at all worth the price | Food/Positive Price/Negative | Positive (✓) Positive (✗) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, J.; Kil, H. Sparse Attention-Based Residual Joint Network for Aspect-Category-Based Sentiment Analysis. Mathematics 2025, 13, 2437. https://doi.org/10.3390/math13152437
Kim J, Kil H. Sparse Attention-Based Residual Joint Network for Aspect-Category-Based Sentiment Analysis. Mathematics. 2025; 13(15):2437. https://doi.org/10.3390/math13152437
Chicago/Turabian StyleKim, Jooan, and Hyunyoung Kil. 2025. "Sparse Attention-Based Residual Joint Network for Aspect-Category-Based Sentiment Analysis" Mathematics 13, no. 15: 2437. https://doi.org/10.3390/math13152437
APA StyleKim, J., & Kil, H. (2025). Sparse Attention-Based Residual Joint Network for Aspect-Category-Based Sentiment Analysis. Mathematics, 13(15), 2437. https://doi.org/10.3390/math13152437