A Multi-Scale Feature Fusion Linear Attention Model for Movie Review Sentiment Analysis
Abstract
1. Introduction
- Existing models fail to adequately extract relevant feature information. Some models primarily focus on local features while neglecting global contextual information. For instance, CNN models excel at extracting local features but are limited by their receptive field, hindering effective modeling of long-range dependencies. Other models predominantly concentrate on global information while overlooking fine-grained details. Furthermore, these models often operate with high feature dimensions, which constrains their computational performance. For example, while standard BERT can effectively model global dependencies through self-attention mechanisms, it struggles to capture subtle patterns in local phrases, and its quadratic computational complexity makes it computationally expensive for long sequences.
- To extract more comprehensive features, existing models typically employ enlarged convolutional kernels. However, this approach leads to a substantial increase in parameters, which directly constrains computational efficiency. Furthermore, the attention mechanisms in some current models exhibit high computational complexity, further exacerbating the computational burden. Hybrid models such as BERT-CNN and BERT-LSTM, while integrating advantages from both architectures, often suffer from structures, inefficient feature fusion, and parameter redundancy.
- Existing models are plagued by several shortcomings, including high computational complexity, excessive parameter counts, and low computational efficiency. These limitations hinder the widespread adoption of the models and make them unsuitable for real-time detection scenarios.
- We propose a novel PMFE module and a MGLFE module. These two modules can effectively capture multi-scale fine-grained information and global contextual features. This design enables the model to simultaneously benefit from both global contextual understanding and local detailed characteristics while maintaining relatively low computational complexity, which helps reduce the number of model parameters. By incorporating the PMFE module, which utilizes parallel multi-kernel dilated convolutions to extract local features at different granularities, the model precisely captures key words and phrasal patterns, thereby compensating for BERT’s limitations in local semantic understanding. Through the Multi-scale Linear Attention (MSLA) mechanism in the MGLFE module, global contextual information is modeled with approximate linear computational complexity, overcoming the inherent limitations of traditional CNNs regarding restricted receptive fields.
- We propose a Multi-Scale Feature Fusion Linear Attention (MSFFLA) model, which primarily adopts a symmetric lightweight design. In the PMFE module, we employ parallel dilated convolutions that effectively reduce the number of parameters compared to traditional convolutional operations while expanding the receptive field. In the MGLFE module, we incorporate a MSLA mechanism that significantly reduces computational complexity. Through residual connections for efficient feature fusion, our design achieves a “1 + 1 > 2” effect, this architecture not only enhances computational performance but also reduces the number of model parameters, thereby substantially improving the computational efficiency of the model.
- Our model not only achieves significant improvements in both accuracy and F1-Score on three public benchmarks, but also demonstrates superior parameter efficiency and computational performance, thereby delivering an efficient and lightweight solution suitable for resource-constrained practical application scenarios.
2. Related Work
2.1. Methods Based on Traditional Manual Approaches
2.2. Approaches Founded on Traditional Machine Learning
2.3. Approaches Founded on Deep Learning
2.4. Hybrid Model-Based Methods
3. Methodology
3.1. Overall Structure
3.2. BERT Encoder BLOCK
3.3. PMFE and MGLFE BLOCK
3.3.1. PMFE BLOCK
3.3.2. MGLFE BLOCK
4. Experiments
4.1. Datasets
4.2. Experimental Setup and Evaluation Metrics
| Algorithm 1. Emotion Classification |
|
Input: Raw input data Output: Emotion classification result |
| Step1: BERT Encoding Encode input data using BERT to get time-step features |
| Step2: Parallel Multi-scale Feature Extraction Extract local detailed features using PMFE modules |
| Step3: Global Multi-scale Linear Feature Extraction Extract global context information using MGLFE modules |
| Step4: Feature Integration and Global Dependency Modeling Build global long-range dependencies |
| Step5: Classification Apply average pooling Apply 1 × 1 convolution Get final emotion classification result return |
4.3. Results
4.4. Ablation Study
4.4.1. Effects of PMFE in the Model
4.4.2. Effects of MGLFE in the Model
4.4.3. Effects of MSLA in the Model
4.4.4. Effects of BERT in the Model
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| MSFFLA | Multi-Scale Feature Fusion Linear Attention Model |
| BERT | Bidirectional Encoder Representations from Transformers |
| PMFE | Parallel Multi-scale Feature Extraction Module |
| MGLFE | Multi-scale Global Linear Feature Extraction Module |
| MSLA | Multi-Scale Linear Attention |
| SST-2 | Stanford Sentiment Treebank—2 classes |
| MR | Movie Review |
| SC | Sentiment Classification |
| BoVW | Bag-of-Visual-Words |
| TF-IDF | Term Frequency-Inverse Document Frequency |
| UPCSim | User Profile Correlation-based Similarity |
| KNN | K-Nearest Neighbors |
| CNN | Convolutional Neural Network |
| LSTM | Long Short-Term Memory |
| BCAT | Bidirectional Context-Aware Transformer |
| IMDB | IMDb Large Movie Review Dataset |
| RNN | Recurrent Neural Network |
| ERNIE-MCBMA | Enhanced Representation through kNowledge IntEgration—Multi-Channel Bidirectional Multi-head Attention |
| SMP2020-EWECT | Social Media Processing 2020—Evaluation of Weibo Emotion Classification Task |
| SVM | Support Vector Machine |
| NLP | Natural Language Processing |
| VADER | Valence Aware Dictionary and sEntiment Reasoner |
| ABSA | Aspect-Based Sentiment Analysis |
| XGBoost | eXtreme Gradient Boosting |
| TD-BERT | Target-Dependent BERT |
| BERT-BiGRU | BERT Bidirectional Gated Recurrent Unit |
| BiLSTM | Bidirectional Long Short-Term Memory |
| GPT-3 | Generative Pre-trained Transformer 3 |
| LLaMA-2 | Large Language Model Meta AI 2 |
| CNN-TE | CNN Transformer Encoder |
| GRU | Gated Recurrent Unit |
| AM-MSFFN | Attention-fused Multi-Scale Feature Fusion Network |
| CBAM | Convolutional Block Attention Module |
| DEAP | Database for Emotion Analysis using Physiological Signals |
| SEED | SJTU Emotion EEG Dataset |
| RoBERTa-MA | Robustly Optimized BERT Pretraining Approach with Multi-head Attention |
| HOMRA-Net | Hybrid Optimized Multi-Scale Residual Attention Network |
| MELD | Multimodal EmotionLines Dataset |
| BN | Batch Normalization |
| LN | Layer Normalization |
| FFN | Feed-Forward Network |
| URL | Uniform Resource Locators |
| SGD | Stochastic Gradient Descent |
References
- Davoodi, L.; Mezei, J.; Heikkilä, M. Aspect-based sentiment classification of user reviews to understand customer satisfaction of e-commerce platforms. In Electronic Commerce Research; Springer Nature: Berlin/Heidelberg, Germany, 2025; pp. 1–43. [Google Scholar] [CrossRef]
- Prova, N. Multilingual Emotion Classification in E-Commerce Customer Reviews Using GPT and Deep Learning-Based Meta-Ensemble Model. 2025. Available online: https://ssrn.com/abstract=5161505 (accessed on 1 January 2025).
- Shi, Y. A CNN-Based Approach for Classical Music Recognition and Style Emotion Classification. IEEE Access 2025, 13, 20647–20666. [Google Scholar] [CrossRef]
- Nabiilah, G.Z. Effectiveness analysis of roberta and distilbert in emotion classification task on social media text data. Eng. Math. Comput. Sci. J. (EMACS) 2025, 7, 45–50. [Google Scholar] [CrossRef]
- Rout, J.K.; Choo, K.K.R.; Dash, A.K.; Bakshi, S.; Jena, S.K.; Williams, K.L. A model for sentiment and emotion analysis of unstructured social media text. Electron. Commer. Res. 2018, 18, 181–199. [Google Scholar] [CrossRef]
- Sharma, N.A.; Ali, A.S.; Kabir, M.A. A review of sentiment analysis: Tasks, applications, and deep learning techniques. Int. J. Data Sci. Anal. 2025, 19, 351–388. [Google Scholar] [CrossRef]
- Wu, S.; Wu, F.; Chang, Y. Automatic construction of target-specific sentiment lexicon. Expert Syst. Appl. 2019, 116, 285–298. [Google Scholar] [CrossRef]
- Zhang, S.X.; Wei, Z.L.; Wang, Y. Sentiment analysis of Chinese micro-blog text based on extended emotion dictionary. Future Gener. Comput. Syst. 2018, 81, 395–403. [Google Scholar] [CrossRef]
- Widiyaningtyas, T.; Hidayah, I.; Adji, T.B. User profile correlation-based similarity (UPCSim) algorithm in movie recommendation system. J. Big Data 2021, 8, 52. [Google Scholar] [CrossRef]
- Pavirha, N.; Pungliya, V.; Raut, A.; Bhonsle, R.; Purohit, A.; Patel, A.; Shashidhar, R. Movie recommendation and sentiment analysis using machine learning. Glob. Transit. Proc. 2022, 3, 279–284. [Google Scholar] [CrossRef]
- Dashtipour, K.; Gogate, M.; Adeel, A.; Larijani, H.; Hussain, A. Sentiment analysis of persian movie reviews using deep learning. Entropy 2021, 23, 596. [Google Scholar] [CrossRef]
- Prasath, S.R.; Jeevitha, J.K.; Margret, S.; Krishnan, R.S.; Raj, J.R.F.; Nithila, E.E. Deep Learning Models for Understanding Audience Sentiments in Movie Tweets. In Proceedings of the 2024 5th International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 7–9 August 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1745–1753. [Google Scholar] [CrossRef]
- Acheampong, F.A.; Nunoo-Mensah, H.; Chen, W. Transformer models for text-based emotion detection: A review of BERT-based approaches. Artif. Intell. Rev. 2021, 54, 5789–5829. [Google Scholar] [CrossRef]
- Saad, T.B.; Ahmed, M.; Ahmed, B.; Sazan, S.A. A Novel Transformer Based Deep Learning Approach of Sentiment Analysis for Movie Reviews. In Proceedings of the 2024 6th International Conference on Electrical Engineering and Information & Communication Technology (ICEEICT), Dhaka, Bangladesh, 2–4 May 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1228–1233. [Google Scholar]
- Ruan, T.; Liu, Q.; Chang, Y. Digital media recommendation system design based on user behavior analysis and emotional feature extraction. PLoS ONE 2025, 20, e0322768. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Liu, W. Sentiment classification method based on BERT-CondConv multi-moment state fusion. Comput. Speech Lang. 2026, 95, 101855. [Google Scholar] [CrossRef]
- Sun, Y.; Yu, Z.; Sun, Y.; Xu, Y.; Song, B. A novel approach for multiclass sentiment analysis on Chinese social media with ERNIE-MCBMA. Sci. Rep. 2025, 15, 18675. [Google Scholar] [CrossRef] [PubMed]
- Louati, A.; Louati, H.; Kariri, E.; Alaskar, F.; Alotaibi, A. Sentiment analysis of Arabic course reviews of a Saudi university using support vector machine. Appl. Sci. 2023, 13, 12539. [Google Scholar] [CrossRef]
- Hu, M.; Liu, B. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, USA, 22–25 August 2004; pp. 168–177. [Google Scholar] [CrossRef]
- Wilson, T.; Wiebe, J.; Hoffmann, P. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, Canada, 6–8 October 2005; pp. 347–354. [Google Scholar] [CrossRef]
- Muddiman, A.; McGregor, S.C.; Stroud, N.J. (Re) claiming our expertise: Parsing large text corpora with manually validated and organic dictionaries. Political Commun. 2019, 36, 214–226. [Google Scholar] [CrossRef]
- Amsler, M. Using Lexical-Semantic Concepts for Fine-Grained Classification in the Embedding Space. Ph.D. Thesis, University of Zurich, Zürich, Switzerland, 2020. [Google Scholar] [CrossRef]
- Pang, B.; Lee, L.; Vaithyanathan, S. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, PA, USA, 6–7 July 2002; pp. 79–86. [Google Scholar] [CrossRef]
- Sarhan, A.M.; Ayman, H.; Wagdi, M.; Ali, B.; Adel, A.; Osama, R. Integrating machine learning and sentiment analysis in movie recommendation systems. J. Electr. Syst. Inf. Technol. 2024, 11, 53. [Google Scholar] [CrossRef]
- Jassim, M.A.; Abd, D.H.; Omri, M.N. Machine learning-based new approach to films review. Soc. Netw. Anal. Min. 2023, 13, 40. [Google Scholar] [CrossRef]
- Gardazi, N.M.; Daud, A.; Malik, M.K.; Bukhari, A.; Alsahfi, T.; Alshemaimri, B. BERT applications in natural language processing: A review. Artif. Intell. Rev. 2025, 58, 1–49. [Google Scholar] [CrossRef]
- Chen, Z.; Qian, T. Relation-aware collaborative learning for unified aspect-based sentiment analysis. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5 July 2020; pp. 3685–3694. [Google Scholar] [CrossRef]
- He, X. Sentiment Classification of Social Media User Comments Using SVM Models. In Proceedings of the 2024 5th International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), Nanjing, China, 29–31 March 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1755–1759. [Google Scholar] [CrossRef]
- Gao, Z.; Feng, A.; Song, X.; Wu, X. Target-dependent sentiment classification with BERT. IEEE Access 2019, 7, 154290–154299. [Google Scholar] [CrossRef]
- Almufareh, M.F.; Jhanjhi, N.Z.; Khan, N.A.; Almuayqil, S.N.; Humayun, M.; Javed, D. BertSent: Transformer-based model for sentiment analysis of penta-class tweet classification. IEEE Access 2024, 12, 196803–196817. [Google Scholar] [CrossRef]
- Rahman, B. Optimizing customer satisfaction through sentiment analysis: A BERT-based machine learning approach to extract insights. IEEE Access 2024, 12, 196803–196817. [Google Scholar] [CrossRef]
- Chen, J.; Ma, X.; Li, S.; Ma, S.; Zhang, Z.; Ma, X. A hybrid parallel computing architecture based on CNN and transformer for music genre classification. Electronics 2024, 13, 3313. [Google Scholar] [CrossRef]
- Li, P.; Duan, C.; Jiang, H.; Liu, E. Research on emotion analysis of e-commerce product reviews based on deep learning. In Proceedings of the Second International Conference on Statistics, Applied Mathematics, and Computing Science (CSAMCS 2022), Nanjing, China, 28 March 2023; SPIE: Bellingham, WA, USA, 2023; Volume 12597, pp. 960–965. [Google Scholar] [CrossRef]
- Prottasha, N.J.; Sami, A.A.; Kowsher, M.; Murad, S.A.; Bairagi, A.K.; Masud, M.; Baz, M. Transfer learning for sentiment analysis using BERT based supervised fine-tuning. Sensors 2022, 22, 4157. [Google Scholar] [CrossRef] [PubMed]
- Yang, H.; Zi, Y.; Qin, H.; Zheng, H.; Hu, Y. Advancing emotional analysis with large language models. J. Comput. Sci. Softw. Appl. 2024, 4, 8–15. [Google Scholar] [CrossRef]
- Garg, S.; Torra, V. Exploring Distribution Learning of Synthetic Data Generators for Manifolds. In Proceedings of the European Symposium on Research in Computer Security, Bydgoszcz, Poland, 16–20 September 2024; Springer Nature: Cham, Switzerland, 2024; pp. 65–76. [Google Scholar] [CrossRef]
- Zhang, W.; Deng, Y.; Liu, B.; Pan, S.; Bing, L. Sentiment analysis in the era of large language models: A reality check. In Findings of the Association for Computational Linguistics: NAACL 2024, Mexico City, Mexico, 16–21 June 2024; pp. 3881–3906. [Google Scholar] [CrossRef]
- Kumar, A.; Sharma, R.; Bedi, P. Towards optimal NLP solutions: Analyzing GPT and LLaMA-2 models across model scale, data set size, and task diversity. Eng. Technol. Appl. Sci. Res. 2024, 14, 14219–14224. [Google Scholar] [CrossRef]
- Tennakoon, N.; Senaweera, O.; Dharmarathne, H.A.S.G. Emotion-based movie recommendation system. Int. J. Adv. ICT Emerg. Reg. (ICTer) 2024, 17, 34–39. [Google Scholar] [CrossRef]
- Hossain, M.M.; Hossain, M.S.; Hossain, M.S.; Mridha, M.F.; Safran, M.; Alfarhood, S. TransNet: Deep attentional hybrid transformer for Arabic posts classification. IEEE Access 2024, 2, 111070–111096. [Google Scholar] [CrossRef]
- Jiang, Y.; Xie, S.; Xie, X.; Cui, Y.; Tang, H. Emotion recognition via multiscale feature fusion network and attention mechanism. IEEE Sens. J. 2023, 23, 10790–10800. [Google Scholar] [CrossRef]
- Jia, K. Sentiment classification of microblog: A framework based on BERT and CNN with attention mechanism. Comput. Electr. Eng. 2022, 101, 108032. [Google Scholar] [CrossRef]
- Guo, Q.; Qiu, X.; Liu, P.; Xue, X.; Zhang, Z. Multi-scale self-attention for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 7847–7854. [Google Scholar] [CrossRef]
- Ameer, I.; Bölücü, N.; Siddiqui, M.H.F.; Can, B.; Sidorov, G.; Gelbukh, A. Multi-label emotion classification in texts using transfer learning. Expert Syst. Appl. 2023, 213, 118534. [Google Scholar] [CrossRef]
- Subbaiah, B.; Murugesan, K.; Saravanan, P.; Marudhamuthu, K. An efficient multimodal sentiment analysis in social media using hybrid optimal multi-scale residual attention network. Artif. Intell. Rev. 2024, 57, 1–27. [Google Scholar] [CrossRef]
- Mirzaee, G.; Doretto, G.; Adjeroh, D. Multi-label Classification using Self-Supervised Learning: Addressing Class Inter-Dependency and Data Imbalance. In Proceedings of the 2024 International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 18–20 December 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1271–1276. [Google Scholar] [CrossRef]
- Foumani, N.M.; Tan, C.W.; Webb, G.I.; Salehi, M. Improving position encoding of transformers for multivariate time series classification. Data Min. Knowl. Discov. 2024, 38, 22–48. [Google Scholar] [CrossRef]
- Kallstenius, T.; Capusan, A.J.; Andersson, G.; Williamson, A. Comparing traditional natural language processing and large language models for mental health status classification: A multi-model evaluation. Sci. Rep. 2025, 15, 24102. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Zhang, Y.; Wang, Y.; Hou, F.; Yuan, J.; Tian, J.; Zhang, Y.; Shi, Z.; Fan, J.; He, Z. A survey of visual transformers. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 7478–7498. [Google Scholar] [CrossRef]
- Zhuang, L.; Wayne, L.; Ya, S.; Jun, Z. A robustly optimized BERT pre-training approach with post-training. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, Huhhot, China, 13–15 August 2021; pp. 1218–1227. Available online: https://aclanthology.org/2021.ccl-1.108/ (accessed on 1 January 2025).
- Malik, S.Z.; Iqbal, K.; Sharif, M.; Shah, Y.A.; Khalil, A.; Irfan, M.A.; Rosak-Szyrocka, J. Attention-aware with stacked embedding for sentiment analysis of student feedback through deep learning techniques. PeerJ Comput. Sci. 2024, 10, e2283. [Google Scholar] [CrossRef]
- Xu, C.; Zhu, G.; Shu, J. A combination of lie group machine learning and deep learning for remote sensing scene classification using multi-layer heterogeneous feature extraction and fusion. Remote Sens. 2022, 14, 1445. [Google Scholar] [CrossRef]
- Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989, 2, 359–366. [Google Scholar] [CrossRef]
- Wang, A.; Singh, A.; Michael, J.; Hill, F.; Levy, O.; Bowman, S. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Brussels, Belgium, 1 November 2018; pp. 353–355. [Google Scholar] [CrossRef]
- McAuley, J.; Leskovec, J. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conferenceon Recommender Systems, Hong Kong, China, 12–16 October 2013; pp. 165–172. [Google Scholar] [CrossRef]
- Pang, B.; Lee, L. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. arXiv 2005, arXiv:cs/0506075. [Google Scholar] [CrossRef]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Zheng, X. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016, arXiv:1603.04467. [Google Scholar] [CrossRef]
- Llugsi, R.; El Yacoubi, S.; Fontaine, A.; Lupera, P. Comparison between Adam, AdaMax and Adam W optimizers to implement a Weather Forecast based on Neural Networks for the Andean city of Quito. In Proceedings of the 2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM), Cuenca, Ecuador, 12–15 October 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef]
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar] [CrossRef]
- Wang, Y.; Feng, L.; Liu, A.; Wang, W.; Hou, Y. Dual BIGRU-CNN-based sentiment classification method combining global and local attention. J. Supercomput. 2024, 80, 2799–2837. [Google Scholar] [CrossRef]
- Zhang, X.; Wu, Z.; Liu, K.; Zhao, Z.; Wang, J.; Wu, C. Text sentiment classification based on BERT embedding and sliced multi-head self-attention Bi-GRU. Sensors 2023, 23, 1481. [Google Scholar] [CrossRef]
- Wu, P.; Li, X.; Ling, C.; Ding, S.; Shen, S. Sentiment classification using attention mechanism and bidirectional long short-term memory network. Appl. Soft Comput. 2021, 112, 107792. [Google Scholar] [CrossRef]
- Li, W.; Qi, F.; Tang, M.; Yu, Z. Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 2020, 387, 63–77. [Google Scholar] [CrossRef]
- Usama, M.; Ahmad, B.; Song, E.; Hossain, M.S.; Alrashoud, M.; Muhammad, G. Attention-based sentiment analysis using convolutional and recurrent neural network. Future Gener. Comput. Syst. 2020, 113, 571–578. [Google Scholar] [CrossRef]




| Dataset | Positive | Negative | Total | Training, Validation, Testing Ratio | Dataset Description |
|---|---|---|---|---|---|
| SST-2 | 31,120 | 38,922 | 70,042 | 80%, 10%, 10% | A movie review with fine-grained, phrase-level sentiment labels for precise analysis. |
| Amazon Review | 35,150 | 34,850 | 70,000 | 80%, 10%, 10% | This labeled sentiment provides massive-scale, diverse user data with train-test splits for robust model training. |
| MR | 5331 | 5331 | 10,662 | 80%, 10%, 10% | The IMDb movie review features sentence-level sentiment labels and authentic, full-length reviews with clear binary classification. |
| Dataset | Description |
|---|---|
| SST-2 | Label 0 denotes negative sentiment, while label 1 denotes positive sentiment. |
| Amazon Review | Change the sentiment labels from 1 representing negative and 2 representing positive to 0 representing negative and 1 representing positive. |
| MR | Label 0 denotes negative sentiment, while label 1 denotes positive sentiment. |
| Item | Content |
|---|---|
| Processor | Intel Core i7-4700 CPU with 2.70 GHz × 12 |
| Memory | 32 GB |
| Operating system | CentOS 7.8 64 bit |
| Hard disk | 1T |
| GPU | Titan-X × 2 |
| Python | 3.7.2 |
| PyTorch | 1.4.0 |
| CUDA | 10.0 |
| Learning rate | 10−3 |
| Momentum | 0.9 |
| Weight decay | 5 × 10−4 |
| Batch | 16 |
| Saturation | 1.5 |
| Subdivisions | 64 |
| SST-2 | Amazon Review | Parameters (M) | |||
|---|---|---|---|---|---|
| Accuracy | F1-Score | Accuracy | F1-Score | ||
| BERT [16] | 0.949 | 0.949 | 0.934 | 0.934 | 110.23 |
| BERT-CNN [16] | 0.951 | 0.951 | 0.933 | 0.933 | 113.54 |
| BERT-LSTM [59] | 0.944 | 0.944 | 0.931 | 0.931 | 115.56 |
| BERT-BiLSTM [60] | 0.948 | 0.948 | 0.933 | 0.933 | 117.83 |
| BERT-GRU [61] | 0.950 | 0.950 | 0.927 | 0.927 | 116.27 |
| BERT-CondConv [16] | 0.961 | 0.961 | 0.942 | 0.942 | 111.34 |
| Bi-LSTM [39] | 0.823 | 0.816 | 0.809 | 0.786 | 49.76 |
| LSTM [39] | 0.735 | 0.711 | 0.715 | 0.709 | 30.65 |
| CNN with BERT Embeddings [39] | 0.763 | 0.751 | 0.742 | 0.7292 | 110.26 |
| Our model | 0.979 | 0.965 | 0.957 | 0.945 | 23.73 |
| MR | Parameters (M) | |||
|---|---|---|---|---|
| Accuracy | F1-Score | |||
| BCAT [62] | 0.802 | 0.798 | - | |
| BERT-MS-BiGRU [63] | 0.828 | 0.829 | 110.43 | |
| SC-ABiLSTM [64] | 0.771 | 0.768 | 111.55 | |
| SAMF-BiLSTM [65] | 0.833 | 0.832 | 112.64 | |
| ATT-Pooling [66] | 0.836 | 0.836 | 111.26 | |
| BERT-CondConv [16] | 0.834 | 0.833 | 110.35 | |
| Our model | 0.852 | 0.839 | 23.73 | |
| Modulars | Accuracy | F1-Score |
|---|---|---|
| W/O PMFE | 0.933 | 0.919 |
| Ours | 0.957 | 0.945 |
| Modulars | Accuracy | F1-Score |
|---|---|---|
| W/O MGLFE | 0.940 | 0.921 |
| Ours | 0.957 | 0.945 |
| Modulars | Accuracy | F1-Score |
|---|---|---|
| W/O MSLA | 0.926 | 0.917 |
| Ours | 0.957 | 0.945 |
| Modulars | Accuracy | F1-Score |
|---|---|---|
| W/O BERT | 0.927 | 0.913 |
| Ours | 0.957 | 0.945 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jiang, Z.; Xu, C. A Multi-Scale Feature Fusion Linear Attention Model for Movie Review Sentiment Analysis. Big Data Cogn. Comput. 2025, 9, 325. https://doi.org/10.3390/bdcc9120325
Jiang Z, Xu C. A Multi-Scale Feature Fusion Linear Attention Model for Movie Review Sentiment Analysis. Big Data and Cognitive Computing. 2025; 9(12):325. https://doi.org/10.3390/bdcc9120325
Chicago/Turabian StyleJiang, Zi, and Chengjun Xu. 2025. "A Multi-Scale Feature Fusion Linear Attention Model for Movie Review Sentiment Analysis" Big Data and Cognitive Computing 9, no. 12: 325. https://doi.org/10.3390/bdcc9120325
APA StyleJiang, Z., & Xu, C. (2025). A Multi-Scale Feature Fusion Linear Attention Model for Movie Review Sentiment Analysis. Big Data and Cognitive Computing, 9(12), 325. https://doi.org/10.3390/bdcc9120325

