Revisiting the Role of Label Smoothing in Enhanced Text Sentiment Classification
Abstract
1. Introduction
- Through systematic evaluation with four distinct smoothing parameters ( for three-class and for binary-class datasets), LS methods outperform the three baseline architectures on all eight datasets, including six three-class datasets and two binary-class datasets.
- From in-depth analysis, LS can accelerate the training process of deep models by 15–30 percent with the deployment of soft labels, reducing the number of epochs required to achieve convergence.
- LS can produce better hidden representations for training examples as they are easier to distinguish than those produced by the baseline method, as demonstrated through t-SNE visualization.
- We provide a comprehensive evaluation using multiple metrics (accuracy, macro-F1, precision, recall) and controlled experiments to isolate the effect of label smoothing from the choice of loss function.
2. Related Works
2.1. Text Sentiment Classification
2.2. Label Smoothing (LS)
2.3. Soft-Target Training in NLP
3. Label Smoothing Method for Text Classification
3.1. Basics of Label Smoothing
3.2. Relationship Between Cross-Entropy and KL Divergence with Label Smoothing
3.3. Training with Label Smoothing
3.4. Selected Deep Learning Architectures
4. Experimental Results and Analysis
4.1. Datasets
4.2. Model Configuration
- Hardware: All experiments were performed on a server with NVIDIA Tesla V100 32 GB GPU, Intel Xeon Gold 6248R CPU (48 cores), and 256 GB RAM.
- Software Environment: Python 3.8, PyTorch 1.10, Transformers 4.15, CUDA 11.3.
- Data Preprocessing: For all datasets, we applied standard text preprocessing including lowercasing, removal of URLs and special characters, and tokenization. For BERT and RoBERTa, we used their respective tokenizers with a maximum sequence length of 128 tokens.
- Train-Validation-Test Splits: For datasets without predefined splits, we used 80/10/10 percent for training/validation/test. For datasets with predefined splits (e.g., Sent140, RTR), we followed the original partitioning.
- Optimization Settings:
- –
- TextCNN: Adam optimizer with learning rate 0.001, batch size 64, 50 epochs maximum with early stopping (patience = 5).
- –
- BERT/RoBERTa: AdamW optimizer with learning rate 2 , batch size 32, 10 epochs maximum with early stopping (patience = 3), linear warmup for first 10 percent of steps.
- Pre-trained Checkpoints: We used bert-base-uncased for BERT and roberta-base for RoBERTa from Hugging Face Transformers.
- Stopping Criteria: Training was stopped when validation loss did not improve for the specified patience epochs.
4.3. Metrics
4.4. Results
4.5. Training Time Analysis
4.6. Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Li, Q.; Peng, H.; Li, J.; Xia, C.; Yang, Z.; Sun, L.; Yu, P.S.; He, L. A survey on text classification: From traditional to deep learning. ACM Trans. Intell. Syst. Technol. (TIST) 2022, 13, 31. [Google Scholar] [CrossRef]
- Liu, Z.; Si, S.; Gu, J. Calibrating Sentiment Analysis: A Unimodal-Weighted Label Distribution Learning Approach. IEEE Access 2025, 13, 148816–148826. [Google Scholar] [CrossRef]
- Alharbi, M.I.; Chafik, S.; Ezzini, S.; Mitkov, R.; Ranasinghe, T.; Hettiarachchi, H. A hasis: Shared task on sentiment analysis for arabic dialects. In Proceedings of the Shared Task on Sentiment Analysis for Arabic Dialects; INCOMA Ltd.: Shoumen, Bulgaria, 2025; pp. 1–6. [Google Scholar]
- Zhang, X.; Zhao, J.; LeCun, Y. Character-level convolutional networks for text classification. In Proceedings of the Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2015; Volume 28. [Google Scholar]
- Zhang, Y.; Meng, J.E.; Venkatesan, R.; Wang, N.; Pratama, M. Sentiment classification using comprehensive attention recurrent models. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN); IEEE: Piscataway, NJ, USA, 2016; pp. 1562–1569. [Google Scholar]
- Chen, P.; Sun, Z.; Bing, L.; Yang, W. Recurrent attention network on memory for aspect sentiment analysis. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2017; pp. 452–461. [Google Scholar]
- Chang, W.C.; Yu, H.F.; Zhong, K.; Yang, Y.; Dhillon, I.S. Taming pretrained transformers for extreme multi-label text classification. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery (ACM): New York, NY, USA, 2020; pp. 3163–3171. [Google Scholar]
- Jiang, T.; Wang, D.; Sun, L.; Yang, H.; Zhao, Z.; Zhuang, F. Lightxml: Transformer with dynamic negative sampling for high-performance extreme multi-label text classification. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Washington, DC, USA, 2021; Volume 35, pp. 7987–7994. [Google Scholar]
- Jing, L.; Li, X.; Yu, M. Attention mechanism-based self-supervised multitask approach for multimodal sentiment analysis. In Proceedings of the International Conference on Electronic Information Engineering and Artificial Intelligence (EIEAI 2025); SPIE: Bellingham, WA, USA, 2026; Volume 14062, pp. 273–279. [Google Scholar]
- Liu, M.; Liu, L.; Cao, J.; Du, Q. Co-attention network with label embedding for text classification. Neurocomputing 2022, 471, 61–69. [Google Scholar] [CrossRef]
- Liu, Y.; Li, P.; Hu, X. Combining context-relevant features with multi-stage attention network for short text classification. Comput. Speech Lang. 2022, 71, 101268. [Google Scholar] [CrossRef]
- Zheng, W.; Han, S.; Jia, X.; Wu, E.Z.; Ding, W. GT-AGCN: Integrating Global Semantics and Local Syntax for Aspect-Based Sentiment Analysis. IEEE Trans. Comput. Soc. Syst. 2025, 13, 1293–1309. [Google Scholar] [CrossRef]
- Minaee, S.; Kalchbrenner, N.; Cambria, E.; Nikzad, N.; Chenaghlu, M.; Gao, J. Deep learning–based text classification: A comprehensive review. ACM Comput. Surv. (CSUR) 2021, 54, 62. [Google Scholar] [CrossRef]
- Zulqarnain, M.; Ghazali, R.; Hassim, Y.M.M.; Rehan, M. A comparative review on deep learning models for text classification. Indones. J. Electr. Eng. Comput. Sci. 2020, 19, 325–335. [Google Scholar] [CrossRef]
- Müller, R.; Kornblith, S.; Hinton, G.E. When does label smoothing help? In Proceedings of the Advances in Neural Information Processing Systems; Curran Associates, Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar]
- Lienen, J.; Hüllermeier, E. From label smoothing to label relaxation. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Washington, DC, USA, 2021; Volume 35, pp. 8583–8591. [Google Scholar]
- Lukasik, M.; Bhojanapalli, S.; Menon, A.; Kumar, S. Does label smoothing mitigate label noise? In Proceedings of the International Conference on Machine Learning; PMLR: Brookline, MA, USA, 2020; pp. 6448–6658. [Google Scholar]
- Gao, Y.; Wang, X.; Herold, C.; Yang, Z.; Ney, H. Towards a better understanding of label smoothing in neural machine translation. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing; Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2020; pp. 212–223. [Google Scholar]
- Chen, B.; Ziyin, L.; Wang, Z.; Liang, P.P. An investigation of how label smoothing affects generalization. arXiv 2020, arXiv:2010.12648. [Google Scholar] [CrossRef]
- Cui, X.; Saon, G.; Nagano, T.; Suzuki, M.; Fukuda, T.; Kingsbury, B.; Kurata, G. Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. In Proceedings of the Annual Conference of the International Speech Communication Association; International Speech Communication Association (ISCA): Grenoble, France, 2022. [Google Scholar]
- Li, W.; Dasarathy, G.; Berisha, V. Regularization via Structural Label Smoothing. In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics; Proceedings of Machine Learning Research: Brookline, MA, USA, 2020; Volume 108, pp. 1453–1463. [Google Scholar]
- Chandrasegaran, K.; Tran, N.M.; Zhao, Y.; Cheung, N.M. Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing? In Proceedings of the International Conference on Machine Learning; PMLR: Brookline, MA, USA, 2022; pp. 2890–2916. [Google Scholar]
- Liu, P.; Xi, X.; Ye, W.; Zhang, S. Label Smoothing for Text Mining. In Proceedings of the 29th International Conference on Computational Linguistics; International Committee on Computational Linguistics: Gyeongju, Republic of Korea, 2022; pp. 2210–2219. [Google Scholar]
- Nordansjö, W.; Fourong, F.; Qasim, M. Financial sentiment analysis with FUNNEL: Filtered UNion for NER-based ensemble labeling. Digit. Financ. 2025, 7, 725–744. [Google Scholar] [CrossRef]
- Haque, S.; Bansal, A.; McMillan, C. Label Smoothing Improves Neural Source Code Summarization. In Proceedings of the 2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC); IEEE: Piscataway, NJ, USA, 2023; pp. 101–112. [Google Scholar]
- Pan, Y.; Chen, J.; Zhang, Y.; Zhang, Y. An efficient CNN-LSTM network with spectral normalization and label smoothing technologies for SSVEP frequency recognition. J. Neural Eng. 2022, 19, 056014. [Google Scholar] [CrossRef] [PubMed]
- Pang, B.; Lee, L.; Vaithyanathan, S. Thumbs up? Sentiment Classification using Machine Learning Techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002); Association for Computational Linguistics (ACL): Stroudsburg, PA, USA, 2002; pp. 79–86. [Google Scholar]
- Onan, A. Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification. J. King Saud-Univ.-Comput. Inf. Sci. 2022, 34, 2098–2117. [Google Scholar] [CrossRef]
- Hasib, K.M.; Towhid, N.A.; Alam, M.G.R. Online review based sentiment classification on bangladesh airline service using supervised learning. In Proceedings of the 2021 5th International Conference on Electrical Engineering and Information Communication Technology (ICEEICT); IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
- Kowsari, K.; Jafari Meimandi, K.; Heidarysafa, M.; Mendu, S.; Barnes, L.; Brown, D. Text classification algorithms: A survey. Information 2019, 10, 150. [Google Scholar] [CrossRef]
- Si, S.; Wang, R.; Wosik, J.; Zhang, H.; Dov, D.; Wang, G.; Carin, L. Students need more attention: Bert-based attention model for small data with application to automatic patient message triage. In Proceedings of the Machine Learning for Healthcare Conference; PMLR: Brookline, MA, USA, 2020; pp. 436–456. [Google Scholar]
- Kim, S.B.; Han, K.S.; Rim, H.C.; Myaeng, S.H. Some effective techniques for naive bayes text classification. IEEE Trans. Knowl. Data Eng. 2006, 18, 1457–1466. [Google Scholar] [CrossRef]
- Raschka, S. Naive bayes and text classification i-introduction and theory. arXiv 2014, arXiv:1410.5329. [Google Scholar]
- Tong, S.; Koller, D. Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2001, 2, 45–66. [Google Scholar]
- Lilleberg, J.; Zhu, Y.; Zhang, Y. Support vector machines and word2vec for text classification with semantic features. In Proceedings of the 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC); IEEE: Piscataway, NJ, USA, 2015; pp. 136–140. [Google Scholar]
- Liu, P.; Qiu, X.; Huang, X. Recurrent Neural Network for Text Classification with Multi-Task Learning. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016; Kambhampati, S., Ed.; IJCAI/AAAI Press: Washington, DC, USA, 2016; pp. 2873–2879. [Google Scholar]
- Gao, B.B.; Xing, C.; Xie, C.W.; Wu, J.; Geng, X. Deep Label Distribution Learning With Label Ambiguity. IEEE Trans. Image Process. 2017, 26, 2825–2838. [Google Scholar] [CrossRef] [PubMed]
- Si, S.; Wang, J.; Peng, J.; Xiao, J. Towards speaker age estimation with label distribution learning. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE: New York, NY, USA, 2022; pp. 4618–4622. [Google Scholar]
- Luo, Y.; Huang, Z.; Wong, L.P.; Zhan, C.; Wang, F.L.; Hao, T. An Early Prediction and Label Smoothing Alignment Strategy for User Intent Classification of Medical Queries. In Proceedings of the International Conference on Neural Computing for Advanced Applications; Springer: Cham, Switzerland, 2022; pp. 115–128. [Google Scholar]
- Luo, Z.; Xi, Y.; Mao, X.L. Smoothing with Fake Label. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management; Association for Computing Machinery (ACM): New York, NY, USA, 2021; pp. 3303–3307. [Google Scholar]
- Zhu, E.; Li, J. Boundary Smoothing for Named Entity Recognition. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022; Muresan, S., Nakov, P., Villavicencio, A., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 7096–7108. [Google Scholar] [CrossRef]
- Yu, Y.; Wang, Y.; Mu, J.; Li, W.; Jiao, S.; Wang, Z.; Lv, P.; Zhu, Y. Chinese mineral named entity recognition based on BERT model. Expert Syst. Appl. 2022, 206, 117727. [Google Scholar] [CrossRef]
- Wang, B.; Li, Y.; Li, S.; Sun, D. Sentiment Analysis Model Based on Adaptive Multi-modal Feature Fusion. In Proceedings of the 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP); IEEE: Piscataway, NJ, USA, 2022; pp. 761–766. [Google Scholar]
- Yang, Y.; Dan, S.; Roth, D.; Lee, I. In and Out-of-Domain Text Adversarial Robustness via Label Smoothing. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ACL 2023, Toronto, Canada, 9–14 July 2023; Rogers, A., Boyd-Graber, J.L., Okazaki, N., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2023; pp. 657–669. [Google Scholar] [CrossRef]
- Yan, Q.; Sun, Y.; Fan, S.; Zhao, L. Polarity-aware attention network for image sentiment analysis. Multimed. Syst. 2023, 29, 389–399. [Google Scholar] [CrossRef]
- Wu, X.; Gao, C.; Lin, M.; Zang, L.; Wang, Z.; Hu, S. Text Smoothing: Enhance Various Data Augmentation Methods on Text Classification Tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ACL 2022, Dublin, Ireland, 22–27 May 2022; Muresan, S., Nakov, P., Villavicencio, A., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 871–875. [Google Scholar] [CrossRef]
- Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. In Proceedings of the NIPS Deep Learning and Representation Learning Workshop; Curran Associates, Inc.: Red Hook, NY, USA, 2015. [Google Scholar]
- Furlanello, T.; Lipton, Z.C.; Tschannen, M.; Itti, L.; Anandkumar, A. Born again neural networks. In Proceedings of the International Conference on Machine Learning; PMLR: Brookline, MA, USA, 2018; pp. 1607–1616. [Google Scholar]
- Geng, X. Label Distribution Learning. IEEE Trans. Knowl. Data Eng. 2016, 28, 1734–1748. [Google Scholar] [CrossRef]
- Pozzi, A.; Incremona, A.; Tessera, D.; Toti, D. Mitigating exposure bias in large language model distillation: An imitation learning approach. Neural Comput. Appl. 2025, 37, 12013–12029. [Google Scholar] [CrossRef]
- Huang, J.; Tao, J.; Liu, B.; Lian, Z. Learning Utterance-Level Representations with Label Smoothing for Speech Emotion Recognition. In Proceedings of the Interspeech, Shanghai, China, 25–29 October 2020; pp. 4079–4083. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015; Curran Associates, Inc.: Red Hook, NY, USA, 2015. [Google Scholar]
- Kim, Y. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, 25–29 October 2014; pp. 1746–1751. [Google Scholar]
- Wang, S.; Yilahun, H.; Hamdulla, A. Medical Intention Recognition Based on MCBERT-TextCNN Model. In Proceedings of the 2022 International Conference on Virtual Reality, Human-Computer Interaction and Artificial Intelligence (VRHCIAI); IEEE: Piscataway, NJ, USA, 2022; pp. 195–200. [Google Scholar]
- Jiang, L. Fault classification method of alarm information based on TextCNN. In Proceedings of the EEI 2022; 4th International Conference on Electronic Engineering and Informatics, Guiyang, China, 24–26 June 2022; pp. 1–5. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019, Volume 1 (Long and Short Papers); Burstein, J., Doran, C., Solorio, T., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Malo, P.; Sinha, A.; Korhonen, P.; Wallenius, J.; Takala, P. Good debt or bad debt: Detecting semantic orientations in economic texts. J. Assoc. Inf. Sci. Technol. 2014, 65, 782–796. [Google Scholar] [CrossRef]
- Go, A.; Bhayani, R.; Huang, L. Twitter sentiment classification using distant supervision. In Proceedings of the CS224N Project Report; Stanford University: Stanford, CA, USA, 2009; Volume 1, p. 2009. [Google Scholar]



| Dataset | Source | Classes | Samples | Domain |
|---|---|---|---|---|
| TFNS | Twitter API | 3 | 11,932 | Finance |
| KFS | Kaggle | 3 | – | Finance |
| TSE | Kaggle | 3 | 27,481 | Social Media |
| AS | Financial News | 3 | – | Finance |
| FP | [58] | 3 | 4840 | Finance |
| CSA | 3 | 10,000 | Technology | |
| RTR | Rotten Tomatoes | 2 | 10,662 | Movie Reviews |
| Sent140 | [59] | 2 | 1,600,000 | Social Media |
| Model | Smoothing Para. | Smoothed Label (3-Class) | Smoothed Label (2-Class) |
|---|---|---|---|
| Baseline | y | y | |
| CE-Soft | |||
| LS1 | |||
| LS2 | |||
| LS3 | |||
| LS4 |
| Arch. | Model | TFNS | KFS | TSE | AS | FP | CSA | RTR | Sent140 |
|---|---|---|---|---|---|---|---|---|---|
| BERT | LS1 | 87.69/85.2 | 79.81/77.4 | 79.17/76.8 | 84.11/81.3 | 89.30/86.9 | 72.02/69.5 | 78.20/77.8 | 76.36/74.2 |
| LS2 | 87.44/84.9 | 79.64/77.1 | 79.12/76.5 | 83.18/80.4 | 89.55/87.2 | 71.81/69.2 | 77.80/77.4 | 76.47/74.5 | |
| LS3 | 87.10/84.6 | 79.13/76.6 | 78.81/76.1 | 83.90/81.1 | 89.68/87.4 | 72.54/70.1 | 77.80/77.4 | 77.11/75.1 | |
| LS4 | 87.40/84.9 | 79.56/77.0 | 78.89/76.3 | 84.21/81.5 | 89.55/87.2 | 71.81/69.2 | 78.60/78.2 | 75.94/73.8 | |
| CE-Soft | 87.21/84.7 | 79.32/76.8 | 78.95/76.2 | 83.56/80.8 | 89.12/86.7 | 71.65/69.0 | 77.90/77.5 | 76.12/74.0 | |
| Baseline | 86.89/84.3 | 78.96/76.2 | 78.81/75.9 | 82.56/79.8 | 88.03/85.6 | 71.49/68.8 | 77.80/77.4 | 76.26/74.1 | |
| TextCNN | LS1 | 82.37/79.8 | 68.52/65.4 | 70.74/67.2 | 77.61/74.3 | 83.82/80.9 | 72.34/69.8 | 78.60/78.1 | 74.55/72.1 |
| LS2 | 82.33/79.7 | 68.01/64.9 | 70.29/66.8 | 77.40/74.1 | 83.57/80.6 | 72.45/70.0 | 79.20/78.7 | 75.40/73.2 | |
| LS3 | 82.16/79.5 | 67.92/64.7 | 70.66/67.1 | 77.71/74.4 | 83.69/80.8 | 71.83/69.3 | 77.80/77.4 | 74.55/72.1 | |
| LS4 | 82.04/79.3 | 68.43/65.2 | 70.29/66.8 | 77.30/74.0 | 83.44/80.4 | 73.38/70.9 | 78.60/78.1 | 74.44/72.0 | |
| CE-Soft | 81.89/79.0 | 67.85/64.5 | 70.12/66.5 | 76.95/73.6 | 83.21/80.1 | 71.92/69.4 | 78.20/77.7 | 74.68/72.4 | |
| Baseline | 81.41/78.5 | 68.51/65.3 | 70.49/67.0 | 76.47/73.2 | 83.19/80.0 | 70.90/68.2 | 79.18/78.6 | 76.26/74.1 | |
| RoBERTa | LS1 | 89.15/87.2 | 83.75/81.6 | 79.46/77.1 | 86.89/84.5 | 90.31/88.3 | 81.01/78.9 | 86.80/86.4 | 85.88/84.2 |
| LS2 | 89.61/87.7 | 82.63/80.4 | 80.36/78.2 | 86.17/83.7 | 90.70/88.7 | 80.29/78.1 | 86.60/86.2 | 85.24/83.5 | |
| LS3 | 89.57/87.6 | 82.46/80.2 | 79.99/77.8 | 86.58/84.2 | 91.08/89.1 | 81.63/79.6 | 87.20/86.8 | 86.10/84.5 | |
| LS4 | 89.78/87.9 | 82.98/80.7 | 80.24/78.5 | 86.38/84.0 | 90.45/88.5 | 81.22/79.1 | 87.80/87.4 | 87.06/85.5 | |
| CE-Soft | 89.12/87.1 | 83.21/81.0 | 79.78/77.5 | 86.45/84.0 | 90.02/87.9 | 80.85/78.7 | 86.90/86.5 | 86.12/84.5 | |
| Baseline | 87.69/85.6 | 83.15/80.9 | 79.91/77.6 | 85.66/83.1 | 89.94/87.8 | 80.39/78.0 | 87.20/86.8 | 87.59/86.1 |
| Model | Time/Epoch (s) | Epochs to Convergence | Total Time (s) |
|---|---|---|---|
| Baseline | 12.3 | 8 | 98.4 |
| CE-Soft | 12.5 | 7 | 87.5 |
| LS1 | 12.4 | 6 | 74.4 |
| LS2 | 12.4 | 6 | 74.4 |
| LS3 | 12.5 | 5 | 62.5 |
| LS4 | 12.5 | 5 | 62.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Si, S.; Gao, Y.; Sun, H.; Zhang, Y.; Luo, H. Revisiting the Role of Label Smoothing in Enhanced Text Sentiment Classification. Electronics 2026, 15, 1984. https://doi.org/10.3390/electronics15101984
Si S, Gao Y, Sun H, Zhang Y, Luo H. Revisiting the Role of Label Smoothing in Enhanced Text Sentiment Classification. Electronics. 2026; 15(10):1984. https://doi.org/10.3390/electronics15101984
Chicago/Turabian StyleSi, Shijing, Yijie Gao, Haixia Sun, Yugui Zhang, and Hua Luo. 2026. "Revisiting the Role of Label Smoothing in Enhanced Text Sentiment Classification" Electronics 15, no. 10: 1984. https://doi.org/10.3390/electronics15101984
APA StyleSi, S., Gao, Y., Sun, H., Zhang, Y., & Luo, H. (2026). Revisiting the Role of Label Smoothing in Enhanced Text Sentiment Classification. Electronics, 15(10), 1984. https://doi.org/10.3390/electronics15101984

