Neural Network-Based Sentiment Analysis and Anomaly Detection in Crisis-Related Tweets
Abstract
:1. Introduction
2. Related Work
2.1. Sentiment Analysis in Crisis-Related Tweets
2.2. Anomaly Detection in Crisis-Related Tweets
3. Materials and Methodology
3.1. Data Preprocessing
- Filtering English language: by using pandas, a Python data analysis library, in the language column, only rows where the language field had the value “en” were filtered. This step was necessary to increase the reliability of the pre-trained BERT model for sentiment analysis [36]. After this filtering, 189,626 tweets out of 472,399 tweets were filtered as English text.
- Text lowercasing: all tweets were converted to lowercase; according to Hickman et al. [37], lowercasing tends to be beneficial because it reduces data dimensionality, thereby increasing statistical power, and usually does not reduce validity.
- Stop word removal: common English (function) words such as “and”, “is”, “I”, “am”, “what”, “of”, etc. were removed by using the Natural Language Toolkit (NLTK). Stop word removal has the advantages of reducing the size of the stored dataset and improving the overall efficiency and effectiveness of the analysis [38].
- URLs removal: all URLs were removed from tweets, since the text of URL strings does not necessarily convey any relevant information, and can therefore be removed [39].
- Duplicate removal: all duplicate tweets were removed to eliminate redundancy and possible skewing of the results.
3.2. Methods
4. Results
4.1. Sentiment Analysis
4.2. Anomaly Detection
4.3. Comparative and Temporal Analysis of Anomalies
4.4. Comparing Neural Network with Traditional ML Approaches
5. Discussion
- Bias: the language model used (BERT) may have inherited biases that could affect the fairness of sentiment analysis. Certain dialects or demographics might be misinterpreted, potentially leading to skewed results.
- Privacy: all data analyzed were public tweets. However, the authors respected user privacy by working with anonymized datasets, and by adhering to platform policies. In any operational setting, safeguards must ensure that individual personal data is not misused, and that analyses remain aggregated and focused on public information needs.
- False alarms: the authors caution that this anomaly system could trigger false positives. For instance, a surge in negative sentiment might be due to misinformation or social panic that does not correspond to a real situation on the ground. Any alert should be verified and complemented by human analysis.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Tamer, Z.; Demir, G.; Darıcı, S.; Pamučar, D. Understanding twitter in crisis: A roadmap for public sector decision makers with multi-criteria decision making. Environ. Dev. Sustain. 2025, 1–37. [Google Scholar] [CrossRef]
- Noor, N.; Okhai, R.; Jamal, T.B.; Kapucu, N.; Ge, Y.G.; Hasan, S. Social-media-based crisis communication: Assessing the engagement of local agencies in Twitter during Hurricane Irma. Int. J. Inf. Manag. Data Insights 2024, 4, 100236. [Google Scholar] [CrossRef]
- Karimiziarani, M.; Moradkhani, H. Social response and Disaster management: Insights from twitter data Assimilation on Hurricane Ian. Int. J. Disaster Risk Reduct. 2023, 95, 103865. [Google Scholar] [CrossRef]
- Kanungo, S.; Jain, S. Hybrid Deep Neural Network G-LSTM for Sentiment Analysis on Twitter: A Novel Approach to Disaster Management. Ingénierie Des Systèmes D’information 2023, 28, 1565–1575. [Google Scholar] [CrossRef]
- Kumar, S.; Khan, M.B.; Hasanat, M.H.; Saudagar, A.K.; Al Tameem, A.; Al Khathami, M. An anomaly detection framework for twitter data. Appl. Sci. 2022, 12, 11059. [Google Scholar] [CrossRef]
- Liu, D.; Zhao, Y.; Xu, H.; Sun, Y.; Pei, D.; Luo, J.; Jing, X.; Feng, M. Opprentice: Towards practical and automatic anomaly detection through machine learning. In Proceedings of the 2015 Internet Measurement Conference, Tokyo, Japan, 28–30 October 2015; pp. 211–224. [Google Scholar] [CrossRef]
- Elmrabit, N.; Zhou, F.; Li, F.; Zhou, H. Evaluation of Machine Learning Algorithms for Anomaly Detection. In Proceedings of the 2020 International Conference on Cyber Security and Protection of Digital Services (Cyber Security), Dublin, Ireland, 15–19 June 2020; pp. 1–8. [Google Scholar] [CrossRef]
- Rahman, M.S.; Halder, S.; Uddin, M.A.; Acharjee, U.K. An efficient hybrid system for anomaly detection in social networks. Cybersecurity 2021, 4, 10. [Google Scholar] [CrossRef]
- Steuber, F.; Schneider, S.; Schneider, J.A.; Rodosek, G.D. Real-Time Anomaly Detection and Popularity Prediction for Emerging Events on Twitter. In Proceedings of the International Conference on Advances in Social Networks Analysis and Mining, Kusadasi, Turkiye, 6–9 November 2023; pp. 300–304. [Google Scholar] [CrossRef]
- Sufi, F.K.; Alsulami, M. Automated Multidimensional Analysis of Global Events With Entity Detection, Sentiment Analysis and Anomaly Detection. IEEE Access 2021, 9, 152449–152460. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the NAACL: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Koroteev, M.V. BERT: A review of applications in natural language processing and understanding. arXiv 2021. [Google Scholar] [CrossRef]
- Lee, M.C.; Lin, J.C.; Gran, E.G. SALAD: Self-Adaptive Lightweight Anomaly Detection for Real-time Recurrent Time Series. In Proceedings of the 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid, Spain, 12–16 July 2021; pp. 344–349. [Google Scholar] [CrossRef]
- Atkinson, O.; Bhardwaj, A.; Englert, C.; Ngairangbam, V.S.; Spannowsky, M. Anomaly detection with convolutional Graph Neural Networks. J. High Energy Phys. 2021, 2021, 80. [Google Scholar] [CrossRef]
- Das, R.; Singh, T.D. Multimodal sentiment analysis: A survey of methods, trends, and challenges. ACM Comput. Surv. 2023, 55, 270. [Google Scholar] [CrossRef]
- Katalinić, J.; Dunđer, I.; Seljan, S. Unraveling the Nuclear Debate: Insights Through Clustering of Tweets. Electronics 2024, 13, 4159. [Google Scholar] [CrossRef]
- Katalinić, J.; Dunđer, I.; Seljan, S. Polarizing Topics on Twitter in the 2022 United States Elections. Information 2023, 14, 609. [Google Scholar] [CrossRef]
- Wankhade, M.; Rao, A.C.S.; Kulkarni, C. A survey on sentiment analysis methods, applications, and challenges. Artif. Intell. Rev. 2022, 55, 5731–5780. [Google Scholar] [CrossRef]
- Shetty, N.P.; Bijalwan, Y.; Chaudhari, P.; Shetty, J.; Muniyal, B. Disaster assessment from social media using multimodal deep learning. Multimed. Tools Appl. 2024, 83, 17–23. [Google Scholar] [CrossRef]
- Fernandez, G.; Suresh-Babu, S.; Vito, D. Mapping Infodemic Responses: A Geospatial Analysis of COVID-19 Discourse on Twitter in Italy. Int. J. Environ. Res. Public Health 2025, 22, 668. [Google Scholar] [CrossRef]
- Suhasini, M.; Srinivasu, B. Emotion Detection Framework for Twitter Data Using Supervised Classifiers. Adv. Intell. Syst. Comput. 2020, 1079, 565–576. [Google Scholar] [CrossRef]
- Jayakody, J.P.U.S.D.; Kumara, B.T.G.S. Sentiment analysis on product reviews on twitter using Machine Learning Approaches. In Proceedings of the 2021 International Conference on Decision Aid Sciences and Application (DASA), Sakheer, Bahrain, 7–8 December 2021; pp. 1056–1061. [Google Scholar] [CrossRef]
- Tan, K.L.; Lee, C.P.; Anbananthen, K.S.M.; Lim, K.M. RoBERTa-LSTM: A Hybrid Model for Sentiment Analysis With Transformer and Recurrent Neural Network. IEEE Access 2022, 10, 21517–21525. [Google Scholar] [CrossRef]
- Stojanovski, D.; Strezoski, G.; Madjarov, G.; Dimitrovski, I.; Chorbev, I. Deep neural network architecture for sentiment analysis and emotion identification of Twitter messages. Multimed. Tools Appl. 2018, 77, 32213–32242. [Google Scholar] [CrossRef]
- Myint, P.Y.; Lo, S.L.; Zhang, Y. Unveiling the dynamics of crisis events: Sentiment and emotion analysis via multi-task learning with attention mechanism and subject-based intent prediction. Inf. Process. Manag. 2024, 61, 103695. [Google Scholar] [CrossRef]
- Nguyen, D.Q.; Vu, T.; Nguyen, A.T. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 16–20 November 2020; pp. 9–14. [Google Scholar] [CrossRef]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lwei, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019. [Google Scholar] [CrossRef]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1724–1734. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Peng, H.; Zhang, R.; Li, S.; Cao, Y.; Pan, S.; Yu, P.S. Reinforced, Incremental and Cross-Lingual Event Detection From Social Messages. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 980–998. [Google Scholar] [CrossRef]
- Patel, K.; Hoeber, O.; Hamilton, H. Real-Time Sentiment-Based Anomaly Detection in Twitter Data Streams. In Proceedings of the 28th Canadian Conference on Artificial Intelligence (Canadian AI 2015), Halifax, NS, Canada, 2–5 June 2015; pp. 196–203. [Google Scholar] [CrossRef]
- Roy, A.; Shu, J.; Li, J.; Yang, C.; Elshocht, O.; Smeets, J.; Li, P. GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining (WSDM ‘24), Merida, Mexico, 4–8 March 2024; pp. 576–585. [Google Scholar] [CrossRef]
- Wong, L.; Liu, D.; Berti-Equille, L.; Alnegheimish, S.; Veeramachaneni, K. AER: Auto-encoder with regression for time series anomaly detection. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 17–20 December 2022; pp. 1152–1161. [Google Scholar] [CrossRef]
- Do, J.S.; Kareem, A.B.; Hur, J.W. LSTM-Autoencoder for Vibration Anomaly Detection in Vertical Carousel Storage and Retrieval System (VCSRS). Sensors 2023, 23, 1009. [Google Scholar] [CrossRef] [PubMed]
- Kaggle. Available online: https://www.kaggle.com/datasets/swaptr/turkey-earthquake-tweets (accessed on 8 March 2025).
- Sahoo, A.; Chanda, R.; Das, N.; Sadhukhan, B. Comparative Analysis of BERT Models for Sentiment Analysis on Twitter Data. In Proceedings of the 2023 9th International Conference on Smart Computing and Communications (ICSCC), Kochi, India, 17–19 August 2023; pp. 658–663. [Google Scholar] [CrossRef]
- Hickman, L.; Thapa, S.; Tay, L.; Cao, M.; Srinivasan, P. Text Preprocessing for Text Mining in Organizational Research: Review and Recommendations. Organ. Res. Methods 2022, 25, 114–146. [Google Scholar] [CrossRef]
- Al-Khafaji, H.K.; Habeeb, A.T. Efficient Algorithms for Preprocessing and Stemming of Tweets in a Sentiment Analysis System. IOSR J. Comput. Eng. (IOSR-JCE) 2017, 19, 44–50. [Google Scholar] [CrossRef]
- Roy, D.; Mitra, M.; Ganguly, D. To Clean or Not to Clean: Document Preprocessing and Reproducibility. J. Data Inf. Qual. (JDIQ) 2018, 10, 18. [Google Scholar] [CrossRef]
- Lakhanpal, S.; Gupta, A.; Agrawal, R. Leveraging Explainable AI to Analyze Researchers’ Aspect-Based Sentiment About ChatGPT. In Proceedings of the 15th International Conference on Intelligent Human Computer Interaction (IHCI 2023), Daegu, Republic of Korea, 8–10 November 2023; pp. 281–290. [Google Scholar] [CrossRef]
- Hussain, Z.; Binz, M.; Mata, R.; Wulff, D.U. A tutorial on open-source large language models for behavioral science. Behav. Res. 2024, 56, 8214–8237. [Google Scholar] [CrossRef] [PubMed]
- Siegel, J.W. Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces. J. Mach. Learn. Res. 2023, 24, 1–52. [Google Scholar] [CrossRef]
- Foorthuis, R. On the nature and types of anomalies: A review of deviations in data. Int. J. Data Sci. Anal. 2021, 12, 297–331. [Google Scholar] [CrossRef]
- Cai, Y.; Chen, H.; Cheng, K.-T. Rethinking autoencoders for medical anomaly detection from a theoretical perspective. In Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2024), Marrakesh, Morocco, 6–10 October 2024; pp. 544–554. [Google Scholar] [CrossRef]
- Wang, Y.; Du, X.; Lu, Z.; Duan, Q.; Wu, J. Improved LSTM-Based Time-Series Anomaly Detection in Rail Transit Operation Environments. IEEE Trans. Ind. Inform. 2022, 18, 9027–9036. [Google Scholar] [CrossRef]
- Premasudha, B.G.; Rampalli, V. A Comparative Study of Logistic Regression, Support Vector Machines, and LSTM Networks for Sentiment Classification in Academic Reviews. In Proceedings of the 2024 First International Conference on Innovations in Communications, Electrical and Computer Engineering (ICICEC), Davangere, India, 24–25 October 2024; pp. 1–11. [Google Scholar] [CrossRef]
- Jahan, I.; Islam, M.N.; Hasan, M.M.; Siddiky, M.R. Comparative analysis of machine learning algorithms for sentiment classification in social media text. World J. Adv. Res. Rev. 2024, 23, 2842–2852. [Google Scholar] [CrossRef]
- Jamil, M.L.; Pais, S.; Cordeiro, J.; Dias, G. Detection of extreme sentiments on social networks with BERT. Soc. Netw. Anal. Min. 2022, 12, 55. [Google Scholar] [CrossRef]
- Santoro, D.; Ciano, T.; Ferrara, M. A comparison between machine and deep learning models on high stationarity data. Sci. Rep. 2024, 14, 19409–19413. [Google Scholar] [CrossRef] [PubMed]
- Zwetsloot, I.M.; Jones-Farmer, L.A.; Woodall, W.H. Monitoring univariate processes using control charts: Some practical issues and advice. Qual. Eng. 2024, 36, 487–499. [Google Scholar] [CrossRef]
- Rezk, M.; Elmadany, N.; Hamad, R.K.; Badran, E.F. Categorizing Crises From Social Media Feeds via Multimodal Channel Attention. IEEE Access 2023, 11, 72037–72049. [Google Scholar] [CrossRef]
- Alhashmi, S.M.; Khedr, A.M.; Arif, I.; El-Bannany, M. Using a Hybrid-Classification Method to Analyze Twitter Data During Critical Events. IEEE Access 2021, 9, 141023–141035. [Google Scholar] [CrossRef]
Class | Precision | Recall | F1 Score |
---|---|---|---|
0 (negative) | 0.88 | 0.9 | 0.89 |
1 (positive) | 0.91 | 0.85 | 0.88 |
Class | Precision | Recall | F1 Score |
---|---|---|---|
0 (negative) | 0.75 | 0.82 | 0.78 |
1 (positive) | 0.78 | 0.41 | 0.54 |
Class | Precision | Recall | F1 Score |
---|---|---|---|
0 (negative) | 0.89 | 0.86 | 0.87 |
1 (positive) | 0.88 | 0.67 | 0.76 |
Class | Precision | Recall | F1 Score |
---|---|---|---|
0 (negative) | 0.83 | 0.87 | 0.85 |
1 (positive) | 0.82 | 0.65 | 0.73 |
Class | Precision | Recall | F1 Score |
---|---|---|---|
0 (negative) | 0.81 | 0.83 | 0.82 |
1 (positive) | 0.69 | 0.64 | 0.66 |
Component | Variant Full | Variant Ablated | t | p |
---|---|---|---|---|
BERT encoder | fine-tuned | frozen encoder | 1.66 | 0.173 |
Autoencoder depth | 3-layers deep | 1-layer deep | 0.98 | 0.385 |
LSTM attention | LSTM + attention | LSTM | 4.57 | 0.011 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Katalinić, J.; Dunđer, I. Neural Network-Based Sentiment Analysis and Anomaly Detection in Crisis-Related Tweets. Electronics 2025, 14, 2273. https://doi.org/10.3390/electronics14112273
Katalinić J, Dunđer I. Neural Network-Based Sentiment Analysis and Anomaly Detection in Crisis-Related Tweets. Electronics. 2025; 14(11):2273. https://doi.org/10.3390/electronics14112273
Chicago/Turabian StyleKatalinić, Josip, and Ivan Dunđer. 2025. "Neural Network-Based Sentiment Analysis and Anomaly Detection in Crisis-Related Tweets" Electronics 14, no. 11: 2273. https://doi.org/10.3390/electronics14112273
APA StyleKatalinić, J., & Dunđer, I. (2025). Neural Network-Based Sentiment Analysis and Anomaly Detection in Crisis-Related Tweets. Electronics, 14(11), 2273. https://doi.org/10.3390/electronics14112273