Analyzing Global Attitudes Towards ChatGPT via Ensemble Learning on X (Twitter)
Abstract
1. Introduction
2. Related Work
3. Methodology
3.1. General Methodology Pipeline
3.1.1. Data Collection
3.1.2. Data Preprocessing
3.1.3. BERT Tokenization
3.1.4. Text Vectorization
3.1.5. Class Balancing
3.1.6. Training Base Classifiers
- -
- Multinomial Naïve Bayes (α = 1.0), particularly effective for TF–IDF-based textfeatures.
- -
- Linear Support Vector Machine (SVM) (C = 1.0, max_iter = 1000), known for its strong performance in high-dimensional spaces.
- -
- Random Forest Classifier (100 trees, Gini impurity), providing robustness and reduced sensitivity to noisy data.
3.1.7. Soft-Voting Ensemble
3.2. Machine Learning Algorithms
3.2.1. Naïve Bayes Classifier
3.2.2. Support Vector Machine (SVM)
3.2.3. Random Forest
- Randomly choose m features from the total p features.
- Identify the optimal feature and split point among the mmm chosen features.
- Divide the node into two child nodes and continue this process until the minimum node size (nmin) is reached.
3.2.4. Ensemble Learning
4. Results and Discussion
- -
- Naïve Bayes contributes probabilistic weighting, which captures global word frequency trends.
- -
- SVM provides strong linear separation in high-dimensional space, improving margin-based discrimination.
- -
- Random Forest captures non-linear interactions between terms and contextual dependencies missed by SVM or NB.
5. Conclusions
6. Limitations of the Study
7. Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| AUC | Area Under the Curve |
| BERT | Bidirectional Encoder Representations from Transformers |
| GPT | Generative Pre-Trained Transformer |
| LLM | Large Language Model |
| ML | Machine Learning |
| NB | Naïve Bayes |
| NLP | Natural Language Processing |
| RF | Random Forest |
| ROC | Receiver Operating Characteristic |
| SVM | Support Vector Machine |
| TF-IDF | Term Frequency–Inverse Document Frequency |
References
- Issa, W.B.; Shorbagi, A.; Al-Sharman, A.; Rababa, M.; Al-Majeed, K.; Radwan, H.; Refaat Ahmed, F.; Al-Yateem, N.; Mottershead, R.; Abdelrahim, D.N.; et al. Shaping the future: Perspectives on the Integration of Artificial Intelligence in health profession education: A multi-country survey. BMC Med. Educ. 2024, 24, 1166. [Google Scholar] [CrossRef] [PubMed]
- Amiri, S.M.H.; Islam, M.M.; Hossen, M.S. The Role of Artificial Intelligence in Shaping Future Education Policies. Educ. J. 2025, 14, 32–38. [Google Scholar] [CrossRef]
- Ahn, J.S.; Shin, S.; Yang, S.-A.; Park, E.K.; Kim, K.H.; Cho, S.I.; Ock, C.-Y.; Kim, S. Artificial Intelligence in Breast Cancer Diagnosis and Personalized Medicine. J. Breast Cancer 2023, 26, 405–435. [Google Scholar] [CrossRef]
- Kabudi, T.; Pappas, I.; Olsen, D.H. AI-enabled adaptive learning systems: A systematic mapping of the literature. Comput. Educ. Artif. Intell. 2021, 2, 100017. [Google Scholar] [CrossRef]
- Hantom, W.H.; Rahman, A. Arabic Spam Tweets Classification: A Comprehensive Machine Learning Approach. AI 2024, 5, 1049–1065. [Google Scholar] [CrossRef]
- Khairy, M.; Mahmoud, T.M.; Omar, A.; Abd El-Hafeez, T. Comparative performance of ensemble machine learning for Arabic cyberbullying and offensive language detection. Lang. Resour. Eval. 2024, 58, 695–712. [Google Scholar] [CrossRef]
- Omar, A.; Abd El-Hafeez, T. Quantum computing and machine learning for Arabic language sentiment classification in social media. Sci. Rep. 2023, 13, 17305. [Google Scholar] [CrossRef]
- Birjali, M.; Kasri, M.; Beni-Hssane, A. A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Knowl.-Based Syst. 2021, 226, 107134. [Google Scholar] [CrossRef]
- Mamdouh Farghaly, H.; Abd El-Hafeez, T. A new feature selection method based on frequent and associated itemsets for text classification. Concurr. Comput. Pract. Exp. 2022, 34, e7258. [Google Scholar] [CrossRef]
- Bi, Y. Sentiment classification in social media data by combining triplet belief functions. J. Assoc. Inf. Sci. Technol. 2022, 73, 968–991. [Google Scholar] [CrossRef]
- Revathy, G.; Alghamdi, S.A.; Alahmari, S.M.; Yonbawi, S.R.; Kumar, A.; Anul Haq, M. Sentiment analysis using machine learning: Progress in the machine intelligence for data science. Sustain. Energy Technol. Assess. 2022, 53, 102557. [Google Scholar] [CrossRef]
- Abdullah, T.; Ahmet, A. Deep Learning in Sentiment Analysis: Recent Architectures. ACM Comput Surv 2022, 55, 1–37. [Google Scholar] [CrossRef]
- Alqarni, A.; Rahman, A. Arabic Tweets-Based Sentiment Analysis to Investigate the Impact of COVID-19 in KSA: A Deep Learning Approach. Big Data Cogn. Comput. 2023, 7, 16. [Google Scholar] [CrossRef]
- Spam and Sentiment Detection in Arabic Tweets Using MarBert Model|IIETA. Available online: https://www.iieta.org/journals/mmep/paper/10.18280/mmep.090617 (accessed on 7 November 2025).
- Khan, M.T.; Durrani, M.; Ali, A.; Inayat, I.; Khalid, S.; Khan, K.H. Sentiment analysis and the complex natural language. Complex Adapt. Syst. Model. 2016, 4, 2. [Google Scholar] [CrossRef]
- Naing, S.Z.S.; Udomwong, P. Public Opinions on ChatGPT: An Analysis of Reddit Discussions by Using Sentiment Analysis, Topic Modeling, and SWOT Analysis. Data Intell. 2024, 6, 344–374. [Google Scholar] [CrossRef]
- Sabir, A.; Ali, H.A.; Aljabery, M.A. ChatGPT Tweets Sentiment Analysis Using Machine Learning and Data Classification. Informatica 2024, 48, 103–112. [Google Scholar] [CrossRef]
- ChatGPT Sentiment Analysis. Available online: https://www.kaggle.com/datasets/charunisa/chatgpt-sentiment-analysis (accessed on 7 November 2025).
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar] [CrossRef]
- Wu, Y.; Schuster, M.; Chen, Z.; Le, Q.V.; Norouzi, M.; Macherey, W.; Krikun, M.; Cao, Y.; Gao, Q.; Macherey, K.; et al. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv 2016, arXiv:1609.08144. [Google Scholar] [CrossRef]
- El-Khair, I.A. TF*IDF. In Encyclopedia of Database Systems; Springer: Boston, MA, USA, 2009; pp. 3085–3086. ISBN 978-0-387-39940-9_956. [Google Scholar]
- SMOTE for High-Dimensional Class-Imbalanced Data|BMC Bioinformatics|Full Text. Available online: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-14-106 (accessed on 7 November 2025).
- Mujahid, M.; Kına, E.; Rustam, F.; Villar, M.G.; Alvarado, E.S.; De La Torre Diez, I.; Ashraf, I. Data oversampling and imbalanced datasets: An investigation of performance for machine learning and feature engineering. J. Big Data 2024, 11, 87. [Google Scholar] [CrossRef]
- Glazkova, A. A Comparison of Synthetic Oversampling Methods for Multi-class Text Classification. arXiv 2020, arXiv:2008.04636. [Google Scholar] [CrossRef]
- Rifaldy, F.; Sibaroni, Y.; Prasetiyowati, S.S. Effectiveness of Word2Vec and TF-IDF in Sentiment Classification on Online Investment Platforms Using Support Vector Machine. JIPI (J. Ilm. Penelit. Dan Pembelajaran Inform.) 2025, 10, 863–874. [Google Scholar] [CrossRef]
- Blanquero, R.; Carrizosa, E.; Ramírez-Cobo, P.; Sillero-Denamiel, M.R. Variable selection for Naïve Bayes classification. Comput. Oper. Res. 2021, 135, 105456. [Google Scholar] [CrossRef]
- Venkateshwarlu, G.; Akhila, S.; Kavyasree, V.; Vishnu, S.; Prasad, V.S. Enhanced Text Classification Using Random Forest: Comparative Analysis and Insights on Performance and Efficiency. Int. J. Comput. Eng. Res. Trends 2024, 11, 1–8. [Google Scholar]
- Peretz, O.; Koren, M.; Koren, O. Naive Bayes classifier—An ensemble procedure for recall and precision enrichment. Eng. Appl. Artif. Intell. 2024, 136, 108972. [Google Scholar] [CrossRef]
- Liang, X.; Zhu, L.; Huang, D.-S. Multi-task ranking SVM for image cosegmentation. Neurocomputing 2017, 247, 126–136. [Google Scholar] [CrossRef]
- Biau, G.; Scornet, E. A Random Forest Guided Tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
- Pasinetti, S.; Fornaser, A.; Lancini, M.; De Cecco, M.; Sansoni, G. Assisted Gait Phase Estimation Through an Embedded Depth Camera Using Modified Random Forest Algorithm Classification. IEEE Sens. J. 2020, 20, 3343–3355. [Google Scholar] [CrossRef]
- Mhawi, D.N.; Aldallal, A.; Hassan, S. Advanced Feature-Selection-Based Hybrid Ensemble Learning Algorithms for Network Intrusion Detection Systems. Symmetry 2022, 14, 1461. [Google Scholar] [CrossRef]









| Classifier | Model/Component | Key Hyperparameters |
|---|---|---|
| Naïve Bayes | MultinomialNB() | alpha = 1.0 fit_prior = True |
| Support Vector Machine (SVM) | LinearSVC | C = 1.0 max_iter = 1000 random_state = 42 |
| CalibratedClassifierCV | cv = 5 | |
| Random Forest | RandomForestClassifier() | n_estimators = 100 criterion = ‘gini’ |
| Model | Precision (Avg) | Recall (Avg) | F1-Score (Avg) | Accuracy (%) |
|---|---|---|---|---|
| Naïve Bayes | 66% | 66% | 66% | 66 |
| SVM | 84% | 84% | 84% | 84 |
| Random Forest | 79% | 79% | 78% | 79 |
| Ensemble Learning | 86% | 86% | 86% | 86 |
| Model | Neutral Precision | Neutral Recall | Neutral F1-Score |
|---|---|---|---|
| Naïve Bayes | 0.62 | 0.54 | 0.58 |
| SVM | 0.82 | 0.73 | 0.77 |
| Random Forest | 0.84 | 0.66 | 0.74 |
| Ensemble Learning | 0.84 | 0.76 | 0.80 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Touhami Chahdi, Y.; Abbou, F.M.; Abdi, F.; Bouhadda, M.; Bouanane, L. Analyzing Global Attitudes Towards ChatGPT via Ensemble Learning on X (Twitter). Algorithms 2025, 18, 748. https://doi.org/10.3390/a18120748
Touhami Chahdi Y, Abbou FM, Abdi F, Bouhadda M, Bouanane L. Analyzing Global Attitudes Towards ChatGPT via Ensemble Learning on X (Twitter). Algorithms. 2025; 18(12):748. https://doi.org/10.3390/a18120748
Chicago/Turabian StyleTouhami Chahdi, Yassir, Fouad Mohamed Abbou, Farid Abdi, Mohamed Bouhadda, and Lamiae Bouanane. 2025. "Analyzing Global Attitudes Towards ChatGPT via Ensemble Learning on X (Twitter)" Algorithms 18, no. 12: 748. https://doi.org/10.3390/a18120748
APA StyleTouhami Chahdi, Y., Abbou, F. M., Abdi, F., Bouhadda, M., & Bouanane, L. (2025). Analyzing Global Attitudes Towards ChatGPT via Ensemble Learning on X (Twitter). Algorithms, 18(12), 748. https://doi.org/10.3390/a18120748

