Exploring the Topics and Sentiments of AI-Related Public Opinions: An Advanced Machine Learning Text Analysis
Abstract
1. Introduction
2. Literature Review
2.1. Public Sentiment Toward Artificial Intelligence
2.2. Social Media as a Lens for Public Opinion
2.3. Text Analysis Methods in AI Opinion Research
2.4. Algorithmic Bias and AI Ethics in Public Discourse
2.5. Research Gaps and Study Contributions
3. Methodology
3.1. Data Collection
3.2. Data Preprocessing
3.2.1. Manual Filtering and Quality Control
3.2.2. Text Cleaning Pipeline
3.2.3. Tokenization and Normalization
3.3. Feature Extraction
3.4. Analytical Approaches
3.4.1. Sentiment Analysis
3.4.2. LIWC Analysis
3.4.3. LIWC Validation Through Regression
3.4.4. Tone Analysis
3.4.5. Topic Modeling
3.4.6. Advanced NLP Analyses
3.4.7. Network and Predictive Analysis
4. Results
4.1. Descriptive Statistics and Sentiment Distribution
4.2. Temporal Analysis
4.3. LIWC Analysis
| Metric | Value |
|---|---|
| R2 | 0.049 |
| Adjusted R2 | 0.049 |
| RMSE | 0.495 |
| p-value | <0.001 |
4.4. Tone Analysis
4.5. Topic Modeling
4.6. Named Entity and Emotion Analysis
4.7. Advanced NLP Analyses
4.8. Network Analysis
4.9. Predictive Modeling
5. Discussion
5.1. Sentiment Trends and the ChatGPT Effect
5.2. LIWC Validation and Psychological Dimensions
5.3. Topic Structure and Thematic Concerns
5.4. Discourse Quality and Engagement Patterns
6. Limitations
6.1. Platform and Sample Limitations
6.2. Methodological Limitations
6.3. Analytical Limitations
6.4. Scope Boundaries
6.5. Strengths Despite Limitations
7. Implications
7.1. Theoretical Implications
7.2. Practical Implications for AI Developers
7.3. Policy Implications
7.4. Implications for Researchers
7.5. Future Research Directions
8. Conclusions
8.1. Addressing Research Questions
8.2. Methodological Contributions
8.3. Connections to Contemporary AI Discourse
8.4. Broader Significance
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rahwan, I.; Cebrian, M.; Obradovich, N.; Bongard, J.; Bonnefon, J.F.; Breazeal, C.; Crandall, J.W.; Christakis, N.A.; Couzin, I.D.; Jackson, M.O.; et al. Machine behaviour. Nature 2019, 568, 477–486. [Google Scholar] [CrossRef]
- Floridi, L.; Chiriatti, M. GPT-3: Its nature, scope, limits, and consequences. Minds Mach. 2020, 30, 681–694. [Google Scholar] [CrossRef]
- CNBC. ChatGPT’s One-Year Anniversary—How the Viral AI Chatbot Has Changed. CNBC, 30 November 2023. Available online: https://www.cnbc.com/2023/11/30/chatgpts-one-year-anniversary-how-the-viral-ai-chatbot-has-changed.html (accessed on 22 December 2025).
- Jobin, A.; Ienca, M.; Vayena, E. The global landscape of AI ethics guidelines. Nat. Mach. Intell. 2019, 1, 389–399. [Google Scholar] [CrossRef]
- Floridi, L.; Cowls, J.; Beltrametti, M.; Chatila, R.; Chazerand, P.; Dignum, V.; Luetge, C.; Madelin, R.; Pagallo, U.; Rossi, F.; et al. AI4People—An ethical framework for a good AI society. Minds Mach. 2018, 28, 689–707. [Google Scholar] [CrossRef]
- Cave, S.; Dihal, K. Hopes and fears for intelligent machines in fiction and reality. Nat. Mach. Intell. 2019, 1, 74–78. [Google Scholar] [CrossRef]
- Stieglitz, S.; Dang-Xuan, L. Emotions and information diffusion in social media. J. Manag. Inf. Syst. 2013, 29, 217–248. [Google Scholar] [CrossRef]
- Jungherr, A. Twitter use in election campaigns: A systematic literature review. J. Inf. Technol. Politics 2016, 13, 72–91. [Google Scholar] [CrossRef]
- Grover, P.; Kar, A.K.; Dwivedi, Y.K.; Janssen, M. Polarization and acculturation in US Election 2016 outcomes. Technol. Forecast. Soc. Change 2019, 145, 438–460. [Google Scholar] [CrossRef]
- Liu, B. Sentiment Analysis and Opinion Mining; Synthesis Lectures on Human Language Technologies; Springer: Cham, Switzerland, 2012; Volume 5, pp. 1–167. [Google Scholar]
- Pang, B.; Lee, L. Opinion Mining and Sentiment Analysis; Foundations and Trends in Information Retrieval; Now Publishers Inc.: Hanover, MA, USA, 2008; Volume 2, pp. 1–135. [Google Scholar]
- Hutto, C.J.; Gilbert, E. VADER: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media, Ann Arbor, MI, USA, 1–4 June 2014; Volume 8, pp. 216–225. [Google Scholar]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Lee, D.D.; Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401, 788–791. [Google Scholar] [CrossRef]
- Pennebaker, J.W.; Boyd, R.L.; Jordan, K.; Blackburn, K. The Development and Psychometric Properties of LIWC2015; University of Texas at Austin: Austin, TX, USA, 2015. [Google Scholar]
- Tausczik, Y.R.; Pennebaker, J.W. The psychological meaning of words: LIWC and computerized text analysis methods. J. Lang. Soc. Psychol. 2010, 29, 24–54. [Google Scholar] [CrossRef]
- Buolamwini, J.; Gebru, T. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, New York, NY, USA, 23–24 February 2018; pp. 77–91. [Google Scholar]
- Mittelstadt, B.D.; Allo, P.; Taddeo, M.; Wachter, S.; Floridi, L. The ethics of algorithms: Mapping the debate. Big Data Soc. 2016, 3, 1–21. [Google Scholar] [CrossRef]
- Raji, I.D.; Buolamwini, J. Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial AI products. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, Honolulu, HI, USA, 27–28 January 2019; pp. 429–435. [Google Scholar]
- Cave, S.; Coughlan, K.; Dihal, K. “Scary robots”: Examining public responses to AI. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, Honolulu, HI, USA, 27–28 January 2019; pp. 331–337. [Google Scholar]
- Zhang, B.; Dafoe, A. Artificial Intelligence: American Attitudes and Trends; Center for the Governance of AI, University of Oxford: Oxford, UK, 2019. [Google Scholar]
- Eurobarometer. Attitudes Towards the Impact of Digitisation and Automation on Daily Life; Special Eurobarometer 460; European Commission: Brussels, Belgium, 2017. [Google Scholar]
- Funk, C.; Tyson, A.; Kennedy, B.; Johnson, C. Science and Scientists Held in High Esteem Across Global Publics; Pew Research Center: Washington, DC, USA, 2020. [Google Scholar]
- O’Neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy; Crown Publishing Group: New York, NY, USA, 2016. [Google Scholar]
- West, S.M.; Whittaker, M.; Crawford, K. Discriminating Systems: Gender, Race and Power in AI; AI Now Institute: New York, NY, USA, 2019. [Google Scholar]
- Kieslich, K.; Keller, B.; Starke, C. Artificial intelligence ethics by design. Big Data Soc. 2022, 9, 1–15. [Google Scholar] [CrossRef]
- Gambino, A.; Fox, J.; Ratan, R.A. Building a stronger CASA: Extending the computers are social actors paradigm. Hum.-Mach. Commun. 2020, 1, 71–86. [Google Scholar] [CrossRef]
- Chubb, J.; Cowling, P.; Reed, D. Speeding up to keep up: Exploring the use of AI in the research process. AI Soc. 2022, 37, 1439–1457. [Google Scholar] [CrossRef]
- Salton, G.; Buckley, C. Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 1988, 24, 513–523. [Google Scholar] [CrossRef]
- Zou, J.; Schiebinger, L. AI can be sexist and racist—It’s time to make it fair. Nature 2018, 559, 324–326. [Google Scholar] [CrossRef]
- Gillespie, T. The relevance of algorithms. In Media Technologies; Gillespie, T., Boczkowski, P.J., Foot, K.A., Eds.; MIT Press: Cambridge, MA, USA, 2014; pp. 167–194. [Google Scholar]
- Slovic, P. Perception of risk. Science 1987, 236, 280–285. [Google Scholar] [CrossRef] [PubMed]
- Slovic, P.; Finucane, M.L.; Peters, E.; MacGregor, D.G. The affect heuristic. Eur. J. Oper. Res. 2007, 177, 1333–1352. [Google Scholar] [CrossRef]
- Newell, S.; Marabelli, M. Strategic opportunities (and challenges) of algorithmic decision-making. J. Strateg. Inf. Syst. 2015, 24, 3–14. [Google Scholar] [CrossRef]
- Cadwalladr, C.; Graham-Harrison, E. Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach. Guardian 2018, 17, 22. [Google Scholar]
- Slater, M.D.; Rouner, D. Entertainment-education and elaboration likelihood: Understanding the processing of narrative persuasion. Commun. Theory 2002, 12, 173–191. [Google Scholar] [CrossRef]
- Heylighen, F.; Dewaele, J.-M. Formality of Language: Definition, Measurement and Behavioral Determinants; Internal Report, Center Leo Apostel, Free University of Brussels: Brussels, Belgium, 1999. [Google Scholar]
- Edelman. Edelman Trust Barometer 2020; Edelman Trust Institute: New York, NY, USA, 2020. [Google Scholar]
- Zuboff, S. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power; PublicAffairs: New York, NY, USA, 2019. [Google Scholar]
- Stromer-Galley, J. Measuring deliberation’s content: A coding scheme. J. Public Delib. 2007, 3, 12. [Google Scholar] [CrossRef]
- Lampe, C.; Zube, P.; Lee, J.; Park, C.H.; Johnston, E. Crowdsourcing civility: A natural experiment examining the effects of distributed moderation in online forums. Gov. Inf. Q. 2014, 31, 317–326. [Google Scholar] [CrossRef]
- Coe, K.; Kenski, K.; Rains, S.A. Online and uncivil? Patterns and determinants of incivility in newspaper website comments. J. Commun. 2014, 64, 658–679. [Google Scholar] [CrossRef]
- Matias, J.N. The civic labor of volunteer moderators online. Soc. Media Soc. 2019, 5, 1–12. [Google Scholar] [CrossRef]
- Ferguson, A.G. The Rise of Big Data Policing: Surveillance, Race, and the Future of Law Enforcement; New York University Press: New York, NY, USA, 2017. [Google Scholar]
- Brayne, S. Big data surveillance: The case of policing. Am. Sociol. Rev. 2017, 82, 977–1008. [Google Scholar] [CrossRef]



| Metric | Value |
|---|---|
| Total Comments | 28,819 |
| Positive | 12,511 (43.4%) |
| Negative | 9625 (33.4%) |
| Neutral | 6683 (23.2%) |
| Mean Compound Score | +0.060 |
| Standard Deviation | 0.508 |
| Search Topic | Avg Sentiment | % Positive | N |
|---|---|---|---|
| Face recognition + gender | +0.410 | 69.4% | 193 |
| AI + facial recognition | +0.394 | 65.7% | 481 |
| AI and error | +0.271 | 61.0% | 341 |
| Facial recognition and bias | −0.204 | 28.4% | 454 |
| Year | Comments | Avg Sentiment | % Positive | % Negative |
|---|---|---|---|---|
| 2015 | 593 | +0.320 | 61.7% | 18.0% |
| 2016 | 768 | +0.200 | 50.7% | 20.8% |
| 2018 | 3358 | +0.062 | 43.8% | 33.4% |
| 2022 | 2965 | +0.051 | 41.9% | 33.0% |
| 2024 | 7171 | +0.053 | 42.7% | 33.4% |
| Period | Comments | Avg Sentiment | % Positive | % Negative |
|---|---|---|---|---|
| Pre-ChatGPT | 14,989 | +0.073 | 44.1% | 31.9% |
| Post-ChatGPT | 13,830 | +0.045 | 42.7% | 35.1% |
| Category | Overall Mean % | Positive | Neutral | Negative |
|---|---|---|---|---|
| I-words | 2.37% | 2.58% | 2.14% | 2.25% |
| We-words | 0.70% | 0.71% | 0.68% | 0.69% |
| You-words | 1.83% | 1.93% | 1.52% | 1.89% |
| Category | Overall Mean % | Positive | Neutral | Negative |
|---|---|---|---|---|
| Positive Emotion | 0.51% | 1.00% | 0.04% | 0.20% |
| Negative Emotion | 0.32% | 0.12% | 0.03% | 0.78% |
| Anxiety | 0.08% | 0.07% | 0.03% | 0.14% |
| LIWC Category | β | Direction | p-Value |
|---|---|---|---|
| Negative Emotion | −0.083 | ↓ Negative | <0.001 *** |
| Positive Emotion | +0.063 | ↑ Positive | <0.001 *** |
| Certainty | +0.019 | ↑ Positive | <0.001 *** |
| Anxiety | −0.019 | ↓ Negative | <0.001 *** |
| Metric | Value |
|---|---|
| Accuracy | 61.0% |
| Precision (Positive) | 0.60 |
| Recall (Positive) | 0.92 |
| LIWC Category | Pearson r | p-Value |
|---|---|---|
| Negative Emotion | −0.160 | <0.001 *** |
| Positive Emotion | +0.123 | <0.001 *** |
| Anger | −0.075 | <0.001 *** |
| Formality Level | Count | Percentage |
|---|---|---|
| Informal (0–40) | 1712 | 5.9% |
| Neutral (40–60) | 27,037 | 93.8% |
| Formal (60–100) | 59 | 0.2% |
| Emotional Tone | Count | Percentage | Avg Sentiment |
|---|---|---|---|
| Neutral | 21,492 | 74.6% | +0.050 |
| Analytical | 5122 | 17.8% | +0.082 |
| Positive-Emotional | 1131 | 3.9% | +0.491 |
| Negative-Emotional | 1074 | 3.7% | −0.296 |
| Thinking Style | Count | Percentage | Avg Sentiment |
|---|---|---|---|
| Balanced | 17,460 | 60.6% | +0.051 |
| Analytical | 10,133 | 35.2% | +0.068 |
| Intuitive | 1226 | 4.3% | +0.121 |
| Topic | Label | Top Keywords |
|---|---|---|
| T1 | General Discussion | doesn, system, mean, possible |
| T2 | Social Media and News | social, media, study, level |
| T4 | Race and Recognition | black, white, person, faces |
| T5 | China and Politics | china, fake, trump, news |
| T8 | AI Ethics and Models | ai, human, ethics, model |
| T9 | Technology and Data | algorithm, intelligence, companies |
| Topic | Label | Top Keywords |
|---|---|---|
| T1 | AI and Ethics | ai, humans, ethics, regulate |
| T3 | Facial Recognition | facial recognition, face, china |
| T4 | Data and Bias | data, bias, training, model |
| T7 | Facebook and Fake News | facebook, news, fake, photos |
| T9 | Machine Learning | machine learning, algorithm |
| Entity | Mentions | Avg Sentiment | % Positive |
|---|---|---|---|
| Meta/Facebook | 650 | +0.101 | 48.3% |
| 648 | +0.111 | 51.4% | |
| OpenAI | 523 | +0.172 | 55.6% |
| Amazon | 117 | +0.221 | 65.8% |
| Emotion | Count | % of Comments | In Positive | In Negative |
|---|---|---|---|---|
| Concern | 2020 | 7.0% | 974 | 965 |
| Joy | 1739 | 6.0% | 1375 | 322 |
| Hope | 1399 | 4.9% | 890 | 431 |
| Anger | 1395 | 4.8% | 360 | 976 |
| Word Count | Avg Sentiment | % Positive | Count |
|---|---|---|---|
| 1–10 words | +0.043 | 29.3% | 7595 |
| 11–25 words | +0.046 | 42.8% | 8941 |
| 500 + words | +0.393 | 70.4% | 54 |
| Sarcasm Level | Count | Percentage |
|---|---|---|
| No signals | 21,411 | 74.3% |
| 2 + signals (likely) | 904 | 3.1% |
| Positive but sarcastic | 517 | 4.1% of positive |
| AI Stance | Count | Percentage | Avg Sentiment |
|---|---|---|---|
| Pro-AI | 348 | 1.2% | +0.346 |
| Cautious | 863 | 3.0% | +0.148 |
| Anti-AI | 805 | 2.8% | −0.169 |
| Pattern | Count | Avg Sentiment |
|---|---|---|
| Counterargument | 357 | +0.230 |
| Personal-Experience | 225 | +0.174 |
| Evidence-Based | 197 | +0.153 |
| Toxicity Level | Count | Percentage |
|---|---|---|
| None | 23,235 | 80.6% |
| Low | 4884 | 16.9% |
| High | 95 | 0.3% |
| Concept Pair | Co-Occurrences |
|---|---|
| Regulation + Police | 196 |
| Privacy + Regulation | 175 |
| Facial Recognition + Regulation | 161 |
| Ethics + Regulation | 161 |
| Metric | Positive | Neutral | Negative | Overall |
|---|---|---|---|---|
| Precision | 0.81 | 0.63 | 0.72 | 0.74 |
| Recall | 0.72 | 0.83 | 0.67 | 0.73 |
| Accuracy | — | — | — | 72.7% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Raghupathi, W.; Ren, J.; Kulkarni, T. Exploring the Topics and Sentiments of AI-Related Public Opinions: An Advanced Machine Learning Text Analysis. Information 2026, 17, 134. https://doi.org/10.3390/info17020134
Raghupathi W, Ren J, Kulkarni T. Exploring the Topics and Sentiments of AI-Related Public Opinions: An Advanced Machine Learning Text Analysis. Information. 2026; 17(2):134. https://doi.org/10.3390/info17020134
Chicago/Turabian StyleRaghupathi, Wullianallur, Jie Ren, and Tanush Kulkarni. 2026. "Exploring the Topics and Sentiments of AI-Related Public Opinions: An Advanced Machine Learning Text Analysis" Information 17, no. 2: 134. https://doi.org/10.3390/info17020134
APA StyleRaghupathi, W., Ren, J., & Kulkarni, T. (2026). Exploring the Topics and Sentiments of AI-Related Public Opinions: An Advanced Machine Learning Text Analysis. Information, 17(2), 134. https://doi.org/10.3390/info17020134

