Mapping Public Sentiment: A Data-Driven Analysis of COVID-19 Discourse on Social Media in Italy †
Abstract
1. Introduction
2. Literature Review
3. Methodology
3.1. Data Collection
3.2. Data Preprocessing
- Replacing emojis with their corresponding descriptions to retain emotional content.
- Stripping out links, mentions, and special characters to focus on the textual content.
- Cleaning hashtags embedded within sentences while preserving their meaning.
- Removing punctuation and normalizing text to create a standardized dataset for analysis.
3.3. Topic Modeling
3.4. Sentiment Analysis
3.5. Lexical and Emotion Recognition Analysis
4. Analysis
4.1. Sentiment Trends over Time
4.2. Geographical Variation in Topics
4.3. Emotional Insights
5. Conclusions
5.1. Public Health Implications
5.2. Limitations and Future Research
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
COVID-19 Coronavirus CoronavirusOutbreak coronavirusitaly racism razzismo | COVID2019 COVID19italy Flu ItalyCoronavirus Influenza | Lombardy Italyquarantine quarantineItaly Covid Amuchina | quarantena focolai zonarossa Lombardia COVID19italia Coronaviriusitalia Italiani all’estero | Covid2019italia Coronavirusitalia CoronavirusItalla Codogno Contagiati Contagio |
References
- Fernandez, G.; Maione, C.; Yang, H.; Zaballa, K.; Bonnici, N.; Carter, J.; Spitzberg, B.H.; Jin, C.; Tsou, M.-H. Social Network Analysis of COVID-19 Sentiments: 10 Metropolitan Cities in Italy. Int. J. Environ. Res. Public Health 2022, 19, 7720. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Akbik, A.; Bergmann, T.; Blythe, D.; Rasul, K.; Schweter, S.; Vollgraf, R. FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, Minneapolis, MN, USA, 2–7 June 2019; pp. 54–59. [Google Scholar]
- Lorenzoni, V.; Andreozzi, G.; Bazzani, A.; Casigliani, V.; Pirri, S.; Tavoschi, L.; Turchetti, G. How Italy Tweeted about COVID-19: Detecting Reactions to the Pandemic from Social Media. Int. J. Environ. Res. Public Health 2022, 19, 7785. [Google Scholar] [CrossRef]
- Hutto, C.; Gilbert, E. VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text. In Proceedings of the International AAAI Conference on Web and Social Media, Ann Arbor, MI, USA, 1–4 June 2014; Volume 8, pp. 216–225. [Google Scholar] [CrossRef]
- Barbieri, F.; Camacho-Collados, J.; Neves, L.; Espinosa-Anke, L. TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 1644–1650. [Google Scholar]
- Mattei, M.; Caldarelli, G.; Squartini, T.; Saracco, F. Italian Twitter semantic network during the Covid-19 epidemic. EPJ Data Sci. 2021, 10, 47. [Google Scholar] [CrossRef] [PubMed]
- Fernandez, G.; Maione, C.; Zaballa, K.; Bonnici, N.; Spitzberg, B.H.; Carter, J.; Yang, H.; McKew, J.; Bonora, F.; Ghodke, S.S.; et al. The Geography of Covid-19 Spread in Italy Using Social Media and Geospatial Data Analytics. Int. J. Intell. Secur. Public Aff. 2021, 23, 228–258. [Google Scholar] [CrossRef]
- Zaballa, K.; Fernandez, G.; Maione, C.; Bonnici, N.; Carter, J.; Vito, D.; Tsou, M.-H. Social Response to COVID-19 SMART Dashboard: Proposal for Case Study. In International Conference on Smart Homes and Health Telematic; Springer International Publishing: Cham, Switzerland, 2022; pp. 154–165. [Google Scholar]
- Chang, J.; Boyd-Graber, J.; Wang, C.; Gerrish, S.; Blei, D.M. Reading tea leaves: How humans interpret topic models. In Advances in Neural Information Processing Systems, 23rd Annual Conference on Neural Information Processing Systems 2009, Vancouver, BC, Canada, 7–10 December 2009; NIPS; Curran Associates Inc.: Red Hook, NY, USA, 2009. [Google Scholar]
- Fernandez, G.; Suresh-Babu, S.; Vito, D. Mapping Infodemic Responses: A Geospatial Analysis of COVID-19 Discourse on Twitter in Italy. Int. J. Environ. Res. Public Health 2025, 22, 668. [Google Scholar] [CrossRef]
Serial Number | Geography | Cities | Region | Unique Tweets | Population | Population Density (/km2) |
---|---|---|---|---|---|---|
1 | North | Venice | Veneto | 30,746 | 858,455 | 349 |
2 | North | Milan | Lombardy | 132,266 | 3,190,340 | 2026 |
3 | North | Turin | Piedmont | 40,129 | 2,293,340 | 336 |
4 | North | Bologna | Emilia Romagna | 10,078 | 1,005,831 | 271 |
5 | Center | Florence | Tuscany | 30,180 | 1,007,435 | 287 |
6 | Center | Rome | Lazio | 197,969 | 4,336,915 | 810 |
7 | South | Naples | Campania | 62,807 | 3,128,702 | 2672 |
8 | South | Bari | Apulia | 15,056 | 1,251,004 | 327 |
9 | Island | Cagliari | Sardinia | 8669 | 431,302 | 346 |
10 | Island | Palermo | Sicily | 15,451 | 1,276,525 | 255 |
COVID-19 in healthcare settings | COVID-19 statistics and data | COVID-19 restrictions and quarantine rules | Criticism of COVID-19 measures and skepticism | General COVID-19 updates and news |
Personal experiences with COVID-19 | COVID-19 safety measures in public spaces | COVID-19 outbreaks and case numbers | COVID-19 outbreaks in specific locations | COVID-19 vaccines and treatments |
COVID-19 research and medical information | COVID-19 testing and positive cases | COVID-19 skepticism and conspiracy theories | COVID-19 social and economic impacts | COVID-19 regulations enforcement |
COVID-19 public health messaging | Humor and jokes about COVID-19 | COVID-19 and public events/gatherings | COVID-19 and voting/elections | COVID-19 prevention measures (masks, distancing, etc.) |
Date | Policy |
---|---|
14 September 2020 | Schools reopen in Italy, following strict health protocols after months of closure. |
7 October 2020 | The Italian government extends the state of emergency until 31 January 2021, due to rising cases. |
13 October 2020 | Italy introduces new containment measures, including mandatory masks outdoors and tighter restrictions on social gatherings. |
24 October 2020 | New restrictions imposed, including closing restaurants and bars early, limiting sporting events, and encouraging remote work. |
3 November 2020 | Italy enacts a tiered system with regional color codes (yellow, orange, red), based on the severity of the outbreak, with varying restrictions. |
27 December 2020 | Italy launches its vaccination campaign, starting with healthcare workers and the elderly. |
6 January 2021 | The European Medicines Agency (EMA) approves the Moderna vaccine for use in the EU. |
13 January 2021 | The Italian government extended the state of emergency until 30 April 2021, due to the pandemic’s persistence. |
15 March 2021 | Italy tightens restrictions once more, enforcing a nationwide lockdown over the Easter period, treating all regions as “red zones”. |
26 April 2021 | The country began a phase reopening, including the reopening of outdoor dining and some cultural and sporting events. |
23 May 2021 | Gyms are allowed to reopen in yellow zones under strict health and safety guidelines. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fernandez, G.; Suresh-Babu, S.; Vito, D. Mapping Public Sentiment: A Data-Driven Analysis of COVID-19 Discourse on Social Media in Italy. Med. Sci. Forum 2025, 33, 3. https://doi.org/10.3390/msf2025033003
Fernandez G, Suresh-Babu S, Vito D. Mapping Public Sentiment: A Data-Driven Analysis of COVID-19 Discourse on Social Media in Italy. Medical Sciences Forum. 2025; 33(1):3. https://doi.org/10.3390/msf2025033003
Chicago/Turabian StyleFernandez, Gabriela, Siddharth Suresh-Babu, and Domenico Vito. 2025. "Mapping Public Sentiment: A Data-Driven Analysis of COVID-19 Discourse on Social Media in Italy" Medical Sciences Forum 33, no. 1: 3. https://doi.org/10.3390/msf2025033003
APA StyleFernandez, G., Suresh-Babu, S., & Vito, D. (2025). Mapping Public Sentiment: A Data-Driven Analysis of COVID-19 Discourse on Social Media in Italy. Medical Sciences Forum, 33(1), 3. https://doi.org/10.3390/msf2025033003