Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China
Abstract
:1. Introduction
2. Data and Methods
2.1. Data and Data Pre-Processing
2.2. Method
2.2.1. Time Series Analysis
2.2.2. Topic Extraction and Classification
Chinese Word Segmentation
LDA Model
RF Algorithm
2.2.3. Kernel Density Estimation
2.2.4. Spearman Correlation
2.2.5. Evaluation of Results
3. Results
3.1. Spatial-Temporal Analysis
3.1.1. Time Series Analysis
3.1.2. Spatial Analysis
3.2. Topic Analysis
3.2.1. Topic Description
3.2.2. Temporal Trend of Topics
3.2.3. Spatial Distribution of Topics
4. Discussion
4.1. Temporal Trend
4.2. Spatial Distribution
4.3. Topic Discussion
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- World Health Organization. WHO Statement on Cases of COVID-19 Surpassing 100,000. Available online: https://www.who.int/news-room/detail/07-03-2020-who-statement-on-cases-of-covid-19-surpassing-100-000 (accessed on 7 March 2020).
- Jung, S.; Akhmetzhanov, A.R.; Hayashi, K.; Linton, N.M.; Yang, Y.; Yuan, B.; Kobayashi, T.; Kinoshita, R.; Nishiura, H. Real-Time Estimation of the Risk of Death from Novel Coronavirus (COVID-19) Infection: Inference Using Exported Cases. J. Clin. Med. 2020, 9, 523. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- National Health Commission of the People’s Republic of China. Announcement of the National Health Commission of the People’s Republic of China. Available online: http://www.nhc.gov.cn/ (accessed on 20 January 2020). (In Chinese)
- CCTV Network. What Is the Reason for Wuhan’s “Closing the City”? Available online: http://news.cctv.com/ (accessed on 23 January 2020). (In Chinese).
- China News. International Opinion Praises China’s Completion of HuoShenshan Hospital on the 10th. Available online: http://www.chinanews.com/gj/2020/02-03/9077199.shtml (accessed on 2 February 2020). (In Chinese).
- Sina Finance. “Guardian Alliance” of “Two Mountain Hospitals”: China Construction Three Bureau Undertakes the Maintenance Tasks of Vulcan Mountain and Thunder Mountain Hospital. Available online: https://cj.sina.com.cn/articles/view/3037284894/b5094a1e02000t0ew?from=finance (accessed on 8 February 2020). (In Chinese).
- National Health Commission of the People’s Republic of China. The Latest Situation of the New Coronavirus Pneumonia Epidemic Situation as of 24:00 on February 10. Available online: http://www.nhc.gov.cn/xcs/yqtb/202002/4a611bc7fa20411f8ba1c8084426c0d4.shtml (accessed on 10 February 2020). (In Chinese)
- Han, X.; Wang, J. Using Social Media to Mine and Analyze Public Sentiment during a Disaster: A Case Study of the 2018 Shouguang City Flood in China. Int. J. Geo Inf. 2019, 8, 185. [Google Scholar] [CrossRef] [Green Version]
- Wang, Z.; Ye, X. Social media analytics for natural disaster management. Int. J. Geogr. Inf. Sci. 2018, 32, 49–72. [Google Scholar] [CrossRef]
- Liu, Q.; Gao, Y.; Chen, Y. Study on disaster information management system compatible with VGI and crowdsourcing. In Proceedings of the 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA), Ottawa, ON, Canada, 29–30 September 2014; pp. 464–468. [Google Scholar]
- Michael, F.; Goodchild, J.; Glennon, A. Crowdsourcing geographic information for disaster response: A research frontier. Int. J. Digit. Earth 2010, 3, 231–241. [Google Scholar]
- Chae, J.; Thom, D.; Jang, Y.; Kim, S.Y.; Ertl, T.; Ebert, D.S. Public behavior response analysis in disaster events utilizing visual analytics of microblog data. Comput. Graph. 2014, 38, 51–60. [Google Scholar] [CrossRef]
- Steiger, E.; Resch, B.; Zipf, A. Exploration of spatiotemporal and semantic clusters of Twitter data using unsupervised neural networks. Int. J. Geogr. Inf. Sci. 2016, 30, 1694–1716. [Google Scholar] [CrossRef]
- Miller, H.J.; Goodchild, M.F. Data-driven geography. GeoJournal 2015, 80, 449–461. [Google Scholar] [CrossRef]
- Gruebner, O.; Lowe, S.; Sykora, M.; Sankardass, K.; Subramanian, S.; Galea, S. Spatio-temporal distribution of negative emotions in New York City after a natural disaster as seen in social media. Int. J. Environ. Res. Public Health 2018, 15, 2275. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dahal, B.; Kumar, S.A.P.; Li, Z. Topic modeling and sentiment analysis of global climate change tweets. Soc. Netw. Anal. Min. 2019, 9, 24. [Google Scholar] [CrossRef]
- Wang, Z.; Ye, X.; Tsou, M.H. Spatial, temporal, and content analysis of Twitter for wildfire hazards. Nat. Hazards 2016, 83, 523–540. [Google Scholar] [CrossRef]
- Ye, X.; Li, S.; Yang, X.; Qin, C. Use of Social Media for the Detection and Analysis of Infectious Diseases in China. ISPRS Int. J. Geo Inf. 2016, 5, 156. [Google Scholar] [CrossRef] [Green Version]
- Zong, Q.; Yang, S.; Chen, Y.; Shen, H. Behavior of Social Media Users in Disaster Area under the Outburst Disasters: A Content Analysis and Longitudinal Study of Explosion in Tianjin 12(th) August 2015. J. Inf. Resour. Manag. 2017, 7, 13–19. (In Chinese) [Google Scholar]
- Wang, Y.; Wang, T.; Ye, X.; Zhu, J.; Lee, J. Using social media for emergency response and urban sustainability: A case study of the 2012 Beijing rainstorm. Sustainability 2016, 8, 25. [Google Scholar] [CrossRef]
- Saffari, A.; Leistner, C.; Santner, J.; Godec, M.; Bischof, H. On-line Random Forests. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), Kyoto, Japan, 27 September–4 October 2009. [Google Scholar]
- Griffiths, T.L.; Steyvers, M. Finding scientific topics. Proc. Natl. Acad. Sci. USA 2004, 101, 5228–5235. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2012, 3, 993–1022. [Google Scholar]
- Thiago, S.; Marcos, G.; Victor, R. Improving Random Forests by Neighborhood Projection for Effective Text Classification. Inf. Syst. 2018, 77, 1–21. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Okabe, A.; Satoh, T.; Sugihara, K. A kernel density estimation method for networks, its computational method and a GIS-based tool. Int. J. Geogr. Inf. Sci. 2009, 23, 7–32. [Google Scholar] [CrossRef]
- Myers, J.L.; Well, A.D.; Lroch, R.F. More About Correlation. In Research Design and Statistical Analysis, 2nd ed.; Debra, R., Ed.; Lawrence Erlbaum Associates: Mahwah, NJ, USA, 2003; p. 508. [Google Scholar]
- Piskorski, J.; Yangarber, R. Information Extraction: Past, Present and Future. In Multi-source, Multilingual Information Extraction and Summarization; Springer: Berlin/Heidelberg, Germany, 2013; pp. 23–49. [Google Scholar]
- Cao, X.; MacNaughton, P.; Deng, Z.; Yin, J.; Zhang, X.; Allen, J.G. Using twitter to better understand the spatiotemporal patterns of public sentiment: A case study in Massachusetts, USA. Int. J. Environ. Res. Public Health 2018, 15, 250. [Google Scholar] [CrossRef] [Green Version]
- Zou, L.; Lam, N.S.N.; Shams, S.; Cai, H.; Meyer, M.A.; Yang, S.; Lee, K.; Park, S.J.; Reams, M.A. Social and geographical disparities in Twitter use during Hurricane Harvey. Int. J. Digit. Earth 2019, 12, 1300–1318. [Google Scholar] [CrossRef]
- World Health Organization. WHO Characterizes COVID-19 as a Pandemic. Available online: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/events-as-they-happen (accessed on 11 March 2020).
Variables | Minimum Value | Maximum Value | Mean Value | Standard Deviation |
---|---|---|---|---|
Weibo texts number | 26 | 8257 | 2902.03 | 2270.665 |
Confirmed cases number | 1 | 31,728 | 1256.91 | 5395.346 |
Topics | Sub-Topics | |
---|---|---|
Precision (P) | 83% | 78% |
Recall (R) | 82% | 77% |
F1 | 82% | 76% |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, X.; Wang, J.; Zhang, M.; Wang, X. Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China. Int. J. Environ. Res. Public Health 2020, 17, 2788. https://doi.org/10.3390/ijerph17082788
Han X, Wang J, Zhang M, Wang X. Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China. International Journal of Environmental Research and Public Health. 2020; 17(8):2788. https://doi.org/10.3390/ijerph17082788
Chicago/Turabian StyleHan, Xuehua, Juanle Wang, Min Zhang, and Xiaojie Wang. 2020. "Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China" International Journal of Environmental Research and Public Health 17, no. 8: 2788. https://doi.org/10.3390/ijerph17082788
APA StyleHan, X., Wang, J., Zhang, M., & Wang, X. (2020). Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China. International Journal of Environmental Research and Public Health, 17(8), 2788. https://doi.org/10.3390/ijerph17082788