Next Article in Journal
Ensemble-Based Online Machine Learning Algorithms for Network Intrusion Detection Systems Using Streaming Data
Previous Article in Journal
Applications of Nonlinear Programming to the Optimization of Fractionated Protocols in Cancer Radiotherapy
Open AccessArticle

COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification

1
Department of Business Analytics, University of Charleston, Charleston, WV 25304, USA
2
Department of Applied Computer Science, University of Charleston, Charleston, WV 25304, USA
3
The William States Lee College of Engineering, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
4
Department of Urban and Regional Planning (URP), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
5
Department of Data Analytics, University of Charleston, Charleston, WV 25304, USA
6
Department of Education, Northeastern University, Boston, MA 02115, USA
*
Authors to whom correspondence should be addressed.
Information 2020, 11(6), 314; https://doi.org/10.3390/info11060314
Received: 28 April 2020 / Revised: 9 June 2020 / Accepted: 9 June 2020 / Published: 11 June 2020
(This article belongs to the Section Information Applications)
Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19’s informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naïve Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities. View Full-Text
Keywords: COVID-19; Coronavirus; machine learning; sentiment analysis; textual analytics; twitter COVID-19; Coronavirus; machine learning; sentiment analysis; textual analytics; twitter
Show Figures

Figure 1

MDPI and ACS Style

Samuel, J.; Ali, G.G.M.N.; Rahman, M.M.; Esawi, E.; Samuel, Y. COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification. Information 2020, 11, 314.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop