Next Article in Journal
Feature Extraction of Ship-Radiated Noise Based on Intrinsic Time-Scale Decomposition and a Statistical Complexity Measure
Previous Article in Journal
Modeling and Analysis of Self-Organizing UAV-Assisted Mobile Networks with Dynamic On-Demand Deployment
Open AccessArticle

Tweets Classification on the Base of Sentiments for US Airline Companies

by Furqan Rustam 1,†, Imran Ashraf 2,†, Arif Mehmood 1,*,†, Saleem Ullah 1 and Gyu Sang Choi 2,*
1
Department of Computer Science, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan, Punjab 64200, Pakistan
2
Department of Information & Communication Engineering, Yeungnam University, Gyeongbuk 38541, Korea
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2019, 21(11), 1078; https://doi.org/10.3390/e21111078
Received: 19 September 2019 / Revised: 31 October 2019 / Accepted: 31 October 2019 / Published: 4 November 2019
(This article belongs to the Section Multidisciplinary Applications)
The use of data from social networks such as Twitter has been increased during the last few years to improve political campaigns, quality of products and services, sentiment analysis, etc. Tweets classification based on user sentiments is a collaborative and important task for many organizations. This paper proposes a voting classifier (VC) to help sentiment analysis for such organizations. The VC is based on logistic regression (LR) and stochastic gradient descent classifier (SGDC) and uses a soft voting mechanism to make the final prediction. Tweets were classified into positive, negative and neutral classes based on the sentiments they contain. In addition, a variety of machine learning classifiers were evaluated using accuracy, precision, recall and F1 score as the performance metrics. The impact of feature extraction techniques, including term frequency (TF), term frequency-inverse document frequency (TF-IDF), and word2vec, on classification accuracy was investigated as well. Moreover, the performance of a deep long short-term memory (LSTM) network was analyzed on the selected dataset. The results show that the proposed VC performs better than that of other classifiers. The VC is able to achieve an accuracy of 0.789, and 0.791 with TF and TF-IDF feature extraction, respectively. The results demonstrate that ensemble classifiers achieve higher accuracy than non-ensemble classifiers. Experiments further proved that the performance of machine learning classifiers is better when TF-IDF is used as the feature extraction method. Word2vec feature extraction performs worse than TF and TF-IDF feature extraction. The LSTM achieves a lower accuracy than machine learning classifiers. View Full-Text
Keywords: text mining; text classification; sentiment analysis; supervised machine learning; ensemble classifier; long short-term memory network text mining; text classification; sentiment analysis; supervised machine learning; ensemble classifier; long short-term memory network
Show Figures

Figure 1

MDPI and ACS Style

Rustam, F.; Ashraf, I.; Mehmood, A.; Ullah, S.; Choi, G.S. Tweets Classification on the Base of Sentiments for US Airline Companies. Entropy 2019, 21, 1078.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop