A Deep Learning Approach for Sentiment Analysis of COVID-19 Reviews

: User-generated multi-media content, such as images, text, videos, and speech, has recently become more popular on social media sites as a means for people to share their ideas and opinions. One of the most popular social media sites for providing public sentiment towards events that occurred during the COVID-19 period is Twitter. This is because Twitter posts are short and constantly being generated. This paper presents a deep learning approach for sentiment analysis of Twitter data related to COVID-19 reviews. The proposed algorithm is based on an LSTM-RNN-based network and enhanced featured weighting by attention layers. This algorithm uses an enhanced feature transformation framework via the attention mechanism. A total of four class labels (sad, joy, fear, and anger) from publicly available Twitter data posted in the Kaggle database were used in this study. Based on the use of attention layers with the existing LSTM-RNN approach, the proposed deep learning approach signiﬁcantly improved the performance metrics, with an increase of 20% in accuracy and 10% to 12% in precision but only 12–13% in recall as compared with the current approaches. Out of a total of 179,108 COVID-19-related tweets, tweets with positive, neutral, and negative sentiments were found to account for 45%, 30%, and 25%, respectively. This shows that the proposed deep learning approach is efﬁcient and practical and can be easily implemented for sentiment classiﬁcation of COVID-19 reviews.


Introduction
Coronavirus disease (COVID-19) is a worldwide pandemic with negative consequences to human health [1]. It has spread to numerous countries across all continents since its initial discovery in Wuhan, China, and was declared as a pandemic by the World Health Organisation (WHO) on 11 March 2020 [2].
A lot of information on COVID-19 has been posted on various social media. However, there can be misinformation spread in a social media platform like Twitter and understanding people's sentiment from those textual resources is hence important [2]. Deep learning technologies can be useful for identifying sentiments from Twitter data on information regarding COVID-19. However, there remains a technological challenge as to how deep learning networks can be adopted and modified in this regard to achieve a good level of accuracy especially when considering the complexities entailed in textual analysis.
Bhat et al. [3] find that human sentiments play a vital role to understand their personal feelings. For example, human sentiments evaluator could help organisations what human are feeling confident in buying commodities which they think will be of importance in unprecedented time. Domenico et al. [4] reflect that understanding sentiments on social media can help organisations to revaluate their business architecture as the needs may change based on information and news that consumers are exposed to. A particular source of data that can facilitate such sentiment analysis is the Twitter, and which has also been considered in this paper due to its popularity as a forum for discussions related to COVID-19. Thus, the main objectives of this paper are to: (1) understand the extent deep learning approaches can identify sentiments of human emotions from day to day within lockdown situations, with the tweets categorised as reflecting positive, negative, or neutral sentiments, and (2) explore the various architectures of deep learning approach including different activation functions to evaluate performance and this is an aspect different from previous research.
Earlier studies have presented various approaches for analysing user sentiment in social media data [5][6][7][8][9]. For example, Wang et al. [6] present a hybrid machine learning algorithm to classify user sentiment into negative or positive. Pham and Le [7] apply the natural language processing (NLP) to understand user sentiment in a dataset entailing customer reviews. Aflakparast et al. [8] use a Bayesian graphical model to analyse Twitter information. However, these approaches can be computational complex, time consuming, and do not necessarily provide high accuracy.
Thus, this paper presents a deep learning approach for the sentiment classification of COVID-19 related tweets. Such an analysis of individuals' comments can help identifying positive and negative emotions, and indeed there is a focus on automatically finding texts with negative emotions in the literature [9,10]. It is common that the rectified linear activation function (ReLU) is used due to low complexity in training and the possibility of achieving superior performance. Most importantly, ReLU is commonly used to study nonlinear dependencies. Several studies have used ReLU in their neural network models due to their ability to assist the convolution neural networks in capturing complex patterns [11]. However, when using the ReLU function, there is a possibility of dying neurons which may limit the learning progress of a neural network. To overcome the limitations in neural networks, several architectures including Long-Short Term Model (LSTM) have been proposed. LSTM has been regarded as an important function as it helps to solve time series and sequential problems with impressive results [12]. Therefore, the proposed deep learning approach implements the LSTM activation function due to its ability to learn text sequence and find the relationship between a word to word or phrase to phrase in sentiment analysis [12]. Furthermore, it improves the semantic information of tweets and makes the learning model efficient. This results in better performance on some datasets as the network is not picking up the right volatility trends in the data as the activation function is not appropriate for the type of data under analysis. Similar studies have been conducted to find feature sets using machine learning SVM [13]. Deep learning approaches with integrated attention mechanism, as proposed in this article, hold high potentials in sentiment analysis and there is a need for further research in this respect. This paper addresses this research gap, particularly for a COVID-19 related tweet dataset. In addition, the LSTM and recurrent neural networks (LSTM-RNN) using the attention layer for feature mapping are introduced. We explore the approach on a publicly available dataset, containing COVID-19 related tweets, and categorise the Tweets based on various sentiments. The proposed deep learning approach shows a notable performance improvement in terms of accuracy (10%) and precision (10-12%) compared to approaches including Naïve Bayes, Random Forest, Support Vector Machine (SVM), Logistic Regression and LSTM-RNN.
The proposed deep learning approach improves the features weights by attention mechanism with semantic sequencing by LSTM which, to the best of the authors' knowledge has not been implemented similarly by previous approaches especially in understanding sentiments from COVID-19 related tweets. The main task is to optimise efficient weight by its semantic relation of words using attention learning with LSTM.
This paper is divided into six sections. Section 2 presents the related work which leads to the development of the tweet classification architecture and deep learning approach in Section 3 and Section 4 respectively. Section 5 presents the results and discussion. This is followed by the conclusion in Section 6.

Related Work
Several studies have focused on sentiment analysis using machine learning and deep learning approaches in different perspectives [7,[12][13][14][15][16][17][18][19]. Pham and Le [7] apply a RNNbased multilayer architecture to capture sentiments from customer reviews of different aspects of products. Their approach is used for analysing 174615 reviews of 1768 hotels from tripadvisor.com, revealing that it is promising for analysing sentiments and predicting ratings of the hotels [7]. Parimala et al. [14] conduct sentiment classification on tweets related to catastrophe events using a LSTM framework with a word embedding algorithm. They propose a risk assessment sentiment analysis (RASA) algorithm and show that the algorithm outperforms other approaches including XGBoost and binary classifiers [14].
Li et al. [15] use a combination of unsupervised and supervised learning approaches, with the unsupervised model using semantic features to characterise feature transformation for the supervised neural nets. Their approach performs very well in identifying emotions in a social context [15]. Xiong et al. [16] present an integrated approach that jointly conduct sentiment and topic analysis from short texts. Hassan and Mahmood [17] apply a combination of CNN and RNN frameworks for sentiment analysis. To train initial word embeddings, they have used an unsupervised NLP model, pre-trained on a large database; and then they exploit the advantage of CNN's capability in extracting features and RNN's ability to capture dependencies to identify sentiments across datasets [17]. The result shows that the approach is capable of increasing classification efficiency significantly [17].
Jongeling et al. [19] explore sentiment analysis tools like SentiStrength, NLTK, Stanford, and Alchemy match manual sentiment labelling. The authors found that none of these tools correspond to manual labelling well [19]. However, NLTK is found to be performing better than SentiStrength [19]. Rani and Singh [20] conduct sentiment analysis through SVM on data collected from Twitter. Extracting features through an implementation of TF-IDF approach, the sentiments are identified using two SVM models and it is found that linear SVM is more accurate than kernel SVM in this respect [20]. Jagdale et al. [21] apply machine learning algorithm to classify reviews that are positive and negative. Arras et al. [22] explore an improved information propagation across neural network layers for sentiment analysis.
Gupta and Joshi [23] use a hybrid approach, involving SentiWordNet (SWN)-based feature vectors and SVM classifier, for sentiment analysis on a Twitter dataset. Du et al. [24] apply hierarchical machine learning to extract sentiments from opinions about HPV vaccines on Twitter and note the hybrid technique of machine learning and lexicon-based approaches as highly efficient. Geetha et al. [25] forecast sensitive tweets on a dataset involving 280,000 tweets and utilising 23 keywords. They use auto-encoders optimised by word embedding strategies, with tweets labelled manually and the sensitivity of the tweets modelled using an RNN with the framework validated using different activation functions like softmax, sigmoid, and ReLU [25]. They note the framework with softmax and ReLU as achieving a high accuracy in recognising tweets of sensitive nature. Hosseini et al. [26] test CNN's capability in conceptualising semantics from training data and highlight some limitations of the model.
Recently, there has been a significant increase in the use of machine learning to tackle the concerns posed by the recent pandemic [27][28][29][30]. For example, robotics and drone technologies have been used to aid the healthcare system, conduct surveillance and disinfection, among other things [30]. Ghimire et al [27] conduct a systematic study on various AI and the Internet of Things (IoT) approaches for diagnosing, predicting death rate, developing drugs and vaccines, and analysing sentiments linked to COVID-19.
Siedlikowski et al. [28], use Chloe, an AI-based digital information tool developed for COVID-19, to indicate the potential of such systems in proactive collaboration across sectors towards disease management and forming public awareness [28]. Dhakal et al. [29] develop an intelligent voice assistant to assist individuals in self-diagnosing coronavirus symptoms. Khan et al. [30] argue that implementing smart technologies can minimize the detrimental impacts of COVID-19 infection.
Mujahid et al. [31] investigate machine learning approaches along with text processing tools for understanding individuals' sentiments about e-learning during the lockdown following COVID- 19 The results reveal the random forest and SVM classifier as potential classifiers in this respect [31]. Sawik and Płonka [32] present the applicability and potential of various data visualisation tools and applications in various COVID-19 related aspects including location tracking, quarantine management, and travel management.
The discussion above shows that different approaches have been proposed in literature to address COVID-19 related issues, as well as sentiment analysis of social media data. Some of these approaches have applied CNN architecture, which have limitations that can affect its prediction accuracy [26]. There are also other works which cover the potential and limitations of different machine learning approaches and deep learning architectures in various domains further to works exploring the attention mechanisms in deep learning [33][34][35][36][37][38][39][40][41][42][43]. For semantic analysis, using a LSTM-RNN ATTENTION approach can be helpful in better conceptualising word sequence and enhancing performance of classifiers in detecting sentiments. Thus, this paper proposes a deep learning approach for sentiment analysis of COVID-19 related tweets.

The Sentiment Classification Architecture
This section presents the sentiment classification architecture for tweet data analysis. The architecture consists of five steps. The first step involves inputting the text dataset of tweets. The next step is to pre-process tweets' text and reduce noise by reducing unwanted characters and symbols from the text. In the pre-processing step, the text is changed into a value based on its frequency. In the fourth step, feature mapping is performed, and weights assigned by LSTM-RNN mapping features are changed to reduce overlapping between features. Improving weights by using the attention mechanism helps to select relevant information. In the fifth step, a classifier is applied in the proposed approach using a basic softmax classifier. The performance of the proposed approach is measured in terms of its accuracy, precision, and recall. In brief, the main emphasis of the proposed approach is on feature selection or feature engineering by the LSTM-RNN function and attention layer, in which the attention layer assigns efficient weights. The proposed approach and the framework provide efficient ways of selecting and weighting efficient features in nonlinear space. Figure 1 shows the sentiment classification architecture. Attention layer is useful in deep learning as it can enhance the performance of neur networks and brought many developments in the field [39,[41][42]. The LSTM model ca capture "long term dependencies" within sequences of words [40]. The attention mech nism allows the layer to emphasise upon a specific part of the input sequence that is high important [41]. Such a mechanism can be useful when combined with LSTM since th can facilitate focus on parts of sentences or documents upon which textual analysis, lik as in this project, is undertaken. The attention mechanism aims at dividing the comple tasks into smaller areas of attention that are further processed in a sequence. The mod Attention layer is useful in deep learning as it can enhance the performance of neural networks and brought many developments in the field [39,41,42]. The LSTM model can capture "long term dependencies" within sequences of words [40]. The attention mecha-nism allows the layer to emphasise upon a specific part of the input sequence that is highly important [41]. Such a mechanism can be useful when combined with LSTM since that can facilitate focus on parts of sentences or documents upon which textual analysis, like as in this project, is undertaken. The attention mechanism aims at dividing the complex tasks into smaller areas of attention that are further processed in a sequence. The model performs visual attention in the same manner as a human brain does by splitting a complex problem into smaller units and focusing to solve them one by one.
The use of attention mechanisms has been popular in recent years due to its ability to improve the performance of neural networks by focusing on important parts of a sequence and therefore, is widely used in applications like emotion recognition, speech recognition, sentiment analysis, and machine translation [39,41,42]. Another benefit that attention mechanisms offer is that these models can be applied to complex problems flexibly and effectively [41]. By using the attention mechanism, the global information is easily captured by a decoder instead of solely information that is based on a hidden state [41,42].
Such a mechanism, especially when mapping a query, and pairing with output vectors can lead to good classification performance [42].

Methodology
This section presents the methodology adopted in this project for both assessing the potential of the proposed deep learning approach and comparing it with other classifiers in sentiment classification of COVID-19 tweets.

Dataset
COVID-19 tweets were downloaded from the Kaggle website (https://www.kaggle. com/gpreda/covid19-tweets, accessed on 26 November 2021). The dataset has over 170k tweets, which were collected and tested with the proposed algorithm. The experiment focused on sentiment analysis of tweets to analyse accurate and precise feelings regarding the COVID-19 pandemic. The dataset has different columns, including user id, location, and other details, along with the tweet and ratings from other users, such as the number of users who have tagged the tweet as favourite. Among these, we particularly focus on two aspects-the tweet text and the ratings by users, i.e., the column labelled user favourites.
We converted the user-favourite column into sentiment classes as follows: user_favourite value of 0-100 reviews = negative; 100-2000 reviews = neutral; and 2000 and above reviews = positive. These classification labels, along with the features created, as discussed in the next phase, are the samples fed to the classification models utilised in this experiment.

Attribute and Train/Test Dataset Formation
The first phase adopted in this experiment entails cleaning and pre-processing the text data to eliminate redundancies by considering the following measures:

•
In both uppercase characters, the formatting is translated into lowercase • All internet slangs are removed • Removed all the words that can be safely skipped from the list such as a, an, etc.

•
Removed white spaces such as blank and empty spaces between words • The redundant terms are compressed such as repetition of words • The text of the hash tags is kept as it is The second phase then concerns feature extraction by (a) listing term occurrence as well as the frequency of term occurrence, and ordering them, and disregarding grammar such as spelling mistakes, and (b) using machine learning classifiers to process the feature vectors. Here, the Term Frequency-Inverse Document Frequency (TF-IDF) is calculated by using, following [5,20,36]: This results in a matrix, which is split and vectorised into two datasets: the training dataset and test dataset. We then use multiple machine learning classifiers, including deep learning, random forest, and SVM, adopt hyperparameter tuning for the training dataset, and test the classifiers' performance on the test data.

Additional Steps in LSTM-RNN
As previously indicated, the purpose of this project is to explore a deep learning architecture involving attention layers to understand sentiment classification accuracy from tweet data. Thus, for the proposed LSTM-RNN architecture, there are some additional processing steps. First is the convolution of features. In this phase, the input COVID-19 tweets are fed into the LSTM-RNN model. This stage aims to extract high-level semantic features from the sequence of words. The LSTM-RNN model also finds the temporal relationship between features and generates feature vectors.
Additionally, we consider the semantic meaning of the tweets and create a secondary set of labels with values assigned to the emotion expressed in those tweets to train the dataset: fear = 0, sad = 1, anger = 2, and joy = 3. Acknowledging that this assignment of emotion labels is subjective, the purpose of doing so is to create a matrix layer that can be fed to LSTM-RNN towards achieving sentiment analysis outcomes. Figure 2 shows that the RNN model takes the sequence of pixels x = x 1 , x 2 , . . . . . . , x n , produces hidden states H = H 1 , H 2 , . . . . . . , H n , and outputs states O = O 1 , O 2 , . . . . . . , O n in the following way [41,42]: Appl. Sci. 2022, 12, x FOR PEER REVIEW 7 of Here, represents the vector from the hidden unit and output unit , represents the hidden unit for a − 1 pixel sequence, is a weight vector from th hidden unit to for the sequence time , and and are biases.  indicates the state of the cell, and is the weight matrix. Both RNN an LSTM models can be combined to extract semantic features from the input tweets.
Notably, in the experimentation, we apply attention layer for improving the learnin from features and improving the feature weights, as also noted in [39]. LSTM-RNN is use for learning sequence of sentences and generating features weighted by attention proces Here, W H t O t represents the vector from the hidden unit H t and output unit O t , H t−1 represents the hidden unit for a t − 1 pixel sequence, W H t−1 H t is a weight vector from the hidden unit H t−1 to H t for the sequence time t, and b H t and b t are biases. Figure 2 above presents a graphical model for RNN and proposed changes to indicate the improved part of the LSTM-RNN. Furthermore, the LSTM stack can be used to learn the time sequence features in which the model learns the problems comprised of a single series of observation and a model is required to learn from the series of past observations to predict the next value in sequence [12,35]: Ce t = ce t−1 f g t + ig t p t (8) where ig t represents the input gate, p t indicates the prediction in starting layers, f g t represents the forget gate, H t provides information on output, b ig , b p , b f g , b op are bias vectors, Ce t indicates the state of the cell, and W xx is the weight matrix. Both RNN and LSTM models can be combined to extract semantic features from the input tweets.
Notably, in the experimentation, we apply attention layer for improving the learning from features and improving the feature weights, as also noted in [39]. LSTM-RNN is used for learning sequence of sentences and generating features weighted by attention process. Further the use of secondary labels combined with LSTM-RNN facilitate the increase of domain knowledge in the learning process. As indicated in the literature [41], the focus function of the attention layer tests weight distribution and estimates an array for the different layers. Thus from X i as input, f(X i , X i+1 ) are the features generated from the second layer and f(X i , X i+1 ,· · ·, X i + L − 1) from the L-th layer. These feature values indicate the responses of multi-scale n-grams [23,41], i.e., unigram X i , bigram X i X i+1 . and L-gram X i X i+1 · · · X i + L − 1. In the focus mechanism, the filtered ensemble and reweight scale are used in unison [41,43]. In addition, scale reweight is used to compute SoftMax distribution of attention weights using the descriptors as data, and outputs weighted attribute weights for reweighing [41,42].
Three performance metrics, namely, accuracy, precision, and recall, are used to measure the performance of the proposed deep learning approach.In the proposed approach, the novelty is the improvement in feature weighting by using the attention layer mechanism. The proposed approach retrieves text data from a sequence mapped by LSTM-RNN; the LSTM generates a sequence of annotations for each input. The vectors used in this work are basically the concatenation of hidden states in the encoder, and then features are refined by the attention layer mechanism. The attention mechanism helps in feature weighting, and this is further improved by the softmax activation function.

Accuracy
Accuracy is represented by: where P o represents rightly labelled positive samples, Ne represents negative samples properly labeled, Fp o represents incorrectly categorised negative samples, and FNe represents incorrectly categorised positive samples.

Precision
Precision is represented by:

Recall
Recall is represented by:

Results and Discussion
This study considers three classification techniques: SVM, random forest, and the proposed approach using the attention layer. The parameters of the classifiers are tuned using grid search. Table 1 shows the parameter tuning outcomes of SVM-RBF, which has two central hyperparameters: C and gamma. These parameters balance generalisation and overfitting. The gamma value helps in reducing the curvature; otherwise, the data accumulation will increase, which will reduce the polynomiality of the learning process. To reduce this overfitting and data mugging, less curvature is balanced by a lower C value. Table 1 shows that low gamma and high C generated better performance compared to others. The SVM hyperparameter tuning shows that at C = 10 and gamma = 0.01, the accuracy is 62.13%, precision is 63.12%, recall is 62%, and F-score is 63%, which is the best model for this classifier for the considered parameters.  Table 2 shows hyperparameter tuning of random forest. The hyperparameters considered are max depth, estimators, and split. These parameters denote the depth of trees, the maximum number of trees generated during the model development, and the minimum number of data points in nodes before a split occurs. As noted in the table, when the minimum split is 5, the maximum depth is 10, and the number of estimators is 600, the model's accuracy is 60%, precision is 61%, recall is 62%, and F-score is 60%. This is the best model for the variation of parameters considered.  Table 3 similarly shows hyperparameter tuning for attention layers with Leaky ReLU as the activation function and with different numbers of attention layers. In these experiments, the number of CNN layers was fixed because CNN layers increase resource usage exponentially. The number of attention layers varies from 1 to 8. As noted in the table, the model's performance increases when the number of attention layers increases from 1 to 4, after which the performance degrades. For an attention layer number of 4, the model's accuracy is 86.12%, precision is 84.23%, recall is 85.23%, and F-score is 85.12%. This performance is better than the other observed results. Table 3 shows that the best deep learning approach is based on the network with four attention layers, which provides the best accuracy, precision, recall, and F-score.  Table 4 shows the activation function results, which clearly show that four attention layers with the activation function Leaky ReLU provided higher accuracy, precision, recall, and F-score in comparison to other activation functions. It can be seen that for the Leaky ReLU activation function, the accuracy is 85.12%, precision is 82.12%, recall is 84.13%, and F-score is 84.12%, which is considerably higher than the accuracy shown by ReLU, which is 84.56% with a precision of 82.34%, recall of 82.12%, and F-score of 81.23%. It can be observed that Leaky ReLU has better accuracy, recall, and F-score results compared to other activation functions considered: TANH, sigmoid, and ReLU. Leaky ReLU is an activation function that influences forward and backward training in LSTM-RNN and controls error using a backpropagation approach. This mechanism potentially has a direct impact on the proposed model's classification performance and error reduction. We further tested the hyperparameter tuning of our proposed approach with four attention layers and Leaky ReLU using epochs to determine which hyperparameters provide the best results. Table 5 and Figure 3 show that epoch 18 generated the best results.  In Table 6, we compare the proposed approaches with other machine learning and deep learning approaches, including the naïve Bayes, logistic regression, and LSTM-RNN with the base hyperparameters, SVM, and random forest with the best-identified hyperparameters. In comparison, the proposed deep learning approach was found to perform better than the other existing approaches, with an accuracy of 84.56%, precision of 82.34%, recall of 82.12%, and F-score value of 81.23%.

Conclusions
Over the years, several approaches have been developed for sentiment analysis of social media data. This sentiment analysis process is usually complex and time-consuming due to the huge amount of data and the requirement to achieve a high level of accuracy. Thus, this paper presents a deep learning approach for sentiment analysis of Twitter In Table 6, we compare the proposed approaches with other machine learning and deep learning approaches, including the naïve Bayes, logistic regression, and LSTM-RNN with the base hyperparameters, SVM, and random forest with the best-identified hyperparameters. In comparison, the proposed deep learning approach was found to perform better than the other existing approaches, with an accuracy of 84.56%, precision of 82.34%, recall of 82.12%, and F-score value of 81.23%.

Conclusions
Over the years, several approaches have been developed for sentiment analysis of social media data. This sentiment analysis process is usually complex and time-consuming due to the huge amount of data and the requirement to achieve a high level of accuracy. Thus, this paper presents a deep learning approach for sentiment analysis of Twitter data on COVID-19 reviews. The algorithm is based on an LSTM-RNN-based network and enhanced featured weighting by an attention layer. This algorithm uses an enhanced feature transformation framework via the attention mechanism. A total of four class labels (sad, joy, fear, and anger) from publicly available Twitter data posted in the Kaggle database were used in this study. In comparison with current approaches, the proposed deep learning approach significantly improved the performance metrics, with increases of 20% in accuracy and 10% to 12% in precision but only 12-13% in recall. Out of a total of 179,108 COVID-19-related tweets, tweets with positive, neutral, and negative sentiments were found to account for 45%, 30%, and 25%, respectively. Overall, the proposed deep learning approach is found to be efficient and practical and can be easily implemented for sentiment classification of COVID-19 reviews.
This study provides theoretical and practical implications. For theoretical implications, this study applies a deep learning approach for sentiment analysis of individuals from Twitter data on information regarding COVID-19. This deep learning approach can be further applied to sentiment analysis of a general decision-making problem in various industries such as marketing, government, service and academic. The implication of this study is evident also from the recent COVID-19 situations when information expressed over the social media have affected public sentiments [44]. For such practical implications, this study suggests that the proposed deep learning approach can be adopted and modified for achieving a good level of accuracy especially when considering the complexities entailed in textual analysis.
This study is not free from limitations. Feature weighting and feature mapping were applied to the original dataset, and other features that are noisy, as well as a combination of these factors, may affect the classification outcomes. In future work, the deep learning approach can be designed to optimise the features in an iterative process. It can also be enhanced to work so that topic detection and sentiment classification are performed simultaneously.