Modiﬁed Aquila Optimizer with Stacked Deep Learning-Based Sentiment Analysis of COVID-19 Tweets

: In recent times, global cities have been transforming from traditional cities to sustainable smart cities. In text sentiment analysis (SA), many people face critical issues namely urban trafﬁc management, urban living quality, urban information security, urban energy usage, urban safety, etc. Artiﬁcial intelligence (AI)-based applications play important roles in dealing with these crucial challenges in text SA. In such scenarios, the classiﬁcation of COVID-19-related tweets for text SA includes using natural language processing (NLP) and machine learning methodologies to classify tweet datasets based on their content. This assists in disseminating relevant information, understanding public sentiment, and promoting sustainable practices in urban areas during this pandemic. This article introduces a modiﬁed aquila optimizer with a stacked deep learning-based COVID-19 tweet Classiﬁcation (MAOSDL-TC) technique for text SA. The presented MAOSDL-TC technique incorporates FastText, an effective and powerful text representation approach used for the generation of word embeddings. Furthermore, the MAOSDL-TC technique utilizes an attention-based stacked bidirectional long short-term memory (ASBiLSTM) model for the classiﬁcation of sentiments that exist in tweets. To improve the detection results of the ASBiLSTM model, the MAO algorithm is applied for the hyperparameter tuning process. The presented MAOSDL-TC technique is validated on the benchmark tweets dataset. The experimental outcomes implied the promising results of the MAOSDL-TC technique compared to recent models in terms of different measures. This MAOSDL-TC technique improves accuracy and interpretability of sentiment prediction.


Introduction
Social media platforms play an important part during extreme crises as individuals use these communications media to share feedback, sentiments, thoughts, and reactions with other people to manage and respond to crises [1].Thus, this study focuses on explorative collective reactions to events expressed on social platforms [2].Special consideration will be given to analyzing the public's responses to worldwide medical-relevant events, particularly the pandemic, described through Twitter's social network, due to its widespread reputation and ease of access utilizing the application programming interface (API) [3].Sentiment analysis (SA) is a kind of technique employed to represent, separate, or define personal data like ideas communicated in a given content, depending on natural language processing (NCP) and computational methods [4].The major goal of SA is to define the author's feelings as negative, positive, or neutral regarding different subjects [5].To evaluate the effects of social media information relevant to the COVID-19 pandemic, research associated with people's opinions on medical information and applications gained major significance [6].In particular, text analysis of Twitter information has been the emphasis in several reviews, allowing researchers to analyze massive instances of user-defined content to find views, which can inform decision-making and earlier reaction mechanisms [7].The Twitter platform has been undergoing a large infusion of data relevant to COVID-19 problems [8].For SA, researchers have been using different kinds of textual documents such as Facebook posts and tweets [9].
Several research works on SA using social media data are available in the literature [10].Identification of such sentiments from social media can support respondents in comprehending network dynamism, for example, panics, users' important problems, and emotional impacts on members' skills [11].This study aims to examine the application of deep learning (DL) methods and natural language processing (NLP) approaches, namely SA, to support policymakers and communities to avoid the growth of misleading information, incitement of insurrection, and fake news [12].SA or public view mining can be described as a way of employing machine learning (ML) and NLP for the classification of sentiments and subjective data [13].SA is defined as the most common research field in the domain of NLP as it provides the ability to study and analyze sentiments that are expressed by various individuals [14].
This article introduces a modified aquila optimizer with stacked deep learning-based tweets classification (MAOSDL-TC) technique for text SA.The presented MAOSDL-TC technique incorporates FastText, an effective powerful text representation approach used for the generation of word embeddings.Furthermore, the MAOSDL-TC technique utilizes an attention-based stacked bidirectional long short-term memory (ASBiLSTM) model for the classification of sentiments that exist on Twitter.To improve the detection results of the ASBiLSTM model, the MAO algorithm is applied for the hyperparameter tuning process.The presented MAOSDL-TC technique is validated against the benchmark of a COVID-19 tweets dataset.

Related Works
Qorib et al. [15] downloaded public tweets day-to-day from Twitter using the Tweeter API and pre-processed and labelled them.Vocabulary normalization was based mainly on the stemming and lemmatization processes.The NRCLexicon method was used to transform tweets into 10 different classes.A T-test was deployed to check the statistical point of the relationship between the sentiments.Lastly, neural networks including the bidirectional encoder representations from transformers (BERT), 1-dimensional convolutional neural network (1DCNN), long short-term memory (LSTM), and multilayer perceptron (MLP) were tested and trained.In [16], an approach was introduced that was designed to provide an ensemble module where the advantages of automatic feature extraction and handcrafted features were linked through ML and DL algorithms.Before training ML techniques, unstructured information was attained, pre-processed, and annotated using VADER and TextBlob.Sunagar et al. [17] implemented the tweet classification of COVID-19 datasets via implementing DL approaches.The algorithm was executed using two-word embedding methods such as Global Vector for Word2Vec and Word Representation (GloVe).
In [18], the researchers presented an NLP technique based on the bidirectional LSTM (BiLSTM) method to implement sentiment classification and detect several problems related to public sentiment on COVID-19.BiLSTM is an enhanced version of classical LSTM to generate the outputs from right and left contexts at every time step.This enabled authorized institutions utilizing this model to alleviate the effect of negative messages and to understand the people's concerns.Tatineni et al. [19] presented a technique to evaluate the emotion of live tweets.The technique comprised a dashboard with different functionalities.The central dashboard had a clickable map of India that illustrated statewide data visualization and had country-wide data visualization of the emotion drawn from Twitter.Live emotion prediction of tweets can be accomplished using the DL techniques.Tweet fetching is a dynamic to obtain new data automatically.Vaddadi et al. [20] developed a technique that used automated implementation to extract details regarding COVID-19 from the up-to-date tweet data.SA uses LSTM, which is a kind of recurrent neural network (RNN) employed by using Twitter's COVID-19 hashtags to see people's reactions to the outbreak.Then, the tweet datasets are categorized and labelled as positive, negative, and neutral and the results visualized.
Chakraborty et al. [21] presented SA on the amount of tweets gathered on COVID-19.In the beginning, they analyzed the trends of public sentiments related to COVID-19 using the n-gram analysis and evolutionary classification.Next, the sentiment rating was calculated on gathered tweets based on the class.Lastly, the LSTM model was trained through two classes of rated tweets to forecast sentiment on the COVID-19 dataset.Tawfik and Makhlouf [22] analyzed public opinions on the program of vaccination against COVID-19.To achieve this, an ensemble mechanism based on DL was established, which fused LSTM and bidirectional gated recurrent unit (BiGRU).The accuracy of the presented algorithm was compared with five different ML techniques, and two DL algorithms using advanced approaches.
Raheja and Asthana [23] implemented an SA of tweets in lockdown utilizing a multinomial LR approach.The presented methodology design followed the pre-processed, polarity and scoring, and extracting features before executing the ML approach.In [24], a novel algorithm was presented for automatic sentiment classification of COVID-19 tweets utilizing ANFIS approaches.Jain et al. [25] purposed to analyze the performance of many classification techniques that demand an input value and identified to which resultant classification they belong.Six ML approaches, two ensemble systems, and four DL methods were utilized for this work.In [26], the R programming language was used to conduct an investigation of Twitter data.During this case, the authors planned a method named Hybrid Heterogeneous SVM (H-SVM) and carried out the sentiment classification and categorized tweets as negative, neutral, and positive.

The Proposed Model
This article is concentrated on the improvement of the MAOSDL-TC technique for text SA.The MAOSDL-TC technique mainly concentrates on the recognition and categorization of different kinds of sentiments in COVID-19 tweets.In the presented MAOSDL-TC technique, the following set of processes are involved, namely pre-processing, FastText, ASBiLSTM-based classification, and MAO-based parameter selection.Figure 1 depicts the workflow of the MAOSDL-TC algorithm.

Data Pre-Processing and Word Embedding
Text preprocessing is the technique used to clean the original text data.A robust text pre-processing technique is crucial for applications of NLP tasks.After preprocessing, the attained text components act as key elements of input that are fed into the processing of textual data.Preprocessing consists of different approaches for translating the original texts using a well-defined method: special characters or symbols, lemmatization, elimination of stopwords, lexical analysis (ignore case sensitivity, word tokenization, and removal of punctuation).Afterwards, the FastText method was employed for the processing of word embedding.FastText is a widely used text representation method that generates word embeddings that are a dense vector representation of words.This embedding captures the semantic meaning of an individual word and its subword information and morphological structure.Particularly, this makes FastText more effective in handling out-of-vocabulary words and capturing the relationship between words with related prefixes or suffixes.FastText works by assuming a word is a mixture of subword units (character n-grams).This technique enables it to create embedding for known and unknown words by leveraging the subword component.

Data Pre-Processing and Word Embedding
Text preprocessing is the technique used to clean the original text data.A robust tex pre-processing technique is crucial for applications of NLP tasks.After preprocessing, th attained text components act as key elements of input that are fed into the processing o textual data.Preprocessing consists of different approaches for translating the origin texts using a well-defined method: special characters or symbols, lemmatization, elimin tion of stopwords, lexical analysis (ignore case sensitivity, word tokenization, and r moval of punctuation).Afterwards, the FastText method was employed for the processin of word embedding.FastText is a widely used text representation method that generate word embeddings that are a dense vector representation of words.This embedding cap tures the semantic meaning of an individual word and its subword information and mo phological structure.Particularly, this makes FastText more effective in handling out-o vocabulary words and capturing the relationship between words with related prefixes o suffixes.FastText works by assuming a word is a mixture of subword units (character n

Tweet Data Classification Using ASBiLSTM Model
Once the tweets are preprocessed, classification takes place using the ASBiLSTM model.In this study, we used the ASBiLSTM model as an essential element of the presented method, which has the benefit of simultaneously extracting temporal features of time series [27].The BILSTM is an augmentation of the LSTM.The LSTM is a kind of RNN, which overcomes the problems of vanishing gradient from RNN through the inclusion of a gating module.In comparison with RNN, LSTM is composed of memory cells, forget, input, and outputs, in which the cell memory is liable to store the overview of historical input series, and the gate modules control the flow of information between the input and output datasets.LSTM aids efficient learning of long-term temporal dependency relationships by taking their well-developed structure into account.
Consider c t−1 as the memory cell state of the prior t−1 time step, an input vector x t at t time steps, and h t−1 indicates the hidden layer of the prior t−1 time step.f t , i t , and 0 t show the gate vector that controls how much data is to be forgotten, updated, and output from the memory cell, correspondingly.The operation of LSTM was formulated by the following expression: From the expression, the Tanh function ensures that the value of HL remains in [−1, 1] the interval.σ(•) indicates the sigmoid function; the symbol shows the pointwise multiplication.The learnable parameters W and b are weight and deviation during the training model, respectively.
BILSTM combines a bidirectional conceptualization into LSTM that exploits forward and backward LSTM for feature extraction and concatenates respective hidden features for extracting patterns or bidirectional features.Accordingly, BILSTM attains context data in the previous observation for the entire input.This bidirectional extraction on the time series simplifies the capture of backwards and forward temporal attributes in wind power-related data considering the variation patterns.With the context feature, BiLSTM allows a hybrid model for wind power-related data to attain feature extraction capabilities and better representation that enables more accurate and efficient prediction of future observation by leveraging past observation.
Particularly, BILSTM trains its parameters in backward and forward paths to realize the context.During the backward layer, the LSTM estimates the derivation of transmission errors in the forward layer.The LSTM updates the parameters from the conventional way in the forward layer.Considering an input of length T, the operational procedures are shown below: where H t indicates the hidden layer (HL) of BILSTM at time step t, → h t and ← h t signifies the HL in the forward and backward layers at timestept.
In ASBiLSTM, the attention module was used to optimize the prediction outcomes.Figure 2 signifies the framework of ASBiLSTM.The attention module is a weighting quantity of sequences that allocates great weight to targets with higher correlation.An attention module minimizes the loss of prior datasets and extracts relevant information by highlighting the contribution of the most powerful and useful parts of the input to the outputs.In the DL technique, the attention module allocates weight to the output of BiLSTM by mapping the weights and the learning parameter matrix can be focused on the input that contributes to the outputs.
Figure 2 signifies the framework of ASBiLSTM.The attention module is a weighting quantity of sequences that allocates great weight to targets with higher correlation.An attention module minimizes the loss of prior datasets and extracts relevant information by highlighting the contribution of the most powerful and useful parts of the input to the outputs.In the DL technique, the attention module allocates weight to the output of BiLSTM by mapping the weights and the learning parameter matrix can be focused on the input that contributes to the outputs.As shown in Equations ( 3) to ( 5), a series of outputs  ,  , … ,  through the HL of BILSTM are fed as input to the attention model, and the distribution of attention weights is attained.Equation ( 5) indicates the accomplishment of the last state of the attention mechanism.Equation ( 4) shows the computation of attention weight by standardizing the score.Equation (3) defines the computation of similarities or correlations between the input and output features.
where  and  signify the weighted coefficient of the parameter learned in the training model. indicates the distribution probability at  time steps. shows the bias.

Hyperparameter Tuning Using MAO Algorithm
The MAO algorithm can be applied in this work for the hyperparameter tuning of the ASBiLSTM module.The AO mainly depends upon the prey-grabbing nature of the Aquila.AO is a population-based algorithm which exhibits its effectiveness in the field of complex and nonlinear optimization in a short period of time.The classical AO principally focuses on five significant steps namely initialization, expanded exploration, narrowed exploration, expanded exploitation and narrowed exploitation.
An MAO was introduced in this study [28].By modifying the SCF from IAO, MAO was inspired to make further amendments to the AO.However, the convergence properties of SCF decelerate the accuracy of the epochs in IAO.These properties may be As shown in Equations ( 3) to ( 5), a series of outputs H 1 , H 2 , . . ., H t through the HL of BILSTM are fed as input to the attention model, and the distribution of attention weights is attained.Equation ( 5) indicates the accomplishment of the last state of the attention mechanism.Equation ( 4) shows the computation of attention weight by standardizing the score.Equation (3) defines the computation of similarities or correlations between the input and output features.
where V e and W signify the weighted coefficient of the parameter learned in the training model.e i indicates the distribution probability at ith time steps.b shows the bias.

Hyperparameter Tuning Using MAO Algorithm
The MAO algorithm can be applied in this work for the hyperparameter tuning of the ASBiLSTM module.The AO mainly depends upon the prey-grabbing nature of the Aquila.AO is a population-based algorithm which exhibits its effectiveness in the field of complex and nonlinear optimization in a short period of time.The classical AO principally focuses on five significant steps namely initialization, expanded exploration, narrowed exploration, expanded exploitation and narrowed exploitation.
An MAO was introduced in this study [28].By modifying the SCF from IAO, MAO was inspired to make further amendments to the AO.However, the convergence properties of SCF decelerate the accuracy of the epochs in IAO.These properties may be responsible for certain challenges in searching for an optimum result.To overcome these challenges, a modified version of IAO was introduced that integrates a modified search control factor (MSCF) that is particularly adapted to the 2nd and 3rd search processes.The subsequent section provides a detailed description of the MAO technique, which highlights certain modifications that were made and their effects on the optimization technique.The MSCF is used to control the search range, which reduces movement of the Aquila in terms of epochs.Accordingly, compared to the prior SCF, the search space is considerably narrower.Furthermore, the optimum solution is found considerably more quickly than in the prior technique.The modified MSCF is shown as follows: where t denotes the existing iteration and T shows the maximal iteration.The r parameter shows a random integer ranging from zero to one, where dir indicates the direction control factor.These factors play a major role in controlling the fight direction of the Aquila.
The MSCF function aims to attain fast convergence by restricting the movement of the Aquila.Furthermore, it decreases optimization latency.The modified technique needs less time to recognize the optimum solution set than the original AO.Both optimization approaches were performed with sizes of 250 and 250 epochs.
With the incorporation of the MSCF function, the presented technique includes four different search stages that are discussed in the following: Step 1: Vertical Dive Attack (S 1 ) The Aquila begin its hunting by identifying the target region and selecting the optimum hunting position by swooping high in the air.These attacks are called vertical dive attacks and are expressed as follows: In Equation ( 8), S 1 (t + 1) denotes the solution candidate of (t + 1) epochs, r shows the random integer in [0, 1] the interval, and S best (t) shows the better solution attained to the ith generation.(1 − Tt) is used for controlling the search region.Now, S(t) denotes the mean value of the existing solution to ith epochs.
Step 2: Modified Full Search with a Short Glide Attack (MS) Before attacking the prey, the Aquila comprehensively searches the solution space via different directions and speeds, in what is called a full search with shorter glide attacks that can be shown as follows: In Equation ( 9), x, and y correspond to the positions or coordinates of the point making the spiral shape during the search step, r indicates the random integer within [0,1], and MCF (t) denotes the modified search control factor.Rather than applying the Levy flight (LF) distribution, we integrated MSCF to eliminate the problems of getting stuck in a locally optimal solution.
Step 3: Modified Search Around Prey and Attack (MS) The prey's region is located accurately after the MS 2 search step.The Aquila thoroughly explores around the target, and with pseudo attacks, recognizes the prey's reaction in what is called a search around prey and attack.
In Equation (10), S R (j) denotes the random set of solutions and MS 3 (i, j) indicates the existing solution for t epochs.
Step 4: Walk and Grab Attack (S) Finally, the Aquila attacks from above based on the prey's movement for the 4th search approach.This search process can be denoted as "Walk and Grab Prey", where S 4 (t + 1) represents the solution attained so far, and lev(D) shows the Levy distribution for the D dimensional range.QF indicates the quality function for balancing the search process, G 1 denotes each kind of movement of Aquila during the hunt, and G 2 shows the fight slope of hunting.
The fitness choice is a key component of the MAO method.Encoder performance is applied measure a better solution candidate.Now, the performance value is the foremost condition applied to develop an FF.
where TP and FP indicate the true and false positive values.

Results and Discussion
The performance validation of the MAOSDL-TC method on the sentiment classification of COVID-19 tweets takes place using the Kaggle dataset [29], which holds 2750 samples with 11 classes, as portrayed in Table 1.

Result Analysis
A brief result of using the MAOSDL-TC technique on COVID-19 tweet classification is illustrated in Table 2 and Figure 4.The obtained results state that the MAOSDL-TC technique properly recognized all classes.On 70% of the TR set, the MAOSDL-TC technique provided an average  of 99.19%,  of 95.63%,  of 95.55%,  of 95.54%, and JI of 91.49%.In addition, on 30% of the TS set, the MAOSDL-TC approach attained an average  of 99.45%,  of 97.15%,  of 96.89%,  of 96.99%, and JI of 94.18%.

Result Analysis
A brief result of using the MAOSDL-TC technique on COVID-19 tweet classification is illustrated in Table 2 and Figure 4.The obtained results state that the MAOSDL-TC technique properly recognized all classes.On 70% of the TR set, the MAOSDL-TC technique provided an average accu y of 99.19%, prec n of 95.63%, reca l of 95.55%, F score of 95.54%, and JI of 91.49%.In addition, on 30% of the TS set, the MAOSDL-TC approach attained an average accu y of 99.45%, prec n of 97.15%, reca l of 96.89%, F score of 96.99%, and JI of 94.18%.

Conclusions
This article has concentrated on the improvement of the MAOSDL-TC method for classification of text sentiments in COVID-19 tweets.The MAOSDL-TC technique mainly concentrates on the recognition and categorization of different kinds of sentiments in COVID-19-related tweets.In the presented MAOSDL-TC technique, the following set of processes were involved, namely pre-processing, FastText, ASBiLSTM-based classification, and MAO-based parameter selection.In this work, the ASBiLSTM model for the classification of sentiments existing in the tweets.Lastly, the MAO system can be applied for the hyperparameter tuning process, which aids in improving the detection results of the ASBiLSTM model.The presented MAOSDL-TC method is validated on the benchmark tweets dataset.The experimental outcomes, with maximum accuracy of 99.45%, suggested the promising results of the MAOSDL-TC technique compared to recent models.This MAOSDL-TC technique not only improves accuracy but also enhances the better in- These results confirmed that the MAOSDL-TC technique exhibits enhanced performance over recent models.

Conclusions
This article has concentrated on the improvement of the MAOSDL-TC method for classification of text sentiments in COVID-19 tweets.The MAOSDL-TC technique mainly concentrates on the recognition and categorization of different kinds of sentiments in COVID-19-related tweets.In the presented MAOSDL-TC technique, the following set of processes were involved, namely pre-processing, FastText, ASBiLSTM-based classification, and MAO-based parameter selection.In this work, the ASBiLSTM model for the classification of sentiments existing in the tweets.Lastly, the MAO system can be applied for the hyperparameter tuning process, which aids in improving the detection results of the ASBiLSTM model.The presented MAOSDL-TC method is validated on the bench-

Figure 2 .
Figure 2. The architecture of the ASBiLSTM technique.

Figure 2 .
Figure 2. The architecture of the ASBiLSTM technique.

Figure 3
Figure 3 represents the classifier performances of the MAOSDL-TC technique under the test database.Figure 3a,b shows the confusion matrix achieved by the MAOSDL-TC

Figure 5
Figure5illustrates the training accuracy TR_accu y and VL_accu y of the MAOSDL-TC approach.The TL_accu y is determined by the evaluation of the MAOSDL-TC technique on the TR dataset whereas the VL_accu y is computed by evaluating performance on a separate testing dataset.The results exhibit that TR_accu y and VL_accu y upsurge with an increase in epochs.As a result, the outcome of the MAOSDL-TC technique increases on the TR and TS datasets with a rise in the number of epochs.

Figure 4 .
Figure 4. Average of MAOSDL-TC approach on 70:30 of the TR set/TS set.

Figure 5
Figure5illustrates the training accuracy _ and _ of the MAOSDL-TC approach.The _ is determined by the evaluation of the MAOSDL-TC technique on the TR dataset whereas the _ is computed by evaluating performance on a separate testing dataset.The results exhibit that _ and _ upsurge with an increase in epochs.As a result, the outcome of the MAOSDL-TC technique increases on the TR and TS datasets with a rise in the number of epochs.

Figure 5 . 16 Figure 6 .
Figure 5. Accu y curve of the MAOSDL-TC approach.In Figure6, the TR_loss and VR_loss results of the MAOSDL-TC approach without optimization are revealed.The TR_loss defines errors among the predictive performance

Figure 7 .
Figure 7. Comparative outcome of MAOSDL-TC algorithm with recent methods.

Figure 7 .
Figure 7. Comparative outcome of MAOSDL-TC algorithm with recent methods.

Table 1 .
Description of database.

Table 3 .
Comparative outcome of MAOSDL-TC algorithm with recent methodologies.

Table 3 .
Comparative outcome of MAOSDL-TC algorithm with recent methodologies.