Emotion AI-Driven Sentiment Analysis: A Survey, Future Research Directions, and Open Issues

: The essential use of natural language processing is to analyze the sentiment of the author via the context. This sentiment analysis (SA) is said to determine the exactness of the underlying emotion in the context. It has been used in several subject areas such as stock market prediction, social media data on product reviews, psychology, judiciary, forecasting, disease prediction, agriculture, etc. Many researchers have worked on these areas and have produced signiﬁcant results. These outcomes are beneﬁcial in their respective ﬁelds, as they help to understand the overall summary in a short time. Furthermore, SA helps in understanding actual feedback shared across di ﬀ erent platforms such as Amazon, TripAdvisor, etc. The main objective of this thorough survey was to analyze some of the essential studies done so far and to provide an overview of SA models in the area of emotion AI-driven SA. In addition, this paper o ﬀ ers a review of ontology-based SA and lexicon-based SA along with machine learning models that are used to analyze the sentiment of the given context. Furthermore, this work also discusses di ﬀ erent neural network-based approaches for analyzing sentiment. Finally, these di ﬀ erent approaches were also analyzed with sample data collected from Twitter. Among the four approaches considered in each domain, the aspect-based ontology method produced 83% accuracy among the ontology-based SAs, the term frequency approach produced 85% accuracy in the lexicon-based analysis, and the support vector machine-based approach achieved 90% accuracy among the other machine learning-based approaches.


Introduction
Sentiment analysis (SA) refers to uncovering the human emotion that is conveyed within a context.It makes it possible to predict the emotion, attitude, or even the personality of a person which is expressed in the form of different aspects.Sentiment analysis identifies the human emotion underlined in the context which enables machines to understand these emotions accurately.Initially, knowledge or opinions were shared among family members, neighbors, friends, relatives, etc. in person.Now, with the evolution of technology, most of these exchanges happen online where SA plays a significant role.Technology has provided a platform for one to be exposed to thousands of opinions in minutes [1].For example, a person can post their views on a social issue or on a product they have recently bought.A dataset needs to be collected.For example, tweets as a dataset can be gathered by utilizing the Twitter API; ROAuth is required to approve the application [10].The dataset could be of more than 5000 tweets that vary in size according to the data needed.It should include all three types of data, where structured data are in an organized format in the repository [11], semi-structured data are formatted in the form of structured data, and unstructured data are not organized and do not contain any pre-defined models [12].The types of data are depicted in Figure 3.

Step 2: Training Dataset and Subjective Data
Two types of datasets are utilized for preparing the classifier: subjective information and unbiased information.The subjective dataset includes the notion inside the setting, while the impartial dataset does not include the sentiment or emotion of the specific situation [13].Emotional information contains the opinion of a particular circumstance and conveys the emotion in two ways: glad or sad.An adequate measure of the negative and positive views (tweets) for two consecutive days was gathered to prepare the dataset by utilizing the classifier.

Step 1: Data Collection
A dataset needs to be collected.For example, tweets as a dataset can be gathered by utilizing the Twitter API; ROAuth is required to approve the application [10].The dataset could be of more than 5000 tweets that vary in size according to the data needed.It should include all three types of data, where structured data are in an organized format in the repository [11], semi-structured data are formatted in the form of structured data, and unstructured data are not organized and do not contain any pre-defined models [12].The types of data are depicted in Figure 3.A dataset needs to be collected.For example, tweets as a dataset can be gathered by utilizing the Twitter API; ROAuth is required to approve the application [10].The dataset could be of more than 5000 tweets that vary in size according to the data needed.It should include all three types of data, where structured data are in an organized format in the repository [11], semi-structured data are formatted in the form of structured data, and unstructured data are not organized and do not contain any pre-defined models [12].The types of data are depicted in Figure 3.

Step 2: Training Dataset and Subjective Data
Two types of datasets are utilized for preparing the classifier: subjective information and unbiased information.The subjective dataset includes the notion inside the setting, while the impartial dataset does not include the sentiment or emotion of the specific situation [13].Emotional information contains the opinion of a particular circumstance and conveys the emotion in two ways: glad or sad.An adequate measure of the negative and positive views (tweets) for two consecutive days was gathered to prepare the dataset by utilizing the classifier.Two types of datasets are utilized for preparing the classifier: subjective information and unbiased information.The subjective dataset includes the notion inside the setting, while the impartial dataset does not include the sentiment or emotion of the specific situation [13].Emotional information contains the opinion of a particular circumstance and conveys the emotion in two ways: glad or sad.An adequate measure of the negative and positive views (tweets) for two consecutive days was gathered to prepare the dataset by utilizing the classifier.

Step 3: Data Pre-Processing
Preprocessing is the initial phase in the supposition investigation, and it is done before semantically examining the vocabulary [14].For instance, we can go back to the example of Twitter.Twitter is a platform where individuals from different parts of the world offer their perspectives as tweets in different languages.The information in these tweets may contain unstructured data that is boisterous, for instance, stop words, non-English words, and emphasis marks [15].These kinds of unstructured information are exceedingly popular in tweets.In the preprocessing step, the tweet is divided based on parts of the speech (POS) tags.Information preprocessing includes evaluating URLs, sifting, expelling interrogative proclamations and stop words, barring unique characters, barring retweets, expelling hashtags on the perspective, barring emojis and pictures, expelling dialects other than English, and eliminating capitalized letters [16].

Comparison with Previous Surveys
Various approaches have been proposed for a suitable analysis of sentiment examination.Nevertheless, to the best of the authors' knowledge, slant investigation is still in its initial stage and endures changes.A couple of efficient surveys have shaped this region well, and the existing solutions work well with the current advancements.The principal boundary is that estimation examination is a multi-faceted issue and includes various sub-problems; however, it is not a solitary errand [17].Additionally, the existing overviews of conclusion, examination, and consideration are either centered around depicting particular specialized points or are primarily focused on a specific part of slant examination [18].
Among the studies that have been proposed recently, the one by Yue et al. [19] summed up an immensely critical research topic in the fields of SA and OM.This work is viewed as a reference book on emotion examinations, and the opinions that are extracted from it [20] focus on the multimodal SA addressed in both the supervised and unsupervised models.It has also detected the automatic sentiment from the context and tested the various machine learning (ML) approaches.Table 1 depicts the comparison of the previous surveys.

Reference Number
Survey Objective Survey Outcome [21] To survey the development of sentiment analysis on images, videos, blogs, etc.
To discuss visualized sentiment analysis and speech sentiment analysis.
It focused on the methodology to be used to provide the sentiment for both speech and visualization.
It discussed how the automatic sentiment analyzer could be proposed for both speech and visual data. [22] To focus on increasing precision and reducing the false rate of sentiment analysis.
The evaluation metrics were discussed, and they showed the experimental results.It explained how the supervised and unsupervised algorithms work.[23] To conduct a sentiment analysis for the communities.
It discussed fine-grained sentiment analysis and algorithm classification. [24] To depict the n-gram, unigram, and focuses on interpreting the sentiment in every single sentence.
It discussed the neural network-based approach and word embedding and elucidated how the recurrent neural network works.[25] To conduct a survey to implement Twitter sentiment analysis.
It presented the methodology of HybridSeg and discussed subjective data.
The rest of this survey paper is organized as follows: Section 2 presents the survey of the SA, Section 3 presents the various methodologies of the SA, Section 4 presents the results and discussions, and Section 5 elucidates the challenges, future research directions, and open issues.

Literature Review
This chapter presents an in-depth analysis of SA which is based on four approaches, namely, ontology-, lexicon-, machine learning-, and neural network-based approaches.

Ontology-Based Sentiment Analysis: Review
In recent developments in AI, ontologies have played an essential role in showing the relationships among class hierarchies using the concept of object-oriented programming.Ontology is defined as the precise labeling and description of the various types of relationships between an object and its properties.
There are four forms of ontology: entity, which denotes an object; the relation among things; the presence of an object in a relationship; and properties that are interrelated with the object [26].Figure 4 shows the hierarchical structure of the general ontology for the English language.There are several reasons to build an ontology as listed below:

•
To examine the domain-specific knowledge; This chapter presents an in-depth analysis of SA which is based on four approaches, namely, ontology-, lexicon-, machine learning-, and neural network-based approaches.

Ontology-Based Sentiment Analysis: Review
In recent developments in AI, ontologies have played an essential role in showing the relationships among class hierarchies using the concept of object-oriented programming.Ontology is defined as the precise labeling and description of the various types of relationships between an object and its properties.
There are four forms of ontology: entity, which denotes an object; the relation among things; the presence of an object in a relationship; and properties that are interrelated with the object [26].Figure 4 shows the hierarchical structure of the general ontology for the English language.There are several reasons to build an ontology as listed below:  Due to the rapid increase in the number of existing websites, information extraction has become more challenging.Most search engines are keyword based or are available as full-text search engines which poses a challenge in extracting precise information.Thus, another technique for data extraction and supposition mining framework has been proposed that depends on type-2 fuzzy ontology.This framework was designed to reconstruct the consumer's full-text data into a proper, classical format for the search engine.This methodology provides the features that are extracted using the type-2 fuzzy ontology method.
In earlier days, SA followed the traditional analysis system which had no proper or precise use for sentiment words.This system used the short-text form to share opinions or reviews in a discussion Due to the rapid increase in the number of existing websites, information extraction has become more challenging.Most search engines are keyword based or are available as full-text search engines which poses a challenge in extracting precise information.Thus, another technique for data extraction and supposition mining framework has been proposed that depends on type-2 fuzzy ontology.This framework was designed to reconstruct the consumer's full-text data into a proper, classical format for the search engine.This methodology provides the features that are extracted using the type-2 fuzzy ontology method.
In earlier days, SA followed the traditional analysis system which had no proper or precise use for sentiment words.This system used the short-text form to share opinions or reviews in a discussion [27].Moreover, this led to a challenge in finding the proper sentiment.Hence, to overcome these challenges, a cross-domain SA was developed.This system has enhanced the sentiment presentation through two perspectives-microscopic and macroscopic views.The fusion of these sentiments resulted in the form by considering the simplicity and the speed of the system.It also uses simple linear insertion for fusing images and texts.
where Scross media is the fusion sentiment result, Scontext is the normalized text sentiment, Simage is the normalized image sentiment, and λ is the balanced weight (when λ > 0.5, it gives better fusion results).
The need to overcome the challenge of the ambiguities in the opinions expressed in Chinese online product reviews has led to a novel approach to identify the product aspects quickly, and it was proposed to use the opinions related to the products to build a suitable ontology [28].The job of SentiWord is to consider the single context and the PoS presented in the statement.The SentiWord is given a score between −1 and 1, where the lowest value indicates a negative sentiment and the highest value indicates a positive one.As the SentiWord considers both the word and PoS, the statement gives a clear view of the tweet given.
Different perspectives and challenges are identified and strategies to overcome them.Sasi et al. [29] demonstrated their contribution to the analysis of the negative sentiment provided in a tweet by the consumer.To attain high customer satisfaction, they focused only on the negative opinions in the tweets related to the delivery service of the United States Postal Service.They used object properties to develop the ontology and SentiStrength to uncover the score of the statement provided in the tweet.
More past ontology-based works are presented in the Table 2 below.

Lexicon-Based Sentiment Analysis: A Review
One of the best examples of a lexicon is the shouts exchanged among the players in a match such as "Yeah!", "Hoo!", "Hut!", "Blitz!", "Hike!", etc.Another set of examples is words used by lawyers in court: "I object, my lord", "court adjourns", "counsel", etc.These sets of terminologies that are used among a group of people are known as lexicons.The phrases that have specific meanings are called lexemes.Different languages have different words with the same meaning; for example, water is thanni (Tamil), vellam (Malayalam), paani (Hindi), etc.
Lexicon-based SAs have two approaches-a dictionary-based approach and a corpus-based approach.The dictionary-based method contains words with semantic orientation, and the corpus-based approach has words with and without sentiments that can be used for other purposes as well [39].Figure 5 shows the architecture of the lexicon-based SA, and it shows how the view is classified and the opinion is extracted from the new data.The data are trained by the learning model that classifies the data into three forms-positive, negative, and neutral.
Lexicon-based SA is used when the training data are inadequate.According to Thakkar et al. [40], unigrams were used in previous algorithms, but they did not provide satisfactory results.Hence, the authors proposed the n-grams method which is formed by the N number of unigrams to offer better results.For negative statements in the document, the authors proposed to use a ratio-based approach.
Another challenge to the domain are heterogeneity and linguistic problems.The domain that is specific to the sentiment can differ from content to content; however, when it comes to language, it differs from person to person [41].To overcome these kinds of challenges, an algorithm that is domain-specific should be taken up with already existing lexicon-based domain-specific methods and with the utilization of undefined dictionary-based sentiment analysis (DBSA) and corpus-based sentiment analysis lexicons (CBSALs).A few more past lexicon-based works are presented in Table 3 below.Sentence level Weight scheme Tweets 70% [48] Phrase level OpenDover (web service) Tweets 75% [49] Sentence level Semantic-based lexicon analysis Tweets - [50] Word level Word emotion lexicon Socio-political (SP) and sports Did not add value A few more past lexicon-based works are presented in Table 3 below.Lexicons are built for applications, such as online product reviews, blogs, Twitter, medical forums, etc., and various works related to them are presented.

Sentiment Analysis Based on Machine Learning: A Review
Machine learning is a model that can handle any complex work that humans cannot accomplish in real-time.In this model, humans enable machines to think and learn by themselves using the experiences they have gained.For example, when the context from feedback is extracted, the aspects and features of the review are identified.Then, the identified feature is labeled with the maximum matching class.In the past, dictionaries were significantly used to understand the views pertaining to a tweet's specific situation during research.A noteworthy issue of lexical examination is the explicit space words that are not utilized in these lexicons, in which case, deciding the notions of these dictionaries is the best test.The current work did not require preparation for the classifier to decide the conclusion for the area's explicit lexicons.In this examination, two words were included-incredible and poor [54].Astounding was viewed as an exceedingly positive setting, and poor was considered to be a profoundly negative setting.The score produced from each word was observed using the given rule: This formula helps to understand how SA is utilized, as it is human nature to know "how and what".Since AI is the best in class, the notion assumes an indispensable job in it [55].Previously, scholastic understudies were found to investigate the supposition of others.The Rule-Based Emission Model (RBEM) was used to recognize the extremity in sentences.Because of the exploration, the approach performed well and gave outcomes that were exceptionally ascendable, transparent, and could work effectively.It has led to the study of a new issue of unsupervised analysis of sentiment in a signed social network.Methodologically, they have suggested consolidating the signed social relations and nostalgic signs from terms into a bound together structure when feeling names are absent.Later, these were additionally analyzed on two true signed social networks-Opinions and Slashdot.The outcomes demonstrated that the proposed Signed Senti has a fundamentally preferred execution over best in class strategies [56].
It aims to provide the automatic SA that uncovers the in-depth attitude that is held towards an entity [57].Also, the problem in SA and multimodal sentiment analysis (MSA) in terms of a different aspect of the data has been discussed, for example, through images, human-machine interactions, human-human interaction images, videos, etc.Consider the utilization of three AI approaches-for instance, naive Bayes (NB), support vector machine (SVM), and most extraordinary entropy-to separate notions into good and bad requests.They utilize the word sack to obtain results on the gathering of the supposition examination.The results exhibited in SVM would be higher in capability than NB.[58] At the same time, exactly when the dataset is lower in size for preparing and testing, the outcome would be higher while utilizing the NB classifier.
The outcomes on the Twitter dataset that was gathered demonstrated that the accuracy of the proposed showcase was 74% greater than that of the conventional directed emotion classifiers (SVM, random forest (RF), decision tree (DT), and some semi-administered calculations) [59].Similarly, an improved NB system showed two NB varieties using lemmas (things, action words, graphic words, and modifiers), polarity lexicons, and multiword as perspective highlights.
More past works on machine learning are presented in Table 4 below.

Neural Network Models: A Review
In the neural models, unique words are utilized as the input in the parse trees which provide the synthetic data and semantic data.Hence, emotion composition is derived from the best.Recurrent neural networks and convolution neural networks are becoming more popular, and they do not require parse trees to split their features from the given sentences.Instead, recurrent neural networks and convolution neural networks (CNNs) utilize word embedding as inputs which already inscribes the semantic and synthetic data.Additionally, the architecture of the convolution neural networks and recurrent neural networks help in learning the connectivity between the words in a statement.A recursive autoencoders network (RAN) in a semi-supervised model for sentence-level SA resulted in providing a low-dimensional vector representation [68].In a new matrix-vector recursive neural network (MVRNN), each context is also related to a matrix representation in the form of a tree [69].
The structure of a tree is derived from a parser that is used externally.It portrays a collaboration of Recurrent Neural Network (RNN) and CNN architecture for the classification of the sentiment from a short context which takes the favorable position in the coarse-grained features generated by a CNN [70].
An approach based on the linguistics LSTM for effective sentiment prediction incorporates the sentiment lexicon as a highly intensive context and negative context [71].The LSTM includes these features while analyzing the sentiment to provide a useful view of the context.Authors have presented a traditional CNN-LSTM model which consists of two sections-local CNN and LSTM-to predict the attitude illustrated in the content [72].Table 5 presents the overall pros and cons identified from the survey.

Pros Cons
• Many utilized fewer parameters.
• Some of the approaches require labeled data.
• Labeling data is not necessary for feature extraction.
• It does not count the absence of text.
• Training is done easily.
• Features are confused at times during analysis.
• Some of the low-featured data can be extracted easily.
• When there is more than one possible opinion, it fails to handle the situation.
• Opinion words are categorized into different forms.
• Many consider only adjectives.
• Develops new standardized data.
• Exact feature identification is complicated.
• Implicit statements are considered in the text during extraction.
• Unlabeled data are not analyzed properly.
• Some of the lexicon approaches provide less false rates.
• It does not analyze short text properly.
• Domain-specific data achieve high accuracy in predicting sentiments.
• Even though it ensures high accuracy, F-measure is also high.
• Neural net models on predicting the sentiment achieve high accuracy with a less false rate.
• Building neural net ID is highly complicated.
• Ontology sentiment analysis provides high accuracy when it is domain-specific.
• The false rate in the analysis is high in ontology when compared to the lexicon.

Methodology
The two important methodologies used for sentiment analysis, such as the machine learning-based approach and lexicon-based approach, are discussed in the next section.

Machine Learning Approaches
As discussed in the literature review, SA can be performed through various methods.Figure 4 shows the categorization of the methods in the analysis of sentiment and aggregation of opinions.
Figure 6 mentions the different approaches that are discussed here.The methods are a probabilistic classifier, linear classifier, rule-based classifier, DT classifier, and NB classifier.
The probabilistic classifier (PC) provides a probabilistic function to the set of input data.Further, the input function f(x) is applied with the probability and is mapped with the output function (y).Hence, the PC is denoted as follows: where f(x) is the input function.When the conditional distributors replace the PC, it is as follows: where the PC is changed to the conditional classifiers Pr(Z /V ), and the given z Z is assigned to v V .In a linear classifier, word T = (T1, T2, T3) is the frequency of a single word, the vector V = (V1, V2, V3) is a linear input coefficient, and the scalar S is the linear output coefficient.This classifier categorizes the margin to distinguish between the two classes [73].
In the rule-based classifier, the "if condition and then decision" approach is used to make specific rules.This rule-based classifier is also known as a multi-class classifier.Furthermore, this classifies the data into three forms-good, bad, and neither good nor bad.Besides, this unsupervised classifier mainly focuses on the prediction of emotion in the context and emoticons [74].The probabilistic classifier (PC) provides a probabilistic function to the set of input data.Further, the input function f(x) is applied with the probability and is mapped with the output function (y).Hence, the PC is denoted as follows: where f(x) is the input function.When the conditional distributors replace the PC, it is as follows: where the PC is changed to the conditional classifiers Pr(Z'/V'), and the given z Z' is assigned to v V'.In a linear classifier, word T = (T1, T2, T3) is the frequency of a single word, the vector V = (V1, V2, V3) is a linear input coefficient, and the scalar S is the linear output coefficient.This classifier categorizes the margin to distinguish between the two classes [73].
In the rule-based classifier, the "if condition and then decision" approach is used to make specific rules.This rule-based classifier is also known as a multi-class classifier.Furthermore, this classifies the data into three forms-good, bad, and neither good nor bad.Besides, this unsupervised classifier mainly focuses on the prediction of emotion in the context and emoticons [74].
The DT classifier is a recursive one.For the training data, the condition based on the classification is applied, and it divides the training data such that the ones which satisfy the conditions are all of one class, and the procedure continues until all the data satisfy the condition [75].
The NB approach examines whether the feature provided with the probability has the role of the label or not.In this, every feature is independent and the specific feature is mapped with a label matched in the maximum.
P(E/A) = (P(E) × P(A/E))/P(A) (5) Here, P(E) is the probability of the label, and P(A) is the probability of the feature.The NB classifier is mainly required to classify features such as email IDs, URLs, words, phrases, dictionaries, parse trees, etc.The NB algorithm is solely used for the textbook, and it classifies the string and not any of the numerical data or subsets.This classifier is a class-specific unigram language model.The The DT classifier is a recursive one.For the training data, the condition based on the classification is applied, and it divides the training data such that the ones which satisfy the conditions are all of one class, and the procedure continues until all the data satisfy the condition [75].
The NB approach examines whether the feature provided with the probability has the role of the label or not.In this, every feature is independent and the specific feature is mapped with a label matched in the maximum.
P(E/A) = (P(E) × P(A/E))/P(A) Here, P(E) is the probability of the label, and P(A) is the probability of the feature.The NB classifier is mainly required to classify features such as email IDs, URLs, words, phrases, dictionaries, parse trees, etc.The NB algorithm is solely used for the textbook, and it classifies the string and not any of the numerical data or subsets.This classifier is a class-specific unigram language model.The likelihood features are assigned the probability of every single word, and they propose the probability to each sentence as follows: At present, for the given parameters, the positive and negative values were specified.The grade was multiplied altogether by the positive score and the negative score.The total positive score was found to be 0.0000005, and the overall negative rating was 0.0000000010 as shown in Table 6.For the given statement, a higher probability was given to the positive.Hence, the given statement is irrefutable.Figure 7 shows how the SA process was executed as well as how the classification was done for the training data using the NB classifier.The training data as classified as positive, negative or neutral.The knowledge-based method determined the record of the appearance of a word in a particular record or report.found to be 0.0000005, and the overall negative rating was 0.0000000010 as shown in Table 6.For the given statement, a higher probability was given to the positive.Hence, the given statement is irrefutable.Figure 7 shows how the SA process was executed as well as how the classification was done for the training data using the NB classifier.The training data as classified as positive, negative or neutral.The knowledge-based method determined the record of the appearance of a word in a particular record or report.After finding the occurrence of a word, similar words are grouped together.Then, the NB approach is utilized to classify the test data and predict the sentiment to document as positive, negative or neutral [76].The Bayesian network model used in it is a directed acyclic graph.There is a strong relationship between the feature and the label in this model.Maximum entropy deals with probability.Initially, to set-up maximum entropy modeling, the characteristics should be selected to determine the constraints.In-text classification-the use of word count-is supposed to be the feature.
(/) = P(L) × P /P(F) (7) After finding the occurrence of a word, similar words are grouped together.Then, the NB approach is utilized to classify the test data and predict the sentiment to document as positive, negative or neutral [76].The Bayesian network model used in it is a directed acyclic graph.There is a strong relationship between the feature and the label in this model.Maximum entropy deals with probability.Initially, to set-up maximum entropy modeling, the characteristics should be selected to determine the constraints.In-text classification-the use of word count-is supposed to be the feature.P(L/F) = P(L) × P F L /P(F) (7) where "P" refers to a function, "L" refers to label, "F" refers to features, P(L) refers to the probability of label, P(F/L) is the probability of the feature categorized as label, and P(F) refers to probability of the feature.
Further, this model is executed by the vectors.Hence, in this model, the labeled features are converted into vectors.The weight of the feature is allocated.The prediction of the label and the feature is performed to calculate the sentiment of the context.Here, the feature and the label were mapped in the form of vectors.The above representation [77] of the maximum entropy shows that if any of the words occur from the same class, the weight for that class will be higher.The primary advantage of the maximum entropy is that it utilizes the natural binary features.
The SVM algorithm classifies the data by using the hyperplane such as the positive or negative forms.In this model, no probability was applied; hence, it was not a PC.Support Vector Machine approach is exceptionally efficient in text categorization [78] and is better than PCs.The motivation behind SVM is to recognize the hyperplane which the vector symbolizes.This records vectors in single class word vectors and sentence vectors in a single year.So far, its accuracy is more than 90% which is high when compared to the NB and maximum entropy method.vector = j αj , countj, documentj vector, αj ≥ 0, (8) Here, αj refers to a dual optimization problem and document.j vector − αj > 0 = support vector machines.The classification part plays a major and versatile role in identifying the hyperplane and making sure which constraint falls under the set margin [79].
Random Forest approach is said to be a tree classifier.Every single class in the tree is given an information vector, and the most elevated class is taken into the point.The error rate is dictated by the connection between the two trees in the woods and the weight of individual trees in the forest.Further, to decrease the error, the trees ought to have substantial weight or quality and be free of one another.The RF takes the DT as the individual predictors that are based on the methods of randomizing outputs, boosting, and bagging.Any large number of datasets can be easily classified using RF methods with better accuracy [80].For each and every tree in A (i) Create a bootstrapping sample S of size D (ii) Create a tree recursively to all the internal nodes with the following steps: Step 1:-Choose the sub feature f randomly at feature F Step 2:-Best sub feature f has to selected Step 3:-Finally split the node 2. Test instance will be passed to trees once trees are created 3. Later, a majority of votes are provided by assigning the class labels The neural network model encompasses three layers: an input layer, a hidden layer, and an output layer [81].Artificial neural networks (ANNs) are used for learning by applying a neural network of multiple layers in it.Furthermore, this is more powerful in representing the neural network, and also, it is more practical with a minuscule quantity of data and a minimum of two phases.The neural network is categorized into the recurrent neural network and the feedforward neural network.Several activation functions are to be used which include ReLu, tanh for sigmoid function, and leaky ReLu.
Sentiment Analysis in the neural network is done with the initial representation of the word into a vector and by word-level word embedding, character-level embedding, sentence-level embedding, and training the network.Numerous profound learning models that are utilized as a part of the NLP, which require the word embedding, come as information highlights.This word embedding changes over the setting into a vector of consistent, genuine numbers; e.g., word-Hai (H-0.13,a-0.15, and i-0.23).The experimentation is finished with a few calculations.Furthermore, it helps to estimate productivity and precision [82].The convolution neural network is typically used in the picture arrangement for the examination of emotion.Likewise, it isolates the strings where every single independent setting is changed to vector [83].Figure 8 shows the flow process of convolution layers.
a vector and by word-level word embedding, character-level embedding, sentence-level embedding, and training the network.Numerous profound learning models that are utilized as a part of the NLP, which require the word embedding, come as information highlights.This word embedding changes over the setting into a vector of consistent, genuine numbers; e.g., word-Hai (H-0.13,a-0.15, and i-0.23).The experimentation is finished with a few calculations.Furthermore, it helps to estimate productivity and precision [82].The convolution neural network is typically used in the picture arrangement for the examination of emotion.Likewise, it isolates the strings where every single independent setting is changed to vector [83].Figure 8 shows the flow process of convolution layers.The LSTM approach is a versatile type of recurrent neural network which handles the dependencies.The entire recurrent neural network follows the chain rules and is done recursively until the process achieves the optimization stage [84].The recursive iteration in the LSTM is complicated when compared to ordinary and straightforward RNN because it has four layers that interact in a particular manner.From the cell state at the timestamp t, the LSTM decides which data that should be in the dump.This is determined by utilizing the sigmoid function (σ), which is called the "forget gate."The capacity guarantees hit -1 (yield from the past covered layer) and it (current data) yields a number in '0' or '1' where 1 means "thoroughly keep" and 0 implies "absolutely dump".The LSTM approach is a versatile type of recurrent neural network which handles the dependencies.The entire recurrent neural network follows the chain rules and is done recursively until the process achieves the optimization stage [84].The recursive iteration in the LSTM is complicated when compared to ordinary and straightforward RNN because it has four layers that interact in a particular manner.From the cell state at the timestamp t, the LSTM decides which data that should be in the dump.This is determined by utilizing the sigmoid function (σ), which is called the "forget gate."The capacity guarantees hit -1 (yield from the past covered layer) and it (current data) yields a number in '0' or '1' where 1 means "thoroughly keep" and 0 implies "absolutely dump".

Sentiment Analysis in Gram Representation
Unigram is the representation of the word presented in the document.In addition, it is associated with the feature value of the word in the document which is referred to as term frequency (TF).Unigram is said to be a single word in the document.Every single word that is taken into account from the document presented is known as a unigram.Alternatively, every pair of words is called a bigram which is used for bigram representation from the document.Here, the feature is associated with the bigrams in the document.Further, in n-gram representation, "N" number of words in the document were considered [85].

Term Frequency-Inverse Document Frequency (TF-IDF) Representation
Term frequency-inverse document frequency is the representation of TF-IDF.Here, the word presented in the document is given by "total frequency (word, document) × log (inverse document frequency (word, document)".A log is presented here for the computation of base 10, and d is the training data which are the collection of the document presented [86].

Dictionary-Based Approach
The new terminologies are collected manually by this approach, and then a list of synonyms and antonyms of the terms are formed.It is later matched to the list, and words with similar meanings are grouped together.This process continues whenever a new term is found [87].

Corpus-Based Approach
The corpus-based method is applied to a particular topic.The corpus approach has two forms: statistical approach and semantic approach.The corpus-based approach is mainly used for addressing languages.The corpus data are extracted from the corpora which has a large amount of data, and it also has the pattern of the language used in day-to-day life [88].

Statistical Approach
This approach is used to find the occurrence of words.The principal goal of this approach is to determine the extremity between positive and negative words.When positive data are high, the entire data are positive and vice versa for negative data.Cosine similarity is one of the statistical approaches utilized in determining the sentiment and the opinion that is uncovered from the context.Cosine similarity shows the similarity among two vectors which is non-zero.Cosine similarity determines the polarity, whether positive or negative [89].

Results and Discussions
First, the evaluation metrics that were considered for comparison between the approaches are discussed here.In order to find the effectiveness of every classifier, the following guideline was utilized to find the precision, support, review, and F-measure [90].

Prediction Accuracy
Generally, to analyze accuracy, the following rule is applied which determines how the sentiment is calculated and determined accurately.Besides, this is said to be the precision measure [91].
where LTT refers to the labeled Twitter tweets and TTT refers to the total tweets.

Refined Measure on Tweet Precision
The fraction of the data that are retrieved in a relevant manner is defined as follows: Re f ined measure f or positive data = PTT TTP + FP Re f ined measure f or negative data = NTT TTN + FN (14) where TTP refers to total positive tweets, and TTN refers to total negative tweets.

Recall
The fraction of the data that are retrieved in a relevant manner is defined as follows The F-measure is used to evaluate the false rate, and the formula used to define it is as follows: In this article, the training dataset and testing dataset were taken from Twitter.The tweets were collected based on a movie.Table 7 depicts the total dataset used for testing and training the sentiment.This particular movie data were tested with three different approaches: ontology-based SA, lexicon-based SA, and machine-learning-based SA.The results are given as follows:

Using Ontology-Based Sentiment Analysis
In the ontology-based SA, four primary conventional approaches were tested: specific ontology-based SA, fuzzy logic-based SA, aspect-based SA, and domain-specific SA.The aspect-based SA resulted in 83% accuracy, 84% recall, and a F-measure of 50%.This shows that the exactness of prediction was high compared to the other approaches for the considered data.The results are shown in Figure 9.

Using Lexicon-Based Sentiment Analysis
The four major approaches that were used to test the lexicon-based SA were term frequency approach, word count, unigram, and bigram.The precision of the word count was 91%, the recall was 88%, and the F-measure was below 10% for the considered data.The results are shown in Figure 10.

Using Lexicon-Based Sentiment Analysis
The four major approaches that were used to test the lexicon-based SA were term frequency approach, word count, unigram, and bigram.The precision of the word count was 91%, the recall was 88%, and the F-measure was below 10% for the considered data.The results are shown in Figure 10.

Using Lexicon-Based Sentiment Analysis
The four major approaches that were used to test the lexicon-based SA were term frequency approach, word count, unigram, and bigram.The precision of the word count was 91%, the recall was 88%, and the F-measure was below 10% for the considered data.The results are shown in Figure 10.

Using Machine-Learning-Based Sentiment Analysis
In the machine learning approaches, the multinomial NB algorithm, RF algorithm, SVM, and XG Boosting algorithm were tested.The SVM resulted in high accuracy of 96%, recall of 66%, and Fmeasure of 60%.Moreover, the machine-learning-based SVM approach achieved 96% accuracy for the considered data.These results may vary when the dataset varies.The results are shown in Figure

Using Machine-Learning-Based Sentiment Analysis
In the machine learning approaches, the multinomial NB algorithm, RF algorithm, SVM, and XG Boosting algorithm were tested.The SVM resulted in high accuracy of 96%, recall of 66%, and F-measure of 60%.Moreover, the machine-learning-based SVM approach achieved 96% accuracy for the considered data.These results may vary when the dataset varies.The results are shown in Figure 11.The above discussion provided quantitative results of various approaches in Sentiment Analysis.Recently, Twitter data analysis has been used the most to predict sentiment.The overall merits and demerits identified from this work is presented in Table 8.It is only applicable when the data are labeled.This is more costly.

Hybrid Lexicon + machine learning
It is done at the sentence level, so it is easy at the document level.
It is complex and too noisy.

Challenges, Future Research Directions, and Open issues
Despite the several advantages of emotion AI-driven SA, there are significant challenges that The above discussion provided quantitative results of various approaches in Sentiment Analysis.Recently, Twitter data analysis has been used the most to predict sentiment.The overall merits and demerits identified from this work is presented in Table 8.It is only applicable when the data are labeled.This is more costly.

Hybrid Lexicon + machine learning
It is done at the sentence level, so it is easy at the document level.
It is complex and too noisy.

Challenges, Future Research Directions, and Open issues
Despite the several advantages of emotion AI-driven SA, there are significant challenges that have to be focused on.Resolving the below-mentioned challenges will make SA more efficient and effective, so it can be applied everywhere.

Mining Unstructured Data
In the broad range of social media, all kinds of users are present, from well-educated to uneducated users.To save time, some users have started to text in the message format; for example, to convey the message "happy with this car," the user would text "happppppppppy vth dis labeled.This is more costly.

Hybrid Lexicon + machine learning
It is done at the sentence level, so it is easy at the document level.
It is complex and too noisy.

Challenges, Future Research Directions, and Open issues
Despite the several advantages of emotion AI-driven SA, there are significant challenges that have to be focused on.Resolving the below-mentioned challenges will make SA more efficient and effective, so it can be applied everywhere.

Mining Unstructured Data
In the broad range of social media, all kinds of users are present, from well-educated to uneducated users.To save time, some users have started to text in the message format; for example, to convey the message "happy with this car," the user would text "happppppppppy vth dis ".This kind of text is considered to be unstructured data.Many of the SA methods preprocess this information.Hence, extracting and providing the sentiment and the emotion to these kinds of data are really challenging [92].
".This kind of text is considered to be unstructured data.Many of the SA methods preprocess this information.Hence, extracting and providing the sentiment and the emotion to these kinds of data are really challenging [92].

Identifying Composite Media Features
Generally, this is one of the pressing issues centered around the notion of examination in the light of the fact that remarks or audits may bring attention to any issue.Table 9 depicts user data examples [93].

User Comment
User 1 "the amazon delivered the paste before the delivery date," User 2 "Paste tastes really good, and my cavity is reducing." Extracting these kinds of data, identifying the feature of the data, then determining the opinion are immense challenges in SA, because several million users use online shopping, and each has a different manner of using language to express feedback [94].

Different Words with the Same Meaning
In a user's review, different words with the same meaning might be used.It is necessary to classify the similarity among each word, as some words are placed differently in some sentences which may cause them to sound different, even though they have the same meaning [95].

•
Sentiment words that do not express a sentiment-some of the words in interrogative statements may not explicitly express any sentiment.Still, the emotion present in the statement should be identified as another challenge [96].

•
Emotion identification (sarcasm)-it is difficult for the machine to identify sarcastic statements.
Researchers on SA work hard to identify sarcastic comments with high accuracy, as human emotions and attitudes are often ambiguous [97].
A significant challenge faced by SA is making the machine understand intense human emotions conveyed in the context.With a rise in the usage of unstructured data, human language has become highly complicated, and it is difficult to determine the opinions, viewpoints or reviews of the customer as well as the right sentiment of the context [98].The open issues on SA are summarized in Figure 12 as follows: A significant challenge faced by SA is making the machine understand intense human emotions conveyed in the context.With a rise in the usage of unstructured data, human language has become highly complicated, and it is difficult to determine the opinions, viewpoints or reviews of the customer as well as the right sentiment of the context [98].The open issues on SA are summarized in Figure 12 as follows: Once these challenges are addressed, the possible outcome can benefit from understanding the affinity or sentiment toward a particular phenomenon, entity or idea.Also, understanding the customers' perspectives on a specific aspect of a product, brand or advertisement is challenging [99].A more accurate interpretation of the sentiment is expressed in an unstructured data format that can Once these challenges are addressed, the possible outcome can benefit from understanding the affinity or sentiment toward a particular phenomenon, entity or idea.Also, understanding the customers' perspectives on a specific aspect of a product, brand or advertisement is challenging [99].A more accurate interpretation of the sentiment is expressed in an unstructured data format that can be evaluated [100].It further involves accessing customers' feedback to measure customers' satisfaction and effective e-governance and crisis management [101].

Conclusions
In this paper, an overview of emotion AI-driven SA in various domains was presented.Also, this survey reviewed the merits, demerits, and scope of the different approaches that have been considered.A significant advantage of SA is that it provides the exact emotion that is underlined in the context.Traditional methodologies, such as machine-learning-based approaches, lexicon-based analysis, and ontology-based analysis, were considered for experimentation to compare performances.In the considered sample data, the aspect-based ontology approach, SVM, and term

Conclusions
In this paper, an overview of emotion AI-driven SA in various domains was presented.Also, this survey reviewed the merits, demerits, and scope of the different approaches that have been considered.A significant advantage of SA is that it provides the exact emotion that is underlined in the context.Traditional methodologies, such as machine-learning-based approaches, lexicon-based analysis, and ontology-based analysis, were considered for experimentation to compare performances.In the considered sample data, the aspect-based ontology approach, SVM, and term frequency achieved high accuracy and provided better SA results in each category.Future research directions as well as limitations were also highlighted for the benefit of future researchers.Even though the results showed higher accuracy for the sample data considered, these results may vary when it is applied to other applications.Deep learning approaches can also be considered for comparing the performances as part of the future work which may bring significant changes to the results.

Figure 3 .
Figure 3. Depiction of types of data.

Figure 3 .
Figure 3. Depiction of types of data.

•
To examine the domain-specific knowledge; • To activate the domain knowledge for reuse; • To provide clear domain supposition; • To split the domain and functional expertise; • To provide a way to share their knowledge with the software agents.

Figure 4 .
Figure 4.The structure of the ontology: terminologies.

Figure 4 .
Figure 4.The structure of the ontology: terminologies.

Figure 6 .
Figure 6.Various methods of emotion AI-driven sentiment analysis.

Figure 6 .
Figure 6.Various methods of emotion AI-driven sentiment analysis.

Algorithm 1 .
Input: A = Total number of trees D = Training dataset F = Features f = Sub features Output: Label of bagged class 1.

Figure 11 .
Figure 11.Evaluation metrics of machine-learning-based sentiment analysis.

Figure 11 .
Figure 11.Evaluation metrics of machine-learning-based sentiment analysis.

Figure 12 .
Figure 12.Summary of open issues.

Figure 12 .
Figure 12.Summary of open issues.
Figure 13 demonstrates the scope of future research and open issues in detail.
Appl.Sci.2019, 9, 5462 22 of 28 be evaluated[100].It further involves accessing customers' feedback to measure customers' satisfaction and effective e-governance and crisis management[101].Figure13demonstrates the scope of future research and open issues in detail.

Figure 13 .
Figure 13.Scope of future research and open issues.

Figure 13 .
Figure 13.Scope of future research and open issues.

Table 1 .
Comparison of the previous surveys.
To provide a way to share their knowledge with the software agents.

Table 2 .
Chronological view of ontology-based sentiment analysis.

Table 3 .
A chronological view of lexicon-based sentiment analysis.

Table 3 .
A chronological view of lexicon-based sentiment analysis.

Table 4 .
A chronological view of machine-learning-based sentiment analysis.

Table 5 .
Advantages and disadvantages of existing research work on sentiment analysis.
Figure 7. Sentiment analysis using the naive Bayes classifier.

Table 7 .
Statistical view of dataset.

Table 8 .
Merits and demerits for lexicon, machine learning and hybrid approaches.

Table 8 .
Merits and demerits for lexicon, machine learning and hybrid approaches.

Table 9 .
User data example.