Improving Sentiment Classiﬁcation of Restaurant Reviews with Attention-Based Bi-GRU Neural Network

: In the era of Web 2.0, there is a huge amount of user-generated content, but the huge amount of unstructured data makes it difﬁcult for merchants to provide personalized services and for users to extract information efﬁciently, so it is necessary to perform sentiment analysis for restaurant reviews. The signiﬁcant advantage of Bi-GRU is the guaranteed symmetry of the hidden layer weight update, to take into account the context in online restaurant reviews and to obtain better results with fewer parameters, so we combined Word2vec, Bi-GRU, and Attention method to build a sentiment analysis model for online restaurant reviews. Restaurant reviews from Dianping.com were used to train and validate the model. With F1-score greater than 89%, we can conclude that the comprehensive performance of the Word2vec+Bi-GRU+Attention sentiment analysis model is better than the commonly used sentiment analysis models. We applied deep learning methods to review sentiment analysis in online food ordering platforms to improve the performance of sentiment analysis in the restaurant review domain.


Introduction
The widespread adoption of Web 2.0 has provided an environment for consumers to engage in expression, creativity, communication, and sharing. Consumers are able to post reviews on online ordering platforms (e.g., Yelp, TripAdvisor, Dianping.) in order to express their opinions about restaurants, vent their emotions, and engage in social activities. Merchants often encourage consumers to actively participate in reviews, and massive usergenerated restaurant reviews give consumers the opportunity to fully express their needs while helping merchants provide real-time and personalized service [1,2]. According to a 2019 BrightLocal survey, approximately two-thirds of consumers have posted reviews of local establishments, with an average of nine reviews per person per year [3]. Due to the intangible and complex nature of goods and services in the restaurant industry, consumers rely heavily on reviews from other customers to evaluate service quality before spending money [4]. Restaurant reviews express the composition of consumers' emotional needs and are an important source of information that consumers can refer to [5]. In the pre-consumer information search phase, consumers tend to search for a large number of restaurant reviews from other users to reduce the perceived uncertainty and perceived risk caused by information asymmetry [6].
Due to a large amount of unstructured information available on the Web, collecting as well as aggregating product review information is a challenging task, which requires the use of automated methods to help researchers collect as well as analyze data, and many previous studies have used sentiment analysis to mine consumer attitudes [7]. The object of sentiment analysis can be in the form of speech, text, images, etc. Restaurant reviews are usually presented as text, so the sentiment analysis in most of papers focuses on text-based sentiment analysis [8]. Consumers usually form a general perception of a restaurant by reading existing restaurant reviews in the pre-purchase information-seeking stage, and the huge amount of restaurant review information obviously exceeds consumers' information processing ability, and reading fewer reviews has a higher probability of generating misperceptions [9]. This requires the platform to provide an efficient way of processing to quickly identify the emotional information contained in restaurant reviews.
There are two main categories of current classification methods. The first is the method based on sentiment lexicon, which mainly judges the sentiment tendency of a text based on the number of sentiment words appearing in the text; the other method is based on machine learning, including Support Vector Machine, Naïve Bayes, K nearest neighbor algorithm, etc. [10][11][12][13]. The limitations of previous studies are found through comparison: (1) lexicon-based, machine learning methods rely on accurate sentiment dictionaries and data preprocessing, and traditional word characterization methods do not take into account contextual information, making sentiment analysis less effective [14]; (2) online ordering platform reviews have strong domain characteristics, such as "Service", "Comfortable", "Enjoyable" and other words, and online ordering platform reviews contain many expressions and meaningless words. Research using sentiment dictionaries or semantic knowledge bases relies on language-specific external resources, and this approach has poor applicability in the face of different domains. It is difficult to consider the full range of specialized vocabulary using traditional sentiment analysis methods.
To efficiently and accurately identify the sentiment in restaurant reviews, we fully consider the advantages of Word2vec and Bi-directional Gated Recurrent Unit (Bi-GRU), and add attention mechanism in neural network. First, we preprocessed online restaurant reviews. Secondly, the distributed word vector representation method, Word2vec, is used to train word vectors. Finally, a restaurant review sentiment classifier was constructed using Bi-GRU. This paper contributes from the following two aspects.

•
We used Word2vec for word vector representation and attention mechanism in Bi-GRU for sentiment analysis, which improves the efficiency of sentiment analysis; • We took full advantage of Bi-GRU's symmetric update to apply it to online restaurant review sentiment analysis, considering the contextual dependencies in online restaurant reviews.
The rest of the paper is organized as follows. Section 2 lists related work on restaurant review sentiment and sentiment analysis methods. Section 3 includes the research framework of this paper and the algorithms. Section 4 provides the detailed steps to construct a sentiment classifier and shows the results of experiments. Section 5 elaborates the conclusion. Section 6 discusses the limitations of this paper and future works.

Literature Review
In this paper, we combined attention mechanism and Bi-GRU for sentiment analysis of reviews on online ordering platforms. In this section, we introduce online restaurant reviews and the related works about sentiment analysis methods.

Online Restaurant Reviews
Consumers usually consider restaurant reviews when making restaurant selection decisions because they complement other information provided by merchants, such as restaurant descriptions, expert opinions, and personalized needs generated by automated recommendation systems [15]. Consumers who read restaurant reviews will rely on their previous experiences to perceive the attitudes expressed in the reviews, and by continuously reading restaurant reviews consumers will form an overall perception of the store and eventually influence their purchase behavior. Restaurant review sentiment reflects the general perceptions and attitudes of other consumers about the restaurant, and consumers often decide to go to a reputable restaurant after searching for online restaurant reviews [4].
Most existing studies examine consumer psychology and behavior in terms of online restaurant reviews, and some of the most relevant research publications in the field are listed in Table 1. Consumer emotional expression is prevalent in online reviews and other forms of computer-mediated communication [22]. Some scholars mined emotional information from online restaurant reviews to provide practice guidance. Luo and Xu applied a deep learning approach to analyze aspect restaurant sentiment during the COVID-19 pandemic period and found that the deep learning model achieved better results overall compared to machine learning algorithms [23]. Micu et al., used Naïve Bayes to classify the sentiment of restaurant reviews, which helps marketers to grasp the characteristics and interests of consumers [24]. Some scholars have studied the methodological perspective of sentiment analysis of restaurant reviews. Kim et al. used word co-occurrence method to calculate the co-occurrence frequency of words in sentences and assigned the highest scoring implicit features to the sentences, while the author introduced a threshold parameter to filter potential features with low scores, and the results showed that this threshold-based approach has good performance for sentiment analysis [25]. Li  of online reviews using a text mining method and through empirical analysis they found that positive emotions had a negative impact on reviews, while negative emotions had a positive impact, in addition expressing angry emotions was more useful than expressing positive emotions [26]. Krishna et al., used machine learning methods to perform sentiment analysis on online restaurant reviews, and SVM achieves optimal results based on a specific data set [27].
Although many studies have paid attention to analyzing online restaurant reviews sentiment to help merchants on the platform to improve their services, there are still some questions: (1) Can the accuracy of a restaurant review sentiment classifier be further improved? (2) Does the method and efficiency of sentiment analysis of online restaurant reviews in Chinese differ from other languages due to the more ambiguous expressions in Chinese?

Sentiment Analysis Method
Sentiment analysis, also known as opinion mining, is a computational study of people's needs, attitudes, and emotions toward an entity [28]. Sentiment analysis is able to obtain the positive or negative sentiments of evaluation subjects and their intensity, and the results of sentiment analysis can be useful in many fields, such as online sentiment opinion analysis, topic monitoring, word-of-mouth evaluation of massive products, and so on.
Feature selection is a fundamental task in the field of sentiment analysis, and effective feature selection from subjective texts can significantly improve the efficiency of sentiment analysis [29,30]. Many scholars have conducted research from the feature perspective to find an effective feature selection method. Zhang et al., selected N-char-grams and N-POSgrams as potential sentiment features and used Boolean weighting method to calculate feature weights, and the results showed that the feature characterization method they chose was able to obtain better accuracy [30]. Hogenboom et al., used a vectorized representation based on text structure for multi-domain English text sentiment analysis, and the conclusion showed that this method works better than word-based feature representation [31]. Sentiment analysis is more domain-sensitive and product feature selection can be seen as the identification of domain-specific named entities, which leads to the fact that most sentiment analysis methods require domain-specific knowledge to improve the performance of the system. Most of the existing studies on feature selection have limitations, and the efficiency of sentiment analysis decreases significantly once it is removed from a specific domain.
Many studies used sentiment dictionaries as well as machine learning methods to analyze restaurant reviews [12,32], and although relatively good results have been achieved, the data processing effort is relatively high and the domain is less transferable. Meanwhile, deep learning-based sentiment analysis methods are gaining popularity as deep learning provides automatic feature extraction as well as richer representation performance and better performance [33]. Abdi et al., proposed a deep learning-based approach to classify user opinions expressed in reviews (called RNSA), which overcomes the disadvantages of traditional methods that lose temporal as well as positional information and achieves good results in sentence-level sentiment classification [34]. Al-Smadi used Long Short Term Memory (LSTM) to achieve sentiment analysis of reviews of Arabian hotels in two ways, first, by combining Bi-directional Long Short Term Memory (Bi-LSTM) and conditional random fields for the formulation of opinion requirements classification, and second, sentiment analysis using LSTM, which showed that both outperformed the previous baseline study [35].
In the field of sentiment analysis, many scholars have used methods based on sentiment dictionaries or traditional machine learning. The results of these methods are not satisfactory, as the performance of the model is heavily dependent on the feature selection strategy and the tuning of the parameters. Deep learning includes Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short Term Memory (LSTM), and other network structures [29]. Deep learning-based sentiment analysis models use neural network to learn to extract complex features from data with minimal external contributions, and it has achieved good performance in natural language processing [36]. Compared to sentiment analysis techniques using machine learning methods, deep learning-based sentiment analysis is more generalizable, and in addition, deep learning-based methods have better performance in terms of feature extraction and nonlinear fitting capabilities.
In this paper, we built a neural network model using Bi-GRU to fully consider the semantic dependency of the context of reviews in online ordering platforms and used the attention mechanism to enhance the efficiency of sentiment classification.

Methodology
In this paper, we propose a deep learning-based sentiment analysis framework for online restaurant reviews. The research framework of this paper is shown in Figure 1. This framework consists of four main components: (1) Web Crawler; (2) Pre-Processing; (3) Word Vector; (4) Sentiment Analysis.

pabilities.
In this paper, we built a neural network model using Bi-GRU to fully consider the semantic dependency of the context of reviews in online ordering platforms and used the attention mechanism to enhance the efficiency of sentiment classification.

Methodology
In this paper, we propose a deep learning-based sentiment analysis framework for online restaurant reviews. The research framework of this paper is shown in Figure 1. This framework consists of four main components: (1) Web Crawler; (2) Pre-Processing; (3) Word Vector; (4) Sentiment Analysis.
(1) Web Crawler: We crawled the restaurant review data needed for the study from online ordering platforms. (2) Pre-Processing: For the crawled dataset, it is necessary to remove null values as well as duplicate values. In addition, we split the reviews into smaller units of study and marked the part of speech. (3) Word Vector: To convert unstructured text into structured text, we applied Word2vec, a method of word embedding, to vectorize the words. (4) Finally, a deep learning method is used to construct a sentiment classification model for the online ordering platform.

Word Embeddings
Word embeddings are often used in sentiment analysis tasks to transform words into low-dimensional vectors that can be recognized by programs. Traditional Bag-of-wordsbased methods suffer from excessive-dimensionality and sparsity, while Word2vec can provide a relatively correct description of the semantics of words, and this paper uses the Word2vec approach to generate word vectors.
Word2vec uses two language models, CBOW and Skip-gram, to learn distributed word representations to reduce the complexity of the algorithm [37]. The mechanism inherent in the CBOW model is to predict the probability of occurrence of the central word from the contextual words. The inherent mechanism in the CBOW model is to predict the contextual words based on the current given word. In this paper, the CBOW model was used to train the vectors and its framework is shown in Figure 2. (1) Web Crawler: We crawled the restaurant review data needed for the study from online ordering platforms. (2) Pre-Processing: For the crawled dataset, it is necessary to remove null values as well as duplicate values. In addition, we split the reviews into smaller units of study and marked the part of speech. (3) Word Vector: To convert unstructured text into structured text, we applied Word2vec, a method of word embedding, to vectorize the words. (4) Finally, a deep learning method is used to construct a sentiment classification model for the online ordering platform.

Word Embeddings
Word embeddings are often used in sentiment analysis tasks to transform words into low-dimensional vectors that can be recognized by programs. Traditional Bag-of-wordsbased methods suffer from excessive-dimensionality and sparsity, while Word2vec can provide a relatively correct description of the semantics of words, and this paper uses the Word2vec approach to generate word vectors.
Word2vec uses two language models, CBOW and Skip-gram, to learn distributed word representations to reduce the complexity of the algorithm [37]. The mechanism inherent in the CBOW model is to predict the probability of occurrence of the central word from the contextual words. The inherent mechanism in the CBOW model is to predict the contextual words based on the current given word. In this paper, the CBOW model was used to train the vectors and its framework is shown in Figure 2.

Bi-GRU
In this article, the restaurant review sentiment classifier was constructed using the Bi-dimensional Gated Recurrent Unit (Bi-GRU) approach. Next, Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU), and Bi-GRU are briefly introduced.
RNN uses a feedback loop where the output of each step is fed back into the Recurrent Neural Network therefore influencing the next output, a process that is repeated in each subsequent step. Such a feedback mechanism allows Recurrent Neural Network to dynamically learn sequence features and thus improve the efficiency of sentiment analysis. The computational equation is as follows: where s t denotes the value of the hidden layer, f , g denotes the activation function, U denotes the weights of t x , W denotes the weights matrix of the weights matrix of the hidden layer. Chung et al., proposed a GRU model with similar experimental results to LSTM, but with a simpler structure and more efficient computational process [38]. Like the inputoutput structure of RNN, GRU is influenced by the current input t x and the hidden state t x passed from the previous node. The Rate Recurrent Unit solves the gradient explosion problem in a simpler structure by introducing reset gate r and update gate z , as shown in Figure 3.

Bi-GRU
In this article, the restaurant review sentiment classifier was constructed using the Bi-dimensional Gated Recurrent Unit (Bi-GRU) approach. Next, Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU), and Bi-GRU are briefly introduced.
RNN uses a feedback loop where the output of each step is fed back into the Recurrent Neural Network therefore influencing the next output, a process that is repeated in each subsequent step. Such a feedback mechanism allows Recurrent Neural Network to dynamically learn sequence features and thus improve the efficiency of sentiment analysis. The computational equation is as follows: where s t denotes the value of the hidden layer, f , g denotes the activation function, U denotes the weights of x t , W denotes the weights matrix of s t−1 , V denotes the weights matrix of the hidden layer. Chung et al., proposed a GRU model with similar experimental results to LSTM, but with a simpler structure and more efficient computational process [38]. Like the inputoutput structure of RNN, GRU is influenced by the current input x t and the hidden state x t passed from the previous node. The Rate Recurrent Unit solves the gradient explosion problem in a simpler structure by introducing reset gate r and update gate z, as shown in Figure 3.

Bi-GRU
In this article, the restaurant review sentiment classifier was constructed using the Bi-dimensional Gated Recurrent Unit (Bi-GRU) approach. Next, Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU), and Bi-GRU are briefly introduced.
RNN uses a feedback loop where the output of each step is fed back into the Recurrent Neural Network therefore influencing the next output, a process that is repeated in each subsequent step. Such a feedback mechanism allows Recurrent Neural Network to dynamically learn sequence features and thus improve the efficiency of sentiment analysis. The computational equation is as follows: where s t denotes the value of the hidden layer, f , g denotes the activation function, U denotes the weights of t x , W denotes the weights matrix of 1 s t  , V denotes the weights matrix of the hidden layer. Chung et al., proposed a GRU model with similar experimental results to LSTM, but with a simpler structure and more efficient computational process [38]. Like the inputoutput structure of RNN, GRU is influenced by the current input t x and the hidden state t x passed from the previous node. The Rate Recurrent Unit solves the gradient explosion problem in a simpler structure by introducing reset gate r and update gate z , as shown in Figure 3.  First, the input from the current node and the state transmitted down from the previous node are used to obtain the reset as well as update the gating state, which is calculated as follows: Secondly, after obtaining the gate signal use reset gate to record the current moment state, the specific calculation formula is as follows: The last step is to update the memory. The specific calculation formula is as follows: where t denotes a certain moment, σ denotes the activation function, W denotes the weight, r t denotes the reset gate t at the moment, z t denotes the update gate t at the moment, and h t denotes the activation state at the moment t.
While Bi-GRU allows the hidden layer to capture historical and future contextual information, Bi-GRU takes into account both preceding and following sentence dependencies on top of GRU, which is usually applied in text classification tasks. Bi-GRU's operation mechanism is shown in Figure 4. At each step, the same weight matrix is multiplied with the input or the hidden layer at the previous time point and the processing has symmetry. This symmetry ensures that the neural network can fully take into account the context and ultimately improves the classification of the model.
First, the input from the current node and the state transmitted down from the previous node are used to obtain the reset as well as update the gating state, which is calculated as follows: Secondly, after obtaining the gate signal use reset gate to record the current moment state, the specific calculation formula is as follows: The last step is to update the memory. The specific calculation formula is as follows: where t denotes a certain moment, σ denotes the activation function, W denotes the weight, t r denotes the reset gate t at the moment, t z denotes the update gate t at the moment, and t h denotes the activation state at the moment t .
While Bi-GRU allows the hidden layer to capture historical and future contextual information, Bi-GRU takes into account both preceding and following sentence dependencies on top of GRU, which is usually applied in text classification tasks. Bi-GRU's operation mechanism is shown in Figure 4. At each step, the same weight matrix is multiplied with the input or the hidden layer at the previous time point and the processing has symmetry. This symmetry ensures that the neural network can fully take into account the context and ultimately improves the classification of the model.

Attention Mechanism
The attention mechanism was originally derived from the human visual attention mechanism and was later applied to the field of artificial intelligence [39]. The attention mechanism is a simple method to encode sequential data based on the importance score assigned to each unit. As an information resource allocation scheme, it is widely used in various information streamlining tasks [40]. Deep learning models based on the attention mechanism can capture global and local connections flexibly, making the model less complex and with fewer parameters, improving the efficiency of model training.

Attention Mechanism
The attention mechanism was originally derived from the human visual attention mechanism and was later applied to the field of artificial intelligence [39]. The attention mechanism is a simple method to encode sequential data based on the importance score assigned to each unit. As an information resource allocation scheme, it is widely used in various information streamlining tasks [40]. Deep learning models based on the attention mechanism can capture global and local connections flexibly, making the model less complex and with fewer parameters, improving the efficiency of model training.
Specifically, the attention mechanism assigns different weights to the input in the model, which can quickly extract the key information from the data to improve the robustness of the results. For example, if the input words of the sentiment classification model are "Restaurant", "Environment", and "Nice", the attention mechanism will take the word probability distribution of 0.2, 0.3, and 0.5 into account in the output of the model. The attention mechanism takes the word probability distributions 0.2, 0.3, and 0.5 into account in the output of the model, which ultimately improves the quality of the sentiment analysis. The model after the introduction of the attention mechanism is shown in Figure 5.
Specifically, the attention mechanism assigns different weights to the input in the model, which can quickly extract the key information from the data to improve the robustness of the results. For example, if the input words of the sentiment classification model are "Restaurant", "Environment", and "Nice", the attention mechanism will take the word probability distribution of 0.2, 0.3, and 0.5 into account in the output of the model. The attention mechanism takes the word probability distributions 0.2, 0.3, and 0.5 into account in the output of the model, which ultimately improves the quality of the sentiment analysis. The model after the introduction of the attention mechanism is shown in Figure 5. The underlying form of the attention mechanism is shown below: where u is the matching feature vector based on the current task for interaction with the context. In this paper, we used Bi-GRU to analyze the sentiment of reviews in online food ordering platforms, taking into account the pretext features as well as post text features to improve the accuracy of the results. During the training process, we used dropout to randomly remove neurons in the hidden layer to prevent overfitting and make the model more generalizable, and used softmax in the output layer to map the results to the range of 0~1. Finally, binary cross-entropy was used as the loss function with the following equation: where i y denotes the label of sample i, and i p denotes the probability that the sample is predicted to be positive. The underlying form of the attention mechanism is shown below: where u is the matching feature vector based on the current task for interaction with the context. v i is the feature vector for a time stamp in the time series, e i is initial attention score without normalization, α i is the attention score after normalization operation, and c is the contextual feature for the current time stamp, and it can be calculated by the summation of the attention score multiplied by the feature vector v.
In this paper, we used Bi-GRU to analyze the sentiment of reviews in online food ordering platforms, taking into account the pretext features as well as post text features to improve the accuracy of the results. During the training process, we used dropout to randomly remove neurons in the hidden layer to prevent overfitting and make the model more generalizable, and used softmax in the output layer to map the results to the range of 0~1. Finally, binary cross-entropy was used as the loss function with the following equation: where y i denotes the label of sample i, and p i denotes the probability that the sample is predicted to be positive.

Model Evaluation Metrics
Confusion matrices are commonly used in the task of two-classification supervised learning to determine the gap between predicted and true values, in the form shown in Table 2 [41]. A single confusion matrix metric is difficult to measure the merit of the model. Therefore, Precision, Recall, and F1-Score were used as the evaluation metrics for model performance in the research setting of this paper. The calculation of Precision is shown below: The calculation of Recall is shown below: The F1-Score is commonly used in statistics to measure the performance of a dichotomous model, which takes into account both accuracy and recall, and the F1-Score is calculated as follows:

Data Description
The experimental data in this paper comes from Dianping.com (accessed on 1 May 2021), which is now the leading local lifestyle consumption platform in China. We randomly crawled a total of 35,248 reviews from 130 stores by crawlers, which contain information on fields such as username, taste rating, environment rating, service rating, review content, and review time. An example of reviews is shown in Figure 6. The field of online reviews usually considers the textual sentiment of online reviews to be consistent with the digital review ratings [42]. In this paper, the average of taste ratings, environment ratings, and service ratings was taken as the composite score and judged, with positive sentiment polarity if the rating is greater than 3 and negative sentiment polarity if the rating is less than or equal to 3. Finally, 26,703 positive sentiment reviews and 8545 negative sentiment reviews were obtained. The descriptive statistics of the review data are shown in Table 3. The distribution of the ratings ranged from 0.5 to 5, the sentiment polarity of the reviews ranged from 0 to 1, and the length of the review text ranged from 5 to 2093. The length distribution of positive and negative reviews is shown in Figure 7. Very suitable for family dinner, several relatives holidays or usually go to the restaurant of choice, the price is reasonable and the amount is sufficient, the taste is also very good, very quiet is also good, often go to Jiangpu Road that, Wenjiang other branches have also eaten, the taste difference is not big. Recommended meat dishes Figure 1 ...... Figure 6. Review Score and Review Screenshot.    Figure 6. Review Score and Review Screenshot.  The Chinese restaurant review data were processed by word separation using the jieba library in Python language, and the meaningless words were removed using the HIT stop words list [43]. The reviews after word separation are shown in Table 4.

Number
Raw Reviews Reviews after Word Segmentation 1 Today at lunch time went to eat the king shrimp, the overall feeling is still good. Shrimp is a large large, generally larger than the shrimp outside a circle. The service is also relatively fast, ordered 5 min on the up. then the price is expensive, three people ate 400 yuan ~ and in addition to shrimp nothing else to eat. Would have liked to match some snacks ah, porridge ah, cold dishes ah, brine ah ...... through no. Only the barbecue, noon is not supplied. Overall still good.
Lunch/eat/king/shrimp/overall/feel/good/shrimp/large/large/outside/shrimp/a circle/service/order/minute/price/expensive/three/eat/400/yuan/shrimp/eat/woul d have/want/match/match/snacks/porridge/cold dish/brine/through/barbecue/noon/supply/good The Chinese restaurant review data were processed by word separation using the jieba library in Python language, and the meaningless words were removed using the HIT stop words list [43]. The reviews after word separation are shown in Table 4. Table 4. Results of Word Segmentation (Example).

Number
Raw Reviews Reviews after Word Segmentation 1 Today at lunch time went to eat the king shrimp, the overall feeling is still good. Shrimp is a large large, generally larger than the shrimp outside a circle. The service is also relatively fast, ordered 5 min on the up. then the price is expensive, three people ate 400 yuan~and in addition to shrimp nothing else to eat. Would have liked to match some snacks ah, porridge ah, cold dishes ah, brine ah ...... through no. Only the barbecue, noon is not supplied. Overall still good.
Lunch/eat/king/shrimp/overall/feel/good/shrimp/large/ large/outside/shrimp/a circle/service/order/minute/price/expensive/three/eat/ 400/yuan/shrimp/eat/would have/want/match/match/snacks/porridge/cold dish/brine/through/barbecue/noon/supply/good 2 The taste can only be said to be so-so, will not take a special detour to eat, soup from beginning to end have not been added, the waiter just started to see after never mind our table, eat to the middle to find even the dipping saucer are not! Three people set menu dishes are also too shabby, shrimp and meatballs such as meat are counted by the number, drinks cannot be changed outside the set menu of other drinks, I personally feel that every aspect of even the small red robe is not comparable, hope to improve.

Experimental Setup
In this paper, the word vector was trained using the Gensim library, a third-party library in Python, where the size of window was set to 5, the dimensionality of the word vector was set to 300 dimensions, the learning rate was set to 0.01, and the rest of the parameters used the default initial settings. The dimensionality of the lexicon was reduced to two dimensions using principal component analysis and visualized as shown in Figure 8. The distribution of restaurant review length is shown in Figure 9. By the statistics of review word length, we found that more than 90% of the review length is below 90, so we constructed the sentence vector embedding matrix with length 90, and the values in the matrix are the corresponding word indexes. prove.
robe/not comparable/improve

Experimental Setup
In this paper, the word vector was trained using the Gensim library, a third-party library in Python, where the size of window was set to 5, the dimensionality of the word vector was set to 300 dimensions, the learning rate was set to 0.01, and the rest of the parameters used the default initial settings. The dimensionality of the lexicon was reduced to two dimensions using principal component analysis and visualized as shown in Figure  8. The distribution of restaurant review length is shown in Figure 9. By the statistics of review word length, we found that more than 90% of the review length is below 90, so we constructed the sentence vector embedding matrix with length 90, and the values in the matrix are the corresponding word indexes.   prove.
robe/not comparable/improve

Experimental Setup
In this paper, the word vector was trained using the Gensim library, a third-party library in Python, where the size of window was set to 5, the dimensionality of the word vector was set to 300 dimensions, the learning rate was set to 0.01, and the rest of the parameters used the default initial settings. The dimensionality of the lexicon was reduced to two dimensions using principal component analysis and visualized as shown in Figure  8. The distribution of restaurant review length is shown in Figure 9. By the statistics of review word length, we found that more than 90% of the review length is below 90, so we constructed the sentence vector embedding matrix with length 90, and the values in the matrix are the corresponding word indexes.   The Tensorflow deep learning framework was used to build the Word2vec+Bi-GRU+Attention deep learning model, and the training and test sets were divided into a 4:1 ratio. Dropout parameter allows the deep neural network model to ignore certain features during the training process to reduce the overfitting problem. To verify the impact of the dropout parameter on the model performance, we tested the accuracy of the model when dropout is 0.1~0.9, as shown in Figure 10, we can find that the accuracy of the model is highest when dropout is 0.2. Figure 11 gives a comparison of the model performance under three types of batch_size settings, and the best when the batch_size is 128 was considered comprehensively. To improve the adaptability of the model training process in different subsets, we set a certain proportion of the validation set. Figure 12 gives a comparison of the model performance under the three types of validation_split parameter settings, the proportion of the validation set in the cross-validation is set to 0.4 and the model has a better performance. The detailed settings of each parameter in the neural network model are given in Table 5. sidered comprehensively. To improve the adaptability of the model training process in different subsets, we set a certain proportion of the validation set. Figure 12 gives a comparison of the model performance under the three types of validation_split parameter settings, the proportion of the validation set in the cross-validation is set to 0.4 and the model has a better performance. The detailed settings of each parameter in the neural network model are given in Table 5.   under three types of batch_size settings, and the best when the batch_size is 128 was con-sidered comprehensively. To improve the adaptability of the model training process in different subsets, we set a certain proportion of the validation set. Figure 12 gives a comparison of the model performance under the three types of validation_split parameter settings, the proportion of the validation set in the cross-validation is set to 0.4 and the model has a better performance. The detailed settings of each parameter in the neural network model are given in Table 5.   under three types of batch_size settings, and the best when the batch_size is 128 was considered comprehensively. To improve the adaptability of the model training process in different subsets, we set a certain proportion of the validation set. Figure 12 gives a comparison of the model performance under the three types of validation_split parameter settings, the proportion of the validation set in the cross-validation is set to 0.4 and the model has a better performance. The detailed settings of each parameter in the neural network model are given in Table 5.

Baseline Model
To verify the validity of the sentiment classification model proposed in this paper, machine learning and deep learning methods were applied to the scenario of sentiment analysis of online ordering platform reviews, respectively. The baseline models we will apply are briefly introduced: • K Nearest Neighbor (KNN). KNN is a classification algorithm whose basic principle is to compute the K values that are most similar to the centroid. • Support Vector Machine (SVM). SVM can map the sample space to a functional space with high dimensionality by nonlinear mapping, converting an originally non-linearly separable problem into a linearly separable problem inside some feature space [44]. It has been proven to have good performance in sentiment analysis as well as efficiency.

•
Convolutional Neural Network (CNN). Convolutional neural network can effectively consider information from different location sources, and they are widely used to solve problems such as image processing, natural language processing, including sentiment analysis, summary extraction, etc. [45]. • Bi-directional Long Short Term Memory (Bi-LSTM). Bi-LSTM fully considers context dependency and achieves good results in sentiment analysis [46].

Experimental Results
Finally, their accuracy, recall, and F1 values were compared, and the results of the comparison are shown in Figure 13 and Table 6. It was found that the combined performance of the sentiment analysis model of W2v+Attention+Bi-GRU was better than the other models.

Baseline Model
To verify the validity of the sentiment classification model proposed in this paper, machine learning and deep learning methods were applied to the scenario of sentiment analysis of online ordering platform reviews, respectively. The baseline models we will apply are briefly introduced:  K Nearest Neighbor (KNN). KNN is a classification algorithm whose basic principle is to compute the K values that are most similar to the centroid.  Support Vector Machine (SVM). SVM can map the sample space to a functional space with high dimensionality by nonlinear mapping, converting an originally non-linearly separable problem into a linearly separable problem inside some feature space [44]. It has been proven to have good performance in sentiment analysis as well as efficiency.  Convolutional Neural Network (CNN). Convolutional neural network can effectively consider information from different location sources, and they are widely used to solve problems such as image processing, natural language processing, including sentiment analysis, summary extraction, etc. [45].  Bi-directional Long Short Term Memory (Bi-LSTM). Bi-LSTM fully considers context dependency and achieves good results in sentiment analysis [46].

Experimental Results
Finally, their accuracy, recall, and F1 values were compared, and the results of the comparison are shown in Figure 13 and Table 6. It was found that the combined performance of the sentiment analysis model of W2v+Attention+Bi-GRU was better than the other models.

Discussion
As more and more unstructured restaurant reviews are exposed to consumers, how to perform rapid sentiment analysis and demand recognition on the text has become a research hotspot. Based on the review data of Dianping.com (accessed on 1 May 2021) obtained by a web crawler, this paper used the Word2vec+Bi-GRU+Attention method to construct an online ordering platform review sentiment analysis model. It is found that the performance of the Word2vec+Bi-GRU+Attention method is higher than the commonly used sentiment analysis model.
The research in this article has certain theoretical and practical implications. First of all, in terms of theoretical implications, many scholars currently use professional sentiment dictionaries and machine learning methods to perform sentiment analysis on restaurant review texts. Traditional sentiment analysis methods rely on specific domain dictionaries, and it is influenced by the number of positive and negative words. The sentiment analysis model based on deep learning was proven to have better performance. This article uses the Word2vec+Bi-GRU+Attention method to perform sentiment analysis on Online restaurant reviews. After testing on the test set, it is found that in the environment of online ordering platforms, the comprehensive performance of Word2vec+Bi-GRU+Attention is better than the commonly used Machine learning methods and deep learning methods.
Secondly, in terms of practical implications, in the face of massive user reviews, sentiment analysis can provide consumers with decision support at a lower cost and faster speed. For example, when consumers choose a restaurant to dine at, they can select a higher quality restaurant by judging the ratio of positive reviews to negative reviews. They no longer need to read all the text, but simply combine keywords with emotional tendencies to quickly grasp the attitudes and opinions of reviewers. In addition, automated emotion recognition can enhance user satisfaction with the platform and ultimately increase consumer activity. Clustering reviews on different aspects, counting the ratio of positive reviews to negative reviews under each aspect, and consumers can choose a restaurant that suits their taste based on the distribution of different aspects of their emotions. By analyzing different aspects of a restaurant's sentiment, consumers no longer need to spend a lot of time reading and understanding each review to quickly grasp the restaurant's strengths and weaknesses. Meanwhile, for aspects where consumers have strong opinions, restaurants can make targeted improvements to improve consumer satisfaction with the restaurant.

Conclusions
The research in this article has some limitations. First, the number of positive reviews and the number of negative reviews is not balanced. The number of positive reviews is significantly higher than the number of negative reviews, which may cause deviations in the results. Second, this article uses ratings to determine the positive and negative sentiments of reviews, ignoring those high-scoring negative reviews and low-scoring positive reviews. This article judges the polarity of reviews based on restaurant review ratings. However, in reality, there is such a problem. Some consumers have given high scores but the polarity of the reviews is negative, while other consumers' behavior is just the opposite. The reader may use a satirical tone to comment.
In the future, we can consider using publicly available balanced datasets for training, or consider a combination of over-sampling and clustering techniques to make the samples more balanced. Furthermore, fine-grained machine learning methods can play an important role in identifying inconsistent reviews.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.