Predicting the Helpfulness of Online Restaurant Reviews Using Di ﬀ erent Machine Learning Algorithms: A Case Study of Yelp

: Helpful online reviews could be utilized to create sustainable marketing strategies in the restaurant industry, which contributes to national sustainable economic development. This study, the main aspects (including food / taste, experience, location, and value) from 294,034 reviews on Yelp.com were extracted empirically using the Latent Dirichlet Allocation (LDA) and positive and negative sentiment were assigned to each extracted aspect. Positive sentiments were associated with food / taste, while negative sentiments were associated with value. This study further proves a robust classiﬁcation algorithm based on Support Vector Machine (SVM) with a Fuzzy Domain Ontology (FDO) algorithm outperforms other traditional classiﬁcation algorithms such as Naïve Bayes (MB) and SVM ontology in predicting the helpfulness of online reviews. This study enriches the literature on managerial aspects of sustainability by analyzing a large amount of plain text data that customers generated. The results of this study could be used as sustainable marketing strategy for review website developers to design sophisticated, intelligence review systems by enabling customers to sort and ﬁlter helpful reviews based on their preferences. The extracted aspects and their assigned sentiment could also help restaurateurs better understand how to meet diverse customers’ needs and maintain sustainable competitive advantages.


Introduction
Many people hold the misconception that sustainability relates only to the natural environment and overlook the importance of promoting sustainable economic development and creating sustainable business strategy. Metrics of sustainable economic development include, but are not limited to: Local economic growth, local and small business growth, and cost of living [1]. The restaurant industry plays an important role in promoting sustainable economic development in the U.S. According to National Restaurant Association [2], as of 2019, revenues generated by the restaurant industry in the U.S. was estimated at $863 billion, accounting for 4% of the U.S. gross domestic product (GDP). Moreover, the restaurant industry employs approximately $15.3 million people, which accounts for 10% of the overall U.S. workforce.
However, it is noticed that restaurants are struggling to survive due to a number of factors such as intense competitions, rising food prices, and high labor costs [3]. An earlier research study has demonstrated that around 60% of restaurants fail within three years [4]. Forbes [5] further reported that the restaurant failure rate is 30% within the first year, and 30% of those that survive shutter in the following two operation years. As such, how to achieve sustained business performance becomes a critical issue for the restaurant industry [6]. Given that the development of restaurant industry could reduce unemployment rate and promote local economic growth, it is important to develop sustainable marketing strategies to drive restaurant performance.
In today's digital era, eWOM (electronic Word-of-Mouth) have outweighed the traditional marketing strategies regarding influence on customers' purchase decisions. According to 2017 TripAdvisor Restaurant Marketing Survey, it is reported that 94% of diners choose a restaurant based on online reviews. Restaurateurs agreed that online listing service is one of the most effective marketing channels for driving more businesses [7]. The amount of online reviews is growing at an unprecedented rate. According to Statista, the number of reviews submitted to Yelp has reached $148.3 million, which is more than twice the amount from 2014 [8]. Customers often feel overwhelmed while confronting the abundance of messages online [9]. Kwon et al. [10] observed that customers tend to rely on a very limited number of reviews in making purchase decisions. Therefore, they may resort to helpful reviews to gain a general idea of products or services. A helpful review is defined as "a peer-generated product evaluation that facilitates the consumer's purchase decision process" [11]. Presenting helpful reviews could help customers reduce the time and effort of searching relevant information from a large volume of online reviews [12]. It is also valuable for marketers to gain customers feedback to improve their products or services. Thus, it is important for restaurant owners and marketers to understand how to make use of helpful online reviews to make their businesses stand out from competitors listed online.
Numerous prior studies have identified factors influencing review helpfulness for both search goods (e.g., furniture, digital camera, cell phone) and experience goods (e.g., restaurant, hotel) [11,13,14]. However, these factors are dominantly measured using Likert scales or numerical metrics (e.g., review volume, star rating; sentence length) [11,15], neglecting more hidden semantic structures, such as emotions and linguistic styles conveyed through a textual content. With digital texts increasing in size, a number of studies have applied machine learning (ML), which studies a computer's ability to learn from data without being explicitly programmed [16], to predict review helpfulness. However, prior scholars tend to predict review helpfulness in the hotel and e-commerce industry (e.g., Amazon.com), there is a relatively small number of studies predicting review helpfulness in the restaurant industry. Also, few studies have made effort to propose an appropriate ML-based text mining technique to predict restaurant review helpfulness using both important dining aspects and emotional contents.
To address these literature gaps, this study aims to extract both emotions and most frequently mentioned dining aspects, then predict review helpfulness by comparing different ML-based text mining techniques. This study is guided by the following three questions: (1) What dining aspects are most important to customers? (2) What attitudes (positive or negative) are expressed regarding each dining aspect? (3) Considering restaurant aspects and their sentiment, which machine learning method performs better in predicting review helpfulness? It is expected that these study results could serve as a guidance for website developers to design better review systems to keep and achieve a sustainable competitive advantage over numerous online review websites. The extracted restaurant aspects and results from sentiment analysis could also help restaurateurs better understand how to constantly improve customers' dining experiences.
The rest of this paper proceeds as follows. Section 2 illustrates related work on eWOM and review helpfulness prediction. The methods, including data collection and data analysis process are explained in Section 3. The main results are presented in Section 4. In the final section, the conclusion, a discussion on theoretical contributions and practical implications, limitations and future research are presented.

The Role of eWOM in Promoting Business Sustainability
With the proliferation of the Internet, eWOM becomes a popular mode of communication. eWOM is defined as any positive or negative statement about a product or company made by actual, potential, or former customers to a multitude of people in an online format [17]. The importance of maintaining a sustainable business using eWOM lies in its impact on customers' trust, purchase intentions and sales performance [18][19][20][21]. It is reported that 87% of customers will not consider businesses with low ratings, while 92% of customers rely on online reviews to determine whether the businesses are good [22].
Online reviews are particularly important in purchasing experiential goods featured with intangible attributes (e.g., hotels, restaurants) since prospective customers cannot experience the products in advance [23]. Given that the restaurant industry is a collection of experiences dominated by intangible and impalpable elements, customers were unable to assess objectively the characteristics of products or services prior to consumption or spatial movement. Therefore, customers tend to obtain detailed information from different sources before making a decision as a means to reduce perceived uncertainty and risk [24]. The volume of restaurant reviews customers make was found to be an indicator of the popularity of restaurant [25]. Kim et al. [26] further found the number of online reviews had a positive impact on restaurant performance.
Additionally, understanding online reviews is also beneficial for business owners and marketers to gain clearer insight into customers' attitudes and behavior, which can be used by practitioners to improve their service, and create a sustainable competitive advantage. Prior studies have analyzed online reviews to understand customer experience and satisfaction across different service contexts including hotel industry, short-term rental industry, airline industry, and wellness industry [27][28][29]. In the restaurant industry, Pantelids [30] empirically examined meal experience using 2471 online restaurant comments, revealing six salient factors in a diner's evaluation of a restaurant: food, service, ambience, price, menu, and décor. Yan et al. [31] analyzed quantitative scores of 10,136 Chinese restaurant reviews and found a similar result revealing food quality, price and value, service quality, and atmosphere influenced customers revisit intention.

Studies on Review Helpfulness Prediction
A number of online platforms have offered the mechanisms for other users to evaluate online reviews [32]. For example, Amazon.com and TripAdvisor allow customers to vote for the reviews that are perceived as helpful in their decision-making process. The number of helpful votes could signal the quality of message contents [33]. Retail website developers could also increase their website traffic by presenting helpful reviews as a differentiation strategy [11].
Numerous previous scholars have identified the factors influencing "helpfulness" of online customer reviews. Based on a heuristic systematic model (HSM), explored factors influencing review helpfulness could be divided into two types: (1) Central route cues, which are associated with review content features, such as review content quality, review length, review readability, review types and review extremity [13,34,35]; and (2) peripheral cues, which are associated with information source features, such as reviewer expertise, reviewers' gender, reviewer reputation [36][37][38]. However, Hong et al. [39] noticed that the predictors of review helpfulness yield inconstant conclusion. They conducted a meta-analysis and found that review readability and review ratings did not significantly influence review helpfulness. However, it is observed that the aforementioned determinants of review helpfulness are dominantly featured by numerical features [15]. In recent years, research on review helpfulness has focused on semantic features and linguistic style in online reviews [15,40]. Table 1 summarizes the studies on predicting review helpfulness on two popular business review sites (Yelp and Tripadvisor) during the period from 2015 to 2019. Racherla and Friske [37] examined the impact of reviewer factors and review factors on perceived review usefulness across three types of categories. Among which, restaurants were chosen as experiential-based services. The results indicated that experiential-based services are a function of individual taste. Liu and Park [38] later extended the work of Racherla and Friske [37] by focusing exclusively on restaurant reviews. They added review readability and customer perceived enjoyment as two qualitative antecedents of review usefulness and identified these two variables as the most influential factors of review helpfulness. Ngo-Ye and Sinha [41] further conducted recency (the recency of purchase), frequency (frequency or total number of purchases), and the average amount spent per transaction (monetary value) (RFM) analysis in predicting review helpfulness. In addition to widely examined review and reviewer-specific characteristics, other factors influencing review helpfulness include review orders [42], emotions, and linguistic styles [15], temporal, exploratory and sensory cues [43]. Restaurants in New York City, Las Vegas, and Los Angeles are most likely to be selected as study samples [15,37,42]. Although recent studies have started to examine the deeper structure and patterns of textual data, few studies have taken into account both emotions and restaurant experience aspects in predicting restaurant review helpfulness.

Machine Learning for eWOM
A few decades ago, researchers tend to conduct content analysis manually to identify the product or service features most important to customers based on the word frequency [30,44]. To better understand aspects that contribute to a helpful review, machine learning for textual data analysis, which allows a machine to extract and classify online reviews, has been utilized to provide more insights and make predictions from high volumes of reviews [45]. Compared to traditional forms of manual content analysis, machine learning methods for text data are less time consuming and labor intensive. It also provides additional information, such as semantics, structures, sequences, and context around nearby words.
Text mining classification, which labels unstructured data with relevant categories from a predefined set, is a fundamental text-mining task [46]. Most frequently used machine learning techniques for classification and regression analysis include Naïve Bayes (NB), Support Vector Machine (SVM), K-Nearest Neighbors (KMN), and classical ontology [47]. NB usually produces less accurate predictive outcomes, but its high processing speed on big data was favored by scholars [48]. SVM is currently one of the most effective methods to categorize unlabeled data [49]. Zhang et al. [50] analyzed restaurant reviews written in Cantonese, revealing that NB achieved equal or better accuracy than SVM. Rafi et al. [51] compared SVM and NB classifiers for text categorization with Wikitology and found that NB performed better. Lau et al. [52] found that fuzzy ontology-based semantic analysis outperformed other algorithms (e.g., SVM, embedded in an experimental system (OBPRM)) given its effectiveness in automatically identifying the aspect-oriented sentiments captured in product ontology. Ali et al. [47] proposed SVM with Fuzzy Domain Ontology (FDO) as a more accurate and efficient algorithm to extract hotel features given its improved ability to remove irrelevant reviews and classify the feature reviews into more degrees of polarity terms.
It is concluded that the predictive power of these classifiers varies across different online review contexts and could be influenced by interactions between classification models and feature options [50]. Therefore, comparisons should be made across different machine learning algorithms to determine which data-mining algorithm in the restaurant industry provides the highest precision and accuracy.

Data Collection
A web-crawler was programmed in Python to automatically retrieve reviews from Yelp.com. Data were collected from Yelp during 10-16 October, 2018. The study selects three best cities to travel in the U.S. based on TripAdvisor reviews [64], including New York, Los Angeles and Las Vegas. During the crawling process, identifiable information of reviewers and restaurants was removed carefully for privacy protection. In total, 294,034 reviews were crawled by the program. The number of reviews extracted in each city is shown in Table 2. In addition to textual feedback of consumers, other relevant information, such as the elite status of consumer, type of restaurant, review date and star rating of individual reviewers for the restaurant, were also collected. Additionally, the number of votes on "useful" specific to each individual review was acquired to measure the usefulness of review in the study.

Step 1: Data Preprocessing
The text preprocessing procedure follows steps adapted from prior studies [65][66][67], including eliminating non-English characters and words, word text tokenization, part-of-speech tagging (POS tagging or POST), replacing common negative words, word stemming, and removing low frequency words (less than 2%) [65].

Step 2: Restaurant Aspect Extraction
After eliminating irrelevant and non-textual contents in the pre-step, reviews were transformed into proper vectors. The step aims at identifying major dining aspects from obtained reviews. The Latent Dirichlet Allocation (LDA) was applied to identify underlying aspects that restricted human interventions from mass reviews [65].

Step 3: Sentiment Detection
This stage aims at detecting customers' sentiments of different restaurant aspect embedded in their reviews. First, each of the review is cut into sentences by SentiStrength, and each sentence then is assigned with a tuple of negative value and positive value, since in reality, one sentence may contain positive sentiments and simultaneously, negative ones. the SentiStrength also fixedly scores the dictionary tokens that include regular emoticons. For instances, "good" is scored {3, −1}, and {1, −4} for the "bad". Note that merely when a word presents within the dictionary, it is characterized by a single score. In addition, additional marks or attributive terms may lead to score change, such as the score of "goood" equals that of "good!!!", and they possibly extend the dictionary. Feature sentiments were calculated by applying SentiStrength as follows. Denote the collection of reviews by R = {r 1 , r 2 , ..., r n } and the collection of obtained aspects by T = {t 1 , t 2 , ..., t m }. LDA outputs a matrix W n×m , of which the entry w i,j represents the number of times a feature from ith review associating to jth aspect. Subsequently, the sentiment score attached to given aspect is the weighted average over the reviews. For every aspect t j , we calculate the aspect sentiment score t sj as noted in Equation (1): where S = {s 1 , s 2 , ..., s l } denotes the sentiment score of each feature associated with aspect t j .

Step 4: Classifier Set Up
Previous helpfulness predictions mainly rely on descriptive review features, such as review rating, review length, and review text, as the most useful features. However, in this exploratory analysis, review helpfulness depends more on review semantic and its sentiment rather than descriptive aspects. Specifically, the aspects and their sentiments indicated in the review are the key criteria to determine if a review is helpful.
Because of the lack of useful votes and the limited schema of review arrangement from the websites, we apply classical learning algorithms to the binary classification of the online-review helpfulness, in another word, to discriminate if any particular review from the review collection is helpful or not, based on the emotion data and the best performing features. These algorithms are: (1) NB+LR; (2) NB and SVM; and (3) SVM accompanied by FDO. Based on these aspects obtained via the aforementioned steps, reviews were separated into training and testing datasets. The test data are used to estimate the performance of each machine-learning algorithm.
NB classifier. NB was defined as a classifier on the basis of Bayes' rule [68]. NB is a scheme based on statistics. Under its assumption, attributes are of equal independence and importance. For classifying an unknown cast, NB selects classes that are the most likely to contain evidence in test cast.
NB is widely applied in classifying sentiments for the classification of a given review document d to class c as noted in Equation (2).
Based on Bayesian law, the likelihood that any given document being a member of class ci is implied by Equation (3).
In our context, the hypothesis of conditional independence that gives the particular class (yes or no) is adopted, and no independence exists between words. This is the reason why the model is called "naïve" (Equation (4)).
Further, Logistic Regression (LR) was employed to examine the relationship between discrete variables. LR is often utilized when there is a dichotomous dependent variable, such as fault prone or non-fault prone. Although this statistical technique yields better performance on numerical data, it allows the prediction of discrete variables by a mix of continuous and discrete predictors.
Thus, the performance curves regarding different review amounts are displayed via the classification methods of NB and LR. Based on the description of Afzal [69,70] united NB and LR after the comparison of NB and LR (see Equation (5)).
In the derivation,p (y = q x, a, p) represents the association between NB and LR, α and β are discrete outcomes of LR. It is also observed from the equation that the two classifiers are linear. If the assumed data distribution is met, discriminate function analysis can yield greater performance. When the outcome of processed data is of continuity, the performance of multiple regressions is enhanced under given assumptions.
NB+SVM. The SVM is a machine learning approach with effectiveness [71]. It builds a hyperplane or a group of hyperplanes in a space with high dimension, such as~w. When the margin is larger, the classifier will exhibit lower error, thus helping to achieve the maximum distance of the support vector from the nearest training data point following training in any class. Hence, the issue of margin maximization will arise.
This experiment employed the kernel functions in SVM's training phase. The SVM classifier is trained via restaurant reviews with semantic annotation. While classifying reviews with SVM, training is performed to modify kernel parameters. Then, the most appropriate kernel parameters are identified. Upon the process of training, SVM has a basic goal of finding the largest margin hyperplane for solving the classification task of feature review.
SVM_FDO. The results are calculated with the SVM accompanied by FDO. A fuzzy ontology acts as a quadruple Ont =< X, C, RXC, RCC >; thus, X and C denote a set of objects and concepts, respectively. The set of objects is mapped to the set of concepts by the fuzzy relation RXC: X × C _→ [0,1] through assigning the value of respective membership. The fuzzy relation RCC: C × C _→ [0,1] refers to the fuzzy taxonomy relations among the set of concepts C.

Measurements
The performance of the helpfulness vote classification system is evaluated by prominent methods as noted in previous studies [47,72] with the recall, precision, accuracy and function measure accuracy being computed by means of Equations (6)- (8). F1 score, i.e., F-measure, is used to measure the accuracy of a test by combining the recall and the precision below: Recall = correct positive predictions amount positive example amount (7) Precision = correct positive predictions amount positive predictions amount (8) The entire research process is illustrated in Figure 1.

Descriptive Analysis of Online Reviews
A summary of all usable reviews of each city is presented in Table 3. The customer ratings of the

Descriptive Analysis of Online Reviews
A summary of all usable reviews of each city is presented in Table 3. The customer ratings of the restaurants in the three cities were dominated by 5 stars (53.62%), followed by 4 stars (22.78%), and 3 stars (9.94%). Of the textual reviews, 69.02% do not receive any votes of helpfulness, while 15.45% of textual reviews have one vote of helpfulness. Only approximately 1% of textual reviews have higher than 5 votes of helpfulness.

LDA Results
LDA, a generative probabilistic model for discovering latent semantic topics from a large text corpus, is utilized to extract and label the dimensions of all yelp customer generated reviews in this study. The LDA-identified four restaurant aspects and the top-20 frequent words within each aspect are shown in Figure 2. The font size is linearly proportional to the word frequency. There were two scholars originally conducting the naming of the restaurant aspects, which stands for the recognition on the logical connection among the most frequently-used words for given aspect. Subsequently, the naming was evidenced in another researches. Four aspects were considered: value, food/taste, location, and experience. Specifically, food/taste described the tangible products (e.g., food, drinks) that restaurant provided to the reviews. Experience was defined as customers' internal responses to any direct interaction with staff in the restaurant. Accordingly, the experience aspect described greeting, serving, consumption, and after-sale processes involved in the reviewers' dining experience. Location depicted the geographical convenience of the Yelp restaurant. The payoff which indicate the difference between the benefit received and the cost paid, and the monetary outcomes both are explicated by the value. In terms of significance order of these aspects, taste/food and their associated words are shown to be most commonly referred in the online reviews that are most frequently mentioned in the online reviews (N = 1,509,172) followed by value (N = 1,219,085), experience (N = 1,123,405), and location (N = 967,192). any direct interaction with staff in the restaurant. Accordingly, the experience aspect described greeting, serving, consumption, and after-sale processes involved in the reviewers' dining experience. Location depicted the geographical convenience of the Yelp restaurant. The payoff which indicate the difference between the benefit received and the cost paid, and the monetary outcomes both are explicated by the value. In terms of significance order of these aspects, taste/food and their associated words are shown to be most commonly referred in the online reviews that are most frequently mentioned in the online reviews (N = 1,509,172) followed by value (N = 1,219,085), experience (N = 1,123,405), and location (N = 967,192).

Sentiment Results
After running the aspect extraction, lists of words were designated for each aspect. The extracted words contain not only restaurant features, but also the likelihood or sentiments of users (e.g., great, bad, good, like, hate). Figure 3 presents the degree of sentiment of each derived restaurant aspect (shown in blue). The sentiment is positive if the point representing a restaurant aspect lies outside the inner diamond-shaped rectangle (shown in gray) and negative if the point lies inside the rectangle

Sentiment Results
After running the aspect extraction, lists of words were designated for each aspect. The extracted words contain not only restaurant features, but also the likelihood or sentiments of users (e.g., great, bad, good, like, hate). Figure 3 presents the degree of sentiment of each derived restaurant aspect (shown in blue). The sentiment is positive if the point representing a restaurant aspect lies outside the inner diamond-shaped rectangle (shown in gray) and negative if the point lies inside the rectangle shape. As shown in Figure 3, positive sentiments tend to be associated with the food/taste followed by experience and location. Customer perceived value of restaurant was generally associated with negative feelings.

Taste/food
Experience Value Location Figure 2. Word cloud generated for each extracted aspect.

Sentiment Results
After running the aspect extraction, lists of words were designated for each aspect. The extracted words contain not only restaurant features, but also the likelihood or sentiments of users (e.g., great, bad, good, like, hate). Figure 3 presents the degree of sentiment of each derived restaurant aspect (shown in blue). The sentiment is positive if the point representing a restaurant aspect lies outside the inner diamond-shaped rectangle (shown in gray) and negative if the point lies inside the rectangle shape. As shown in Figure 3, positive sentiments tend to be associated with the food/taste followed by experience and location. Customer perceived value of restaurant was generally associated with negative feelings.

Model Comparison
The comparison results of different classification methods are illustrated in Table 4. First, by adding SVM, the second method increased the F1 accuracy rate from 67.68 to 71.20 indicating that the performance of helpful classification by NB with SVM for online Yelp reviews is better than simple NB. SVM+FDO is further used for more precise examination. F1 accuracy, recall, and precision increased significantly during review classification in the case of SVM with FDO. Thus, the third method is most efficient for in helpfulness classification of opinion mining compared with the other simple SVM scheme.

Summary of Results and Discussion
On the basis of 294,034 reviews from Yelp, this study proposes a restaurant review helpfulness prediction model with an emphasis on both dining aspects and emotional aspects. It is revealed that restaurant online reviews are associated with four fundamental aspects: taste/food, experience, value, and location. Most positive reviews are associated with food/taste, while negative reviews are associated with value. Utilization of SVM with FDO algorithm achieved highest prediction accuracy (79.59%) and precision rate (81.62%) to predict restaurant review usefulness of three U.S.-based cities on Yelp. Among four extracted fundamental dining aspects, taste/food, value, and experience is consistent with prior studies that apply text-mining analysis to discover hidden restaurant aspects in online reviews [73][74][75][76]. However, location is barely mentioned by previous studies. Based on aspect frequency, it is suggested that quality of food appeared to be the most important aspect to customers, which is consistent with prior studies indicating food is the greatest contributor to the success of any restaurant [77]. However, Cuizon et al. [73] found that service was the most frequently mentioned aspect. Good taste and food quality are more likely to generate positive online reviews. However, this study highlights that customers tend to express negative emotions towards value, which is different from previous findings indicating restaurant ambience had the lowest and still positive sentiment score [78], and customers tend to complain about service quality [79]. The potential reason might be this study extracted restaurant reviews from three metropolitan cities in the U.S. where the costs of living are relatively high. Therefore, negative feelings are linked to low perceived value.
The comparison of three algorithms for predicting the review helpfulness revealed that SVM with FDO for predicting online review usefulness is superior to two other algorithms: NB and the amalgamation of NB and SVM. The SVM with FDO algorithm increased the F1-score, recall and precision metrics by 11.91%, 13.31%, and 10.23%, respectively, compared to NB. This study finding supplements prior studies that compared different algorithms in predicting review helpfulness without considering a combination of two algorithms [54,80].

Implications, Limitations, and Future Studies
The implications of this study can be explained from both theoretical and managerial perspectives. First, unlike prior studies that used either a perceptional survey to examine the impact of review contents or reviewer characteristics on review helpfulness, this study contributes to the emerging review helpfulness literature in the hospitality and tourism industry in terms of methodology. As one of the few attempts, this study reveals that the SVM with FDO algorithm significantly improve the accuracy of predicting review helpfulness in restaurant business domain. This approach is an innovative technique that combines both traditional natural language processing and advanced machine learning algorism in predicting helpful reviews.
Second, this study provides new insight on sustainable economic development by developing sustainable marketing strategies to maximize restaurant industry's performance growth. As revealed in Figure 1, words associated with food/taste includes menu, special, fresh, option, and portion. It is suggested that customers are not only satisfied with food quality, but also the varied option of the menu, and the large portion food sizes. In addition, waiters' service quality plays an important in creating a good experience. It is also revealed that location can have a big impact on restaurant performance. To maintain a sustainable business development, restaurateurs should consider choosing a high-traffic location where the surrounding area has a well-developed transportation infrastructure. In terms of value, restaurateurs should put much thought and consideration into developing and prioritizing a food pricing strategy as well as take notes of how much the customers are willing to pay. As suggested by Cao et al. [81], review platforms could use this approach to develop a sorting or recommendation algorithm that accurately shows helpful and valuable reviews to increase readers' stickiness to the review websites. The restaurant attributes identified in this study can help filter large amount of online reviews and could be used as guidelines to assist restaurant marketers and managers to improve their services and develop sustainable online marketing strategies. Especially for small start-ups, making effective use of online reviews could increase the chances of being discovered and gain a sustainable competitive advantage. However, it is important to note that sustainable aspects of a restaurant (e.g., green packaging, waste management, preservation of energy, and public relations on green activity) identified by Ju and Chang [82] are not emerged as a frequently-mentioned aspect. The potential reason might be that many of U.S. customers are unable to define green restaurant, even though they have eaten in a green restaurant [83]. Due to fact that customers are willing to pay more for the green restaurant experience [83] and practices with a focus on food and environment could form customers' positive attitudes, which in turn lead to buying behavior [84], restaurateurs should take sustainable practices to attract customers or encourage customers to post reviews regarding their sustainability efforts. To raise customers' awareness of the restaurant's sustainability activities, review websites should consider taking sustainable aspects of restaurants into account while designing a ML-based technique to predict review helpfulness.
Third, due to limited amount of helpfulness votes, this study could also help review websites such as Yelp to derive more helpful reviews even though some of them are buried in thousands of reviews. The filter mechanism created upon the SVM + FDO reveals substantial higher accuracy opposed to that built on NB and SVM in previous studies. The outcomes of the existing study propose an innovative review filtering mechanism that can boost up more helpful reviews using different aspect. Thus, the review websites' ability to provide more effective and useful information could attract more visitors. Consequently, adding new pop-up predicted helpful reviews in the online review websites contribute to long-term development of these platforms.
Fourth, for someone who would like to become opinion leaders in online communities write high-quality reviews to become a Yelp Elite Squad member, which is a community with active evangelists and role models [85]. However, review manipulation constantly occurs [15], and efforts should be made by review platforms practitioners to ban and detect fake helpful reviews.
This study is subject to some limitations. First, this study only focused on restaurant reviews written in English, and the proposed helpfulness prediction approach might not be applicable to predict helpfulness of restaurant reviews written in other languages. Future work could test the accuracy of FDO approach on other languages. Second, only one site with reviews from restaurants located in three U.S. cities was chosen for data collection, which limits the study sample. According to CNBC [86], in addition to New York, Los Angeles, and Las Vegas, the top ten foodie cities in the U.S. also include Portland, San Francisco, Miami, Orlando, Seattle, San Diego, and Austin. Future research should collect more restaurant reviews on a larger sample of U.S. cities. Also, it will be interesting to make comparisons in the importance of dining aspects across different cities or regions. Third, this study does not take the temporal feature of online reviews into account. Yang et al. [57] explained that older online reviews were more likely to receive helpful votes than recent reviews. However, this is not always the case since customers tend to read most recently posted reviews to gain the most up-to-date information. Therefore, temporal dimensions should be controlled in future studies. Fourth, this study predicted helpfulness on the basis of emotions and restaurant features conveyed in online reviews. Future studies could examine the impact of reviewers' characteristics (e.g., expertise; history helpful votes) and dining context (e.g., dining purpose, dining companions) highlighted in the study of Gan et al. [78] through experimental designs. Finally, this study only examines the textual reviews, future studies could examine the impact of video presentation formats and imagery format in predicting review helpfulness.

Conflicts of Interest:
The authors declare no conflict of interest.