Wineinformatics: Using the Full Power of the Computational Wine Wheel to Understand 21st Century Bordeaux Wines from the Reviews

Although wine has been produced for several thousands of years, the ancient beverage has remained popular and even more affordable in modern times. Among all wine making regions, Bordeaux, France is probably one of the most prestigious wine areas in history. Since hundreds of wines are produced from Bordeaux each year, humans are not likely to be able to examine all wines across multiple vintages to define the characteristics of outstanding 21st century Bordeaux wines. Wineinformatics is a newly proposed data science research with an application domain in wine to process a large amount of wine data through the computer. The goal of this paper is to build a high-quality computational model on wine reviews processed by the full power of the Computational Wine Wheel to understand 21st century Bordeaux wines. On top of 985 binary-attributes generated from the Computational Wine Wheel in our previous research, we try to add additional attributes by utilizing a CATEGORY and SUBCATEGORY for an additional 14 and 34 continuous-attributes to be included in the All Bordeaux (14,349 wine) and the 1855 Bordeaux datasets (1359 wines). We believe successfully merging the original binary-attributes and the new continuous-attributes can provide more insights for Naïve Bayes and Supported Vector Machine (SVM) to build the model for a wine grade category prediction. The experimental results suggest that, for the All Bordeaux dataset, with the additional 14 attributes retrieved from CATEGORY, the Naïve Bayes classification algorithm was able to outperform the existing research results by increasing accuracy by 2.15%, precision by 8.72%, and the F-score by 1.48%. For the 1855 Bordeaux dataset, with the additional attributes retrieved from the CATEGORY and SUBCATEGORY, the SVM classification algorithm was able to outperform the existing research results by increasing accuracy by 5%, precision by 2.85%, recall by 5.56%, and the F-score by 4.07%. The improvements demonstrated in the research show that attributes retrieved from the CATEGORY and SUBCATEGORY has the power to provide more information to classifiers for superior model generation. The model build in this research can better distinguish outstanding and class 21st century Bordeaux wines. This paper provides new directions in Wineinformatics for technical research in data science, such as regression, multi-target, classification and domain specific research, including wine region terroir analysis, wine quality prediction, and weather impact examination.


Introduction
Science has its origins in attempts to understand the world in an empirically verified manner. To understand the world, one relies on testing it. In order to test ideas, you often need data. Each data point is a discrete record or a property of events that occurred in the world. Now, with the rise of the Internet, data has become abundant. What has also allowed for the rise of data science is the advancement in the application of statistically-based The goal of this paper is to improve from our previous research [29] by incorporating the full power of the Computational Wine Wheel to gain more insights in 21st century Bordeaux wines. More specifically, instead of only using "specific words" to map wine reviews and assign "normalized" binary (0/1) attributes to the wine in all previous Wineinformatics research, we want to include "category" and "subcategory" into the natural language processing component and assign corresponding continuous-attributes. For example, apple and peach will be under "tree fruit" in the subcategory column and "fruity" in the category column. In previous Wineinformatics research, a wine review with peach and apple would be converted into a positive (1) in peach, a positive (1) in apple, and negative (0) in all other attributes. However, in this research, the same wine review will be converted into the "fruity" category with an accumulated value of 2, the "tree fruit" subcategory with an accumulated value of 2, plus all the binary attributes that appeared previously. We are interested in how the word frequency in the categories and subcategories would affect the machine learning process to understand Bordeaux wines. We believe, by successfully merging the original binary-attributes and the new continuous-attributes, can provide more insights for Naïve Bayes and Support Vector Machine (SVM) to build the model for a wine grade category prediction.

Materials and Methods
Data science is the study of data with an application domain. The source, the preprocessing, and the creation of the data are all major factors for the quality of the data. Since the data in this research is about wine reviews, which is stored in a human language format, Natural Language Processing (NLP) is needed [32].

Wine Spectator
Many research efforts indicate that wine judges may demonstrate intra-and interinconsistencies while tasting designated wines [33][34][35][36][37][38][39][40]. The Wine Spectator is a wine magazine company that provides consistent creditable wine reviews periodically by a group of wine region specific reviewers. The company has published approximately 385,000 wine reviews. The magazine publishes 15 issues a year, and there are between 400 and 1000 wine reviews per issue. In our previous Wineinformatics research [11,12], more than 100,000 wine reviews were gathered and analyzed across all wine regions. This dataset was used to test wine reviewers' accuracy in predicting a wine's credit score. Wine Spectator reviewers received more than 87% accuracy when evaluated with the SVM method while predicting whether a wine received a credit score higher than 90/100 points [11]. The regression analysis on predicting a wine's actual score based on the reviews also has an outstanding result with only a 1.6 average score difference on a Mean Absolute Error (MAE) evaluation [12]. These results support Wine Spectator's prestigious standing in the wine industry. Figure 1 is the example of wine reviews on WineSpectator.com.
There is a 50-100 score scale on evaluating the wine used by Wine Spectator. Details of the score scale can be described as:  There is a 50-100 score scale on evaluating the wine used by Wine Spectator. Details of the score scale can be described as: 95-100 Classic: a great wine 90-94 Outstanding: a wine of superior character and style 85-89 Very good: a wine with special qualities 80-84 Good: a solid, well-made wine 75-79 Mediocre: a drinkable wine that may have minor flaws 50-74 Not recommended

Bordeaux Datasets
In our previous research [29], we explored all 21st century Bordeaux wines by creating a publicly available dataset with 14,349 Bordeaux wines [30]. In addition, based on the official Bordeaux wine classification in 1855 [31], we generated an 1855 Bordeaux Wine Official Classification Dataset to investigate the elite wines. This dataset is a subset of all 21st century Bordeaux wines and contains 1359 wines. All of the wine reviews in the datasets came from Wine Spectator in a human language format. We processed the reviews through the Computational Wine Wheel, which works as a dictionary using one-hot encoding to convert words into vectors. For example, in a wine review, there are some words that contain fruits such as apple, blueberry, plum, etc. If the word matches the attribute in the computation wine wheel, it will be 1. Otherwise, it will be 0. The Computational Wine Wheel is also equipped with a generalization function to map similar words into the same coding. For example, fresh apple, apple, and ripe apple are generalized into "apple" since they represent the same flavor. Yet, green apple belongs to a "green apple" since the flavor of green apple is different from apple. The score of the wine is also attached to the data as the last attribute, also known as the label. For supervised learning, in order to understand the characteristics of classic (95+) and outstanding (90-94) wines, we use 90 points as a cutoff value. If a wine receives a score equal/above 90 points out of 100, we mark the label as a positive (+) class to the wine. Otherwise, the label

Bordeaux Datasets
In our previous research [29], we explored all 21st century Bordeaux wines by creating a publicly available dataset with 14,349 Bordeaux wines [30]. In addition, based on the official Bordeaux wine classification in 1855 [31], we generated an 1855 Bordeaux Wine Official Classification Dataset to investigate the elite wines. This dataset is a subset of all 21st century Bordeaux wines and contains 1359 wines. All of the wine reviews in the datasets came from Wine Spectator in a human language format. We processed the reviews through the Computational Wine Wheel, which works as a dictionary using one-hot encoding to convert words into vectors. For example, in a wine review, there are some words that contain fruits such as apple, blueberry, plum, etc. If the word matches the attribute in the computation wine wheel, it will be 1. Otherwise, it will be 0. The Computational Wine Wheel is also equipped with a generalization function to map similar words into the same coding. For example, fresh apple, apple, and ripe apple are generalized into "apple" since they represent the same flavor. Yet, green apple belongs to a "green apple" since the flavor of green apple is different from apple. The score of the wine is also attached to the data as the last attribute, also known as the label. For supervised learning, in order to understand the characteristics of classic (95+) and outstanding (90-94) wines, we use 90 points as a cutoff value. If a wine receives a score equal/above 90 points out of 100, we mark the label as a positive (+) class to the wine. Otherwise, the label would be a negative (−) class. There are some wines that received a ranged score, such as 85-88. We use the average of the ranged score to decide and assign the label. Some detailed score-distribution and number-of-wine-reviewed-each-year statistics are available in Reference [29].
In this paper, we want to use the full power of the Computational Wine Wheel to extract more information from the same Bordeaux datasets. In the latest version of the Computational Wine Wheel, there are four columns CATEGORY_NAME, SUBCATE-GORY_NAME, SPECIFIC_NAME, and NORMALIZED_NAME. There are 14 attributes in the CATEGORY_NAME, 34 in the SUBCATEGORY_NAME, 1881 in the SPECIFIC_NAME, and 985 in the NORMALIZED_NAME. The details can be found in Table 1. Contrary to the previous data, which used only a SPECIFIC_NAME to match wine reviews and NORMAL-IZED_NAME to encode wine attributes, this research includes two more columns of the Computational Wine Wheel, SPECIFIC_NAME and NORMALIZED_NAME. For example, for a wine with "fresh apple" and "crushed cherry" in the review, the computational wine wheel will encode a 1 for the "apple" attribute since apple is the normalized attribute for "fresh apple" and a 1 for the "cherry" attribute since cherry is the normalized attribute for "crushed cherry." In this research, not only "apple" and "cherry" attributes will be 1, the new data will count "tree fruit" twice and "fruity" twice since both apple and cherry are under the "tree fruit" SUBCATEGORY_NAME and "fruity" CATEGORY_NAME columns. The new dataset changes from a purely binary dataset to a mixed data format dataset. The flowchart presented in Figure 2 shows the process of converting reviews into a machine readable format. Reviews are processed using the computational wine wheel into NORMALIZED_NAME, SUBCATEGORY_NAME, and CATEGORY_NAME. The flowchart presented in Figure 2 shows the process of converting reviews into a machine readable format. Reviews are processed using the computational wine wheel into NORMALIZED_NAME, SUBCATEGORY_NAME, and CATEGORY_NAME. Figure 2. The process of using the computation wine wheel to convert NORMALIZED_NAME, SUBCATEGORY_NAME, and CATEGORY_NAME into a machine readable format.
After preprocessing, 14 category and 34 subcategory attributes were added into both datasets, creating a total of 1033 attributes. However, the attributes in SUBCATEGO-RY_NAME and CATEGORY_NAME are continuous instead of binary values, and the ranges of each attribute are variable. The learning algorithm will give one feature more Figure 2. The process of using the computation wine wheel to convert NORMALIZED_NAME, SUBCATEGORY_NAME, and CATEGORY_NAME into a machine readable format.
After preprocessing, 14 category and 34 subcategory attributes were added into both datasets, creating a total of 1033 attributes. However, the attributes in SUBCATE-GORY_NAME and CATEGORY_NAME are continuous instead of binary values, and the ranges of each attribute are variable. The learning algorithm will give one feature more weight than the other if we do not perform normalization with our attributes, especially when the attributes in NORMALIZED_NAME are binary. Normalization can help speed up the learning process as well as avoid numerical problems, such as loss of accuracy due to arithmetic overflow [41]. In this research, we used Min-Max normalization (Equation (1)) to scale the attribute values to 0-1. The formula is as follows.
where x is an original value, x is the normalized value, min(x) is the minimum value among all x, and max(x) is the maximum value among all x.

Supervised Learning Algorithms and Evaluations
Supervised learning builds a model through labeled datasets to make predictions. If the label is a continuous variable, it is known as regression. If the label is a discrete variable, it is known as classification. In this research, we are trying to understand what makes an outstanding 21st century Bordeaux wine. More specifically, what are the characteristics of classic (95+) and outstanding (90-94) Bordeaux. In order to achieve this goal, 90 points were chosen as the cutoff value. If a wine receives a score equal/above 90 points out of 100, we mark the label as a positive (+) class for the wine. Otherwise, the label would be a negative (−) class. Therefore, the task we are doing is classifying supervised learning. The evaluation metrics (accuracy, precision, recall, and F-score) were used to evaluate the performance of models with five-fold cross-validation. In this research, we applied Naïve Bayes and SVM as our classification algorithms to build the model of 21st century Bordeaux wines. If the model can accurately predict a wine's grade category through the review, it means the model can capture the essence of high-quality wines.

Naïve Bayes Classifier
Throughout Wineinformatics research, the Naïve Bayes classifier has been considered the most suitable white-box classification algorithm. A Naïve Bayes classifier is a simple probabilistic classifier that applies Bayes' Theorem with two assumptions: (1) there is no dependence between attributes. For example, the word APPLE appearing in a review has nothing to do with the appearance of the word FRUITY even though both words might appear in the same review. (2) In terms of the importance of the label/outcome, each attribute is treated the same. For example, the word APPLE and the word FRUITY have equal importance in influencing the prediction of wine quality. The assumptions made by Naïve Bayes are often incorrect in real-world applications. As a matter of fact, the assumption of independence between attributes is always wrong, but Bayes still often works well in practice [42,43].
Here is the Bayes' Theorem: where P(X|Y) is the posterior probability of Y belonging to a particular class when X happens, P(Y) is the prior probability of Y, and P(X) is the prior probability of X.
Based on the Bayes' Theorem, the Naïve Bayes classifier is defined as: Equation (3) computes all posterior probabilities of all values in X for all values in Y. For a traditional bi-class classification problem in terms of positive and negative cases, the Naïve Bayes classifier calculates a posterior probability for the positive case and a posterior probability for the negative case. The case with the higher posterior probability is the prediction made by the Naïve Bayes classifier. There are two main Naïve Bayes classifiers based on the type of attributes: Gaussian Naïve Bayes classifier [44] for continuous attributes, and Bernoulli Naïve Bayes classifier [45] for binary attributes.
In the Gaussian Naïve Bayes classifier (Equation (4)) [44], it is assumed that the continuous values associated with each attribute are distributed according to a Gaussian distribution. A Gaussian distribution is also known as a normal distribution.
where µ y is the sample mean, and 6 y is the sample standard deviation.
In the Bernoulli Naïve Bayes classifier (Equation (5)) [45], the attributes are binary variables. The frequency of a word in the reviews is used as the probability.
where N ic is the number of samples/reviews having attribute X i and belonging to class Y, and N C is the number of samples/reviews belonging to class Y. When a value of X never appears in the training set, the prior probability of that value of X will be 0. If we do not use any techniques, P(Y|X 1, X 2 , . . . X n ) will be 0, even when some of the other prior probabilities of X are significant. This case does not seem fair to other X. Therefore, we use Laplace smoothing (Equation (6)) [45] to handle zero prior probability by modifying Equation (5) slightly.
Laplace Smoothing: where c is the number of values in Y.
In our dataset, as mentioned in Section 2, 985 attributes captured by NORMAL-IZED_NAME are binary values, while 14 and 34 attributes captured by SUBCATEGORY_ NAME and CATEGORY_NAME are continuous values. To calculate the posterior probability in the Naïve Bayes classifier for a training dataset with both continuous and binary values, this research multiplies the posterior probability produced by Gaussian Naïve Bayes on continuous attributes with the posterior probability produced by Bernoulli Naïve Bayes with Laplace on binary attributes.

SVM
SVM are supervised learning models with associated learning algorithms that analyze data and recognize patterns used for classification and regression analysis [46]. Classification-based SVM builds a model by constructing "a hyperplane or set of hyperplanes in a high-dimensional or infinite-dimensional space, which can be used for classification, regression, or other tasks. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training data point of any class (so-called functional margin), since, in general, the larger the margin is, the lower the generalization error of the classifier is." After constructing the model, testing data is used to predict its accuracy. SVM light [47] is the version of SVM that was used to perform the classification of attributes for this project.

Evaluation Methods
To evaluate the effectiveness of the classification model, several standard statistical evaluation metrics are used in this paper: True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN). We have refined their definitions for our research as follows.
TP: The real condition is true (1) and predicted as true (1); 90+ wine correctly classified as 90+ wine; Beverages 2021, 7, 3 9 of 15 TN: The real condition is false (−1) and predicted as false (−1); 89− wine correctly classified as 89− wine; FP: The real condition is false (−1) but predicted as true (1); 89− wine incorrectly classified as 90+ wine; FN: The real condition is true (1) but predicted as false (−1); 90+ wine incorrectly classified as 89− wine; If we use 90 points as a cutoff value in this research, TP would be "if a wine scores equal/above 90 and the classification model also predicts it as equal/above 90." Below are our evaluation metrics.
Accuracy (Equation (7)): The proportion of wines that have been correctly classified among all wines. Accuracy is a very intuitive metric.
Recall (Equation (8)): The proportion of 90+ wines that were identified correctly. Recall explains the sensitivity of the model to 90+ wine.
Precision (Equation (9)): The proportion of predicted 90+ wines that were actually correct. Precision = TP TP + FP (9) F-score (Equation (10)): The harmonic mean of recall and precision. F-score takes both recall and precision into account, combining them into a single metric.

Results and Discussion
By using the full power of the Computational Wine Wheel, we can generate 1033 (14 category + 34 subcategory + 985 normalized) attributes. Answering the questions "Will the newly generated category and subcategory attribute help in building more accurate classification models?" and "How can a system be designed to integrate three different types of attributes?" are the main areas of focus for this research.
Since there are three sets of attributes, seven datasets were built to realize all possible combinations. These datasets are defined as follows: datasets 1, 2, and 3 reflect the three individual sets of attributes. Dataset 4 encompasses all three sets of attributes and datasets 5, 6, and 7 are constructed of a combination of two out of the three individual sets. Details of each dataset can be found in Table 2. Each dataset is trained by both the Naïve Bayes and SVM algorithms discussed in Section 3 with a five-fold cross. Please note that the all Bordeaux dataset 3 and 1855 Bordeaux dataset 3 are the original datasets used in Reference [29]. Therefore, results generated from dataset 3 is the comparison results with previous research work. Dataset 4 is the complete dataset with all attributes.

Results of All Bordeaux Wine Datasets
This research applied Naïve Bayes classifier on the All-Bordeaux datasets 1-7 and received encouraging results as shown in Table 3. The All-Bordeaux datasets 1, 2, and 5 did not perform as well as other datasets since the number of attributes is much lower. Compared with the All-Bordeaux dataset 3, surprisingly, the performances of the All-Bordeaux dataset 4 are less accurate in terms of accuracy, recall, and F-score. More attributes do not provide better information for the Naïve Bayes classification model. However, the All-Bordeaux dataset 7, which combines 14 Category and 985 normalized attributes, generated the best results in three out of four evaluation metrics. These results suggest that attributes from the subcategory do not increase the information quality in the All-Bordeaux datasets using the Naïve Bayes algorithm. Table 3. Accuracy, precision, recall, and F-score with Naïve Bayes classifier in All-Bordeaux datasets. The highest results in each column are marked in bold.  Table 4 shows the results generated from SVM on the All-Bordeaux datasets. Overall speaking, SVM demonstrated more consistent prediction performances. All accuracies are higher than 80%, even for the datasets with less attributes. Compared with the All-Bordeaux dataset 3, the All-Bordeaux datasets 4 and 6 performed slightly better in accuracy, recall, and F-score. However, the improvement is not clear. These results suggest that attributes from the category and subcategory do slightly increase the information quality in the All-Bordeaux datasets using SVM. Table 4. Accuracy, precision, recall, and F-score with the SVM classifier in All-Bordeaux datasets. The highest results in each column are marked in bold. Comparing the results given in Tables 3 and 4, it suggests that the best performance came from the All-Bordeaux dataset 7 using Naïve Bayes classification algorithm. This result is encouraging since SVM, a black box classification algorithm, usually performs better than the Naïve Bayes algorithm such as results from the All-Bordeaux dataset 3. This means Naïve Bayes can capture extra information from 14 category attributes to build a more accurate, even better than SVM classification model.

Results of 1855 Bordeaux Wine Official Classification Dataset
This research also applied a Naïve Bayes classifier on the 1855 Bordeaux datasets 1-7 and received even more encouraging results, as shown in Table 5. Except for datasets 1, 2, and 5, all other datasets with additional information performed better than the previous results from dataset 3. The 1855 Bordeaux dataset 4 achieved the highest accuracy, recall, and F-score. The results suggest that additional information from the CATEGORY and SUBCATEGORY does help the Naïve Bayes algorithm to build the classification model. The 94.33% recall indicates that more than 94% of the wines in the 1855 Bordeaux category can be recognized by our classification research. However, the best model for use might be the 1855 Bordeaux dataset 7 as it produced the highest precision result of 91.34%. Table 5. Accuracy, precision, recall, and F-score with Naïve Bayes classifier in the 1855 Bordeaux datasets. The highest results in each column are marked in bold.  Table 6 shows the results generated from SVM on the 1855 Bordeaux datasets. The 1855 Bordeaux dataset 4 achieved the highest results in all evaluation metrics. The consistent results produced from Tables 5 and 6 suggests that the smaller dataset, known as the 1855 Bordeaux dataset, benefits more from the full power of the Computational Wine Wheel. Comparing the results given in Tables 5 and 6, this research suggests that the 1855 Bordeaux dataset 4 is the best processed data for understanding 90+ points 21st century elite Bordeaux wines. Both Naïve Bayes and SVM produced outstanding results. Unlike results shown in Table 4, SVM does catch more information from additional attributes resulting in an increase of 5.00% and 4.19% accuracy in the 1855 Bordeaux datasets 4 and 7, respectively. The 1855 Bordeaux dataset 5, which uses only the CATEGORY and SUBCATEGORY (total of 48 attributes), performed better than the 1855 Bordeaux dataset 3 with the highest recall. This finding indicates that the SVM put emphasis on the CATEGORY and SUBCATEGORY attributes.

Bordeaux Dataset
In deciding which classification algorithm performed better, F-score will be the tie breaker to indicate Naïve Bayes slightly outperformed SVM. On top of that, Naïve Bayes is a white-box classification algorithm, so the results can be interpreted. Therefore, based on the findings in this paper, Naïve Bayes is the best classification algorithm for Wineinformatics research, utilizing the full power of the Computational Wine Wheel.

Comparison of Datasets 4 and 7
Dataset 4 contains 1033 attributes (14 category attributes + 34 subcategory attributes + 985 normalized attributes) and dataset 7 contains 999 attributes (14 category attributes + 985 normalized attributes). Experimental results suggest that both datasets 4 and 7 outperformed dataset 3 used in previous research in the Bordeaux wine application domain. However, drawing the conclusion "the more attributes, the better" is still not very clear. To visualize the experimental results using datasets 4 and 7, Figure 3 is created based on Tables 3 and 4. Figure 4 is created based on Tables 5 and 6.
In Figure 3, all results are very similar except dataset 4 using Naïve Bayes. More specifically, the recall performed poorly. In the All-Bordeaux dataset with a total of 14,349 wines, there are 4263 90+ wines and 10,086 89− wines. This imbalanced situation might be the cause for Naïve Bayes to perform poorly in dataset 4, which means the 34 attributes from the SUBCATEGORY cannot provide positive impacts to the model generated from All-Bordeaux wines.
In Figure 4, in terms of accuracy and F-score, both the 1855 datasets 4 and 7 performed very similarly under the SVM and Naïve Bayes algorithms. The only result that needs to be addressed is the recall from the 1855 dataset 4 using Naïve Bayes, which reaches as high as 94%. This result might be affected by the same imbalanced situation of the dataset. In the 1855 Bordeaux dataset with 1359 collected wines, there are 882 90+ wines and 477 89− wines. Unlike the data distribution of the first dataset, which has much more 89− wines than 90+ wines, the 1855 Bordeaux dataset has more 90+ wines than 89− wines. Therefore, we feel dataset 4, which uses all attributes, is more likely to be impacted by the imbalanced dataset. In either case, this research suggests that attributes retrieved from the CATEGORY and SUBCATEGORY have the power to provide more information to classifiers for superior model generation. This finding provides strong impact to many Wineinformatics research efforts in different wine-related topics, such as evaluation on wine reviewers [11], wine grade and price regression analysis [12], terroir study in a single wine region [48], weather impacts examination [49], and multi-target classification on a wine's score, price, and region [50].
on the findings in this paper, Naïve Bayes is the best classification algorithm for Wineinformatics research, utilizing the full power of the Computational Wine Wheel.

Comparison of Datasets 4 and 7
Dataset 4 contains 1033 attributes (14 category attributes + 34 subcategory attributes + 985 normalized attributes) and dataset 7 contains 999 attributes (14 category attributes + 985 normalized attributes). Experimental results suggest that both datasets 4 and 7 outperformed dataset 3 used in previous research in the Bordeaux wine application domain. However, drawing the conclusion "the more attributes, the better" is still not very clear. To visualize the experimental results using datasets 4 and 7, Figure 3 is created based on Tables 3 and 4. Figure 4 is created based on Tables 5 and 6.    on the findings in this paper, Naïve Bayes is the best classification algorithm for Wineinformatics research, utilizing the full power of the Computational Wine Wheel.

Comparison of Datasets 4 and 7
Dataset 4 contains 1033 attributes (14 category attributes + 34 subcategory attributes + 985 normalized attributes) and dataset 7 contains 999 attributes (14 category attributes + 985 normalized attributes). Experimental results suggest that both datasets 4 and 7 outperformed dataset 3 used in previous research in the Bordeaux wine application domain. However, drawing the conclusion "the more attributes, the better" is still not very clear. To visualize the experimental results using datasets 4 and 7, Figure 3 is created based on Tables 3 and 4. Figure 4 is created based on Tables 5 and 6.

Conclusions
In this research, with the full power of the Computational Wine Wheel to extract new attributes from the CATEGORY and SUBCATEGORY, we studied two Bordeaux datasets. The first dataset is all the Bordeaux wine from 2000 to 2016, and the second one is all wines listed in a famous collection of Bordeaux wines, known as the 1855 Bordeaux Wine Official Classification, from 2000 to 2016. Both SVM and Naïve Bayes techniques are utilized to build models for predicting 90+ (Classic and Outstanding wines) and 89− (Very Good and Good wines) wines, and we directly compare the performance with the latest research. According to the results presented in Tables 3-6, the datasets with additional attributes are able to generate better performance in almost all evaluation metrics. For the All-Bordeaux dataset, with the additional attributes retrieved from the CATEGORY, the Naïve Bayes classification algorithm was able to outperform the existing research results by increasing accuracy by 2.15%, precision by 8.72%, and F-score by 1.48%. For the 1855 Bordeaux dataset, with the additional attributes retrieved from the CATEGORY and SUBCATEGORY, the SVM classification algorithm was able to outperform the existing research results by increasing accuracy by 5%, precision by 2.85%, recall by 5.56%, and the F-score by 4.07%. The improvements demonstrated in the research suggest that attributes retrieved from the CATEGORY and SUBCATEGORY have the power to provide more information to classifiers for superior model generation. The main contributions of this paper are (1) proving a method to process wine reviews through the full power of the Computational Wine Wheel to retain more information and (2) demonstrating how to merge the original binary-attributes and the new continuous-attributes in Wineinformatics data so that the classification models can achieve better performance. This finding provides strong impacts to all Wineinformatics research in many different research topics about wine, which has the potential to provide useful information to wine-makers, consumers, and distributors.