1. Introduction
Data is everywhere. Data is growing faster than ever, as the amount of existing digital data is growing rapidly, doubling every two years and changing our way of life. At the end of 2020, it was estimated that approximately 1.7 MB of new information was generated per second for all humans on Earth. This trend has made it extremely important to know how to extract useful meaning from this huge data. Data science is a successful study that includes different techniques and theories from different fields, including mathematics, computer science, economics, and business administration to gain unique insight, from data, related to its domain. All data science problems require different techniques to solve. Based on the type of data science problem there are four major categories of machine-learning algorithms that can be applied to the data. They are: supervised learning [
1], unsupervised learning [
2], semi-supervised learning [
3], and reinforced learning [
4]. These methods aid in the discovery of interesting information from large amounts of data with a specific application area.
Wine has been produced for several thousands of years. This type of beverage is typically made from fermented grape juice and has remained popular and become even more affordable in modern times. An endless number of varieties and flavors are provided to consumers; as not many of whom are wine experts, their choices in wines can be influenced by the reviews and scores that reputed experts and websites assign them. Therefore, what they have to say about the quality of produced wines can be relied upon when manufacturing them [
5]. Beneficiaries of these wine reviews do not consist of only consumers; winemakers can also gain valuable information and knowledge from expert reviews in knowing which factors contribute most to whether a wine should be drunk or held. To uncover meaningful information from large amounts of wine reviews available currently is a major task, one which would be useful for wine producers, distributors, and consumers.
Wineinformatics is a new and emerging data science that uses wine as domain knowledge and integrates data systems and wine-related data sets, including physicochemical laboratory data and wine review analysis [
6]. Wine reviews, which are written in human language format, describes a judge’s perception of a wine, including colors, smells, tastes, and overall feelings. Wine judges also assign a 50–100 rating to the wines they review. Through the study of wineinformatics, wine reviews are processed by a natural language processing tool named the Computational Wine Wheel for categorizing and mapping various wine terminologies from wine reviews into a consolidated set of descriptors [
7].
Unlike many other foods, the evolution of wine’s sensory qualities is thought to peak after a period in the bottle [
8]. The length of this time frame can vary enormously depending on the wine, meaning that some wines evolve very rapidly toward an optimum followed by a decline, whereas others can withstand several years of aging, during which their overall sensory characters evolve favorably [
8]. This aging process is usually considered as the second phase, while the first is called maturation, which refers to the changes in wines after fermentation and before bottling [
9]. Wines’ aromas change dramatically during bottle-aging, through a complex array of chemical reactions [
8]. “Chardonnay”, “Cabernet Sauvignon”, “Merlot”, and “Zinfandel”, considered premium varietals, benefit most from maturation by developing a more complex flavor profile. Over 86% of Bordeaux wines are red wines made with Merlot, Cabernet Sauvignon and Cabernet Franc grapes, therefore, Bordeaux wine has an established history for aging and evolving in the bottle.
Experienced wine reviewers should be able to combine olfactory and gustatory clues to judge the aging potential of red wines [
9]. Wine reviews from Wine Spectator usually include the aging information in the end of the review in the form of “Best from YearA through YearB”; With the vintage of the wine included, the suggested wine-holding year (YearA—vintage), shelf-life (YearB—vintage) and aging capacity (YearB—YearA) can be calculated and provide crucial information to the field of wineinformatics.
The goal of this paper is to test whether wine reviews describing olfactory and gustatory information reveal wines’ suggested holding-year information. The determination of aging capacity in a wine is usually decided by a panel since the task is very subjective. However, little-to-no research has evaluated wine professionals’ judgment in aging potential through olfactory and gustatory clues. To the best of our knowledge, no similar works have been conducted in discussing the relationship between wine reviews and the aging property of a wine, or using wine-aging capability as a class label in classification research.
3. Results
In prior wineinformatics research [
12,
13], naïve Bayes and SVM have been the best classification algorithms in predicting wine grade. Each algorithm has their pros and cons. In short, SVM is a black-box classification algorithm that addresses outliers. KNN and naive Bayes are white-box classification algorithms, which mean the classification process can be understood by humans. KNN is a lazy learner and is non-parametric. The naïve Bayes algorithm is parametric and builds probability models to make its predictions. We also noticed that K-nearest neighbors (KNN) algorithms outperformed many white-box classification approaches; therefore, we report the findings with all three algorithms in this study.
The four-step process mentioned in the previous section was followed for each of the three algorithms and their results were recorded. All three procedures mentioned in
Section 2.4.1 were used for all algorithms: the first procedure included all wine attributes, while, in the second and third procedures, some attributes were removed from the dataset to improve the performance of the algorithms. Each table describes the accuracy, recall, precision, and F-Score for each of different algorithms used.
Table 4 shows the results of using KNN to predict whether a wine can be held for more than six years or not. The results suggest that more than 70% of wine could be identify from wine reviews suggestions of their holding years by using a KNN algorithm. The results also show that removing the FINISH attribute from the dataset had a negative impact on the performance of the KNN algorithm. However, in procedure III, where we removed the FRUIT, PLUM, GREAT and FINISH attributes, recall was significantly improved (+11.54%) while precision dropped (−6.58%).
Table 5 shows the results of using a naïve Bayes algorithm. The results indicate that naïve Bayes does not fit for this research task since the prediction accuracy is only slightly higher than 50%, which is the baseline for a bi-class classification problem. The removal of attributes in the dataset had no impact on the performance of the naïve Bayes algorithm—only runtime was improved.
Table 6 shows the results by using an SVM algorithm. The results indicate that it provided the best prediction results of the three algorithms in terms of recorded results. The removal of attributes in the dataset consistently improved its performance across the board.
F-Score considers both precision and recall; it is the harmonic mean of them. An F-Score of one is best if there is some sort of balance between precision and recall in the system. Conversely, the F-Score will be lower if one measure is improved at the expense of the other. Due to the distribution of Drink (44.3%) versus Fold (55.7%) in the dataset, the F-score serves as a key evaluator of overall performance in all three algorithms.
It is unsurprising that SVM was the top-performing algorithm (F-Score = 78.75%). However, it is a black-box classification algorithm, so it is difficult to really understand why or how it came to make its decisions when performing the classification. The KNN algorithm (F-Score = 74.86%), though producing “stable” results (there is no gradual increase in performance when moving from procedure to procedure), performed quite well when predicting whether or not a wine should be drunk or held. Finally, the naïve Bayes algorithm (F-score = 34.33%) performed the worst of the three, mainly because it generated a model that was heavily biased toward predicting that a wine should be held.
Figure 3 describes the performance of each algorithm on the wine dataset for each of the three procedures used. The goal of this paper is to utilize aging-capability information to associate wine reviews so as to test if they can be used to predict whether a wine can be held for more than six years or not. Three algorithms were used in the research, K-nearest neighbor (KNN), naïve Bayes, and support vector machines (SVM), to search for such hidden patterns. Two classification algorithms achieved more than 70% accuracy in this research. Several frequent attributes were deleted to test the impact thereof on the performances of the algorithms. While runtimes were improved when four attributes (FRUIT, PLUM, GREAT, and FINISH) were removed, SVM was also able to gain a minor improvements on all its evaluation metrics. Based on the performance of these algorithms, the KNN and SVM algorithms performed better on the dataset than did the naïve Bayes algorithm. This paper provides a new approach for using machine learning to understand the linkage between a wine’s reviews and its aging capability.
This research opens a new door for discovering wines’ aging capabilities and their tasting notes. Different labels can be used to extract distinct information; for example, instead of calculating the minimum years for holding by subtracting vintage year from the “best from” year, the aging capability can be derived as “through” year minus “best from” year, or the maximum drinkable years could be assessed as “through” year minus the vintage year of the wine. Bordeaux wines are considered exemplary of the Old World wines, which usually have longer aging capabilities compared with New World wines. Similar research could be carried out on new world wines, such as wines from the United States or Australia; the aging-years threshold would be expected to be shorter and the model thereof, built from classification algorithms, might vary widely from this research. Finally, the research results suggest that trying a more sophisticated approach to feature selection might be able to improve the classification performances of all the algorithms evaluated in this paper. Such step will be able to identify the important attributes linked with aging capacity. Although dimension reduction does not always yield better results, there seems to be enough academic literature supporting this idea [
20,
21] to warrant its exploration in the wine dataset.