MDPI - Publisher of Open Access Journals

15 pages, 2618 KB

Open AccessArticle

Wineinformatics: Wine Score Prediction with Wine Price and Reviews

by Yuka Nagayoshi and Bernard Chen

Fermentation 2024, 10(12), 598; https://doi.org/10.3390/fermentation10120598 - 23 Nov 2024

Viewed by 2082

Wineinformatics is a new field that applies data science to wine-related data. The goal of this paper is to determine whether incorporating wine price can improve the accuracy of score prediction. To explore the relationship between wine price and wine score, naive Bayes [...] Read more.

Wineinformatics is a new field that applies data science to wine-related data. The goal of this paper is to determine whether incorporating wine price can improve the accuracy of score prediction. To explore the relationship between wine price and wine score, naive Bayes classifier and support vector machine (SVM) classifier are employed to predict the scores as either equal to or above 90 or below 90. The price values are normalized using four different methods: mean, median, boxplot mean, and boxplot median. To conduct a proper comparison, the original dataset from previous research, which includes a total of 14,349 wine reviews, was preprocessed by filtering all null price values, resulting in 9721 wine reviews. Using this dataset, classifiers, and normalization methods, the models with and without the price feature were compared. SVM classifier with mean normalization method (USD 50.04) achieved the best accuracy of 87.98%, while naive Bayes classifier with boxplot median normalization method (USD 28.00) showed the greatest improvement of 0.99%. From all the results, we concluded that boxplot median normalization (USD 28.00) is the most effective method in this study. These results indicate that incorporating price as an attribute enhances machine learning algorithms’ ability to recognize the correlation between wine reviews and scores. Full article

(This article belongs to the Special Issue Applications of Computer Science and AI to Fermented Foods and Beverages)

► Show Figures

Figure 1

23 pages, 2989 KB

Open AccessArticle

Applying Neural Networks in Wineinformatics with the New Computational Wine Wheel

by Long Le, Pedro Navarrete Hurtado, Ian Lawrence, Qiuyun Tian and Bernard Chen

Fermentation 2023, 9(7), 629; https://doi.org/10.3390/fermentation9070629 - 1 Jul 2023

Cited by 5 | Viewed by 2984

Abstract

Wineinformatics involves the application of data science techniques to wine-related datasets generated during the grape growing, wine production, and wine evaluation processes. Its aim is to extract valuable insights that can benefit wine producers, distributors, and consumers. This study highlights the potential of [...] Read more.

Wineinformatics involves the application of data science techniques to wine-related datasets generated during the grape growing, wine production, and wine evaluation processes. Its aim is to extract valuable insights that can benefit wine producers, distributors, and consumers. This study highlights the potential of neural networks as the most effective black-box classification algorithm in wineinformatics for analyzing wine reviews processed by the Computational Wine Wheel (CWW). Additionally, the paper provides a detailed overview of the enhancements made to the CWW and presents a thorough comparison between the latest version and its predecessors. In comparison to the highest accuracy results obtained in the latest research work utilizing an elite Bordeaux dataset, which achieved approximately 75% accuracy for Robert Parker’s reviews and 78% accuracy for the Wine Spectator’s reviews, the combination of neural networks and CWW3.0 consistently yields improved performance. Specifically, this combination achieves an accuracy of 82% for Robert Parker’s reviews and 86% for the Wine Spectator’s reviews on the elite Bordeaux dataset as well as a newly created dataset that contains more than 10,000 wines. The adoption of machine learning algorithms for wine reviews helps researchers understand more about quality wines by analyzing the end product and deconstructing the sensory attributes of the wine; this process is similar to reverse engineering in the context of wine to study and improve the winemaking techniques employed. Full article

(This article belongs to the Section Fermentation Process Design)

► Show Figures

Figure 1

14 pages, 3637 KB

Open AccessArticle

Wineinformatics: Comparing and Combining SVM Models Built by Wine Reviews from Robert Parker and Wine Spectator for 95 + Point Wine Prediction

by Qiuyun Tian, Brittany Whiting and Bernard Chen

Fermentation 2022, 8(4), 164; https://doi.org/10.3390/fermentation8040164 - 4 Apr 2022

Cited by 3 | Viewed by 3658

Abstract

Wineinformatics is among the new fields in data science that use wine as domain knowledge. To process large amounts of wine review data in human language format, the computational wine wheel is applied. In previous research, the computational wine wheel was created and [...] Read more.

Wineinformatics is among the new fields in data science that use wine as domain knowledge. To process large amounts of wine review data in human language format, the computational wine wheel is applied. In previous research, the computational wine wheel was created and applied to different datasets of wine reviews developed by Wine Spectator. The goal of this research is to explore the development and application of the computational wine wheel to reviews from a different reviewer, Robert Parker. For comparison, this research collects 513 elite Bordeaux wines that were reviewed by both Robert Parker and Wine Spectator. The full power of the computational wine wheel is utilized, including NORMALIZED, CATEGORY, and SUBCATEGORY attributes. The datasets are then used to predict whether the wine is a classic wine (95 + scores) or not (94 − scores) using the black-box classification algorithm support vector machine. The Wine Spectator’s dataset, with a combination of NORMALIZED, CATEGORY, and SUBCATEGORY attributes, achieves the best accuracy of 76.02%. Robert Parker’s dataset also achieves an accuracy of 75.63% out of all the attribute combinations, which demonstrates the usefulness of the computational wine wheel and that it can be effectively adopted in different wine reviewers’ systems. This paper also attempts to build a classification model using both Robert Parker’s and Wine Spectator’s reviews, resulting in comparable prediction power. Full article

(This article belongs to the Section Fermentation for Food and Beverages)

► Show Figures

Figure 1

11 pages, 1078 KB

Open AccessArticle

Wineinformatics: Can Wine Reviews in Bordeaux Reveal Wine Aging Capability?

by William Kwabla, Falla Coulibaly, Yerkebulan Zhenis and Bernard Chen

Fermentation 2021, 7(4), 236; https://doi.org/10.3390/fermentation7040236 - 20 Oct 2021

Cited by 5 | Viewed by 3114

Abstract

Wineinformatics is a new and emerging data science that uses wine as domain knowledge and integrates data systems and wine-related data sets. Wine reviews from Wine Spectator usually include the aging information, at the end of the review, in the form of “Best [...] Read more.

Wineinformatics is a new and emerging data science that uses wine as domain knowledge and integrates data systems and wine-related data sets. Wine reviews from Wine Spectator usually include the aging information, at the end of the review, in the form of “Best from YearA through YearB”; with the vintage of the wine included, the suggested holding year (YearA—vintage), shelf-life (YearB—vintage) and aging capacity (YearB—YearA) can be calculated and provide crucial information in the study of wineinformatics. The goal of this paper is to test whether wine reviews describing olfactory and gustatory information reveal wines’ suggested holding-year information. Wine reviews from Wine Spectator are extracted and processed by a natural language processing tool named the Computational Wine Wheel for categorizing and mapping various wine terminologies from wine reviews into a consolidated set of descriptors. The suggested aging capability is also calculated from the review and served as a label for classification problems. The study uses different learning algorithms, analyzing their performances and using the best-performing algorithm(s) to build a model for the prediction of a wine’s aging properties. The results of the study suggest that both support vector machine (SVM) and the K-nearest neighbor (KNN) algorithms achieved more than 70% accuracy. These results suggest that the algorithms are able of capturing a hidden informational relationship between a wine’s reviews and its aging capability. Full article

(This article belongs to the Section Fermentation for Food and Beverages)

► Show Figures

Figure 1

19 pages, 9050 KB

Open AccessArticle

Clustering in Wineinformatics with Attribute Selection to Increase Uniqueness of Clusters

by Jared McCune, Alex Riley and Bernard Chen

Fermentation 2021, 7(1), 27; https://doi.org/10.3390/fermentation7010027 - 18 Feb 2021

Cited by 5 | Viewed by 4651

Abstract

Wineinformatics is a new data science research area that focuses on large amounts of wine-related data. Most of the current Wineinformatics researches are focused on supervised learning to predict the wine quality, price, region and weather. In this research, unsupervised learning using K-means [...] Read more.

Wineinformatics is a new data science research area that focuses on large amounts of wine-related data. Most of the current Wineinformatics researches are focused on supervised learning to predict the wine quality, price, region and weather. In this research, unsupervised learning using K-means clustering with optimal K search and filtration process is studied on a Bordeaux-region specific dataset to form clusters and find representative wines in each cluster. 14,349 wines representing the 21st century Bordeaux dataset are clustered into 43 and 13 clusters with detailed analysis on the number of wines, dominant wine characteristics, average wine grades, and representative wines in each cluster. Similar research results are also generated and presented on 435 elite wines (wines that scored 95 points and above on a 100 points scale). The information generated from this research can be beneficial to wine vendors to make a selection given the limited number of wines they can realistically offer, to connoisseurs to study wines in a target region/vintage/price with a representative short list, and to wine consumers to get recommendations. Many possible researches can adopt the same process to analyze and find representative wines in different wine making regions/countries, vintages, or pivot points. This paper opens up a new door for Wineinformatics in unsupervised learning researches. Full article

(This article belongs to the Special Issue Control of Wine Fermentation)

► Show Figures

Figure 1

15 pages, 1522 KB

Open AccessArticle

Wineinformatics: Using the Full Power of the Computational Wine Wheel to Understand 21st Century Bordeaux Wines from the Reviews

by Zeqing Dong, Travis Atkison and Bernard Chen

Beverages 2021, 7(1), 3; https://doi.org/10.3390/beverages7010003 - 4 Jan 2021

Cited by 8 | Viewed by 5311

Abstract

Although wine has been produced for several thousands of years, the ancient beverage has remained popular and even more affordable in modern times. Among all wine making regions, Bordeaux, France is probably one of the most prestigious wine areas in history. Since hundreds [...] Read more.

Although wine has been produced for several thousands of years, the ancient beverage has remained popular and even more affordable in modern times. Among all wine making regions, Bordeaux, France is probably one of the most prestigious wine areas in history. Since hundreds of wines are produced from Bordeaux each year, humans are not likely to be able to examine all wines across multiple vintages to define the characteristics of outstanding 21st century Bordeaux wines. Wineinformatics is a newly proposed data science research with an application domain in wine to process a large amount of wine data through the computer. The goal of this paper is to build a high-quality computational model on wine reviews processed by the full power of the Computational Wine Wheel to understand 21st century Bordeaux wines. On top of 985 binary-attributes generated from the Computational Wine Wheel in our previous research, we try to add additional attributes by utilizing a CATEGORY and SUBCATEGORY for an additional 14 and 34 continuous-attributes to be included in the All Bordeaux (14,349 wine) and the 1855 Bordeaux datasets (1359 wines). We believe successfully merging the original binary-attributes and the new continuous-attributes can provide more insights for Naïve Bayes and Supported Vector Machine (SVM) to build the model for a wine grade category prediction. The experimental results suggest that, for the All Bordeaux dataset, with the additional 14 attributes retrieved from CATEGORY, the Naïve Bayes classification algorithm was able to outperform the existing research results by increasing accuracy by 2.15%, precision by 8.72%, and the F-score by 1.48%. For the 1855 Bordeaux dataset, with the additional attributes retrieved from the CATEGORY and SUBCATEGORY, the SVM classification algorithm was able to outperform the existing research results by increasing accuracy by 5%, precision by 2.85%, recall by 5.56%, and the F-score by 4.07%. The improvements demonstrated in the research show that attributes retrieved from the CATEGORY and SUBCATEGORY has the power to provide more information to classifiers for superior model generation. The model build in this research can better distinguish outstanding and class 21st century Bordeaux wines. This paper provides new directions in Wineinformatics for technical research in data science, such as regression, multi-target, classification and domain specific research, including wine region terroir analysis, wine quality prediction, and weather impact examination. Full article

► Show Figures

Figure 1

16 pages, 2500 KB

Open AccessArticle

Understanding 21st Century Bordeaux Wines from Wine Reviews Using Naïve Bayes Classifier

by Zeqing Dong, Xiaowan Guo, Syamala Rajana and Bernard Chen

Beverages 2020, 6(1), 5; https://doi.org/10.3390/beverages6010005 - 14 Jan 2020

Cited by 13 | Viewed by 6077

Abstract

Wine has been popular with the public for centuries; in the market, there are a variety of wines to choose from. Among all, Bordeaux, France, is considered as the most famous wine region in the world. In this paper, we try to understand [...] Read more.

Wine has been popular with the public for centuries; in the market, there are a variety of wines to choose from. Among all, Bordeaux, France, is considered as the most famous wine region in the world. In this paper, we try to understand Bordeaux wines made in the 21st century through Wineinformatics study. We developed and studied two datasets: the first dataset is all the Bordeaux wine from 2000 to 2016; and the second one is all wines listed in a famous collection of Bordeaux wines, 1855 Bordeaux Wine Official Classification, from 2000 to 2016. A total of 14,349 wine reviews are collected in the first dataset, and 1359 wine reviews in the second dataset. In order to understand the relation between wine quality and characteristics, Naïve Bayes classifier is applied to predict the qualities (90+/89−) of wines. Support Vector Machine (SVM) classifier is also applied as a comparison. In the first dataset, SVM classifier achieves the best accuracy of 86.97%; in the second dataset, Naïve Bayes classifier achieves the best accuracy of 84.62%. Precision, recall, and f-score are also used as our measures to describe the performance of our models. Meaningful features associate with high quality 21 century Bordeaux wines are able to be presented through this research paper. Full article

► Show Figures

Graphical abstract

10 pages, 678 KB

Open AccessArticle

Wineinformatics: Regression on the Grade and Price of Wines through Their Sensory Attributes

by James Palmer and Bernard Chen

Fermentation 2018, 4(4), 84; https://doi.org/10.3390/fermentation4040084 - 29 Sep 2018

Cited by 15 | Viewed by 5304

Abstract

Wineinformatics is a field that uses machine-learning and data-mining techniques to glean useful information from wine. In this work, attributes extracted from a large dataset of over 100,000 wine reviews are used to make predictions on two variables: quality based on a “100-point [...] Read more.

Wineinformatics is a field that uses machine-learning and data-mining techniques to glean useful information from wine. In this work, attributes extracted from a large dataset of over 100,000 wine reviews are used to make predictions on two variables: quality based on a “100-point scale”, and price per 750 mL bottle. These predictions were built using support vector regression. Several evaluation metrics were used for model evaluation. In addition, these regression models were compared to classification accuracies achieved in a prior work. When regression was used for classification, the results were somewhat poor; however, this was expected since the main purpose of the regression was not to classify the wines. Therefore, this paper also compares the advantages and disadvantages of both classification and regression. Regression models can successfully predict within a few points of the correct grade of a wine. On average, the model was only 1.6 points away from the actual grade and off by about $13 per bottle of wine. To the best of our knowledge, this is the first work to use a large-scale dataset of wine reviews to perform regression predictions on grade and price. Full article

(This article belongs to the Special Issue Bioprocess and Fermentation Monitoring)

► Show Figures

Figure 1

16 pages, 2216 KB

Open AccessArticle

Wineinformatics: A Quantitative Analysis of Wine Reviewers

by Bernard Chen, Valentin Velchev, James Palmer and Travis Atkison

Fermentation 2018, 4(4), 82; https://doi.org/10.3390/fermentation4040082 - 25 Sep 2018

Cited by 24 | Viewed by 5918

Abstract

Data Science is a successful study that incorporates varying techniques and theories from distinct fields including Mathematics, Computer Science, Economics, Business and domain knowledge. Among all components in data science, domain knowledge is the key to create high quality data products by data [...] Read more.

Data Science is a successful study that incorporates varying techniques and theories from distinct fields including Mathematics, Computer Science, Economics, Business and domain knowledge. Among all components in data science, domain knowledge is the key to create high quality data products by data scientists. Wineinformatics is a new data science application that uses wine as the domain knowledge and incorporates data science and wine related datasets, including physicochemical laboratory data and wine reviews. This paper produces a brand-new dataset that contains more than 100,000 wine reviews made available by the Computational Wine Wheel. This dataset is then used to quantitatively evaluate the consistency of the Wine Spectator and all of its major reviewers through both white-box and black-box classification algorithms. Wine Spectator reviewers receive more than 87% accuracy when evaluated with the SVM method. This result supports Wine Spectator’s prestigious standing in the wine industry. Full article

(This article belongs to the Special Issue Bioprocess and Fermentation Monitoring)

► Show Figures

Figure 1

Search Results (9)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (9)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI