1. Introduction
Wine is an aesthetic product [
1] and its appreciation is considered mostly subjective, sometimes likened to art appreciation, especially when the terms used for its description are borrowed from art: “complexity”, “balance/harmony”, “development” [
2].
Quality is described as a multidimensional concept [
3] or multifaceted construct [
4], difficult to define and therefore often avoided in scientific works [
1]. As there is a lack of clear definition and/or defined parameters [
1], quality is evaluated by proxy [
1,
3]. For wine, the three main proxies are color, taste and mouthfeel, and aroma (or flavor) [
2,
5]. As there is still uncertainty regarding its nature, quality is assessed through the proxies’ perception; they have different weight for the final quality score [
6,
7,
8,
9].
Quality manifests a range of intrinsic and extrinsic dimensions; between the two, intrinsic dimensions are perceived as more important [
10]. Interestingly, “pleasure” is considered as one of the intrinsic dimensions, but subordinate to cognitive gustatory dimensions [
1].
Quality evaluation can be carried out by experts, trained panels, and consumers alike, but it was demonstrated that these groups differ in their perception of quality and its intrinsic and extrinsic dimensions [
11,
12,
13,
14,
15]. The context for evaluating quality is relevant. In competitions, it is usually carried out by tasters with experience; competitions have systems in place to check for consistency of judges and an audit system [
16]. Evaluations can be carried out also out of competition, in which case the intent is different—profile, preference, quality assessment, etc. [
2]. The results for quality evaluation depend on the panel, on the manner of evaluation, and on the information received prior to evaluation [
1,
17].
Generally, experts are preferred for wine quality assessment. This might be linked to the perceived “expert objectivity” [
1]. Experienced judges are considered to have a more objective (and systematic) approach to wine tasting and a better technique, lexicon, acuity, and consistency developed in time through experience and exposure [
2,
3,
13,
14,
15,
18,
19] but their emotional response cannot be ignored [
20,
21]. It was demonstrated that for experts, preference is correlated to the quality score [
11,
15], but inconsistencies are not unusual [
15]. Experts also tend to use a combination of descriptive and hedonic terms when describing wines [
11,
22]. Moreover, experts may agree on which sensory characteristics drive quality, but can lack concept alignment (i.e., interpretation of quality as a concept) and ratings can vary according to each individual’s concept of quality [
3,
11].
Writing about wine quality also brings to mind difficulties in communicating the findings: while scientific writing should be clear and offer critical commentary of the product, wine descriptions are often used to evoke an emotional response or an image. The use of precise terms—for example, the ones included in flavor wheels [
23,
24]—could constitute a more objective and systematic approach when dealing with wine descriptors [
2,
25].
Despite all the issues related to defining and consistently evaluating quality, wines are often valued for their quality as perceived by professionals in the industry and every year many prestigious local and international competitions take place. Wine competitions play an important role in the wine industry. They are used by producers as a marketing tool and, to an extent, as a benchmarking exercise. Consumers obtain quality assurance, which can simplify purchase decisions and risk perception as these competitions are seen as a “trusted seal of approval” [
26].
Researchers and the industry are keen to explain why certain wines win by correlating success in a competition (and by extent quality) to more objective measures such as chemical composition as independent measurement of the quality proxies—color, aroma (flavor), taste. Linking quality to chemistry was achieved with various degrees of success, depending on the types of chemical analysis and the statistical technique employed (Multilinear Regression Analysis/MRA [
22], Partial Least Squares/PLS [
27], PLS2 [
11], Sequential and Orthogonalized PLS/SO-PLS [
28]). In general, the literature reports on use of targeted chemical analyses (for example volatile aroma compounds [
22,
27]) and the use of statistical methods directed to the prediction of quality based on the chemistry and/or sensory data acquired.
Another way to elucidate why certain wines win is to consider sensory evaluation out of competition and identify descriptors associated with a quality wine. Quality is not simply equivalent to a wine’s sensory profile, but the profile acts as an indicator of, or a means through which quality is assessed. In this case, the methods reported in the literature varied from traditional quality assessments (Quantitative Descriptive Analysis/QDA) [
22], Descriptive Analysis (DA) [
4,
11], to DA combined with expert-quality sorting [
11] and sorting [
18].
In this context, the aim of the current study was to establish a methodology that would assist researchers and the industry in determining the sensory and chemistry drivers of quality. The case studies for illustrating this methodology were the winners of the Top 10 Chenin Blanc Challenge and Top 10 Pinotage competition in 2018 in South Africa. To answer questions such as “What are the characteristics of the winning wines?” or “What do winning wines taste and smell like?” the aroma and taste sensory profiles of the wines that won were established. Check-all-that-apply (CATA, a multiple choice-based rapid sensory method) was used, followed by quality rating on a 20-point scale [
9,
29], resulting in the profiling of the wines and also the re-evaluation of their quality scoring in a non-competition setting. Wine fingerprinting by HRMS (an untargeted metabolomics-type approach) completed the samples’ characterization using an information-rich chemical technique. The results of the investigation could be used to explore the sensory space of a wine sample set with the added dimension of the quality drivers which, in turn, highlight the experts’ opinions on what makes a winning wine.
2. Materials and Methods
2.1. Wine Samples
The Pinotage set selected for the study was composed of the winners of the 2018 ABSA Top 10 Pinotage competition [
30] wines and an additional five low scoring wines. The Chenin Blanc set contained the winners of the 2018 Standard Bank Top 10 Chenin Blanc Challenge [
31] and an additional three low scoring wines. The wines were supplied by the Pinotage Association and the Chenin Blanc Association.
For both competitions, the top 10 wines are released in alphabetical order and are all considered winners. The details of the wines and the codes used are given in
Table 1.
2.2. Sensory Evaluation
The tasting panel consisted of 27 industry experts (44% female, 65% male, average age 41) for the Chenin Blanc evaluation and 20 experts (37% female, 63% male, average age 43) for Pinotage, including winemakers and cellar masters that did not judge the wines during the competition. All the experts had more than 5 years’ experience working in the wine industry. More than 75% indicated 10 years of experience or more.
A single session was used to capture both sensory profiling data (using CATA) and quality scores. The panel was instructed to evaluate the nose/odor of the wines using a CATA list which was based on previous research and the Chenin Blanc [
32] and Pinotage [
33] aroma wheels respectively. The overall perceived quality of each wine sample was scored out of 20 according to the internationally used wine quality rating system [
9,
29]. All sensory data were captured on 10.1” Samsung galaxy tab A (2018) tablets, using the Compusense cloud software (Compusense Inc, Guelph, ON, Canada).
The sensory evaluation was performed in a well-ventilated and temperature-controlled room free of extraneous odors or noises. Wines were presented at 20 °C ± 2 °C in international tasting glasses (ISO NORM 3591, 1977). Each glass was coded with a random three-digit code and covered with a Petri as lid. The panel received 25 mL of each wine. Monadic sample presentation was applied. The order of sample presentation was randomized across judges according to a Williams Latin square design. Judges were not allowed to communicate with each other during the session. Information about the wines were only shared at the end of the sensory evaluation session.
2.3. Chemical Analysis
All solvents were MS purity and were purchased from Merck Chemicals Pty. Ltd. (Germiston, South Africa). HRMS coupled to liquid chromatography (LC-HRMS) was used for wine fingerprinting. The samples were analyzed by Ultra Performance Liquid Chromatography (UPLC, Waters Corporation) equipped with a Synapt G2 quadrupole time-of-flight mass spectrometer (Waters Corporation). The separation was carried out on an Acquity UPLC HSS T3 column (1.8 μm internal diameter, 2.1 mm × 100 mm, Waters Corporation) using 0.1% formic acid (mobile phase A) and acetonitrile (mobile phase B) and a scouting gradient over 10 min. Flow rate was 0.3 mLmin
−1 and the column temperature 55 °C. The injection volume was 2 μL and the samples were injected directly without pre-treatment. Mass calibration was performed according to the manufacturer’s procedure. The MS was operated in both positive and negative mode, and the total number of features acquired as RT_m/z was 1466 for each sample. The software is directly integrated with SIMCA-P (SIMCA 14.1, Umetrics, Sweden) and the statistical algorithms are directly applied to the processed datasets [
34].
2.4. Statistical Data Analysis
Data obtained from quality scoring were subjected to one-way Analysis of Variance (ANOVA). When a significant ANOVA result was obtained (at p < 0.05) the Fisher’s LSD post-hoc test was applied to perform pairwise comparisons of the wines (XLSTAT 2018, Addinsoft SARL, New York, NY, USA).
Contingency tables containing the CATA data were constructed by counting the number of citations for each attribute across the judges for every wine sample (Microsoft Excel 2016, Microsoft Corporation, Redmond, Washington, USA). The attributes were tabulated as variables in the columns and the wine samples as objects in the rows. The intersection of a row and column represented the number of times that the attribute in the corresponding column was cited by all the judges to describe the wine in the corresponding row.
The contingency tables were submitted to heatmap analysis which included Hierarchical Cluster Analyses (HCA) Chi-square tests and Cochran’s Q tests. Attributes identified as significant were subjected to Correspondence Analysis/CA (XLSTAT 2018, Addinsoft SARL, New York, NY, USA). Pearson’s correlation coefficients between the standardised deviates (Statistica 13, TIBCO Software Inc., Palo Alto, CA, USA), obtained from the CATA analysis, and the quality scores were calculated to identify negative and positive quality drivers [
18].
For the chemistry data, Principal Component Analysis (PCA) and Hierarchical Cluster Analysis (HCA) were applied in order to find natural configurations in the data according to treatments and samples by grouping/clustering. The variables with the highest squared cosines for the first three dimensions were considered for variable selection. Regression vector (RV) coefficients between the CA, performed on the sensory data, and the PCA, performed on the chemical data, was calculated using the first three dimension of the CA and PCA outputs (XLSTAT 2018, Addinsoft SARL, New York, NY, USA).
4. Discussion
In view of the methodology aim of the work, it was demonstrated that combining CATA with the quality rating is a quick way of profiling the wines selected and determining their quality scores in a non-competition environment. Compared to the previous methodology proposed that combined sorting as a rapid method with quality rating [
18], for the current methodology, the tasks were completed in one step and not in two. This approach also avoided the wines being presented to the judges once as a group and the second time monadically; the rating was carried out at the same time as the profiling for one wine at a time.
One of the challenges for CATA is the attributes list [
18]. A relevant aspect to consider when compiling such a list that will eventually be used for correlation or comparison with quality rating is that the terms included must be linked to quality proxies [
3]. The number and the nature of the attributes chosen also have to cover the possible sensory space of the wine set, which will be more complex if the samples are commercial and come from a variety of producers and styles. By choosing Chenin Blanc and Pinotage, the current study aimed to benefit off of the in-depth knowledge South African industry professionals have of these two iconic wines. Additionally, aroma wheels for both cultivars are available and were used in this study [
32,
33] and familiar to the judges, so in this case the choice of terms for the CATA list was straightforward.
It would be difficult to compare the performance of various methodologies unless the evaluations were completed on the same set and with the same aim. The choice of samples can affect the results of both quality rating and sensory evaluation through the attributes generated. When evaluating typicality (of a high quality wine in this case), various degrees of representativeness for the prototype are required [
35,
36] in order to have examples of both high quality and low (which in this case would constitute the borders of the concept of high quality). The choice of the researcher can be difficult; if the wines in the set are all considered representative of a category, region, or style, they might not be varied enough in terms of quality. The number of samples included in a set can also influence the outcome in terms of information generated, explained variance within the set, and the robustness of results. Studies approached these issues differently with various degrees of success. For the current study, the wines chosen were from local competitions; in addition to the Top 10 winners, low scoring wines were included as representatives of the boundaries of the high-quality concept. In the case of Californian Cabernet Sauvignon wines, the authors chose 27 wines from three quality categories according to competition results [
4,
11], while other authors opted for experimental wines considered free of faults as the simplest rule for quality [
28,
37] but with a higher number of wine samples (60 Cabernet Sauvignon and 50 Chardonnay wines) or even wines all with ratings higher than 90 points according to wine critics (83 Australian Chardonnay wines) [
27]. In a previous study conducted in South Africa, only eight Sauvignon Blanc wines were included in the set, all chosen by industry professionals as “representing premium quality” [
18]. However, in that case, the goal of the study was to propose a new methodology and the associated fast workflow appropriate to use in an industry context.
In the current study, other than the wine sensory characteristics, the difference in the numbers of wines in each sample set could have been one of the causes for the explained variance in the CA (higher for Chenin Blanc, a smaller sample set). The number of wines in a sample set also had to take into account the sensory tasks and the judges that performed them. Even though quality can be evaluated with consumers, the industry professionals were also chosen in the current study due to their familiarity with the method and the lexicon included in the CATA lists used. The judges are used to evaluate a large number of samples in one session, but the scope of this study was not known to them prior to the tasting session and what was asked of them was also different from competition conditions in terms of sample profiling.
The evaluation of wine quality (or of quality wines) is carried out with various goals in mind, using different types of judges and thus the sensory methods and the statistical data handling vary. The correlation of rating (for example of hedonic rating by consumers) with other properties of the product can be carried out by generating an external preference map [
38,
39]. It was proposed that if the rating is carried out by experts and it is aimed at quality, the same approach would be an external quality mapping [
3]. These types of approaches are aimed ultimately at correlating quality rating with sensory attributes and/or generating drivers for quality and predicting the sensory attributes of a high quality wine using supervised statistical techniques [
4,
11,
28,
37]. In the case of Australian Sauvignon Blanc and Chardonnay, the wines were sorted into quality groups and described by DA; the statistical analyses consisted of CVA and MDS and the results were linked through GPA [
37]. For Californian Cabernet Sauvignon wines, results generated through DA were subjected to PCA and the quality scores to DISTATIS; the correlation between these datasets was completed by PLS2 and cross-validated by leave-one-out procedure as the number of sample was limited [
11]. When the aim of a study is exploring the sensory space of high-quality wines, the sensory methods and the statistical approach will differ. However, in the case of the current study, an unsupervised method such as CA was considered more appropriate for exploring the sensory space of the wine sets, then combined with the Pearson coefficient to determine the drivers for quality [
18]. As in the cited work, the aim of the current study work was not to predict quality based on sensory data, but rather to elucidate the sensory drivers for quality for the specific set.
Seen as a much more objective way of characterizing a wine, chemical analysis is sometimes included in studies focused on wine quality. It is easy to see why certain classes of compounds would be related to quality through proxies—for example, volatile compounds contribute to aroma, polyphenols to color and taste. Normally, a limited number of compounds analyzed does not provide a comprehensive picture of the wine chemical space. An information-rich technique such as MS could be used to fingerprint the wines and appropriate statistical tools would reveal the compounds driving the quality. Even though in the current study, the same statistical approach was to be considered for the chemistry data as for the sensory results (calculating the Pearson’s correlation coefficients between the quality scores and the standardised deviates from the PCA), the chemical data proved to be too complex and the 1466 features included in the PCA contained a high level of noise. Even after feature selection, which led to noise reduction, this operation could not be performed. In the case of supervised statistical methods, feature selection is more straightforward as it leads to better separation, regression, etc.—aspects that have performance indicators that are easier to evaluate; conversely, the aim of the current work was to explore the space of the sample sets, and the statistical methods were unsupervised. Orthogonal PLS-Discriminant Analysis (OPLS-DA) and the S-plots associated could have been an option in the case of supervised modelling; even in that case, the very limited number of samples and the criteria for choice of classes (based on competition results, quality rating or sensory evaluation) would have made this analysis unadvisable. Another aspect of statistical relevance was that the matrix was not balanced, containing 13 and 15 samples (observations) for Chenin Blanc and Pinotage, respectively compared to the 1466 MS signals (variables). To obtain a more balanced matrix, the number of variables should be reduced and/or the number of samples should be increased considerably. One way is reducing the number of variables through statistical means, as completed in this study. Additionally, the number of variables can be limited a priori by targeting compounds (analytically), but this approach makes the assumption that the researcher would know which compounds are critical for the quality or the quality proxies. If the list of targeted chemical compounds contributing to the wine quality is comprehensive, the methodology would correspond to a targeted metabolomics approach, but even in that case, the same assumption is made even if not to the same extent. The alternative, increasing the number of samples, was impossible in the circumstances of the current study. If the same type of study were to be carried out over a number of years, the chances of success for the statistical analysis would increase. However, one should consider that the style of winning wines might change in time, that the panel used for quality evaluation (during and outside the competition) would change, and even the palate of the judges would change.
Therefore, the sample configurations (score plots) derived from the PCA on the MS data were considered sufficient in the current study, as they allowed for the comparison of sensory and chemistry spaces for each dataset through RV coefficients. This type of approach is not usual in literature related to wine quality. Previous works reported on the correlation between quality proxies such as judgement points and/or expert scores and a limited number of individual aroma compounds or even chemical elements; in the latter case, the more likely explanation was that the elements were rather markers for the origin of the wine and thus, indirectly, the authors linked the quality indicators to the wine origin, which they already regarded as a quality proxy [
4]. In the same study, the chemical profile obtained by HS-SPME-GC-MS for 64 volatiles and the sensory profile by DA were submitted separately to PCA and then compared pair-wise through Pearson’s product correlation coefficient, showing both positive and negative correlations between compounds and aromas. For Australian Chardonnay, 83 wines were scored on a 20-point scale and chemically analyzed for 39 volatiles by HS-SPME-GC-MS; the two datasets were correlated through PLS [
27].