Odor Characterization of White Wines Produced from Indigenous Greek Grape Varieties Using the Frequency of Attribute Citation Method with Trained Assessors

The aim of this study was to investigate the sensory aroma profiles of white wines of the indigenous Greek grape varieties Assyrtiko, Malagousia, Moschofilero, and Roditis. Twenty-three panelists evaluated 17 wines of the aforementioned varieties using the frequency of attribute citation method. Three indices were calculated to assess panel performance in terms of reproducibility. Correspondence analysis and cluster analysis were employed to investigate the sensory space of the wines. Samples of the Roditis variety were characterized mainly by Banana and Vanilla odors; Assyrtiko samples had Earthy, Mushroom, and Nutty odors, as well as Lemon and Honey for some of the samples. Malagousia wines were described as having Lemon, Grapefruit, and Citrus blossom character, and they shared some descriptors with Assyrtiko wines, such as Mushroom and Earthy, and some with Moschofilero samples, i.e., floral and citrus notes. All Moschofilero wines exhibited a floral odor profile: specifically, Rose, Jasmine, or more Citrus blossom-like. Moreover, some Moschofilero samples also revealed a Grapefruit, Lemon, and/or Earthy character, while others expressed Honey notes. In conclusion, despite common characteristics found within varieties, some samples of different varieties exhibited overlapping profiles, and in some cases, samples of the same variety were quite different from each other.


Introduction
In the last decade, the wine produced from indigenous Greek grape varieties has been increasingly appreciated in the global wine market [1]. Assyrtiko, Malagousia, Moschofilero, and Roditis, as white varieties, and Agiorgitiko and Xinomavro, as red varieties, are gaining ground among other prestigious international grape varieties [2]. Assyrtiko is the most well-known Greek grape variety and initiated the discussion about Greek wine throughout the wine world [3]. Furthermore, it is the first Greek grape variety planted for commercial purposes in other countries, such as Australia, Cyprus, and Lebanon,

Wine Samples
Seventeen commercial white wines from 4 different grape varieties were used for this study, namely, Assyrtiko (3 samples), Malagousia (4 samples), Moschofilero (7 samples), and Roditis (3 samples). Table 1 shows the details of each wine sample, along with their physicochemical characteristics. All of the samples were 2018 vintage, except for ASR4 (2016). Wine samples were chosen according to the suggestions of the National Inter-Professional Organization of Vine and Wine; different places of origin were chosen in order to cover a high degree of variability for each variety. The wines had no contact with wood during their elaboration process. Before the final sensory evaluation by the panel, the wines were controlled by a team of enologists for any flaws in the wine, and all of them were found to be normal. However, some of them were found to be rather 'tired' and to have 'lost' their character and were thus eliminated before evaluation. Samples were stored in the Laboratory of Oenology and Alcoholic Drinks of the Agricultural University of Athens, and the sensory evaluation took place in December 2019.

Panelists
Twenty-three panelists (16 females and 7 males; mean age: 43 years (age range: 21-63 years))students and staff of the Agricultural University of Athens-participated in the study. They were recruited on the basis of their interest and availability to participate in the study. They were habitual consumers of white wine. All of them passed through a series of screening tests to confirm their ability to take part in wine sensory assessments. Panelists were not paid for their participation.

Training Process
In total, 21 training sessions of 45 min took place over a period of five months. Panelists were instructed to refrain from eating, drinking, and smoking for 1 h prior to the sessions. Furthermore, they were asked to avoid the use of perfumes or perfumed cosmetics. Panelists attended these sessions over a period of 4 months. Training included the smelling of odor reference standards and describing the odors of wines.
In particular, during the first 3 sessions, panelists smelled aroma reference standards from an aroma box (Pulltex, Barcelona, Spain) in order to become familiarized with odors that can be found in wine. Furthermore, in the 2nd and 3rd sessions, they described two wines in each session using the wine aroma wheel of Noble et al. [27]. From the 4th training session onwards, a predetermined descriptor list was given to the panelists. This list was compiled after searching the literature for the varieties used in this study. To the best of our knowledge, no previous scientific work investigating the typical aromas of these varieties has been published. Thus, we checked the available grey literature, including blogs [3,5,8], magazines [28], and books [4,6].
A common list of 28 descriptors for all varieties was compiled using the aforementioned sources. Furthermore, from the 4th session onwards, different reference standards were presented to the panelists to familiarize them with the vocabulary. In some cases, different recipes for the same attribute were presented in order to narrow the list down to the most appropriate terms for the panel. In sessions 4-11, the first part included reference smelling, and the second part entailed describing two wines. At the end of each session, the results were discussed, and the most cited attributes for each wine were highlighted by the panel leader. In session 12, panelists evaluated 5 wines in booths using Compusense Cloud (Compusense, Guelph, ON, Canada) in order to familiarize them with the evaluation procedure. In sessions 13-18, panelists smelled reference standards during the first part, while, during the second part, they described 2-3 wines that were similar to those used in the study using up to five descriptors. In this stage, they were allowed to modify the list, i.e., to add descriptors that they felt were missing or remove descriptors that they did not find relevant or did not use. Table 2 shows the final list of the 25 descriptors along with odor categories of the attributes that were created by the panel in order to make the list more intuitive. Reference standards are also described in Table 2. References made by natural products were prepared at the beginning of each day in order to maintain their freshness throughout the training session.
In training session 19, panelists were asked to identify blind-coded reference standards. In session 20, a pretest was conducted using 4 wines of the study in duplicate in order to evaluate their performance. Training session 21 included feedback on the previous sessions and the smelling of reference standards. After these last sessions, the panel was deemed successfully trained and continued with the evaluation of the wines.

Wine Evaluation
The evaluation of the wine samples (17 wines in two replicates, i.e., 34 samples in total) was divided into four sessions of 8 or 9 samples over a period of two weeks. Wine bottles were opened 30 min prior to the test session and were verified to be free of cork taint by the panel leader. Then, 30 mL was poured into transparent INAO wine glasses [29] covered by plastic Petri dishes in order to allow volatiles to move to the headspace. Samples were left to reach room temperature (20 ± 1 • C). The testing room was ventilated and air-conditioned. Samples were evaluated orthonasally in individual booths.
Panelists were asked to check 2-5 descriptors from the list that applied to the sample in front of them. A 10 min break was mandatory after the first 5 samples in each session. Between samples, a 1 min break was enforced. Samples were presented with 3-digit blinding codes in a monadic sequence according to a Latin Square Design. Data were collected using Compusense Cloud, Academic Consortium (Compusense, Guelph, ON, Canada).

Individual Panelist Performance
The analysis of panel performance was initially focused on assessing panelists' reproducibility. For carrying out this task, three indices were computed for each of the 23 panelists. Specifically, n 1. and n .1 denote the numbers of descriptors chosen by the panelist in the first and second replicate, respectively, n 11 denotes the number of common descriptors chosen by the panelist in both replicates, and n denotes the total number of tested wines. The indices are defined as (a) R = n −1 Σ(2n 11 /(n 1. + n .1 )), wherein the sum is extended over n wines; (b) p 11 = n −1 Σ(n 11 /n 1. ); and (c) the average chi-square statistic (under the assumption of independence) from n wines.
The first two indices (i.e., R and p 11 ) have values in the interval [0, 1], and values close to 1 indicate a high reproducibility; these indices mainly focus on the estimation of the probability of choosing a descriptor in the second replicate given that it was chosen at the first replicate. The index denoted by R was used previously by Campo et al. [25]. The last index is the mean value of the n chi-square distances under the independence assumption between the two replicates; analytically, after computing the values for each panelist and each wine, a contingency table (with two rows and two columns) that included the frequency distribution of the two replicates was generated; then, the chi-square statistic was computed under the assumption of independent evaluations. The index for each panelist is the mean value of n resulting chi-square statistics, with large values indicating a lack of independence (reproducibility) between the two replicates.
In order to identify poor performances among panelists and exclude their evaluations from the dataset, a descriptive analysis was carried out on the three indices, along with a set of statistical tests for detecting outliers (Dixon test, Chi-square test, and Grubbs test, provided by the "outlier" package in r-project [30][31][32][33]).

Overall Panel Performance
For assessing the overall panel performance, we took the following into account: (a) the effect of the replicate (reproducibility) and product (discrimination ability) on the selection of descriptors, (b) the average reproducibility for each wine, and (c) the distances between the two replications for each wine (reproducibility). A descriptive analysis and methods such as Friedman's test, Cochran's Q test [34], and the non-metric multidimensional scaling (Sammon's non-linear mapping [35,36]) were conducted.

Product Characterization
Products' sensory profiles were first evaluated by descriptive analysis, and for each wine, the most frequently selected descriptors by the panel were identified. Correspondence analysis on a contingency table containing the average citation frequency of descriptors (for the most selected descriptors) was conducted in order to visualize the product and descriptor space. Moreover, the coordinates derived from the correspondence analysis were submitted to cluster analysis. Analytically, the decision about the underlying number of clusters was based on the package "NbClust" [37] using five distance measures (Euclidean, Manhattan, maximum, Canberra, and Minkowski) and six clustering methods (kmeans, ward.D, ward.D2, single, complete, and average) and evaluating any solution in which the number of clusters ranged from 2 to 10. Among the suggested optimal clustering solutions (30 in total), we chose the one suggested by the most combinations (measures and methods). The sensory profile of the derived clusters was further examined by correspondence analysis. All data analyses were run using the R-project [38]. Finally, a spider diagram with the averaged citation proportion of the most cited odor attributes for each cluster was created using Microsoft Excel 2016 (www.microsoft.com, Microsoft, Redmond, WA, USA). In other words, for wines found in the same cluster, the average proportion of citations for each attribute was computed.

Individual Panelist Performance
The individual assessment of panel performance included the analysis of the three indices: chi-square, R, and p 11 (Table 3, Supplementary Table S1). The minimum values of both R and p 11 (0.167) were found for panelist 2, whereas panelist 17 had the minimum value of the chi-square index. It should be mentioned that panelist 21 had the second smallest values of R and p 11 (note that these indices are highly correlated) and the third smallest value of the chi-square index; panelist 23 had the best performance according to all three indices. The Dixon test and Grubbs test for outliers (the normality assumption is not rejected at α = 5%) support a lack of outliers; this is not the case for the chi-square test (the alternative hypothesis is that the lowest value is an outlier). However, by excluding panelists 2 and 21 from our analysis, all of the tests for R and p 11 became non-significant; without these two panelists, the lowest values of R and p 11 were 0.215 and 0.242, respectively (both calculated for panelist 16). On that basis of these findings, it was decided that panelists 2 and 21 should be excluded from further analysis.

Overall Panel Performance
Cochran's Q test (Table 4) showed that the overall effect of replicate on the descriptors is non-significant (p = 0.138), providing us with evidence for the panel's reproducibility. This is also the case for each descriptor separately, except for Nuts (p = 0.004; the descriptor "nuts" was chosen in 11.5% of the cases in the first session and in 18.5% of the cases in the second session). Table 4 also contains the p-values for the effect of product on the descriptors; there are significant effects on 16 descriptors (at a 10% significance level), which supports the ability of the panel to discriminate among products.
The average values of the three indices (chi-square, R, and p 11 ) for each wine can be found in Table 5; the p-values from the Friedman's test show that there is no statistically significant difference among wines (evidence for the panel's reproducibility). However, it seems that panelists faced some difficulties in reproducing descriptors for MLG4, ASR1, and MSF1, in contrast to MSF9, MSF10, and ASR3, which resulted in large values for the three indices. The chi-square distances between the two replications for each wine were assessed by a non-metric multidimensional scaling (based on Sammon's non-linear mapping). Indeed, the expected small distances between the two replications are confirmed by the results in Figure 1 for most of the 17 wines.
The few reproducibility difficulties mentioned above are partially supported by the multidimensional scaling method.

Product Characterization
Product sensory profiles were initially evaluated by the frequency of attribute citation. Table 6 shows the six most frequently selected descriptors for each wine. Among the top three most selected descriptors for ROD1, ROD3, and ROD4, we encounter three common descriptors: Banana, Rose, and Citrus blossoms. Lemon, Honey, Nuts, Earthy, and Citrus blossoms are some common descriptors for ASR1, ASR3, and ASR4. Moreover, Rose is the most selected descriptor for MSF3, MSF4, MSF9, and MSF10 (MSF9 and MSF10 have more similarities), whereas MSF1, MSF7, and MSF8 share Lemon and Citrus blossoms among their most frequently attributed descriptors. Some common descriptors for MLG1-MLG4 are Citrus blossoms, Rose, and Lemon. It is worth noting that the most selected descriptors for the wines MSF10 and MSF9, for which we obtained the largest reproducibility indices, were Rose, Citrus blossoms, Honey, and Jasmine.

Product Characterization
Product sensory profiles were initially evaluated by the frequency of attribute citation. Table 6 shows the six most frequently selected descriptors for each wine. Among the top three most selected descriptors for ROD1, ROD3, and ROD4, we encounter three common descriptors: Banana, Rose, and Citrus blossoms. Lemon, Honey, Nuts, Earthy, and Citrus blossoms are some common descriptors for ASR1, ASR3, and ASR4. Moreover, Rose is the most selected descriptor for MSF3, MSF4, MSF9, and MSF10 (MSF9 and MSF10 have more similarities), whereas MSF1, MSF7, and MSF8 share Lemon and Citrus blossoms among their most frequently attributed descriptors. Some common descriptors for MLG1-MLG4 are Citrus blossoms, Rose, and Lemon. It is worth noting that the most selected descriptors for the wines MSF10 and MSF9, for which we obtained the largest reproducibility indices, were Rose, Citrus blossoms, Honey, and Jasmine. A correspondence analysis (the p-value from the chi-square test equals 0.04) on a contingency table containing the average citation frequency of descriptors was conducted in order to visualize the product and descriptor space. Specifically, the contingency table was computed based on the descriptors Banana, Citrus blossoms, Earthy, Grapefruit, Honey, Jasmine, Lemon, Mushroom, Nuts, Rose, and Vanilla; these descriptors are among the top three most selected descriptors, as reported in Table 6. Note that although Mushroom is not among the top three most selected descriptors, it was included in the contingency table because it was generally selected many times. Figure 2 shows products and attributes in the first two dimensions of the correspondence analysis (symmetrical plot), which explain 63.4% of the total variability. Therefore, Dim1 is characterized mainly by Lemon, Earthy, Mushroom, and Nuts in the positive part; Honey, Jasmine, and Rose are in the negative part. Dim2 is mainly characterized by Honey and Lemon in the positive part and by Banana and Vanilla in the negative part. A correspondence analysis (the p-value from the chi-square test equals 0.04) on a contingency table containing the average citation frequency of descriptors was conducted in order to visualize the product and descriptor space. Specifically, the contingency table was computed based on the descriptors Banana, Citrus blossoms, Earthy, Grapefruit, Honey, Jasmine, Lemon, Mushroom, Nuts, Rose, and Vanilla; these descriptors are among the top three most selected descriptors, as reported in Table 6. Note that although Mushroom is not among the top three most selected descriptors, it was included in the contingency table because it was generally selected many times. Figure 2 shows products and attributes in the first two dimensions of the correspondence analysis (symmetrical plot), which explain 63.4% of the total variability. Therefore, Dim1 is characterized mainly by Lemon, Earthy, Mushroom, and Nuts in the positive part; Honey, Jasmine, and Rose are in the negative part.  Among the samples of the variety Roditis, ROD1, ROD3, and ROD4 seem to have a similar profile (also see Table 6), which is mainly characterized by odors such as Banana, Vanilla, and Rose. ASR3 is characterized by Earthy and Nuts, while ASR1 is characterized by Lemon and, to some extent, Grapefruit, which can also be found in the description of ASR3. Among the samples of the Malagousia variety, MLG1 seems to have some distinct elements, such as Honey, while MLG3 is characterized by Citrus blossoms and Grapefruit. Samples of the Moschofilero variety seem to be more dispersed in the sensory space. MSF9 and MSF10 can be described as having Jasmine, Rose, and Honey odors, while MSF8 is described as Lemon and Earthy.
Moreover, from the residuals of the contingency table (not shown here due to space limitations) and their contribution to the total variability, we can derive the most influential relationships; according to this analysis, these relationships are as follows: (1) Earthy (3.89%) and Nutty (2.71%) odors in ASR3, (2) a Honey (3.40%) odor in ASR4, (3) Rose (2.01%) and Jasmine (4.79%) odors in MSF10, (3) a Honey (5.37%) odor in MSF9, and (4) a Banana (4.60%) odor in ROD1. As can also be seen in Figure 2, ROD1, MSF9, ASR3, and MSF10 are the most influential wines overall. Among the samples of the variety Roditis, ROD1, ROD3, and ROD4 seem to have a similar profile (also see Table 6), which is mainly characterized by odors such as Banana, Vanilla, and Rose. ASR3 is characterized by Earthy and Nuts, while ASR1 is characterized by Lemon and, to some extent, Grapefruit, which can also be found in the description of ASR3. Among the samples of the Malagousia variety, MLG1 seems to have some distinct elements, such as Honey, while MLG3 is characterized by Citrus blossoms and Grapefruit. Samples of the Moschofilero variety seem to be more dispersed in the sensory space. MSF9 and MSF10 can be described as having Jasmine, Rose, and Honey odors, while MSF8 is described as Lemon and Earthy.
Cluster analysis, carried out on the first three dimensions (explaining 76.7% of the variability) of the correspondence analysis, indicated the existence of 4 clusters in our set of wines ( Table 7). The results show a clear grouping of the three wine samples from the Roditis variety. Additionally, the samples of the Malagousia variety are divided into two groups. Most of the Moschofilero samples are grouped in the first cluster with MLG3. MSF9 and MSF10 are very similar, and both are grouped in cluster 3. Among the Assyrtiko samples, ASR1, ASR3, and ASR4 are grouped in the same cluster. A correspondence analysis on a contingency table, in which rows are determined by descriptors and columns are determined by clusters, was run to visualize the sensory profile of each cluster (also see the Pearson residuals in the last part of Table 7). Hence, Figure 3 shows the products and the cluster space in the first two dimensions (symmetrical plot; explains 84.39% of the variability). A distinction among the four clusters can partially be seen (note the low quality of representation of Cluster 1, which is mainly due to the low representation of citrus blossoms): for example, Cluster 1 is mainly characterized by (in order) Citrus blossoms, Lemon, Banana, and Grapefruit; Cluster 2 is characterized by (in order) Earthy, Lemon, Mushroom, Nuts, and Grapefruit; Cluster 3 is characterized (in order) by Honey, Jasmine, and Rose; and Cluster 4 (in order) is characterized by Banana, Vanilla, Nuts, and Rose. Moreover, as shown in Figure 4, the most cited attribute for Cluster 1 is Citrus blossoms (over 50%), followed by Rose (about 33%). In Cluster 2, about 35% of the panelists cited Citrus blossoms; the second most cited term for Cluster 2 is Lemon (about 32%). Cluster 3 shows the maximum proportion of citations for Rose (65%), followed by Citrus blossoms, which was cited by more than 40% of the panelists. Finally, Banana was cited by 50% of the panel in Cluster 4, followed by Rose (over 40%). These results indicate that a flower character prevails in the aroma profiles of Clusters 1 and 3, while Cluster 2 also shows a lemonish character, and Cluster 4 is characterized by a banana aroma. Moreover, as shown in Figure 4, the most cited attribute for Cluster 1 is Citrus blossoms (over 50%), followed by Rose (about 33%). In Cluster 2, about 35% of the panelists cited Citrus blossoms; the second most cited term for Cluster 2 is Lemon (about 32%). Cluster 3 shows the maximum proportion of citations for Rose (65%), followed by Citrus blossoms, which was cited by more than 40% of the panelists. Finally, Banana was cited by 50% of the panel in Cluster 4, followed by Rose (over 40%). These results indicate that a flower character prevails in the aroma profiles of Clusters 1 and 3, while Cluster 2 also shows a lemonish character, and Cluster 4 is characterized by a banana aroma.

Discussion
In this study, data analysis showed that the panelists had an adequate performance as a whole, after excluding two of them, and individually. The overall performance of the panel was based on

Discussion
In this study, data analysis showed that the panelists had an adequate performance as a whole, after excluding two of them, and individually. The overall performance of the panel was based on the reproducibility of their evaluations and their discrimination ability. The former was evaluated through multidimensional scaling (Figure 1), Cochran's Q test (Table 4), and Friedman's test (Table 5), while the latter was assessed through the wine effect on attribute selection by Cochran's Q test (Table 4). Cochran's Q test is widely used to analyze binomial data obtained by sensory panels [39,40]. To evaluate individual performance, we proposed the use of three different indices in order to draw more objective conclusions. Thus, we used index R, first used by Campo et al. [25], index p 11 , and the average chi-square statistic for the 17 wines. These analyses showed satisfactory results as well, allowing us to continue with further analysis of our data.
With regard to the sensory space of products and attributes, Figure 2 (Table 6). This can also be seen in Figure 2, where, additionally, the odor of Mushroom seems to play a role for MLG2. Interestingly, some samples of different grape varieties, such as ASR4 and MLG1 or ASR1 and MSF3, seem to be more similar to each other rather than to other samples of the variety to which they belong. This is also supported by the results of cluster analysis (Table 7, Figure 3).
Indeed, some clusters are formed by samples of different grape varieties. Specifically, Cluster 1 consists of samples from two varieties, and Cluster 2 comprises samples from three varieties, whereas Clusters 3 and 4 are homogeneous in terms of varietal character. Taking into consideration Figure 3 and Table 7, the odor attributes describing Cluster 1 are Citrus blossoms, Banana, Grapefruit, and Lemon. Cluster 2 is characterized by Grapefruit, Lemon, Nutty, Earthy, and Mushroom odors. Floral aromas such as Jasmine and Rose, as well as Honey, are attributed to Cluster 3. Finally, Banana, Vanilla, and Rose seem to be important attributes for Cluster 4. It should be noted that Citrus blossom seems to be a common characteristic for all clusters-this can also be seen in Table 6, where Citrus blossom is amongst the top three most cited terms for all samples-although, for Cluster 1, it is more important (see Table 7). These characteristics are also evident in the spider diagram (see Figure 4), where Citrus blossom is apparently cited frequently in all of the clusters, but it does not play an important role in the formation of all clusters, especially in Clusters 2 and 4 (see Table 7). Furthermore, although the attributes Earthy, Mushroom, and Nuts are not stressed in the spider diagram, they seem to be important for Cluster 2 in the cluster analysis.
Regarding cluster formation and varietal origin of the studied wines, Ballester et al. [41], similar to our work, reported that just belonging to a specific category is not enough for an item of this category to be described by the typical characteristics of the category. Furthermore, Rosch and Mervis [42] stated that a typical product can be described not only by the attributes of its own category but also by some attributes of other categories. Within a category, items do have common attributes, but each item does not possess all of the key attributes of that category. Thus, membership in a category has been described as a continuum, where some items may exhibit more typical characteristics than others. Hence, it should not be regarded as an indicator of the absolute inclusion in a category or exclusion from that category [43]. The importance of including samples from different varieties has also been stressed before in order to better explore the limits of the sensory space of a specific variety [41].
Previous research on volatile compounds has revealed the presence of 2-phenylethyl ethanol and phenylethyl acetate in Moschofilero wines [18,44]. These compounds have been further associated with flower aromas, especially rose [45]. This is in congruence with our finding that Moschofilero wines exhibit a floral aroma, including a rose aroma. Moreover, Kechagia et al. [19] found key odorants that are responsible for fruity, honey, and floral aromas in Assyrtiko wines. Accordingly, in our study, Lemon, Grapefruit, and Honey were some of the most cited attributes for Assyrtiko samples. Tyrosol has been previously identified as one of the main phenolic compounds in seven Malagousia wines [46], and it is also known to be responsible for the honey aroma [47]. In our work, we observed that the honey aroma was frequently cited for one Malagousia wine (MLG1), as well as in wines of Assyrtiko (ASR4) and Moschofilero varieties (MSF9, MSF10). Moreover, tropical fruit and banana notes in wines of the Roditis variety have been previously reported [48], but no scientific sources are available on the aroma character of this variety. These correlations from the literature stress the importance of investigating key odor compounds in the study of the characteristic aromas of wines of these grape varieties.
The present study is, to our knowledge, the first scientific attempt to systematically investigate the sensory aroma profiles of wines of the indigenous Greek grape varieties Assyrtiko, Roditis, Malagousia, and Moschofilero, followed by the assessment of panel performance. Other studies have researched the chemical profiles of some of these varieties and their association with other factors, such as prefermentative treatments [19] and yeast interactions [18]. However, there are no published reports in the literature that aim to identify typical aromas of these varieties. We implemented extensive panel training and a sensory technique, i.e., the frequency of attribute citation, which has been used over the last decade as a reliable and more intuitive alternative to other sensory descriptive methods [24,26,49]. Furthermore, we used three indices to check panel reproducibility, followed by robust statistical analysis, in order to ensure objectivity in our conclusions. Reproducibility indices for binomial data have also been used in prior studies [25,50,51], and they are of indisputable value, as all our data rely on panel assessments. Moreover, we used a common descriptor list for all wines, and no information was given during the evaluation of the variety of each wine so that we could obtain results with no such cognitive bias. However, we should mention that although all samples were of the same vintage, except for one, we could not account for possible different winemaking processes, as these samples were commercial wines from different winemakers and different areas. Another limitation of this study is that the number of samples for the Assyrtiko and Roditis varieties was relatively low to allow firm conclusions regarding their aroma profiles.

Conclusions
Overall, this study provides a guide to evaluate the performance of a whole panel and individual panelists through statistical indices and analysis. Furthermore, sensory data indicate patterns in the aroma profiles of the wines of four indigenous Greek white grape varieties. Specifically, wine samples of the Roditis variety exhibit mainly a Banana and Vanilla odor profile. Assyrtiko wine samples are characterized by Earthy, Mushroom, and Nutty odors, as well as Lemon and Honey for some of the samples. Malagousia wines are described as having Lemon, Grapefruit, and Citrus blossom characters, and they also share some descriptors with Assyrtiko wines, such as Mushroom and Earthy, and some with Moschofilero samples, i.e., floral and citrus notes. All Moschofilero wine samples exhibit a floral odor profile: specifically, Rose, Jasmine, or more Citrus blossom-like. Moreover, some samples of Moschofilero also reveal a Grapefruit, Lemon, and/or Earthy character, while others express Honey notes. Thus, as is already known, the present work shows that although some descriptors are more characteristic of each of the studied varieties, some samples of different varieties have overlapping profiles, and in some cases, samples of the same variety are quite different from each other.
Future work should focus on using samples that have been vinified with the same protocol. This way, the confounding factor of different winemaking processes will be excluded, and firmer conclusions about aromas that characterize each variety can be drawn. Nonetheless, we should focus not only on finding the key attributes of each variety but also on attributes to define the boundaries of the sensory space of each variety. These findings should ultimately be related and explained by chemical analysis in order to find key odorants that are responsible for those aromas.
Supplementary Materials: The following are available online at http://www.mdpi.com/2304-8158/9/10/1396/s1: Table S1: Chi-Square, R, and p 11 values for each of the panelists individually. Funding: This research was performed within the framework of the project "The Vineyard Roads, Subproject 2: Chemical/organoleptic characterization of varieties-bio-synthetic paths-vinification". This research has been financed by Greek national funds through the Public Investments Program (PIP) of the General Secretariat for Research and Technology (GSRT), under the action "The Vineyard Roads". (Project code: 2018ΣE01300000; Title of project: "Emblematic research action of national scope for the exploitation of new technologies in the agri-food sector, specializing in genomic technologies and pilot application in the value chains of "olive", "grapevine", "honey ", and "livestock").