Comparison of Three Approaches to Assess the Flavour Characteristics of Scotch Whisky Spirit

This  study  compared  the use of  three  sensory and analytical  techniques: Quantitative Descriptive Analysis (QDA), Napping, and Gas Chromatography‐Mass Spectrometry (GC‐MS) for the  assessment  of  flavour  in  nine  unmatured whisky  spirits  produced  using  different  yeasts. Hierarchical Multiple Factor Analysis (HMFA) showed a similar pattern of sample discrimination (RV scores: 0.895–0.927) across  the  techniques: spirits were mostly separated by  their Alcohol by Volume (ABV). Low ABV spirits tended to have heavier flavour characteristics (feinty, cereal, sour, oily, sulphury) than high ABV spirits, which were lighter in character (fruity, sweet, floral, solventy, soapy). QDA differentiated best between  low ABV spirits and GC‐MS between high ABV spirits, with Napping having the lowest resolution. QDA was time‐consuming but provided quantitative flavour profiles of each spirit that could be readily compared. Napping, although quicker, gave an overview of the flavour differences of the spirits, while GC‐MS provided semi‐quantitative ratios of 96 flavour compounds for differentiating between spirits. Ester, arenes and certain alcohols were found in higher concentrations in high ABV spirits and other alcohols and aldehydes in low ABV spirits. The most comprehensive insights on spirit flavour differences produced by different yeast strains are obtained through the application of a combination of approaches.


Introduction
Scotch whisky is one of the most important products for Scotland's economy, with 133 operational distilleries adding a gross value of GBP 5.5 billion to the economy every year. This industry employs over 10,000 people in Scotland alone and attracts 2 million visitors to distilleries every year [1]. Scotch malt whiskies must be produced using malted barley, yeast and water, before being matured for at least three years in Scotland in oak casks [2]. This maintains the high standard of this spirit. Apart from caramel for colour standardisation, nothing can be added. So, the process steps and quality of the raw materials are essential to the final flavour.
The production of Scotch malt whisky includes several steps: malting, mashing, fermentation, distillation, and maturation. The choice of raw materials and process parameters dictates the final flavour. Malting allows for the amylolysis of barley starch for the conversion to fermentable sugars. Flavours originating from this process are malty, bready, chocolate-like, or damp-straw notes. Furfural is a compound responsible for a grainy/marzipan aroma and 2-and 3-methylbutanal responsible for malty aroma [3]. Other aldehydes, such as hexanal and 2,4-decadienal, give grassy and bean-like aromas. These are created during malting but are reduced by kilning, the subsequent drying of the malt [4].
Mashing uses hot water to breakdown the starch further into fermentable sugars. There is relatively little published research on mashing and flavour, though off-flavours related to oxidation during mashing have been analysed [4,5]. Yeast is added to the resulting wort and the alcoholic fermentation converts the sugars to ethanol and carbon dioxide. The focus of this study are the secondary metabolites produced by yeast. Esters and higher alcohols are produced during fermentation resulting in fruity and floral notes. Aldehydes originating from the malt can be reduced to their saturated counterparts [4]. Peppery/pungent notes during fermentation can occur due to other active microorganism like Lactobacillus spp. which can produce acrolein [3]. Distillation subsequently reduces sulphur and meaty notes due to interactions of sulphur compounds with copper in the still. These include dimethyl trisulphide, a compound responsible for rotten vegetable type flavours [6]. The cut point, the fraction of the spirit collected, also affects the composition and flavour. The distilled spirit is then filled into oak casks for at least 3 years to mature to become Scotch whisky. Desirable flavour compounds are extracted from oak, such as vanillin (vanilla aroma) and oak lactone (coconut). At the same time unpleasant flavours, such as dimethyl sulphide, are reduced [3,7].
Most published research and data on Scotch whisky and its flavour composition are related to the maturation or the authenticity of the final product. However, new make spirit analysis and research are important to ensure a complementary pairing of the style and character of the spirit with the right cask. For mashing and fermentation, the focus to date has mainly been on high yields [7,[9][10][11]. Recently there has been increasing interest in other areas, such as use of heritage barley varieties [12] and new yeast strains to enhance flavour diversity before maturation. The spirits examined in this paper were produced using a range of yeast strains. The metabolism of different yeast strains can vary, which alters the ratios of the secondary metabolites, i.e., flavour compounds, that they produce [13]. These strains had been used with the aim of exploring the impact of yeast on flavour. It is important that the whisky industry has a means of measuring flavour characteristics, so that these differences can be both described and quantified. In this study three different methods for assessing the flavour of new make (unmatured) Scotch whisky spirits were compared: two sensory methods (Quantitative Descriptive Analysis and Napping) and one instrumental method (Gas Chromatography-Mass Spectrometry).
Sensory evaluation is the assessment of samples by a panel of human assessors using controlled methods and conditions. In this paper, the use of Quantitative Descriptive Analysis (QDA) is considered as one of the most common approaches for profiling the flavour characteristics of a product. This technique was developed in the 1970s [14,15] and since then has been widely used in food and beverage applications. Panellists individually rate the intensities of a pre-selected set of flavour characteristics using a scaling system. In the case of whisky spirits, the vocabulary is already well established, being based on The Scotch Whisky Research Institute Flavour Wheel. This flavour wheel was firstly developed in the 1970s and is now widely used across the industry [3,16,17]. QDA is time consuming and requires a trained sensory panel. However, it has been shown to give informative and comprehensive results, highlighting subtle differences, and making it easy to compare samples [15,17,18].
In recent years, several rapid sensory methods have been developed and used for the evaluation of similar product categories, such as wine [19,20] and beer [21][22][23]. One of these rapid methods is Napping, also called projective mapping. Napping was first introduced in 1994, being further developed in the 2000s [24][25][26]. It gets its name from the word "nappé" which is the French word for tablecloth, on which the technique was first performed. Panellists are asked to arrange samples in a two-dimensional space based on similarities and differences between them, placing samples that are different far apart and samples which are alike close to each other [18]. Once placed on the nappé, the panellists give a short description of each sample or group of samples. Untrained panellists can be used for this approach, as they use their own criteria to sort and describe the samples. Previous studies, with no or low alcohol products have shown that QDA and Napping give similar product maps [18,[27][28][29][30][31].
Gas Chromatography-Mass Spectrometry (GC-MS) is an analytical method that can be used to separate and identify compounds present in a complex mixture, such as whisky. In whisky spirit, many of these compounds are flavour active, so differences in their relative abundance contribute to flavour differences. GC-MS is used in the whisky industry for quality control purposes, product development and for product authentication [32,33]. However, due to equipment costs and the need for trained personnel, it is not a standard method, generally only being used in larger laboratories. GC-MS is also used for research purposes, often in combination with sensory techniques [34][35][36].
The aim of this study was to evaluate the information that each of the three techniques (QDA, Napping and GC-MS) provides on flavour differences across a set of unmatured whisky spirits, including how the spirits were grouped using the three methods, and the strengths and weaknesses of each approach.

Spirit Samples
Nine new make (unmatured) Scotch whisky spirits were examined in this study. These spirits had been produced in the laboratory under controlled fermentation and distillation conditions, each using a different strain of yeast. The different metabolic characteristics of the yeasts resulted in spirits with a range of compositions and flavours. The aim of this study was to focus on the differences between the measurement methods. Hence, the impact of individual yeast strains is not reported here but will be published elsewhere. Spirits were simply coded with A-I.
As a result of differences in fermentation performance, the spirits varied in their alcohol strength. The alcohol by volume (ABV) was determined with a density meter, DMA 35 (Anton Parr, Graz, Austria). Details of the ABVs of each spirit are shown in Table  1.

Panellists
The sensory assessments were carried out by The Scotch Whisky Research Institute (SWRI) sensory panel. This panel consisted of 17 panellists (mixed gender, over 18 years old), who were trained and experienced in the evaluation of Scotch whisky and other spirits. Panellists were preselected for the panel based on their performance in an odour recognition screening test, which evaluates their ability to identify and describe everyday aromas. Panellists were then trained in specific whisky related flavours, with training centred around the characteristics on the SWRI Flavour Wheel. They then undertook a period as trainee panellists, during which time they were introduced to a range of sensory techniques and familiarised with different spirit types. All panellists used in this study had been judged by the panel leader as having the suitable level of expertise based on their individual performances in relation to panel means. Regular participation in the FlavorActiv Whisky Sensory Proficiency Scheme (https://www.flavoractiv.com/) is used as a further check of panellist performance. The same group of 17 panellists carried out both the QDA and Napping procedures.

Spirit Preparation and Presentation
In accordance with standard industry practice, the spirits were diluted to a uniform alcohol strength of 20% ABV using water. Then, 20 mL portions were presented in 100 mL blue nosing glasses and covered with watch-glasses. Spirits were coded using three-digit random codes to hide their identities. Since the interest was in the volatile flavour compounds of the spirits, assessments were based only on aroma with no tasting being carried out. This is typical of standard industry practice for laboratory produced spirits.

Quantitative Descriptive Analysis
QDA was carried out in accordance with the procedures outlined by Lawless and Heymann [15] and Stone et al. [14]. Spirits were presented in a randomised order. Panellists scored the intensities of a range of pre-defined attributes using a line scale of 0 to 3. The attributes were selected based on previous experience of new make whisky spirit, being selected from descriptors on the SWRI Flavour Wheel: smoky, feinty, cereal, green, floral, fruity, solventy, soapy, sweet, oily, sour, sulphury, meaty, and stale. The panellists scored all spirits for each attribute before moving on to the next attribute. Data collection was split into three sessions to minimise sensory fatigue, with four or five attributes evaluated in each session. Tests were carried out in individual sensory booths under red light in the SWRI sensory laboratory. Data were collected using Compusense software (West Guelph, Canada). In addition to recording scores, the software also logged the time taken for panellists to complete each session.
Data were analysed using JMP 14.3.0 software (32-bit, SAS Institute Inc., Cary, NC, USA). A Two-Way ANOVA was carried out followed by multiple comparison Tukey-Kramer HSD test. In both cases, a p-value < 0.05 was taken as a statistically significant difference. Normality was assumed on the basis that a trained and regularly tested panel was used [15]. The mean panel scores for all 14 attributes were further summarised by Principal Component Analysis (PCA) and the correlation analysis used to create a product map using RStudio Version 1.3.103 (RStudio PBC, Boston, MA, USA) packages FactoMineR and ggplot2 [37]. For the purposes of comparing the measurement techniques, the first two dimensions of this analysis are presented in this paper.

Napping
Napping was carried out in accordance with the method described by Perrin et al. [38]. Panellists were presented with all nine spirits to assess and asked to arrange them on a piece of paper (A0; 841 × 1189 mm) with the instructions that spirits that were similar should be placed close to each other, while spirits which were most different placed furthest apart. Panellists could place the spirits freely on the paper, rearrange them as often as they liked, and reassess and regroup them based on their own criteria. Once the panellists had positioned the spirits, they were asked to note the dominant flavour characteristics next to each spirit or group. These tests were carried out individually in a well ventilated meeting room under ambient lighting. Due to room conditions, no red light was used. The time taken for each panellist to perform the task was recorded.
Once the panellist had completed the test, the position of the glasses was recorded by noting the X-and Y-coordinates of each spirit on the paper. The flavour descriptions were also noted. These descriptors were then categorised according to the flavour categories used in the QDA (Section 2.2.3). An extra category, nutty, was added as this was frequently mentioned by the panellists. RStudio Version 1.3.103 (RStudio PBC, Boston, MA, USA) and the packages FactoMineR and ggplot2 were used to carry out a Multiple Factor Analysis (MFA) based on the coordinates of each spirit to create a product map. Again, the first two dimensions are presented in this paper for the purposes of comparing the measurement techniques.
The number of times a flavour attribute (from the categorisation process) was noted for a spirit was totalled and used as supplementary data [39]. The RV coefficient was used to calculate the consensus between panellists. Scores > 0.5 to the Centroid were seen as acceptable [39].

Gas-Chromatography-Mass Spectrometry
Solid phase microextraction (SPME) gas-chromatography-mass spectrometry (GC-MS) was used to analyse the volatile compounds of the spirits. Then, 2.5 mL of the spirit were filled into 10 mL magnetic screw cap vials (Gerstel, Mühlheim an der Ruhr, Germany) and adjusted to 20% ABV with the addition of 7.5 mL of a water-ethanol mixture. Then, 50 μL of methyl-heptanoate (Sigma-Aldrich, St. Louis, MO, USA) at a concentration of 22 ppm was added as an internal standard to each spirit. Spirits were prepared at least 24 h prior to analysis.
Spirits were incubated for 10 min at 35 °C. A DVB/CAR/PDMS 50/30 μm fibre (Supelco Inc., Bellefonte, PA, USA) was used to extract the volatiles and injected splitless to the column with an injector temperature of 250 °C, a split flow of 40 mL/min and a splitless time of 1 min. The temperature program of the oven was 2 min at 35 °C with a rate of 5 °C/min to 250 °C and holding this temperature for an additional 10 min with a flow rate of 1.4 mL/min. Four independent samples were taken from each spirit, these were measured in duplicate, giving a total of eight measurements per spirit. A full scan was done with a solvent delay of 1 min. m/z between 35 and 350 were scanned. Peaks were included where the minimum peak height of 3.0 signal to noise was exceeded and the mass spectra could be identified by the used library NIST Mass Spectral Sear Program and NIST/EPA/NIH Mass Spectral Library Version 1.7 (National Institute of Standards and Technology, Gaithersburg, MD, USA). Xcalibur software (Home page version 1.2, ThermoScientific, Waltham, MA, USA) was used to analyse the data.
Peak areas were determined for each compound and semi-quantitatively analysed: peak areas were measured and compared, but no standards and calibration lines were created. Compounds were assigned to groups based on both chemical structure and flavour. Flavour groupings were assigned based on the compound flavour descriptions given on the Good Scents Company website http://www.thegoodscentscompany.com/. The compounds were then grouped according to the flavour attributes used in the QDA evaluation described in Section 2.2.3. Compounds without listed descriptors (no aroma or not listed) were assigned as not described.
PCA based on correlation was used to summarise the data and produce a product map. This was carried out using RStudio Version 1.3.103 (RStudio PBC, Boston, MA, USA) and the packages FactoMineR and ggplot2. The first two dimensions of this analysis are presented in this paper.

Statistical Comparison of the Three Methods Using Hierarchical Multiple Factor Analysis
Hierarchical Multiple Factor Analysis (HMFA) was carried out in order to statistically compare the three methods [38,40,41]. The QDA data were entered in the form of mean panel scores, the same format used in the PCA analysis. The Napping coordinates from individual panellists were used, while the GC-MS data were the mean of the eight measurements per spirit. The data structure of the HMFA is displayed in Figure 1. For the data analysis, each row was allocated to a spirit, while the columns described the variables belonging to QDA, Napping and GC-MS. Each variable was normalised based on performing a PCA on each data set followed by dividing the elements by the square of the first eigenvalue obtained [42], allowing the comparison of all three methods in the same space. Data were balanced out at each level, which allowed firstly the analysis of three different techniques on Level 1, followed by comparing the two sensory approaches on Level 2 and finally the sensory approaches with GC-MS on Level 3. This was carried out using RStudio Version 1.3.103 (RStudio PBC, Boston, MA, USA) and the packages FactoMineR and ggplot2.

Quantitative Descriptive Analysis
The first two dimensions from the PCA of the QDA data are shown in Figure 2. These two dimensions accounted for 77.3% of the variance in the data.
Dimension 1 separated fruity, solventy, sweet, floral, and soapy spirits (negative end) from feinty, cereal, sulphury, meaty, oily, stale, and sour spirits (positive end). This separation related to the ABV of the spirits, with the high ABV spirits (A, B, C) being fruity, solventy, sweet, floral, soapy, and green and the lower ABV spirits (G, H, I) having feinty, cereal, sulphury and other heavier notes. The intermediate ABV spirits were found in the middle of this dimension (D, E, F), though spirit E was located closer to the high ABV spirits and D and F closer to the low ABV spirits than might have been expected, based on their ABVs.
The separation across Dimension 2 did not relate to ABV. Spirits C and F, and to a lesser extent E, were on the negative end of this dimension, tending towards more smoky, stale aromas, as opposed to soapy, green, solventy notes. In addition to the product map produced from the PCA of the data, QDA provides further information on the significance of differences between the spirits. Mean panel scores and the accompanying statistical analysis (Table A1, Appendix A) showed that all nine spirits had been given very low scores for stale, meaty and smoky attributes. This indicates that none of these are key attributes in any of the spirits, which in turn helps in the interpretation of Dimension 2 of the PCA. Five of the fourteen attributes-feinty, cereal, fruity, sulphury and solventy-showed statistically significant differences across the sample set, indicating that these attributes are the characteristics differentiating the spirits.
Mean panel scores from the QDA can be plotted in the form of flavour profiles in a spider diagram. An example is given in Figure A1 (Appendix A), which shows the flavour profiles of spirits B, D and H. This presentation of the data makes it easy to view the key characteristics of each spirit and quickly identify the main flavour differences between spirits.

Napping
Analysis of the Napping data showed that the RV values for all panellists were greater than 0.5, so all 17 data sets could be included in the MFA. The first two dimensions from the MFA are shown in Figure 3. These two dimensions accounted for 63.2% of the variance in the data. Dimension 1 separated fruity, sweet, floral, soapy and solventy spirits (positive end) from sulphury, feinty, stale, cereal and sour spirits (negative end). As observed in the QDA data (Section 3.1), this related to the ABV of the spirits. Again, as detected by QDA, the intermediate spirits did not fully follow this trend. While they were all located between the high and low ABV spirits, spirit E was located closer to the high ABV spirits and D and F close to the low ABV spirits.
The separation across Dimension 2 was not correlated with ABV. This dimension showed a separation of nutty, oily spirits (positive end) from those with cereal character (negative end), with spirits I and F separated out at the two extremes. This separation differed from the QDA mapping, where spirit I was close to the other low ABV spirits.
Using the Napping approach, it is not possible to plot the individual flavour profiles of the spirits.

Time Expenditure for QDA Versus Napping
The time taken for each panellist to carry out QDA and Napping on all nine spirits was noted and the mean calculated. While the Napping session took on average 7.7 ± 2.9 min, the QDA data analysis for all three sessions took on average 18.7 ± 4.6 min. In addition, the QDA was run over three sessions, so there was additional time required in terms of sample preparation for this approach. Napping required the manual measurement of spirits' positions, approximately 5 min per sample set. Overall, Napping was less time consuming both for the panellists and panel leader.

Gas-Chromatography-Mass Spectrometry
GC-MS identified a total of 96 compounds in the spirits. A full list of the detected compounds, their typical aroma descriptors, chemical group, and the peak area in each of the spirits can be found in Table A2 (Appendix A). PCA of the peak area data is shown in  The separation of the spirits across Dimension 1, shown in the scores plot in Figure 4, was broadly related to ABV. High ABV spirits were located on the positive end of this dimension and low ABV spirits on the negative end. However, this tendency was not fully followed by the intermediate ABV spirits; spirits D and F were located close to the low ABV spirits, while E was located more towards the positive end, only being separated from the high ABV group by Dimension 2.
The associated loadings plot ( Figure 5) showed that the spirits were separated by chemical groups. Most of the compounds on the positive end of Dimension 1 were esters, certain alcohols and arenes, while those on the negative end were mainly other alcohols and aldehydes. Looking in more detail at the different compounds, the low ABV spirits contained higher levels of compounds that eluted early in the GC-MS run and alcohols.  Table A2 (Appendix A)).
A further separation was observed across Dimension 2. Spirits E and B, which were both on the positive end of Dimension 1, were separated across Dimension 2. Spirit B was relatively high in propyl acetate, isobutyl acetate and pentyl acetate, while spirit E was rich in a variety of decanoate esters, such as isobutyl decanoate and isoamyl decanoate.
In addition to the product map produced from the PCA, further information can be extracted from the GC-MS data by looking in more detail at the flavour characteristics typically associated with the identified compounds (Table A2, Appendix A). Over half of the compounds were typically described as having fruity characteristics, while the next most prevalent flavour descriptions were oily, sweet, and green. Compounds with soapy, feinty, sour and sulphury flavours were least common. Roughly one-eighth of the compounds could not be described. This meant that there was either no flavour description, indicating that the flavour of this compounds had not previously been researched, or that it was simply not flavour active. Some of the key flavour attributes used in the QDA and mentioned by panellists during Napping, namely cereal, meaty, and smoky, were not directly linked to any of the compounds measured using GC-MS. This indicates that either the compounds responsible for these flavours are not being detected by the instrument, or that these flavours arise from a more complex interaction of compounds. Correlating sensory with the analytical data using statistical methods, such as Partial Least Squares, can help to understand these interactions.

Statistical Comparison of the Three Methods Using Hierarchical Multiple Factor Analysis
Results of the HMFA of the three data sets is shown in Figure 6. The first two dimensions of this analysis accounted for a total of 66.4% of the total variance. The grouping of the spirits across these dimensions show that all three methods largely discriminated between the spirits in the same way. This was further confirmed by comparing the RV scores ( Table 2). High RV scores between methods mean that they produce similar product maps, with an RV > 0.8 considered as similar [43,44]. The RV scores in this study were all around 0.9. Some differences could be observed, however. For example, there were bigger differences between the QDA and Napping data for the lowest ABV spirits (H and I), than for the highest ABV spirits (A and B). Napping differentiated less between the spirits than QDA and GC-MS, resulting in the same product maps with a lower resolution for finer differences. GC-MS appeared to differentiate the high ABV spirits better, while the resolution for the low ABV spirits was better using QDA.

Discussion
The aim of this study was to compare two sensory approaches (QDA and Napping) and an analytical technique (GC-MS) for the evaluation of the flavour differences among a set of new make Scotch whisky spirits produced in the laboratory using different yeasts. The method of production and used yeast resulted in variable ABVs. HMFA was used to explore the relationships between the data produced using the three methods. The results of this analysis showed that the three techniques grouped the spirits in a similar way with RV scores between 0.895 to 0.927. QDA discriminated the low ABV spirits better while GC-MS discriminated the higher ABV spirits more effectively. All three techniques discriminated the spirits in the same way. However, Napping had a lower resolution for distinguishing spirits from each other. The three methods are not directly interchangeable, as the data from each provides a unique insight into how the spirits differ. There are also advantages and disadvantages associated with each method, which we have compiled based on existing knowledge and from experience gained from this study.
Sensory methods give a measure of human perceptions and hence are relevant to the consumer. However, sensory evaluation can be time consuming and even with trained panellists the data are noisy. QDA is an established sensory method, where spirits are evaluated in terms of perceived intensity of pre-selected flavour attributes. The advantage of this approach is that flavour profiles can be produced for each spirit. These data can be presented visually in a form that is easy to understand and allows spirits to be easily compared. Differences can be confirmed through statistical evaluation of the data. The main downsides of QDA are the requirement of trained and experienced panellists [15], and the time required to evaluate each individual flavour descriptor. Using a pre-selected vocabulary may result in important flavours being be overlooked. This is a particular concern when using QDA to evaluate novel products, such as the spirits produced from new yeast strains used in this study. The descriptor nutty was used frequently during Napping but had not been included in the QDA vocabulary. The solution to this would be to carry out a preliminary session to fine-tune the vocabulary. However, this would add to the time and volume required to carry out the QDA which are the main limiting factors when conducting sensory work.
This study showed that Napping was significantly less time consuming than QDA, taking around a third of the time to carry out although having a lower resolution in distinguishing the spirits. Similarly, in a study by Dehlholm et al. [43], the time was decreased to nearly one third. Although QDA requires trained panellists, this is not mandatory for Napping. This significantly reduces resource requirements, if no pretrained panel is available, but the data are likely to be noisier. Although not studied here, familiarity with the products and their experience in describing flavours could also reduce the number of panellists required to analyse a sample set due to an increase in the quality of their evaluations [19,43,[45][46][47][48][49].
Although considerably more rapid, Napping resulted in a similar product map to QDA. Other studies comparing Napping to descriptive techniques showed similar results in most cases [18,[27][28][29][30][31]43,[50][51][52]. With Napping the evaluation is based on the main characteristics of the sample, rather than full evaluation of single flavour descriptors, so subtle differences may be overlooked. This may depend on the nature of the samples. Some studies show that Napping is the best approach when the samples are very different, with QDA being more efficient for picking up more subtle differences between samples [53]. However, there have also been studies that have found a poor correlation between the techniques [54]. Unlike most of the previous research, our study used the same panel in both evaluations, so any differences in the product maps were due to the technique used rather than the panellists.
When using QDA, the evaluations were split over several sessions, while Napping was carried out in a single session. This reduces sample preparation time and the amount of spirit required. This can be particularly important in applications where there is a limited amount of sample available.
The data analysis used in Napping, MFA, gives a product map but does not provide information on the degree of difference in the sample set [18]. The nature of the data means that they can only be used to compare samples assessed in the same session, while with QDA data can be compared across sessions. The nine spirits used in this study was a manageable number, but there is a limit to the number of samples that can be compared using Napping.
Even when sensory tests are carried out under carefully controlled conditions by highly trained panellists, the data are subjective and unavoidably influenced by their sensitivities to individual flavour compounds. A panel of assessors is required to obtain meaningful results. Other researchers have observed that, with high alcohol strength spirits, only a limited number of spirits and attributes can be assessed during a session if you want to retain the quality of the data [55]. The alternative is to use an analytical method, which is more reproducible, less subjective and allows for the analysis of large sets of samples.
GC-MS is an established method which is used to understand how the chemical composition of a product influences flavour, to predict flavour, and to show which flavour compounds are important to create specific characteristics [34,52,56]. In this study, the GC-MS semi-quantitative analysis of 96 chemical compounds gave a similar product map showing the same spirit groupings as the two sensory approaches. The spirits which had high ABVs and lighter flavour characteristic (fruity, floral, solventy, and sweet) contained higher concentrations of esters, certain alcohols and arenes. The low ABV, which had heavier flavour characteristics (feinty, cereal, sour, oily, and sulphury) contained lower levels of these compounds and increased concentrations of other alcohols and aldehydes. This agrees with previous research, which shows that aldehydes originate from malt and can be reduced to their unsaturated counter parts during fermentation. Hence the degree of fermentation, final ABV, is important for the reduction. In the spirits with a low ABV, the fermentation was incomplete, leaving the higher level of aldehydes [4]. Additionally, furfural concentrations emanate from the grain [3].
While GC-MS gave rich information about the chemical composition, the abundance of compounds and their ratios, the link between this and perceived flavour is more difficult to interpret due to masking effects and compound interactions. For example, the most abundant compounds present in the spirits were esters, which have fruity flavours, but the spirits were not perceived as predominantly fruity. Previous research on American bourbon whiskey has suggested that a chemical recreation with a selection of compounds is possible but is not an easy task [57]. A disadvantage of chemical compositional analysis is that there is typically not an individual compound contributing to a single flavour. For example, the nutty character of malt whisky has been positively linked to 2,5dimethylpyrazine, 2-furanmethanol and ethyl benzoate, while γ-nonalactone has a negative correlation [58]. In another study, the green flavour of malt whisky was linked to aldehydes and nonan-2-ol. However, even after extensive research the levels of these compounds only gave an indication and could not totally recreate the flavour [56]. This study showed that while there were a range of compounds contributing to certain flavour characteristics, other flavours did not appear to be linked to single compounds. A good example are cereal notes, which were perceived at relatively high levels in several of the spirits. None of the compounds measured in our GC-MS analysis were described as having this flavour. Either the compounds responsible were not detected, the flavour descriptors on the Good Scents Company website were too vague or, as previous research suggests, this cereal character is due to an interaction of several compounds.
Around one eighth of the compounds identified in the spirits could not be classified as there was no flavour description available in the literature. This may be because the compound is not flavour active, in which case it can be ignored. However, there may have been compounds that were detected in this analysis that had not previously been included in sensory studies. In this case, their potential flavour impact may have been overlooked. Gas chromatography-olfactometry-mass spectrometry (GC-O-MS), which combines an instrumental with a sensory analysis, could be applied to identify which compounds are flavour-active. Sensory thresholds are also important. Some compounds have much lower thresholds, having more of an impact at lower levels than compounds with higher thresholds. However, since there are no data available on threshold levels for many of these compounds, and they were not quantified, it was not possible to account for this factor in the analysis. Previous research conducted to determine the flavour threshold of single compounds showed that it is still difficult to calculate the overall flavour perception in mixtures of several compounds [34,52,56,57].
The final issue that causes problems when linking GC-MS data to flavour perception is that, although an appropriate method was selected for this analysis, SPME selectively extracts some compounds more efficiently than others. Less volatile compounds, in particular, are more difficult to detect using GC-MS. This makes it harder to identify all compounds that give heavier characteristics, such as phenols, fatty acids or arenes, which are likely to have been more prevalent in the lower ABV spirits. Nevertheless, the data can be semi-quantitatively compared. While a comparison on a quantitative level showed that strains with a high ABV had, in general, higher peak areas of compounds, the real flavour can only be determined through sensory evaluation.
Data collected across the three analyses revealed two main differences among the spirits. The biggest difference in both composition and sensory character was related to the ABV of the spirit, the level of alcohol that had been produced by the various yeast strains during fermentation. Higher alcohols, such as 2-methyl-1-propanol or isoamyl alcohols, distil later because their volatility increases with reducing alcohol concentration [59]. Solely based on the distillation effect, the opposite of what would be expected was observed, with these compounds being found in higher levels in the high ABV spirits. These compounds are fermentation products and are more prevalent in complete fermentations, resulting in higher ABVs. Most of the late eluting compounds detected by the chosen GC-MS method were linked to flavour descriptors such as sweet, oily, or fruity. This could indicate that heavier compounds linked to sulphury, feinty, cereal, or stale flavours were not detected by the chosen method, or that they were only produced in low levels which could be linked back to the yeast strains used.
All three analyses also revealed more subtle differences between the spirits. Spirit B was differentiated from the other spirits by all three methods, with GC-MS differentiating it most. While the sensory techniques could only highlight that this sample was different in flavour terms, GC-MS gave further insights. Spirit B was higher in propyl acetate, isobutyl acetate and pentyl acetate than the other spirits. GC-MS also differentiated spirit E, which was rich in a variety of decanoates, such as isobutyl decanoate and isoamyl decanoate, though this difference in composition did not appear to have a significant impact on flavour. Such differences between the spirits can be linked back to the metabolism of the yeast.

Conclusions
The three methods compared in this study each provided different insights into the differences among whisky spirits. QDA gave quantitative data on the perceived intensity of flavour attributes, which could be presented in an easily interpreted graphical format and allowed statistical significance to be determined between spirits. However, this method was time-consuming, and required a pre-determined vocabulary of attributes and a trained and experienced panel. In comparison, Napping required only around one third of the time taken for QDA. With Napping panellists provided their own descriptions of sample differences, so no pre-selected vocabulary was needed. Although in this case the same panel was used, it is possible to carry out this approach using less experienced assessors. With Napping, differences between samples can be mapped out using PCA, but flavour profiles of individual samples cannot be produced. GC-MS provides information on the compositional differences between samples. The results of this analysis can be link to flavour differences based on previous knowledge of the sensory impacts of the measured congeners. A detailed summary of the advantages and disadvantages of the three techniques are summarised in Table 3. Overall, the study showed that the high ABV spirits contained relatively high concentrations of esters and certain alcohols which were linked to lighter flavour characteristics, such as fruity, floral, solventy, and sweet. The low ABV samples were richer in heavy flavour characteristics, such as feinty, cereal, sour, oily, and sulphury and contained more aldehydes, than certain other alcohols. The medium ABV spirits were more variable in terms of flavour and composition. It was concluded that the nature of these spirits was likely to be due to other differences in metabolism between the yeast strains that were not directly correlated with the amount of alcohol that they produce.
HMFA was used to statistically compare the three approaches. The results revealed that QDA showed a better resolution between the low ABV spirits, while GC-MS gave the best discrimination between the high ABV spirits. Napping showed the lowest resolution of the samples, though overall the product map was similar. When designing a study to determine the flavour characteristics of a set of food or drink samples, the researcher must select a method that best matches both the information that they want to obtain and the resources that they have available. Overall, this research has shown that the best understanding of the flavour differences in whisky spirits will be obtained through using a combination of sensory and compositional analyses.

Data Availability Statement:
The authors confirm that the data supporting the findings of this study are available within the article and its supplementary materials. Appendix A Figure A1. Typical presentation of QDA mean panel scores in a spider diagram for three selected spirits. * displays significant differences (p < 0.05).