Single Origin Coffee Aroma: From Optimized Flavor Protocols and Coffee Customization to Instrumental Volatile Characterization and Chemometrics

In this study, the aroma profile of 10 single origin Arabica coffees originating from eight different growing locations, from Central America to Indonesia, was analyzed using Headspace SPME-GC-MS as the analytical method. Their roasting was performed under temperature–time conditions, customized for each sample to reach specific sensory brew characteristics in an attempt to underline the customization of roast profiles and implementation of separate roastings followed by subsequent blending as a means to tailor cup quality. A total of 138 volatile compounds were identified in all coffee samples, mainly furan (~24–41%) and pyrazine (~25–39%) derivatives, many of which are recognized as coffee key odorants, while the main formation mechanism was the Maillard reaction. Volatile compounds’ composition data were also chemometrically processed using the HCA Heatmap, PCA and HCA aiming to explore if they meet the expected aroma quality attributes and if they can be an indicator of coffee origin. The desired brew characteristics of the samples were satisfactorily captured from the volatile compounds formed, contributing to the aroma potential of each sample. Furthermore, the volatile compounds presented a strong variation with the applied roasting conditions, meaning lighter roasted samples were efficiently differentiated from darker roasted samples, while roasting degree exceeded the geographical origin of the coffee. The coffee samples were distinguished into two groups, with the first two PCs accounting for 73.66% of the total variation, attributed mainly to the presence of higher quantities of furans and pyrazines, as well as to other chemical classes (e.g., dihydrofuranone and phenol derivatives), while HCA confirmed the above results rendering roasting conditions as the underlying criterion for differentiation.


Introduction
Coffee is the second most favored beverage in the world after tea owing to its unique sensory perception by billions of consumers. Its consumption rate has been steadily increasing throughout the years, placing it among the most traded products with great economic importance on the global market. According to the International Coffee Organization (ICO), world coffee consumption for the year 2020/21 is expected to present a 1.9% increase reaching 167.58 million (60 kg) bags [1]. At the same time, the demand for high-quality coffee also presents a considerable rise with consumers, especially in Europe, being more selective in their choices, appreciating products with label of origin and searching for specific sensory attributes. The flavor of coffee is deemed the top criterion of its quality, affected by factors related to the variety, agroecological zone of its cultivation (e.g., climate, altitude, soil), as well as harvest and post-harvest processing conditions, roasting, grinding, and expected quality attributes, as well as if they could be an indicator of coffee origin and roasting degree.

Volatile Compounds Profile of the Studied Coffee Samples
The volatile profile of roasted coffee samples from a wide range of geographical locations (Table 1, Figure 1) was evaluated. Headspace-SPME and GC-MS were applied, which enabled the detection of more than 130 compounds belonging to a great range of chemical classes. Specifically, a total of 138 compounds, including heterocyclic (lactones, pyranone, furan, pyrazine, pyridine, pyrrole, and thiophene derivatives), isocyclic (alcohols, aldehydes, ketones, and phenols) and aliphatic volatile compounds (aldehydes, alcohols, monocarboxylic acids and their esters, thiols, and ketones) were tentatively identified in the headspace of the samples. The volatile compounds of the analyzed samples were divided into 17 chemical groups and, along with compounds' categorization (e.g., aliphatic aldehydes, α-and γ-diketone, norisoprenoid ketones, isocyclic aldehydes, carbonic acid, etc.), odor description and classification according to the Coffee Taster's Flavor Wheel odorant attributes, are presented in Table 2.            [24], and The Pherobase [25] online databases. e Expressed as relative percentage of each compound peak area to the total GC-MS peak area (n=3) f -: not found.

Cyclopentene and Derivatives
Most of the volatile compounds that contribute to the aroma of roasted coffee are formed during roasting, while some of them come from the raw material (green coffee beans) without, however, participating actively in the aroma, because they are found in concentrations below their odor threshold [12]. The volatiles' composition of the individual samples tested showed both qualitative and quantitative differences. These can be attributed to the origin of the raw material, but mainly to the roast profile at which it took place (not provided due to a non-disclosure agreement). Roasting time and temperature not only result to the intended roasting degree and color, but also cause the occurrence of chemical reactions in green coffee beans. The volatile compounds formed during roasting are products of the Maillard reaction, Strecker degradation of amino acids and related reactions, caramelization reactions, thermal decomposition of food components, such as unsaturated fatty acids, carotenoids, amino acids, hexoses, pentoses and pentose polymers, trigonelline, phenolic acids, and L-ascorbic acid, and thermal oxidative degradation of fatty acids [11]. From the detected compounds, around 30-40% was formed by the Maillard reaction, 12-16% by the Maillard reaction and thermal decomposition of food components, 16-18% by the Strecker degradation of amino acids, and 30-37% by other formation mechanisms.
The volatile compound occupying the largest percentage of the total volatiles was 5-methylfurfural (10-25%), followed by furfural and 2-furanmethanol. All of them are furan derivatives, the major class of volatile compounds present in roasted coffee samples (~25-41%). Furan derivatives are mostly formed during the Maillard reaction, but also can be products of other paths, such as thermal oxidative degradation of polyunsaturated fatty acids, degradation of thiamine and breakdown of nucleosides, dehydration of sugars during sugar caramelization, thermal degradation of carbohydrates, ascorbic acid, or unsaturated fatty acids during roasting [7,12]. 5-Methylfurfural is characterized by a caramel, maple, spicy aroma and furfural, by a sweet, woody, almond-like aroma. 2-Furanmethanol is also present in the volatile fraction of green coffee and is correlated with the undesirable burnt bitter note of dark roasted coffees, offering sweet, caramel, coffee-like notes [12]. All of them are classified in the sweet category.
Along with furan derivatives, pyrazine derivatives represent the second main class of volatile compounds contributing to the characteristic aroma of coffee (~25-39%). These compounds are mainly derived by Strecker degradation of α-amino acids. Twenty-two pyrazines were found in the coffee samples, among which 2-ethyl-3,5-dimethylpyrazine, 3-ethyl-2,5dimethylpyrazine, methylpyrazine, 2,5-dimethylpyrazine and 2,6-dimethylpyrazine. They are all considered as potent odorants in coffee, characterized by nutty, cocoa, roasted odor notes, and consequently, classified in the nutty/cocoa category. The coffee samples from Honduras (HND) and Brazil (BRA1) were found to contain the highest levels of pyrazine derivatives with 36.1% and 39.1%, respectively, that is captured by the resulting nutty scent of their brews ( Table 2).
Esters were found to represent the third most abundant class of volatile compounds identified in the volatile fraction of the examined roasted coffee samples (5-8%), mainly due to the presence of 2-furanmethanol acetate (3.7-5.8%) followed by 2-furanmethanol propanoate (0-1.3%), conferring the fruity aroma of the roasted samples.
Pyrrole derivatives (~5-12%) are a minor class of volatile compounds present in roasted coffee. It seems that they are formed by reactions of aldoses with alkylamines and may result from the thermal degradation of Amadori intermediates, while caramelization, pyrolysis and trigonelline's degradation may also take place [9,12]. The most abundant in the volatile fraction of the examined roasted coffee samples was 1-methyl-1H-pyrrole-2carboxaldehyde, followed by 2-acetylpyrrole and 2-formylpyrrole, three Maillard reaction products with roasted, nutty notes, contributing to the nutty/cocoa category.
There is little information available regarding the contribution of pyridine derivatives (~3-6%) to the roasted coffee aroma. They can be produced by thermal degradation of Amadori intermediates, but also by pyrolysis of amino acids and trigonelline [12]. The compound with the highest percentage was pyridine itself, a trigonelline degradation product, with fishy, sour notes, part of the sour/fermented category.
Aldehydes and ketones have been found to represent~1-6% of the volatile organic compounds present in different roasted coffee samples; however, single compounds do not exceed more that 1% of the total sample contribution. Aldehydes of different chain length present different sensory characteristics and can be formed by the Maillard reaction, by the oxidative degradation of amino acids during their interaction with sugars at high temperatures (Strecker degradation), and during the interaction of amino acids and polyphenols in the presence of polyphenol oxidase at normal temperatures [12]. The main pathway for ketones formation in the raw coffee seeds is the oxidation of fatty acids [26]. They are formed during roasting as Maillard reaction and caramelization products. The formation of volatile aldehydes and ketones has also been attributed to self-oxidation of alcohols and autoxidation of unsaturated fatty acids via the breakdown of hyperoxide intermediates [7,12].
Sulfur-containing compounds, such as thiols, sulfides and thiophenes, are among the key odorants of coffee aroma, regardless of their quite low concentration [2]. 2-Furfurylthiol, described as having a strong, fresh roasted coffee aroma, has been reported to be of great importance because of its very low sensory threshold. The coffee samples COL, HND and PER were characterized by the highest percentage of 2-furfurylthiol (~0.3%) compared to all the other samples, which denotes that the roasted-like odor will affect more intensely their aroma (Table 1) [15]. Other S-containing compounds, such as furfuryl methyl sulfide and furfuryl methyl disulfide, were also detected in the headspace of the samples exhibiting alliaceous/vegetable and roasted characters, respectively [14,27].
Carbonic acids (~1.5-4%), are not crucial to coffee aroma. Their contribution could be either positive with cheese, cream, chocolate notes or negative with sweat-like notes [12]. The most abundant was 3-methyl-2-butenoic acid with phenolic, dairy, green odor notes.

Volatile Odor Description and Categorization-Heatmap Analysis
The sample with the greatest number of volatile compounds (114) was COL, probably due to its dark roast. Other medium-roasted coffee samples, such as HND, PER, ETH2, also contained a great amount of volatile compounds (111, 111, 102, accordingly), while 80-94 volatile compounds were detected in the headspace of the rest samples (ETH1, ETH3, BRA1, SLV, MEX, PNG).
Volatile compounds were characterized using odor descriptors extracted from the Good Scents Company Information System [23], Flavornet [24] and The Pherobase [25] online databases. Moreover, all volatile compounds detected in the coffee samples were categorized according to the SCA Coffee Taster's Flavor Wheel [18], into nine categories-specifically, roasted, spices, nutty/cocoa, sweet, floral, fruity, sour/fermented, green/vegetative and other (containing chemical and papery/musty odors)-on the basis of each compound's odor descriptors. Co-eluted compounds of different characterization were not included.
Most of the identified compounds belong to the category "nutty/cocoa" (20-24 volatiles in every sample), which mainly consists of pyrazine, furan, and pyrrole derivatives, followed by the "sweet" category (10-18 volatiles in every sample) delivered by aldehydes and ketones and other (6-16 volatiles in every sample). The contribution of the "other" category is endorsed by different compounds, such as furan, thiophene, and pyrrole derivatives, aldehydes, ketones, esters, and thiols. The "floral" category consists mostly of alcohols, ketones, monoterpenes, pyrrole derivatives, "fruity" category of aldehydes, ketones, and esters, "green/vegetative" category of pyridine, pyrazine, aldehydes and furan derivatives, "roasted" category of pyrazine, pyridine, furan, and thiophene derivatives, "sour/fermented" category of acids, and "spices" category of furan derivatives.
A Hierarchical Cluster Analysis Heatmap (HCA Heatmap) was constructed to illustrate the similarities and differences among the studied roasted coffee samples in terms of the volatile compound odor categories and to observe a correlation with desired brew characteristics. The dataset consisted of 10 observations (roasted coffee samples) and nine variables (volatile compound categories).
As Figure 2 reveals, the nine volatile compound categories were clearly divided into two clusters. Cluster I contained spices, floral and green/vegetative categories, with an inner subcategory differentiating spices from the other two categories. Cluster II contained nutty/cocoa, fruity, roasted, sweet, other and sour/fermented categories. Subcategories divided nutty/cocoa, fruity and roasted from the rest of the categories sweet, other and sour/fermented. The main clustering is affected by the percentage of volatiles in each flavor category and cannot be connected to the chemical classes included in the categories. The percentage of compounds in Cluster I is much lower compared to Cluster II.  It can be observed that aroma contribution is higher in dark-and medium-roasted samples COL, HND, PER, ETH2. COL, although the sample with the darkest roast, seems to be inferior to the categories of the Cluster I, mostly spices, green/vegetative and, to a lesser extent, floral, while the rest are in high abundance. COL is also high in nutty/cocoa and roasted-smelling compounds, which is consistent with its desired brew characteristics of nutty and toasted notes (Table 1). ETH1 and ETH3 are the lightest roasted samples, with a minor abundance of volatiles, specifically of those included to nutty/cocoa, fruity and roasted categories. As described in the brew characteristics of ETH1, "floral" seems to be the category with the highest contribution; however, an estimation cannot be made for ETH3. In MEX, BRA, SLV and PNG, the orange tiles of medium abundance seem to excel. The HND, PER and ETH2 samples are described by major abundance yellow and yellowish tiles. HND is mostly characterized by spices, nutty/cocoa, fruity, roasted, but also sweet odors, in accordance to its vanilla, butter, caramel, nuts notes and chocolate taste and PER by green/vegetative, floral, spices and roasted-smelling volatiles, in accordance to its spicy, floral, citrus aroma. The aroma profile of ETH2 is comprised mostly by floral and sour/fermented volatile compounds, but also by fruity to a great extent, in It can be observed that aroma contribution is higher in dark-and medium-roasted samples COL, HND, PER, ETH2. COL, although the sample with the darkest roast, seems to be inferior to the categories of the Cluster I, mostly spices, green/vegetative and, to a lesser extent, floral, while the rest are in high abundance. COL is also high in nutty/cocoa and roasted-smelling compounds, which is consistent with its desired brew characteristics of nutty and toasted notes (Table 1). ETH1 and ETH3 are the lightest roasted samples, with a minor abundance of volatiles, specifically of those included to nutty/cocoa, fruity and roasted categories. As described in the brew characteristics of ETH1, "floral" seems to be the category with the highest contribution; however, an estimation cannot be made for ETH3. In MEX, BRA, SLV and PNG, the orange tiles of medium abundance seem to excel. The HND, PER and ETH2 samples are described by major abundance yellow and yellowish tiles. HND is mostly characterized by spices, nutty/cocoa, fruity, roasted, but also sweet odors, in accordance to its vanilla, butter, caramel, nuts notes and chocolate taste and PER by green/vegetative, floral, spices and roasted-smelling volatiles, in accordance to its spicy, floral, citrus aroma. The aroma profile of ETH2 is comprised mostly by floral and sour/fermented volatile compounds, but also by fruity to a great extent, in agreement with its berries, rose, floral, or apricot notes. In general, a decent correlation could be observed between samples and their brew characteristics.

Sample Differentiation-Principal Component Analysis (PCA) and Hierarchical Cluster Analysis (HCA)
Unsupervised PCA clustering method was utilized to map the natural groupings of coffee samples based on the abundances of the detected volatile compounds. The distribution of the volatile compounds within each coffee sample, as well as the relationships between them, is reflected in the associated PCA score plot, shown in Figure 3. For better monitoring, all volatile compounds were numbered, grouped into chemical categories, and colored according to the enclosed table. Overall variance of the first two principal components is 73.66%, of which 55.02% relates to F1 and 18.64% to F2. Clustering conformation of roasted coffees is dominated by roasting degree as the main discrimination parameter. In particular, PCA score-plot clustered coffee samples in two main subgroups. The first component (F1) with positive loading (right area) separated the first subgroup (highlighted with a pink-colored circle) of dark and medium roasted samples (COL, ETH2, HND, PER), with respect to the second group (highlighted with a yellow-colored circle) that includes mostly light and medium roasted samples (ETH1, ETH3, SLV, MEX, PNG, BRA1) with a negative loading on F1 (left area). In the first subgroup on the right area of the score plot (F1 with positive scoring), F2 plotted medium roasted samples PER and HND samples in the upper quadrant with positive loading, while the medium/dark ETH2 and dark COL were located in the lower quadrant with negative loading, with COL being clearly distant from the other samples. In the second subgroup on the left side of the score plot (F1 with negative scoring), F2 plotted medium roasted BRA1 and medium/light roasted MEX and PNG samples in the upper quadrant with positive loading, while the medium light roasted SLV and light roasted ETH1 and ETH3 were in the lower quadrant with negative loading. The medium/light roasted SLV sample was noticed to be closer to medium/light roasted PNG, than ETH1 and ETH2, the lightest roasted samples.
As can be observed by the score plot, darker roasted samples (pink-colored circle) were influenced heavily by compounds belonging to the furan and pyrazine derivatives (yellow and light grey colored active observations, respectively), dihydrofuranone and phenol derivatives (green and deep red colored symbols) as well as γ-lactones (deep grey colored symbols), while they were also related to more volatile compounds than lighter roasted samples. These compound categories are associated with dark roasting and sweet, nutty, cocoa, caramel, spicy notes, observed mostly in COL and HND brew characteristics (Table 1).
Although PCA showed a natural separation of coffee samples by roasting degree and a correlation to desired brew characteristics, it did not demonstrate a separation between samples based on their geographical origin [5].
Similarly, hierarchical cluster analysis (HCA) was applied to the set of variables employed for PCA, in order to interpret the results in a graphical way. X-axis represents the analyzed coffee samples and y-axis shows the dissimilarity (Figure 4). Clustering of roasted coffee samples presented a grouping by roasting degree and expected brew characteristics and, secondarily, geographical origin. Specifically, a clear clustering of two distinct groups was observed among coffee samples of different roasting degree in accordance with PCA results. The first blue cluster contains samples with lighter roast (ETH1, ETH3, BRA1, SLV, MEX, PNG) and the second red cluster, samples with darker roast and more intense roasting conditions (COL, ETH2, HND, PER). Within each group, the effect of geographical origin can be partially observed.
In the first group, samples are clustered from Africa with light roast (ETH1, ETH3) and from America and Australia with medium/light or medium roast. In a second clustering, South American, medium roasted sample BRA1 is differentiated from medium/light roasted North American samples MEX, SLV and Australian PNG.
In the second group containing the darker roasted coffee samples, geographical origin is a secondary criterion for differentiation. South American COL is clustered alone due to its intense dark roast and volatile compounds' abundance, compared to African medium/dark roasted ETH2 and, in the next grouping, medium roasted North American HND and South American PER.
The similarity between samples inside each major group indicates that roasting degree exceeds the geographical origin of coffee, meaning that volatile composition is affected by roasting conditions to a greater extent than by origin and coffee bean composition. For example, coffee samples from Ethiopia (ETH1 and ET3) belong to the same inner group, while ETH2 can be observed to the second district group, due to its more intense roast. It then became apparent that PCA and HCA multivariate statistical analyses were able to adequately differentiate samples according to their roasting degree, considering their volatile composition.
between samples based on their geographical origin [5].
Similarly, hierarchical cluster analysis (HCA) was applied to the set of variables employed for PCA, in order to interpret the results in a graphical way. X-axis represents the analyzed coffee samples and y-axis shows the dissimilarity (Figure 4). Clustering of roasted coffee samples presented a grouping by roasting degree and expected brew characteristics and, secondarily, geographical origin. Specifically, a clear clustering of two distinct groups was observed among coffee samples of different roasting degree in accordance with PCA results. The first blue cluster contains samples with lighter roast (ETH1, ETH3, BRA1, SLV, MEX, PNG) and the second red cluster, samples with darker roast and more intense roasting conditions (COL, ETH2, HND, PER). Within each group, the effect of geographical origin can be partially observed.

Coffee Samples
Ten Arabica (Coffea arabica L.) coffee samples were provided by AVEK S.A. (Athens, Greece). The samples were of single origin, covering a wide geographical distribution from Central America to Indonesia (i.e., Brazil, Colombia, El Salvador, Ethiopia, Honduras, Mexico, Papua New Guinea, Peru) (Figure 1). Table 1 lists the production country and region, roasting degree, and brew sensory characteristics of the studied samples.
Roasting was accomplished by the company using a custom-made fluidized bed roaster (Coffeetool, Athens, Greece). Charge temperature for all samples was 190 • C following a 14 min reference profile with the end temperature being between 200-220 • C. The roasting conditions applied were the result of preliminary roasting trials, were unique for each coffee sample, and were aimed to reach the desirable cup quality shown in Table 1. Their selection was based on the sensory evaluation of the respective espresso coffee brews conducted according to SCA Cupping protocols [29]. A typical time-temperature roasting curve is available as Supplementary Material ( Figure S1). Due to confidentiality reasons, detailed roasting conditions were not disclosed by the company.
Roasted coffee samples were immediately air-cooled and allowed to rest for 1 h, before grinding with a Swiss Ditting grinder for 10 s in order to pass a~500 µm sieve, and were stored in sealed containers under nitrogen atmosphere at 4 • C for a maximum of 24 h. Prior to HS-SPME analysis, the coffee samples were brought to room temperature (1 h).

Volatile Compounds Analysis
The volatile compounds of the coffee samples were analyzed by HS-SPME/GC-MS according to the method described by Lee et al. [30] and Papageorgiou et al. [31], with certain modifications. Specifically, 1.2 g of ground coffee were placed into a 15 mL vial, which was immediately sealed with a Teflon-lined septum and screw cap (Supelco, Bellefonte, PA, USA). A minimum time of 10 min was determined for equilibration, during which the samples were incubated at 50 • C under magnetic stirring. The headspace of the ground coffee was then sampled (30 min) by using an SPME fiber coated with CAR/PDMS (75 µm) (Supelco, Bellefonte, PA, USA), pre-conditioned according to manufacturer recommendations. During the extraction process the sample was continuously magnetically stirred. Afterwards, the fiber was thermally desorbed at 250 • C for 5 min in splitless mode. Carry-over effects were diminished by holding the fiber in the injection port for an extra 5 min.
Volatile compounds analysis was performed using an Agilent 6890A gas chromatograph equipped with MSD5973 mass spectrometer (Palo Alto, USA). The volatile compounds were separated on a polar DB-Wax column (60 m × 0.32 mm i.d., film thickness 0.25 µm; Agilent J&W, Palo Alto, CA, USA). Helium was used as carrier gas at a constant flow rate of 1.0 mL/min. The oven temperature was set at 50 • C for 5 min, followed by an increase of 3 • C/min up to 230 • C (15 min) (total run time 80 min). The transfer line temperature was set at 240 • C. MS was taken at 70 eV with a scan range between 35 and 500 amu at 2 scans/s. The MS source and MS Quad temperatures were 230 • C and 150 • C, respectively. Compounds identification was based on the comparison of the experimental mass spectra with those stored in NIST (Version 2.0g, 2011) and AMDIS libraries considering a similarity level >800 as well as on the comparison of calculated GC retention indices (RI) (using C7-C30 alkanes, Sigma-Aldrich, Laramie, WY, USA) with RIs reported in the literature. Data were processed by the MSD ChemStation software and expressed as relative percentage of each compound peak area to the total GC-MS peak area. The samples were evaluated in triplicate and the coefficient of variance (CV) for the vast majority of the compounds was lower than 6.5%. The suitability of the analysis conditions and the absence of contaminants were verified by running blank samples every two injections.

Statistical Analysis
The results were expressed as a mean of at least three measurements (n = 3). Multivariate statistical analysis, specifically Principal Component Analysis (PCA), Hierarchical Cluster Analysis (HCA) and Hierarchical Cluster Heatmap Analysis (HCA-Heatmap), was performed using XLSTAT 2021.2 (Addinsoft, New York, NY, USA). Pearson correlation was adopted.

Conclusions
In this work, the headspace composition of 10 single origin Arabica coffee samples from different geographical regions, roasted separately to different roasting degrees under varying conditions adapted for each sample to reach specific sensory attributes of their brews, was successfully profiled using SPME/GC-MS combined with chemometrics. Volatile compounds differed for the different roasted coffees and showed a strong variation with the applied roasting profile. A total of 138 volatile compounds were tentatively identified in all samples, of which, pyrazine and furan derivatives were predominant. The desired brew characteristics were satisfactorily captured from the volatile compounds formed, contributing to the aroma potential of each sample as chemometric analysis revealed. Furthermore, lighter roasted samples were efficiently differentiated from darker roasted samples, while roasting degree prevailed over coffee's geographical origin. In general, the concept of customizing roast profiles to highlight the finest features of single origin coffee samples, and obtain the best aroma profile and thus sensory outcome, determines the differentiation between samples, and allows their ensuing blending and tailoring of cup quality thus providing a valuable tool for the coffee industry.