Metabolite Profiling of Premium Civet Luwak Bio-Transformed Coffee Compared with Conventional Coffee Types, as Analyzed Using Chemometric Tools

Luwak (civet) coffee is one of the most precious and exotic coffee commodities in the world. It has garnered an increasing reputation as the rarest and most expensive coffee, with an annual production. Many targeted analytical techniques have been reported for the discrimination of specialty coffee commodities, such as Luwak coffee, from other ordinary coffee. This study presents the first comparative metabolomics approach for Luwak coffee analysis compared to other coffee products, targeting secondary and aroma metabolites using nuclear magnetic resonance (NMR), gas chromatography (GC), or liquid chromatography (LC) coupled with mass spectrometry (MS). Chemometric modeling of these datasets showed significant classification among all samples and aided in identifying potential novel markers for Luwak coffee from other coffee samples. Markers have indicated that C. arabica was the source of Luwak coffee, with several new markers being identified, including kahweol, chlorogenic acid lactones, and elaidic acid. Aroma profiling using solid-phase micro-extraction (SPME) coupled with GC/MS revealed higher levels of guaiacol derivatives, pyrazines, and furans in roasted Luwak coffee compared with roasted C. arabica. Quantification of the major metabolites was attempted using NMR for Luwak coffee to enable future standardization. Lower levels of alkaloids (caffeine 2.85 µg/mg, trigonelline 0.14 µg/mg, and xanthine 0.03 µg/mg) were detected, compared with C. arabica. Other metabolites that were quantified in civet coffee included kahweol and difurfuryl ether at 1.37 and 0.15 µg/mg, respectively.


Introduction
Luwak (civet) coffee is one of the most precious exotic coffee commodities traded in the world [1]. It has garnered an increasing reputation as the rarest and most expensive coffee, with an annual production of around 500 pounds and 600 dollars per pound, which is approximately one hundred times that of normal coffee [2]. Such a high price is attributed to its high consumer demand and superior sensory attributes. Luwak coffee is low in caffeine, low in fat, and low in bitterness. A study of different samples of civet coffee from the Gayo Highlands showed that nutty, fishy, chocolaty, herby, toasty, and earthy flavors were the dominant characteristics [3]. Coffea arabica berries are first digested by the Asian palm civet (Paradoxurus hermaphroditus), then the stools are either collected in the wild or harvested from caged animals. This arboreal animal is an excellent tree climber; it uses its strong sense of smell and eyesight to select only the ripest and sweetest coffee cherries [4]. The civet digests the berries' pericarp and ejects coffee beans that then undergo cleaning, wet fermentation, sun-drying, and, finally, roasting [1]. Throughout the fermentation metabolomics studies have targeted the diverse commercial types of coffee and have provided a detailed profile of the different coffee species, i.e., C. arabica, C. canephora, C. liberica, etc. Although the unique coffee analyzed in this paper has attracted many coffee lovers and has increased in demand worldwide, there is little research into the identification of significant phytochemicals and discriminant markers for its authentication, compared to other coffee types. The objective of this study was to employ chemometric tools for the first time to assess the phytochemicals in Luwak coffee, targeting its aroma and secondary metabolites, and standardizing its major components as one of the priciest and rarest coffees available.
Metabolites 2023, 13, 173 3 of 19 and provide a detailed profile of their composition. Liquid chromatography is a powerful tool for investigating the differences in the chemical profile between closely related taxa and species, due to its excellent resolution and high sensitivity level. Most previous metabolomics studies have targeted the diverse commercial types of coffee and have provided a detailed profile of the different coffee species, i.e., C. arabica, C. canephora, C. liberica, etc. Although the unique coffee analyzed in this paper has attracted many coffee lovers and has increased in demand worldwide, there is little research into the identification of significant phytochemicals and discriminant markers for its authentication, compared to other coffee types. The objective of this study was to employ chemometric tools for the first time to assess the phytochemicals in Luwak coffee, targeting its aroma and secondary metabolites, and standardizing its major components as one of the priciest and rarest coffees available.

Figure 1.
Graphical sketch summarizing the current paper's objectives and the techniques employed for comparisons between Luwak and regular coffee types.

Coffee Specimens, Chemicals, and Extraction
Commercial Luwak coffee was purchased from Bogor Indonesia as 100% pure, coarsely powdered coffee, ca. 2-4 mm in size, which has been heavily roasted and freezedried. It was compared to samples from two coffee-producing species: C. arabica, commonly known as arabica coffee, roasted arabica coffee (RCA), and green arabica coffee (GCA); the other is C. canephora var. robusta (known as green robusta coffee (GCC) or roasted robusta coffee (RCC)), collected from the Mina Gerais University Arboretum, Brazil, as entire seeds that were further powdered in a mortar using liquid nitrogen. Analysis was performed via NMR spectroscopy, ultra-performance liquid chromatography coupled with mass spectroscopy (UPLC-MS), and solid-phase microextraction coupled with the gas chromatography-mass spectrometry method (SPME/GC-MS). Samples subjected to SPME/GC-MS included Luwak coffee, roasted coffee, and roasted coffee blended with cardamom to comparatively evaluate the aroma profile. The NMR fingerprinting of coffee extracts was also conducted.

Coffee Specimens, Chemicals, and Extraction
Commercial Luwak coffee was purchased from Bogor Indonesia as 100% pure, coarsely powdered coffee, ca. 2-4 mm in size, which has been heavily roasted and freeze-dried. It was compared to samples from two coffee-producing species: C. arabica, commonly known as arabica coffee, roasted arabica coffee (RCA), and green arabica coffee (GCA); the other is C. canephora var. robusta (known as green robusta coffee (GCC) or roasted robusta coffee (RCC)), collected from the Mina Gerais University Arboretum, Brazil, as entire seeds that were further powdered in a mortar using liquid nitrogen. Analysis was performed via NMR spectroscopy, ultra-performance liquid chromatography coupled with mass spectroscopy (UPLC-MS), and solid-phase microextraction coupled with the gas chromatography-mass spectrometry method (SPME/GC-MS). Samples subjected to SPME/GC-MS included Luwak coffee, roasted coffee, and roasted coffee blended with cardamom to comparatively evaluate the aroma profile. The NMR fingerprinting of coffee extracts was also conducted.
Freeze-dried coffee seeds were prepared for NMR analysis following the same protocol as used for herbal extracts [15][16][17]; about 150 mg of each coffee powder (n = 3) was homogenized with 6 mL of 100% MeOH containing 10 µg/mL umbelliferone (an internal standard for relative quantification using LC-MS), using an Ultra-Turrax (IKA, Staufen, Germany) at 11,000 rpm for 5 × 60 s, with 1-minute break intervals. The extract was vortexed for 1 min, centrifuged at 3000× g for 30 min, and then filtered. Afterward, 4 mL of the supernatant was aliquoted for NMR analysis and then dried in a stream of nitrogen. The dried extract was re-suspended with 800 µL of 100% methanol-d4, containing HMDS that has been adjusted to a final concentration of 0.94 mM. After centrifugation (13,000× g for 1 min), the supernatant was transferred to a 5-millimeter NMR tube for measurement. For the LC-MS analysis, 1 mL of the sample was aliquoted and placed on a 500 mg Octadecylsilane (C18) cartridge that was preconditioned with methanol. The samples were then eluted using 3 × 0.5 mL methanol; the eluent was then evaporated under a nitrogen stream and the obtained dry residue was resuspended in 1 mL of 100% methanol.

UPLC-MS Profiling of Secondary Metabolites
UPLC-MS acquisition was performed using ion-trap high-resolution testing under the same conditions as those used for coffee testing by El-Hawary et al. [13].
To profile the metabolites, 150 mg of each coffee powder specimen was homogenized with 5 mL MeOH (100% v/v) containing 10 µg/mL umbelliferone as an internal standard, using an Ultra-Turrax mixer (IKA, Staufen, Germany) adjusted at 11,000 rpm, mixed in five 20-second periods, with intervals of 1 min between each mixing period to guard against temperature increases and heating effects. The resulting suspensions were then vortexed vigorously, centrifuged at 3000× g for 30 min, and filtered through a 22 µm pore-size filter to remove plant debris. Then, 1 mL of the sample was aliquoted and pre-treated by placement on a 500 mg C 18 cartridge that was pre-conditioned with MeOH and Milli-Q water before elution; this was performed twice, using 3 mL of MeOH. Afterward, the eluent was evaporated under a nitrogen stream, and the obtained dry residue was re-suspended in 1 mL of MeOH.
The principal step of UPLC-ESI-HRMS analysis was conducted in triplicate (n = 3), with 2 µL introduced to a Dionex 3000 UPLC system (Thermo Fisher Scientific, Bremen, Germany), equipped with an HSS T3 column (100 × 1.0 mm, 1.8 µm; Waters ® ; column temperature: 40 • C) and a photodiode array detector (PDA, Thermo Fisher Scientific, Bremen). The chromatographic conditions were optimized for improved peak elution, using a binary gradient elution protocol at a flow rate of 150 µL/min. The composition of the mobile phase varied between water/formic acid at 99.9/0.1 (v/v) (A) and acetonitrile/formic acid at 99.9/0.1 (v/v) (B). The protocol consisted of an isocratic step for 1 min with 5% mobile phase B, followed by a linear increase of B from 5% to 100% over 11 min. The mobile phase was kept isocratic for between 11 and 19 min at 100% B. After this, there was a return to 5% B within 1 min, and, finally, an additional 10 min, i.e., 20-30 min overall, for column re-equilibration using 5% B. The wavelength range of the PDA measurements used for detection was 190-600 nm.
The UPLC system was coupled with a high-resolution mass spectrometer, comprising an Orbitrap Elite mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) equipped with a HESI electrospray ion source (spray voltage, positive ion mode 4 kV, negative ion mode 3 kV; source heater temperature, 250 • C; capillary temperature, 300 • C; FTMS resolution, 30,000). Nitrogen was used as both the sheath and auxiliary gas. The CID mass spectra (buffer gas: helium; FTMS resolution: 15,000) were recorded in a data-dependent acquisition mode (DDA) using normalized collision energy (NCE) of 35% and 45% The instrument was externally calibrated with Pierce ® LTQ Velos ESI positive ion calibration solution (product number 88323, Thermo Fisher Scientific, Rockford, IL, USA) and Pierce ® LTQ Velos ESI negative ion calibration solution (product number 88324, Thermo Fisher Scientific, Rockford, IL, USA).

Headspace SPME GC-MS Profiling of Aroma Compounds
GC/MS analysis of coffee volatiles was performed exactly as previously described by Farag et. al. [17,18]. Three biological replicates were analyzed for each specimen, using a Shimadzu GC-17A gas chromatograph equipped with a DB-5 column (30 m, 0.25 mm × 0.25 um film thickness; Supelco ®, Merck SA, Darmstadt, Germany) coupled to a Shimadzu QP5050A mass spectrometer.

NMR Fingerprinting of Coffee Extracts
All spectra were recorded using an Agilent VNMRS 600 NMR spectrometer (Varian, Palo Alto, CA, USA) at a proton NMR frequency of 599.83 MHz, using a 5-millimeter inverse detection cryoprobe. The 1 H-NMR spectra were recorded at the parameters, including a digital resolution of 0.367 Hz/point, pulse width (pw) of 3 µs (90 • ), relaxation delay of 23.7 s, and an acquisition time of 2.7 s; the number of transients was 160. Zero filling up to 128 K (l b = 0.4) was used prior to the Fourier transformation. The NMR spectra were processed with Mestrenova software (Mestrelab Research Mnova 14.1.0 Build 24037) to aid in the peak picking (δ H , δ C , and δ H/C ) of detected NMR signals, measured in parts per million (ppm) relative to the internal standard hexamethyldisilazane (HMDS). In addition, the 1 H-NMR spectra were automatically Fourier-transformed to ESP files using the ACD/NMR Manager Lab version 10.0 software (Toronto, Canada), based on the method used by [15,16].
Further processing was applied for multivariate analysis (MVA), including the spectra binning into buckets of equal width (0.04 ppm) within the region of δ H at 11.4-0.4 ppm and the exclusion of signals between δ H at 5.0-4.7 ppm and δ H at 3.4-3.25 ppm, corresponding to the residual water and methanol signals, respectively. The data were then subjected to principal component analysis (PCA), hierarchical component analysis (HCA), and orthogonal partial least-squares discriminant analysis (OPLS-DA) using the SIMCA-P version 14.1 software package (Umetrics, Umeå, Sweden). All variables were mean-centered and scaled to the Pareto variance. The models were derived from both the full scale of the chemical shift (δ H : 1-10 ppm) and the aromatic region (δ H : 5.4-10 ppm). Quantification followed the exact formulae and procedures described in earlier works [15,16].

SPME-GC/MS Dataset Volatiles Identification and Modeling
Volatiles were identified by the comparison of peak retention time, the Kovat index (KI), and the spectrum with the reference metabolites in the NIST database (NIST/EPA/NIH mass spectral database (NIST 11). For peak identification, peaks were first deconvoluted using the AMDIS software (www.amdis.net) (accessed on 1 December 2022), prior to spectral matching. The relative content of each metabolite was obtained by the area normalization of all responses related to the identified hits. Average responses per injection replicates were then calculated for each metabolite. Afterward, the data were subjected to multivariate analysis (MVA), as described earlier, with the NMR dataset deriving models from the full scale of both the chemical shift and the aromatic region.

UPLC-ESI-HRMS Dataset Metabolite Identification and Modeling
All metabolites were identified by their accurate mass, retention time, MS fragments, isotopic distribution, and error. The "X-caliber software qual" browser (https: //www.thermofisher.com/)(accessed on 1 December 2022) was used for the imported high-resolution files. The analysis was performed in negative mode, and the ion mass spectra that were derived from the anions (M−H) were accompanied by many fragmentation patterns. Relative comparisons of the spectral data were made with the literature references, in-house data, and natural products database of the standard phytochemical dictionary (CRC, Wiley).
The original LC-MS files of all authenticated samples (GCC, GCA, RCA, and RCC) and the Luwak samples were converted into mzML files using the MS Convert GUI (http: //proteowizard.sourceforge.net/download.html) (accessed on 1 December 2022) and then converted to .abf files using the ABF converter (https://www.reifycs.com/AbfConverter/) (accessed on 1 December 2022), with the exact parameters described in our previous study [13]. The peak abundance mass list was then exported for multivariate data analysis, wherein the final ID and metabolites were Pareto-scaled using SIMCA (Umetrics, Umea, Sweden). The unsupervised principal component analysis (PCA) models were validated, based on R and Q, in addition to a hierarchal cluster analysis (HCA) of the authenticated and Luwak samples. Supervised OPLS-DA analysis was used in the pre-classified groups to identify the requisite markers, via an S-plot that was validated using the p-value, covariance (p), and correlation (pcor).

H-NMR Assignments and the Quantification of Coffee Metabolites
3.1.1. Identification of Coffee Metabolites 1 H-NMR analysis was employed for Luwak coffee characterization and the quantification of its major peaks, to be used for its future standardization (Table 1). The representative 1D 1 H-NMR spectra are depicted in Figure 2. The Luwak coffee sample displayed a signal richness that can mostly be ascribed to primary metabolites found in the aliphatic region from 0 to 5 ppm, and the lower intensity ascribed to secondary metabolites found in the region from 5.5 to 10 ppm. Metabolites that were identified from both ranges included several major coffee metabolites, i.e., caffeine, trigonelline, N-methylpyridinum, kahweol, sucrose, caffeoyl shikimic acid, quinic acid, malic acid, lactic acid, acetic acid, sterols, and fatty acids. Considering that coffee's health-promoting effects are mostly ascribed to its secondary metabolites, assignments of its key metabolites were attempted based on one-and two-dimensional NMR experiments, such as heteronuclear single-quantum correlation spectroscopy (HSQC), heteronuclear multiple-bond coherence (HMBC), etc.
With regard to the secondary bioactives, the 1 H-NMR spectrum was characterized by dense signals in the mid-spectrum region (δ H 3.4-4 ppm) belonging to caffeine, which was detected at δ H 7.58 (H5), and three methyl signals at δ H 3.34, 3.52, and 3.97 for CH 3 -8, CH 3 -7, and CH 3 -6, respectively [19]. Lower-intensity NMR signals were observed for another alkaloid, i.e., trigonelline annotated from singlet signals at δ H 4.42 [20], owing to the N-methyl group. The trigonelline structure was confirmed, based on multiple signals at δ H 8.0-9.5 ppm, which can be attributed to the aromatic protons. Due to its structural similarity to trigonelline, N-methylpyridinium showed overlap signals at δ H 4.4 ppm and 8-9 ppm, owing to the N-methyl group and aromatic protons, respectively [21]. N-methylpyridinium (NMP) is a thermal degradation product of trigonelline and is hypothesized to exert several health benefits in humans [22]; it is likely to be generated during the roasting process of Luwak coffee. Dark-roasted coffee that is rich in NMP was shown to reduce body weight [23] and has yet to be examined in the case of Luwak coffee. as heteronuclear single-quantum correlation spectroscopy (HSQC), heteronuclear multiple-bond coherence (HMBC), etc. With regard to the secondary bioactives, the 1 H-NMR spectrum was characterized by dense signals in the mid-spectrum region (δH 3.4-4 ppm) belonging to caffeine, which was detected at δH 7.58 (H5), and three methyl signals at δH 3.34, 3.52, and 3.97 for CH3-8, CH3-7, and CH3-6, respectively [19]. Lower-intensity NMR signals were observed for another alkaloid, i.e., trigonelline annotated from singlet signals at δH 4.42 [20], owing to the Nmethyl group. The trigonelline structure was confirmed, based on multiple signals at δH 8.0-9.5 ppm, which can be attributed to the aromatic protons. Due to its structural similarity to trigonelline, N-methylpyridinium showed overlap signals at δH 4.4 ppm and 8-9 ppm, owing to the N-methyl group and aromatic protons, respectively [21]. Nmethylpyridinium (NMP) is a thermal degradation product of trigonelline and is hypothesized to exert several health benefits in humans [22]; it is likely to be generated during the roasting process of Luwak coffee. Dark-roasted coffee that is rich in NMP was shown to reduce body weight [23] and has yet to be examined in the case of Luwak coffee.
Kahweol is a diterpene that is reported as a marker of C. arabica, identified based on its two key doublet signals at δH 5.9 ppm (due to H-2), δH 6.2 ppm (due to H-1 and H-18), and δH 7.2 ppm (due to H-1 and H-19) [19]. The characterization of its signals verified that C. arabica was the origin material of Luwak coffee production in this product; an analysis of other Luwak coffee samples from other origins can further confirm such a hypothesis.
With regard to primary metabolites that contribute more to coffee's sensory and nutritive attributes, malic acid was identified as the major organic acid to be characterized, based on a multiplet at δH 2.8 ppm, while the high-field region up to δH 2.0 ppm showed lactic acid and acetic acid signals [24]. Furthermore, signals characteristic of free sugars that might account for the taste of the coffee could be readily assigned in the 1 H-NMR spectrum, including the anomeric proton of sucrose at δH 5.39 (d, J = 5.0 Hz) [17].
Organic acids, such as quinic acid, were identified at δH 3.57 (dd, J = 9.5, 3.2 Hz) and 4.14 (dd, J = 10.8, 6.4) ppm [25] as the major acids that contribute to the production of coffee key phenolics i.e., chlorogenic acid, the major antioxidant in coffee. Other phenolic acid derivatives were identified downstream of quinic acid, due to the acetylation of an acid moiety, including caffeoyl quinic acid (chlorogenic acid), which appeared at δH 5.32, Kahweol is a diterpene that is reported as a marker of C. arabica, identified based on its two key doublet signals at δ H 5.9 ppm (due to H-2), δ H 6.2 ppm (due to H-1 and H-18), and δ H 7.2 ppm (due to H-1 and H-19) [19]. The characterization of its signals verified that C. arabica was the origin material of Luwak coffee production in this product; an analysis of other Luwak coffee samples from other origins can further confirm such a hypothesis.
With regard to primary metabolites that contribute more to coffee's sensory and nutritive attributes, malic acid was identified as the major organic acid to be characterized, based on a multiplet at δ H 2.8 ppm, while the high-field region up to δ H 2.0 ppm showed lactic acid and acetic acid signals [24]. Furthermore, signals characteristic of free sugars that might account for the taste of the coffee could be readily assigned in the 1 H-NMR spectrum, including the anomeric proton of sucrose at δ H 5.39 (d, J = 5.0 Hz) [17].
Organic acids, such as quinic acid, were identified at δ H 3.57 (dd, J = 9.5, 3.2 Hz) and 4.14 (dd, J = 10.8, 6.4) ppm [25] as the major acids that contribute to the production of coffee key phenolics i.e., chlorogenic acid, the major antioxidant in coffee. Other phenolic acid derivatives were identified downstream of quinic acid, due to the acetylation of an acid moiety, including caffeoyl quinic acid (chlorogenic acid), which appeared at δ H 5.32, owing to the presence of 5-CQA H10 [26], as identified in Figure 2. Caffeoyl shikimic acid signals were detected at δH 6.34 and 7.23 ppm, which correspond to H-4 and H-6, respectively [27].
Generally, the δ H 0-3 ppm region showed considerably higher-intensity signals that are typical for organic/fatty acids and sterols [19]. Few of the signals characteristic of fatty acids could be readily assigned in the 1 H-NMR spectrum, such as at δ H 1.28 and 1.32 ppm, for the repeated methylene groups of fatty acids [19].
To overcome the signal overlap observed in the 1D-NMR spectra, a set of 2D-NMR spectroscopic experiments were employed for the assignment of coffee metabolites. The unsaturation in some fatty acid chains, as in the case of octadec-9-enoic acid (elaidic acid) was confirmed by the presence of a triplet at δ H/C (1H, 5.35/129, t, J = 5.0 Hz), showing the HSQC cross-peak correlation to the aliphatic methylene (C-8, C-11) at δ C 27.5 ppm. The annotation of free fatty acid was based on its carbonyl at δ C 176 and the adjacent α-methylene at δ H/C 2.38/35.0 and was consistent with that reported in [28] (see Figure  S1C,D in the Supplementary Materials).
Elaidic acid is the trans-unsaturated fatty acid isomer of oleic acid. It has been regarded as detrimental to the sensory quality of coffee [29] and is generated during the thermal processing of coffee beans, leading to the transformation of fatty acids from a cis configuration to a trans configuration. Generally, elaidic acid is detected at low levels in coffee beans; being detected in Luwak coffee by the NMR indicates the effect of the roasting post-fermentation step. Although linoleic acid has been characterized via NMR in several studies, this is the first report to characterize elaidic acid in Luwak coffee beans.
Caffeine comprised the main alkaloid detected at 2.85 µg/mg (compared to 12.2 µg/mg and 11.1 µg/mg in GCA and RCA respectively), whereas trigonelline was detected at much lower levels of 0.14 µg/mg (compared to 10.6 µg/mg and 7.2 µg/mg in GCA and RCA, respectively), which is likely attributable to the roasting of Luwak coffee seeds during the preparation of samples, it being thermolabile [19]. Xanthine, a caffeine derivative, was quantified at low levels of 0.03 µg/mg; whether this is derived from microbiota-mediated fermentation inside the animal gut is yet to be determined.
The diterpene, kahweol, which is a marker of C. arabica species, was detected at 1.378 µg/mg. This was much less than that in authentic roasted C. arabica RCA (8.8 µg/mg) and its green counterpart, GCA (9.7 µg/mg). Kahweol was not detected in either roasted or green robusta coffee samples [19].

Metabolite Profiling via UPLC-ESI-HRMS
Kopi Luwak extract was subjected to UPLC-MS analysis, allowing the annotation of 24 metabolites, as listed in Table 2. The order of eluted metabolites followed that in our previous paper [13] on authenticated green and roasted coffees, including organic acids, phenolic acids (i.e., hydroxycinnamates, feruloyl, and coumaroyl derivatives), amino acids, and fatty acids. A list of identified compounds, along with their spectroscopic data, is shown in Table 2. The fragmentation patterns of the identified metabolites have been presented in previous reports [13,[32][33][34].

SPME/GC-MS Analysis of Luwak Coffee Aroma
A powdered sample of civet coffee was subjected to head-space extraction, coupled with GC-MS for aroma profiling. The obtained chromatograms were evaluated in comparison with chromatograms obtained from previous samples of authentic roasted C. arabica (RCA) and C. arabica with cardamom as a major blended coffee type, analyzed using the same method [14]. The reason for choosing RCA for comparison is the common origin and similar processing of samples. This is because civet coffee is obtained from Luwak animal-feed coffee arabica cherries, as revealed by NMR and LCMS modeling using PCA and HCA, showing the close grouping between civet coffee samples and roasted C. arabica samples. Commercial coffee with cardamom was added, owing to the distinct pattern of volatile metabolites associated with cardamom supplementation, as outlined in our previous report [14], and to compare whether civet coffee has an improved aroma profile.
As identified in Table 3, the pyrazines showed a high abundance in Luwak coffee, compared with roasted C. arabica, which was most likely generated during the thermal processing (Milliard reaction) of coffee. The higher amount detected in Luwak coffee, compared to other roasted C. arabica samples, is in accordance with a previous report indicating the impact of solid fermentation on the amino acid and sugar precursors of pyrazines such as phenylalanine, aspartic acid, and glutamic acid. Substantial levels of 2-methylpyrazine were detected, which indicated high levels of both glutamic acid and aspartic acid amino acid precursors in the Luwak coffee, prior to roasting [35].
In a similar way, a higher abundance of phenolics, such as 4-ethylguiacol (9.59%), 4-vinyl guaiacol (1.78%), and guaiacol (1.52%), which were only detected in Luwak coffee, indicated the impact of fermentation in the civet's gut on the phenolic precursors of hydroxycinnamic acid, whereas the higher abundance of 4-ethylguiacol, compared with 4-vinylguiacol, indicated the thermal processing during dark roasting [35].

H-NMR Multivariate Data Analysis of Luwak Coffee and Authenticated Green and Roasted Coffees
Similar NMR spectra were observed by a visual examination of the 1 H-NMR spectra of Luwak coffee and then compared with authenticated green (GCA, GCC) and roasted coffee (RCA and RCC) samples [19], revealing to which type Luwak coffee is close in an untargeted manner using chemometric tools. A report on the exact chemical characterization of the NMR data of these coffee samples has previously been reported by our group [19].
The NMR-derived dataset was based on samples of authenticated green, roasted, and Luwak coffee, using both unsupervised and supervised analysis, as seen in Figure 3. The principal component analysis (PCA) plot showed the principal component, PC1, representing 57% and 25% of the PC2 of the total variance, with acceptable values for the goodness-of-fit and goodness-of-prediction (R 2 = 0.57 and Q 2 = 0.39), suggesting an acceptable model, as shown in Figure 3A. The corresponding loading plot revealed the enrichment of sugars in green coffee, while roasted and Luwak coffees were more abundant in fatty acids (see Figure 3B). HCA showed a similar segregation pattern ( Figure 3C), in which samples were segregated into two main clusters. The first cluster included all green samples of both species; the Luwak coffee, along with all the roasted samples of both species (RCA and RCC), showed that the fatty acids were present in the second cluster, suggesting that the roasting process was more influential than the genotype among the full scans and aromatic models.
To confirm the results revealed from the unsupervised PCA, another supervised OPLS-DA analysis of the full NMR ( Figure 3D,E), and aromatic spectral regions ( Figure 3F,G) was attempted, with good model parameters: R 2 = 0.93, Q 2 = 0.91, R 2 = 0.83, Q 2 = 0.80, and a pvalue of less than 0.05 for the full coffee region at δ H : 0-10 ppm and 5.5-10 ppm, respectively. The full scan model (δ H : 0-10 ppm) provided better classification than the aromatic region (δ H : 5.5-10 ppm), based on these validation parameters. The OPLS S-loading plot (δ H : 0-10 ppm) confirmed the PCA results for the high abundance of fatty acids in Luwak coffee versus the enrichment of roasted coffee in sugars as the most discriminatory 1 H-NMR signals (see Figure 3E). Lastly, an OPLS of the aromatic region's 1 H-NMR signals (δ H : 5.5-10 ppm) dataset showed a higher abundance of caffeine and trigonelline in the roasted samples of both species, whereas, interestingly, no markers were detected for Luwak (see Figure 3G). aromatic region (δH: 5.5-10 ppm), based on these validation parameters. The OPLS Sloading plot (δH: 0-10 ppm) confirmed the PCA results for the high abundance of fatty acids in Luwak coffee versus the enrichment of roasted coffee in sugars as the most discriminatory 1 H-NMR signals (see Figure 3E). Lastly, an OPLS of the aromatic region's 1 H-NMR signals (δH: 5.5-10 ppm) dataset showed a higher abundance of caffeine and trigonelline in the roasted samples of both species, whereas, interestingly, no markers were detected for Luwak (see Figure 3G).

UPLC-HRMS Multivariate Data Analysis of the Luwak Coffee and Authenticated Coffee Samples
The UPLC-HRMS dataset was classified using multivariate data analysis, including the previously characterized coffee samples (authenticated green and roasted coffee) and Luwak coffee, all analyzed using the same method [13]. Both unsupervised analyses, i.e., PCA and HCA, and supervised analyses, i.e., OPLS-DA, were constructed for specimen classification and for the identification of distinct markers for Luwak coffee (Figure 4).
Firstly, the PCA model, taken as an unsupervised model, was applied for five

UPLC-HRMS Multivariate Data Analysis of the Luwak Coffee and Authenticated Coffee Samples
The UPLC-HRMS dataset was classified using multivariate data analysis, including the previously characterized coffee samples (authenticated green and roasted coffee) and Luwak coffee, all analyzed using the same method [13]. Both unsupervised analyses, i.e., PCA and HCA, and supervised analyses, i.e., OPLS-DA, were constructed for specimen classification and for the identification of distinct markers for Luwak coffee (Figure 4). and explaining its significant markers with a p-value of less than 0.05 (Figure 4 OPLS-DA-derived S-plot ( Figure 4E) showed other distinctive markers for Luwa as the citric acid (L4) and fatty acid series, viz., L20, L22, L23, and L24. On the oth hydroxycinnamic acids (caffeoyl, feruloyl, coumaroyl, and dicaffeoyl quinic acid more abundant in the roasted C. arabica samples. The chlorogenic acid lacton distinguished as markers for the roasted C. arabica samples. These results are align our previously published work on authentic coffee samples [13].
In comparison to the NMR results, both techniques showed good segregatio samples, suggesting the related composition of both roasted and Luwak compared to green coffee. However, more markers were detected using the technique, such as chlorogenic acids and citric acid, which were not revealed usin

SPME-GC/MS Multivariate Data Analysis of Luwak Coffee, Roasted Coffee, and Ro Coffee with Cardamom
Multivariate data analysis (MVA) visualized the further differences in the profile of Luwak coffee, compared with those of the RCA and roasted coff cardamom. The score plot of the PCA that was derived from all the coffee samples two components that accounted for 39.6% and 24.3% of the total variance. The PC plot revealed the distinct separation of civet coffee from other roasted coffees ( Firstly, the PCA model, taken as an unsupervised model, was applied for five samples, including Luwak (PWN) and the authenticated green and roasted samples denoting different symbols, i.e., green canephora coffee (GCC), green C. arabica (GCA), roasted C. arabica (RCA), and roasted canephora coffee (RCC). The PCA score plot ( Figure 4A) explained 48% of the total variance in PC1, whereas the second principal component, PC2, explained 12% of the variance, with acceptable values for the goodness-of-fit and goodness-of-prediction values (R 2 = 0.48 and Q 2 = 0.39), indicating a good model. The HCA model offers another unsupervised analysis method, with a visual graphical display ( Figure 4C) showing two main clusters (I and II). Cluster I encompassed only the green arabica sample (GCA), while the rest of the samples were embedded in cluster II. As the two subclasses, the green canephora specimen was present, alone, in one subclass (A), whereas the roasted authentic samples, along with the Luwak, were grouped in cluster B, revealing the similarity between roasted and Luwak coffees, as revealed by the PCA analysis. Both the PCA score plots and HCA showed the segregation of Luwak coffee toward the roasted samples, with a closer aggregation with the roasted arabica samples (RCA), which is in agreement with the NMR results ( Figure 4A). Further examination of the PCA loading plot ( Figure 4B) indicated that the phenolic acids and diterpenes were more abundant in the roasted samples than in the Luwak samples. In another attempt to investigate more markers, a supervised OPLS-DA model was established to compare the roasted samples against the Luwak coffee. The supervised model showed the parameters R 2 and Q 2 at 0.98 and 0.84, respectively, supporting good model fitness and predictability, and explaining its significant markers with a p-value of less than 0.05 ( Figure 4D). The OPLS-DA-derived S-plot ( Figure 4E) showed other distinctive markers for Luwak, such as the citric acid (L4) and fatty acid series, viz., L20, L22, L23, and L24. On the other hand, hydroxycinnamic acids (caffeoyl, feruloyl, coumaroyl, and dicaffeoyl quinic acids) were more abundant in the roasted C. arabica samples. The chlorogenic acid lactones were distinguished as markers for the roasted C. arabica samples. These results are aligned with our previously published work on authentic coffee samples [13].
In comparison to the NMR results, both techniques showed good segregation of all samples, suggesting the related composition of both roasted and Luwak coffees, compared to green coffee. However, more markers were detected using the LCMS technique, such as chlorogenic acids and citric acid, which were not revealed using NMR.

SPME-GC/MS Multivariate Data Analysis of Luwak Coffee, Roasted Coffee, and Roasted Coffee with Cardamom
Multivariate data analysis (MVA) visualized the further differences in the aroma profile of Luwak coffee, compared with those of the RCA and roasted coffee with cardamom. The score plot of the PCA that was derived from all the coffee samples showed two components that accounted for 39.6% and 24.3% of the total variance. The PCA score plot revealed the distinct separation of civet coffee from other roasted coffees ( Figure 5A). Civet coffee was found to be clearly separated from the roasted C. arabica along the PC1-axis. The PCA model showed high quality in terms of the goodness-of-fit (R 2 X% 0.639). Repeating the model using a subset of RCA versus Luwak coffee showed a higher goodness-of-fit (R 2 X% 0.825), as indicated in Figure 5B. The loading plot of Luwak coffee versus RCA ( Figure 5C) showed a higher 4-ethylguiacol level as a potential marker of Luwak coffee, establishing the aroma profiling. Repeating the model using a subset of RCA versus Luwak coffee showed a higher goodness-of-fit (R 2 X% 0.825), as indicated in Figure 5B. The loading plot of Luwak coffee versus RCA ( Figure 5C) showed a higher 4-ethylguiacol level as a potential marker of Luwak coffee, establishing the aroma profiling. OPLS-DA modeling was further employed to indicate the role of animal fermentation on the aroma profile of roasted coffee. The OPLS-DA score plot of the total data set of Luwak coffee versus that of RCA revealed segregation between the samples on the basis of animal fermentation (see Figure S3A in the Supplementary Materials). Civet coffee and the roasted coffee samples were clearly separated in the predictive component (t [1]). The OPLS-DA model was built with an R 2 Y value of 0.995 and a Q 2 value of 0.946. The correlation coefficient (R 2 Y) is used to describe how a model fits a set of predicted data sets related to class separation. The high Q 2 value in the model precludes overfitting. A Q 2 value of 0.5 is considered to be acceptable for a model derived from biological samples [12].
Permutation tests were performed in the PLS-DA model to confirm the quality of the OPLS-DA model. According to Setoyama et al., if the OPLS-DA model were over-fitted, the R 2 Y and Q 2 values would not virtually change after permutation [36]. Both parameters were in the range of the requirements for a reliable model; R 2 Y-intercept values fluctuated between 0.0 and 0.795, and the Q 2 -intercept was below 0.05. These values denoted that there was a change in the values of the two parameters (see Figure S3B in the Supplementary Materials,).
The OPLS plot showed that the terpenes and esters showed higher abundance, with conventionally roasted coffee showing the most influencing volatiles on the right corner, which belong to terpinyl acetate, p-anisylacetone, cinnamic aldehyde, and acetone. In contrast, the S-plot showed that the civet coffee is rich in phenolics alongside furans, with most of the influencing volatiles being 4-ethylguaiacol, furfuryl alcohol, and difurfuryl ether (see Figure S3C in the Supplementary Materials). This study is the first report on the civet coffee metabolome in comparison with conventional coffee but managed to identify several civet coffee markers. Compared with recent publications, most of the studies focused on analytical techniques by which to determine coffee authenticity for the common types and minimize adulteration [37]. Several markers were reported for the authenticity evaluation of Luwak coffee, such as caffeine, inositol, and pyroglutamic acid, by using GC-MS [38] obtained from the robusta OPLS-DA modeling was further employed to indicate the role of animal fermentation on the aroma profile of roasted coffee. The OPLS-DA score plot of the total data set of Luwak coffee versus that of RCA revealed segregation between the samples on the basis of animal fermentation (see Figure S3A in the Supplementary Materials). Civet coffee and the roasted coffee samples were clearly separated in the predictive component (t [1]). The OPLS-DA model was built with an R 2 Y value of 0.995 and a Q 2 value of 0.946. The correlation coefficient (R 2 Y) is used to describe how a model fits a set of predicted data sets related to class separation. The high Q 2 value in the model precludes overfitting. A Q 2 value of 0.5 is considered to be acceptable for a model derived from biological samples [12].
Permutation tests were performed in the PLS-DA model to confirm the quality of the OPLS-DA model. According to Setoyama et al., if the OPLS-DA model were overfitted, the R 2 Y and Q 2 values would not virtually change after permutation [36]. Both parameters were in the range of the requirements for a reliable model; R 2 Y-intercept values fluctuated between 0.0 and 0.795, and the Q 2 -intercept was below 0.05. These values denoted that there was a change in the values of the two parameters (see Figure S3B in the Supplementary Materials).
The OPLS plot showed that the terpenes and esters showed higher abundance, with conventionally roasted coffee showing the most influencing volatiles on the right corner, which belong to terpinyl acetate, p-anisylacetone, cinnamic aldehyde, and acetone. In contrast, the S-plot showed that the civet coffee is rich in phenolics alongside furans, with most of the influencing volatiles being 4-ethylguaiacol, furfuryl alcohol, and difurfuryl ether (see Figure S3C in the Supplementary Materials).
This study is the first report on the civet coffee metabolome in comparison with conventional coffee but managed to identify several civet coffee markers. Compared with recent publications, most of the studies focused on analytical techniques by which to determine coffee authenticity for the common types and minimize adulteration [37]. Several markers were reported for the authenticity evaluation of Luwak coffee, such as caffeine, inositol, and pyroglutamic acid, by using GC-MS [38] obtained from the robusta Luwak coffee, while citric acid, malic acid, and glycolic acid were reported to be characteristic of arabica Luwak coffee [39]. Citric acid and malic acid, as compound markers of Luwak coffee [2], appeared in alignment with those in the NMR spectra evaluation in this study. Likewise, our results revealed a higher abundance of furans, pyridine, and pyrazine derivatives in the Luwak coffee post-roasting tests, compared with unfermented roasted coffee and in accordance with the previous report [40]. Moreover, although elaidic acid was previously reported in arabica coffee pulp and husk [41], our research is the first to report it in Luwak coffee. With regard to the study limitations, the current research examined civet coffee from one commercial source that has yet to be distinguished from other specimens. Further comparisons between the same coffee used in Luwak coffee prior to the fermentation step should aid in dissecting the impact of this step on the Luwak coffee metabolome as the analyzed coffee proved to have been subjected to both fermentation and roasting processes, as is evident from the presence of furan compounds. The administration of different coffee types to the civet animal and the monitoring of changes in metabolome using the same approach as that described herein should aid in identifying the best sources for producing this premium type of coffee.
Future biological studies are recommended for revealing civet coffee's effects, especially in terms of targeting CNS compared to other coffee types. They will aid in correlating the metabolome composition to achieve certain targeted effects.

Conclusions
Three different technology platforms were employed for Luwak coffee classification, including NMR, LC-MS, and SPME/GC-MS, to show the significant classifications among all samples and aid in identifying potential novel markers to distinguish Luwak coffee from other coffee samples. The markers indicated that C. arabica was the source of Luwak coffee. The roasting process that was applied to Luwak coffee and roasted C. arabica preparation had a pivotal role in their comparable metabolite profile similarity and their distance from the green coffee samples, as revealed from the NMR and LC-MS models. The Luwak coffee metabolite markers revealed by the NMR included elaidic acid, kahweol, and di-furfuryl ether. The latter was also identified as a marker for Luwak coffee, using SPME/GC-MS analysis. This study also confirmed the impact of the fermentation step prior to roasting on the aroma profile by using SPME coupled with GC/MS, as exemplified by the higher abundance of guaiacol derivatives, pyrazines, and furans in roasted Luwak coffee compared with roasted C. arabica. Finally, such a comparative metabolomics approach overcomes the limitation of detection by using one technique versus another. For example, some metabolite markers, such as citric acid, were identified in the LC-MS versus the elaidic and other fatty acids in NMR via other markers, i.e., di-furfuryl ether was detected in both NMR and SPME/GC-MS. Such a comparative metabolomics approach can be used for the quality control assessment of other distinctive or premium coffee products from regular ones in the future. A comparison of Luwak's health benefits compared to roasted coffee should also follow, based on these findings, as revealed using metabolomics.