How Does LC/MS Compare to UV in Coffee Authentication and Determination of Antioxidant Effects? Brazilian and Middle Eastern Coffee as Case Studies

Coffee is a popular beverage owing to its unique flavor and diverse health benefits. The current study aimed at investigating the antioxidant activity, in relation to the phytochemical composition, of authenticated Brazilian green and roasted Coffea arabica and C. robusta, along with 15 commercial specimens collected from the Middle East. Ultra-high-performance liquid chromatography coupled to high-resolution mass spectrometry (UHPLC-ESI–HRMS) and UV spectrometry were employed for profiling and fingerprinting, respectively. With the aid of global natural product social molecular networking (GNPS), a total of 88 peaks were annotated as belonging to different chemical classes, of which 11 metabolites are reported for the first time in coffee seeds. Moreover, chemometric tools showed comparable results between both platforms, with more advantages for UV in the annotation of roasting products, suggesting that UV can serve as a discriminative tool. Additionally, antioxidant assays coupled with the UHPLC-ESI–HRMS dataset using partial least-squares discriminant analysis (PLS-DA) demonstrated that caffeoylquinic acid and caffeine were potential antioxidant markers in unroasted coffee versus dicaffeoyl quinolactone and melanoidins in roasted coffee. The study presents a multiplex metabolomics approach to the quality control of coffee, one of the most consumed beverages.


Introduction
Beverages containing caffeine, including coffee, are consumed daily worldwide to improve cognitive functions. Approximately three billion cups of coffee are consumed daily, expressed economically at the cost of ca. US $200 billion annually [1,2]. Though there are more than 120 species of Coffea, coffee is brewed mainly from the seeds of Coffea arabica L. and C. canephora L. var. robusta or C. robusta [3]. Arabica coffee is preferred by most consumers due to a more intense aroma and flavor compared to robusta [4].
Besides caffeine (1-4%), coffee seeds are rich in other secondary metabolites with several health benefits. Major secondary metabolites in coffee include phenolic acids, i.e., chlorogenic acids, which have various pharmacological properties, such as anti-inflammatory, Table 1. A list of coffee specimens analyzed using UHPLC-ESI-HRMS and UV spectroscopy, including origin, degree of roasting, and the sample code used in the text.

Samples Extraction and UHPLC-ESI-HRMS Analysis
Following the protocol previously developed by Farag et al. [18,19], with a few modifications, 150 mg of each coffee powder specimen was homogenized with 5 mL MeOH (100% v/v) containing 10 µg/mL umbelliferone as an internal standard using an Ultra-Turrax mixer (IKA, Staufen, Germany) adjusted at 11,000 rpm, five times for 20 s periods, with intervals of 1 min between each mixing period to guard against temperature increases and heating effects. The resultant suspensions were then vortexed vigorously, centrifuged at 3000× g for 30 min, and filtered through a 22 µm pore size filter to remove plant debris. Then, 1 mL was aliquoted and pre-treated by placement on a 500 mg C 18 cartridge preconditioned with MeOH and Milli-Q water before elution, performed twice, using 3 mL MeOH. The eluent was afterwards evaporated under a nitrogen stream, and the obtained dry residue was re-suspended in 1 mL MeOH.
The principal step of UHPLC-ESI-HRMS analysis was conducted in triplicate (n = 3), with 2 µL introduced to an Dionex 3000 UHPLC system (Thermo Fisher Scientific, Bremen, Germany) equipped with a HSS T3 column (100 × 1.0 mm, 1.8 µm; Waters ® ; column temperature: 40 • C), and a photodiode array detector (PDA, Thermo Fisher Scientific, Bremen). The chromatographic conditions were optimized for improved peak elution using a binary gradient elution protocol at a flow rate of 150 µL/min. The composition of the mobile phase varied between water/formic acid, 99.9/0.1 (v/v) (A) and acetonitrile/formic acid 99.9/0.1 (v/v) (B). The protocol consisted, first, of an isocratic step for 1 min of 5% mobile Antioxidants 2022, 11, 131 4 of 28 phase B, then a linear increase of B from 5% to 100% over 11 min. The mobile phase was kept isocratic between 11-19 min at 100% B. Afterwards, there was a return to 5% B within 1 min, and, finally, an additional 10 min, i.e., 20-30 min, for column re-equilibration using 5% B. The wavelength range of the PDA measurements used for detection was 190-600 nm.
The UHPLC system was coupled with a high-resolution mass spectrometer using an Orbitrap Elite mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) equipped with a HESI electrospray ion source (spray voltage: positive ion mode 4 kV, negative ion mode 3 kV; source heater temperature: 250 • C; capillary temperature: 300 • C; FTMS resolution: 30,000). Nitrogen was used as sheath and auxiliary gas. The CID mass spectra (buffer gas: helium; FTMS resolution: 15,000) were recorded in data-dependent acquisition mode (dda) using a normalized collision energy (NCE) of 35% and 45% The instrument was externally calibrated with Pierce ® LTQ Velos ESI positive ion calibration solution (product number 88323, Thermo Fisher Scientific, Rockford, IL, USA) and Pierce ® LTQ Velos ESI negative ion calibration solution (product number 88324, Thermo Fisher Scientific, Rockford, IL, USA).

UV Measurements and Multivariate Data Analysis
Three grams of each coffee sample was macerated with 30 mL methanol (100%) for 2 h, then centrifuged and filtered as previously described in Section 2.2. An aliquot of 200 µL was used, prepared from four replicates in a 96-well plate for the Gen 5 Greener UV microplate reader (Gen 5, kitted with a 96-well quartz cell with 1 nm spectral resolution in the UV region). Aliquots of each coffee extract (200 µL) were pipetted into the microplate wells (n = 4) of a Gen 5 Greener UV microplate reader (BioTek Instruments, Inc., Winooski, VT, USA). The absorption spectra were recorded in the range of 200-450 nm.
Afterwards, the spectral dataset was then converted to a data matrix using Excel (Excel 2010, Microsoft, Redmond, DC, USA). The matrix was constructed for all samples with their biological replicates for samples against 250 variables (wavelengths) spanning the readings. Finally, the dataset was subjected to unsupervised multivariate data analysis, including principal component analysis (PCA) using SIMCA software (version 14.1). All variables were mean-centered and pareto-scaled.

Tenative Identification of Metabolites Analyzed by UHPLC-ESI-HRMS
Metabolite identification was carried out based on retention times, accurate mass, fragments, comparison with reference standards, isotopic distribution, UV-Vis spectra, and errors reported in the literature and the Dictionary of Natural Products. The analysis was performed in both positive and negative modes, and the mass spectra derived from the protonated [M + H] + or deprotonated [M − H] − ions accompanied by their fragmentation patterns aided in structural elucidation. The chromatograms show components as functions of their retention time and mass-to-charge ratio by mass relative abundance. The highresolution mass spectrometry data were evaluated with the Xcalibur software 2.2 SP1 (Thermo Fisher Scientific).

Molecular Based Networking of Coffee Specimens
Molecular networks were generated for negative ionization files applying Global Natural Products Social Molecular Networking (GNPS, https://gnps.ucsd.edu/ProteoSAFe/ static/gnps-splash.jsp) accessed date (21 December 2020). For the building of networks, the following parameters were adjusted: 0.02 Da parent mass tolerance, 0.01 Da fragment ion tolerance, 0.7 or above cosine score, and a minimum of four matching peaks. In addition, cystoscope open-source software (version 3.8.2) was used for network visualization [20].

Determination of Total Phenolic Content
Total phenolic content (TPC) of the coffee specimen extracts, prepared as explained in Section 2.3, was determined calorimetrically using Folin-Ciocâlteu reagent as described by Zhang et al., with slight modifications, with gallic acid being used as a standard for quantification [21]. Briefly, 20 µL aliquots were mixed with 100 µL of 10% Folin-Ciocâlteu Antioxidants 2022, 11, 131 5 of 28 reagent and left for 5 min in the dark, followed by the addition of 80 µL 7.5 mg sodium bicarbonate and incubation in the dark for 30 min. The absorbance of all samples was measured at 765 nm. In addition, a standard curve of gallic acid was established in the concentration range of 1-100 µg/mL. All measurements were made in triplicate (n = 3) and the TPC was expressed as mg gallic acid equivalent/mg extract (mg GAE/mg extract). The DPPH radical scavenging assay was carried out as described by Hidalgo et al., with slight modifications [22]. Briefly, 30 µL of each extract prepared as described in Section 2.3 was mixed with 270 µL 6 × 10 −5 M DPPH. The mixtures were then left in the dark for 30 min, and absorbance was recorded afterwards at 517 nm. A negative control sample was made of 30 µL 100% methanol instead of sample aliquots. All measurements were made in triplicate using a microplate reader at different concentrations, i.e., 0.01, 0.1, and 0.5 µg/mL, and the results were expressed as mean ± SD.
The radical scavenging activity was measured for each specimen as percentage inhibition of DPPH = (1 − A s /A c ) × 100, where A s stands for sample absorbance and A c for negative control. The IC 50 ± SD (µg/mL) values were then calculated, representing the percentage required by the samples to decrease DPPH absorption by 50%; therefore, higher IC 50 values indicate the lower antioxidant activity of the coffee samples.

Ferric Reducing Antioxidant Power (FRAP) Assay
The FRAP assay is also a colorimetric assay that measures the ferric reducing power of samples. According to Fernández-Poyatos et al.'s protocol, a FRAP assay was conducted. Briefly, 175 µL of freshly prepared FRAP reagent, consisting of 10 mM TPTZ (2,4,6-tripyridyl-S-triazine) in 40 mM HCl (10 Mm), acetate buffer (300 mM, pH 3.6), and 20 mM FeCl 3 , was mixed with 25 µL of extract and incubated in the dark for 30 min till recordings were made at 593 nm. A Trolox calibration curve (0.01-0.1 mg/mL) was constructed. The results were expressed as mg Trolox equivalents per mg extract (mg TE/g) [23]. All the measurements were performed in triplicate (n = 3) and expressed as mean ± SD.

Results and Discussion
The main objective of this study was to identify heterogeneity in coffee metabolites with reference to suppliers, roasting methods, and different commercial blends in the Middle East. To achieve this objective, methanol extracts prepared from authenticated and commercial coffee specimens were analyzed using UHPLC-ESI-HRMS and UV-Vis. In addition, total phenolic content (TPC) and one of the most important properties of coffee, viz., antioxidant activity, were determined in relation to the UHPLC-ESI-HRMS and UV-Vis datasets for comparison between both analytical platforms.

Metabolite Profiling via UHPLC-ESI-HRMS
Authenticated green and roasted coffee seeds, i.e., GCA, GCC, RCA, and RCC (Table 1), were first subjected to UHPLC-ESI-HRMS metabolite profiling. The results revealed qualitative and quantitative differences in the peaks detected in authenticated coffee seed extracts derived from both species, either in positive or negative mode (Supplementary Figure S1A,B). The metabolites were eluted in the order of organic acids, phenolic acid glycosides, alkaloids, hydroxycinnamic acids, diterpenoids, N-alkanoyl fatty acids, sphingolipids, and fatty acids. While the negative mode was able to detect most classes of metabolites, the positive mode was more suitable for alkaloids, diterpenes, amides, and nitrogenous compounds (see Table 2). Hence, both modes complemented each other, covering the identification of numerous compounds. Table 2. Metabolites identified in methanol extracts of authenticated green C. robusta (GCC), green C. arabica (GCA), roasted C. robusta (RCC), and roasted C. arabica (RCA) via UHPLC-PDA-ESI-HRMS in both negative and positive ionization modes. Annotation of detected peaks was based on previous literature, retention times, tandem MS, and molecular networking. Furthermore, GNPS is a system that calculates the scores between all the fragment ions (MS/MS) inside a dataset as an early step in the launch of a molecular network to analyze sets of data in comparison with all public data. The molecular networks generated by GNPS are considered as a visual exhibition of a group of spectra of structurally related molecules. Each node represents a spectrum that provides information from a metadata file describing special properties of the supplemented files, such as sample species, processing, type, etc. On the other hand, the edges correspond to the alignment between spectrums, and connections between two nodes contributes to the formation of clusters of similar molecules known as molecular families that allow the user to distinguish between the distinct families included in the network. Finally, for the visualization of molecular networks, the data were imported into Cytoscape software for further analysis (gnps.ucsd.edu) [20]. GNPS has been applied successfully for various naturally derived extracts and could potentially identify a number of metabolites [47][48][49].

Peak
GNPS was applied for the visualization of the coffee metabolome obtained from the UHPLC-ESI-HRMS platform for authenticated green and roasted coffee seeds [20]. The graphical display aided in the annotation and dereplication of the metabolites obtained from the UHPLC-ESI-HRMS datasets and in the tentative identification of unknown peaks. The molecular networking (MN) that was created encompassed 145 connected nodes consisting of 11 clusters, the nodes of the network representing the compounds' parent ions and the colors of the node representing the roasting and species attributes provided from the metadata file ( Figure 1A-E). Furthermore, GNPS is a system that calculates the scores between all the fragment ions (MS/MS) inside a dataset as an early step in the launch of a molecular network to analyze sets of data in comparison with all public data. The molecular networks generated by GNPS are considered as a visual exhibition of a group of spectra of structurally related molecules. Each node represents a spectrum that provides information from a metadata file describing special properties of the supplemented files, such as sample species, processing, type, etc. On the other hand, the edges correspond to the alignment between spectrums, and connections between two nodes contributes to the formation of clusters of similar molecules known as molecular families that allow the user to distinguish between the distinct families included in the network. Finally, for the visualization of molecular networks, the data were imported into Cytoscape software for further analysis (gnps.ucsd.edu) [20]. GNPS has been applied successfully for various naturally derived extracts and could potentially identify a number of metabolites [47][48][49].
GNPS was applied for the visualization of the coffee metabolome obtained from the UHPLC-ESI-HRMS platform for authenticated green and roasted coffee seeds [20]. The graphical display aided in the annotation and dereplication of the metabolites obtained from the UHPLC-ESI-HRMS datasets and in the tentative identification of unknown peaks. The molecular networking (MN) that was created encompassed 145 connected nodes consisting of 11 clusters, the nodes of the network representing the compounds' parent ions and the colors of the node representing the roasting and species attributes provided from the metadata file ( Figure 1A-E).  A total of 88 peaks were annotated as belonging to different metabolite classes, including hydroxycinnamic acid esters, lactones and amides (30 compounds), fatty acids and sphingolipids (22), diterpenes and diterpene glycosides (17), alkaloids (2), phenolic acid glycosides (2), in addition to other classes that were tentatively identified. Metabolite assignments were mostly based on tandem MS spectra showing unique fragmentation patterns for the metabolites (Supplementary Figures S2-S21), as explained in the next sub-sections for each class. Table 2 lists all spectral data for the identified peaks in coffee seeds of both species.

Alkaloids
Two alkaloids were identified in peaks P5 and P6, annotated as trigonelline and caffeine, respectively, with a higher abundance of caffeine than trigonelline (Supplementary Figure S1B).
Thirty HCA derivatives were identified, including acids, esters, lactones, and amides, among which dicaffeoyl quinic acid (P19) was the most abundant (Supplementary Figure S1A Figure S5). The compound annotations were consistent with the previous literature [25]. Both compounds were detected in the four types of authenticated seeds.
In agreement with the literature, several minor chlorogenic acid derivatives were also characterized in both green and roasted coffee seeds, i.e., P16, P22, and P28. For example, a peak at m/z 381 found to be more abundant in RCC than RCA was anno-  Figure 1B, and Supplementary Figure S10) [25].
Aside from hydroxycinnamic acid esters, lactones (quinides) were identified specifically in roasted seeds of both species, i.e., RCC and RCA, and hence considered as roastingassociated products. Their annotations were confirmed by their clustering together in the MN ( Figure 1B Figure S11) [32]. Lastly, five hydroxycinnamoyl amides were identified mostly in robusta coffee seeds, i.e., P82 and P83 in GCC and RCC ( Table 2). The MN ( Figure 1C) showed most of the identified hydroxycinnamoyl amides, as identified in Table 2, indicating their structural similarities. The results were in agreement with the previous literature distinguishing C. robusta products from other Coffea species-containing analogues [46].

Diterpenes
The major diterpenes in coffee are cafestol and kahweol, reported in both C. arabica and C. robusta species [41]. In the current study, with the aid of MN, several diterpenes were identified in investigated coffee samples ( Figure 1D). They were detected in positive ion mode given their lack of an electronegative group, as in phenolic compounds, highlighting the importance of profiling in different ionization modes [50].  Figures S12 and S13).
Additionally, a new diterpene was identified in P45, particularly in roasted seeds of both species, i.e., RCA and RCC. It has been reported for the first time in coffee seeds derived from dehydrocafestol by a further dehydration step which may easily occur during seed roasting and was annotated as a dehydrocafestol derivative. It was assigned based on  Figure S14) [41]. Hence, such compounds may be recognized as roasting markers.
Moreover, mozambioside, a diterpenoid glycoside of furokaurane type, was reported previously as a marker for C. arabica species [39]. It was detected in the current study in both green and roasted arabica (P38) (Supplementary Figure S1B [38], presenting a new marker for green arabica species (GCA), since it was completely absent from roasted RCA seeds, which suggests its degradation upon roasting. P36 was identified in Isodon species and this is the first time it has been reported in coffee seeds (Supplementary Figure S16) [38].

Fatty Acids and Sphingolipids
Seeds are well-known with their richness in lipids. Coffee seed analysis revealed that various fatty acids and sphingolipids appeared late at R t > 12.00 min of the UHPLC chromatograms given their relatively nonpolar nature ( Table 2). The negative ionization mode revealed several non-hydroxylated, i.e., P54 and P69, and hydroxylated fatty acid, i.e., P52, P71, and P72, peaks. An example of a non-hydroxylated fatty acid was P69  Figure S17). P69 was detected in all investigated coffee samples. In addition, the hydroxylated fatty acids in P52 showed [M − H] − at m/z 329, 483, and 355, respectively. They were annotated with reference to the literature [11]. It is worth mentioning that P52, i.e., trihydroxy-octadecaenoic acid, was only detected in green and roasted arabica coffee (Table 2).
Nitrogen-containing lipids, including various sphingolipids and fatty acyl amides, were also detected. In particular, P74 m/z 338 [M + H] + was annotated as docosenamide, which is a fatty acyl amide showing MS/MS fragments at m/z 321 [M+H-17] + , corresponding to the loss of ammonia, which is in agreement with previous literature [43]. It was detected in both roasted arabica and robusta seeds for the first time (Table 2 and Supplementary Figure S18). Therefore, P74 may be considered a roasting marker. The MN for fatty acids and sphingolipids is illustrated in Figure 1E.

Miscellaneous
Phenylpropanoid esters of sucrose are common secondary metabolites in Planta. They are considered potential candidates for drug discovery [36]. However, they have not been reported before in coffee seeds. Feruloyl ester of sucrose, identified at m/z 735 (P33) in negative mode with a product ion at m/z 367, was detected in green arabica and robusta species and annotated as acetyl-diferuloyl sucrose (Supplementary Figure S21) [35].

UHPLC-HRMS Based Multivariate Data Analyses and Fingerprinting of Coffee Samples
Although differences in metabolite composition could be revealed from the visual inspection of the UHPLC-MS chromatograms of coffee specimens (Supplementary Figure S1), the dataset was holistically extracted from the UHPLC-HRMS using multivariate data analyses, especially considering the large number of samples (57 samples) represented by three biological replicates each. Several models were constructed to classify coffee samples, stratifying them according to their species, roasting indices, and different blends common in the Middle East, i.e., cardamom and Qassim, as discussed in the next subsections.

Roasted versus Green Coffee
Based on the UHPLC-HRMS dataset, a PCA score plot was firstly applied unsuper-  Figure S22B), responsible for such segregation, including caffeine (P6) and, to a lesser extent, trigonelline (P5), which were enriched in roasted samples, while chlorogenic acid isomers, such as feruloyl quinic acid (P11), were found to be abundant in unroasted specimens.
In another attempt to identify better markers and improve the classification potential of roasted versus unroasted samples, a supervised OPLS model was established for the same dataset. The supervised model showed R 2 and Q 2 of 0.91 and 0.83, respectively, with better sample segregation (Figure 2A). The S-plot showed other markers for HCAs, i.e., mozambioside (P38) and caffeoyl-dimethoxy cinnamoyl quinic acid (P24) for green specimens. In contrast, caffeoyl-quinolactone (P13) and dicaffeoyl-quinolactone (P18) were characteristic for the roasted samples and likely to be generated upon roasting as a result of dehydration reactions for the quinic acid derivatives and the formation of chlorogenic acid lactones (CGLs) ( Figure 2B) [32]. different symbols denoting roasted, i.e., RCA, RCC, LRCM, LRS, HRKC, BRK, LRCK, BRA, LRSQ, LRCS, LRCQ, ICA, ICC, versus green or unroasted ones, i.e., GCA, GCC, GCU, GCE, GCS, GCK. The PCA score plot (Supplementary Figure S22A) showed values for R 2 = 0.49 and Q 2 = 0.38, indicating an acceptable model, though not showing clear segregation of roasted from green samples, with some overlap between investigated specimens. A few markers appeared in the PCA loading plot (Supplementary Figure S22B), responsible for such segregation, including caffeine (P6) and, to a lesser extent, trigonelline (P5), which were enriched in roasted samples, while chlorogenic acid isomers, such as feruloyl quinic acid (P11), were found to be abundant in unroasted specimens.
In another attempt to identify better markers and improve the classification potential of roasted versus unroasted samples, a supervised OPLS model was established for the same dataset. The supervised model showed R 2 and Q 2 of 0.91 and 0.83, respectively, with better sample segregation (Figure 2A). The S-plot showed other markers for HCAs, i.e., mozambioside (P38) and caffeoyl-dimethoxy cinnamoyl quinic acid (P24) for green specimens. In contrast, caffeoyl-quinolactone (P13) and dicaffeoyl-quinolactone (P18) were characteristic for the roasted samples and likely to be generated upon roasting as a result of dehydration reactions for the quinic acid derivatives and the formation of chlorogenic acid lactones (CGLs) ( Figure 2B) [32].

Instant versus Roasted Coffee Samples
As instant coffee production usually involves processing steps additional to roasting, PCA was employed to determine the variability between roasted and instant coffee samples (Supplementary Figure S22C). PCA was employed for 12 coffee samples, including an instant sample (ICA), an instant sample with cardamom (ICC), along with 10 roasted specimens denoted by different symbols (RCA, RCC, LRCM, LRS, HRKC, BRK, LRCK, BRA, LRSQ, LRCS). The PCA model revealed that the plain instant sample, i.e., ICA, was well separated from the other samples in the two PC projections, representing 40% of the total variance. The low PC values might be attributed to the fact that the instant sample blended with cardamom, i.e., ICC, was not segregated from the plain instant sample in addition to variation in the roasting degree of the coffee samples.
Consequently, an OPLS model was constructed, which showed better separation parameters (PC1 and PC2 = 85%). In addition, R 2 and Q 2 took the values 0.98 and 0.95, respectively ( Figure 2C). The S-plot revealed that caffeine (P6) and dicaffeoyl quinolactone (P19) were the major markers for roasted samples, while sphingolipid conjugates, i.e., P60, were, interestingly, predicted to be the main markers for the instant sample ( Figure 2D).
Another chemical that should have contributed to the separation of instant samples from roasted samples is acrylamide enrichment in instant as opposed to roasted coffee that has been reported in many studies [52]. However, it was hardly to be detected by LC-MS using the method employed in the current research. This may be attributed to its low molecular mass and concentrations better suited for detection by GC-MS or LC-MS after a clean-up pretreatment [52,53], suggesting that it would be of value to use more than one technique in metabolomics [18].

Blended versus Plain Samples
To determine the impact of blending coffee as is typical in the Middle East region, PCA was employed to model 10 samples, including cardamom blended and plain samples, with specimens denoted by different symbols for cardamom and Qassim blended (LRCM, HRKC, LRCK, LRSQ, LRCS) versus plain ones (RCA, RCC, BRA, BRK, LRS). The score plot model showed the clustering of samples blended with cardamom mainly on the left side, while plain samples along with a few cardamom blended samples were placed on the right side along PC1 to cover 33% of the total variance (Supplementary Figure S22D). The loading plot demonstrated the possible enrichment of caffeine (P6) in plain coffee, while dicaffeoyl quinic acid (P19) and feruloyl quinic acid (P11) were found in blended products (Supplementary Figure S22E).
The OPLS score plot showed improved discrimination between investigated samples, with better PCs responsible for 91% of the total variance (R 2 = 0.98 and Q 2 = 0.91) ( Figure 2E). Finally, the S-plot model confirmed PCA loading results regarding the higher abundance of caffeine in plain coffee versus chlorogenic acids, i.e., dicaffeoyl quinic acid and feruloyl quinic acid enrichment in coffee blended with cardamom products ( Figure 2F). The obtained results were in accordance with a previous analysis of cardamom by HPLC which revealed its richness in phenolic compounds, i.e., tannic, caffeic, gallic, and dicaffeoyl quinic acids [54]. OPLS-DA model validation was assessed by the diagnostic metrics R 2 (total variance) and Q 2 (goodness parameters), which were greater than 0.8, with most models showing a regression line crossing zero, and with negative Q 2 and R 2 close to 1, signifying the model's validation. Moreover, the p-values for each model were calculated using CV-ANOVA (analysis of variance of cross-validated residuals) and were all below 0.005 ( Supplementary  Figures S23-S25).

UV-Vis Fingerprinting of Coffee Seeds
The UV-Vis technique was further applied to compare models derived from UHPLC-HRMS based on metabolome analysis in coffee samples using the same extraction method employed for UHPLC-HRMS as a fingerprinting method. UV spectral bands provide information mostly about conjugated bioactive component characteristic profiles, i.e., phenolic acids (220-325 nm), methylxanthines (244-300 nm), and chlorogenic acids (200-325 nm). In this study, UV-Vis fingerprinting coupled to multivariate analysis was applied for differenti-ation between coffee specimens based on different variables examined with UHPLC-HRMS, i.e., genotypes, suppliers, roasting levels, and blending.

Roasted versus Green Coffee Specimens
PCA modeling of the roasted and unroasted sample score plot successfully classified most of the samples along PC1 (91%) (Supplementary Figure S26A). Moreover, the loading line plot revealed intense absorption by green samples in the region of 220-350 nm, suggesting their richness in phenolic acids. In contrast, roasted samples were segregated based on their higher absorption values between 375 and 450 nm, which could be explained by the brown color factor or presence of melanoidins (λ max = 420 nm). Melanoidins have been reported to exhibit different absorption bands in UV during roasting, early roasting (λ max = 280 nm), medium roasting (λ max = 330), and final roasting (λ max = 420 nm). Hence, UV is suggested to be used for determining the roasting index in coffee for the coffee industry as a simple robust and inexpensive method compared to UHPLC-HRMS.
Moreover, the OPLS-DA model was constructed for maximal separation of the samples. Green samples were distinctly clustered away from roasted samples and showed good validation parameters (R 2 and Q 2 = 1) ( Figure 3A). Inspection of the corresponding S-line plot revealed that the discriminant wavelengths of green samples from roasted were at 210-230 nm and 300-330 nm, which is attributable to the absorption of phenolic acids ( Figure 3B). These results were synchronous with those obtained from the S-plot for the UHPLC-MS model that revealed the enrichment of green coffee with chlorogenic acids ( Figure 2B).

Instant versus Roasted Samples
The instant samples, i.e., ICA and ICC, were well separated along PC1 (93%), appearing as outliers in the upper part of the plot (Supplementary Figure S26B). The corresponding loading plot revealed that instant samples absorbed more at 220 nm and 290 nm, which is likely attributable to fatty acids and sphingolipid conjugates, in accordance with UHPLC-HRMS results ( Figure 2D). Interestingly, a UV λ max at 275 nm was detected 3.5-fold compared to the roasted arabica sample, which is likely attributable to acrylamide, suggesting that instant coffee contains more acrylamide than roasted coffee, consistent with previous literature [52]. These results highlighted the way in which UV complemented results derived from UHPLC-MS by revealing potential coffee markers not detected in the later technique, including melanoidins and acrylamide, indicating coffee processing levels and further safety.
To confirm the acrylamide-derived band in UV, the spiking method with an acrylamide reference standard was attempted. The results showed that the absorbance of the sample spiked with acrylamide increased at 273 nm, in agreement with Alfarhani [55] (Supplementary Figure S27).
For further confirmation, an OPLS-DA model was built for better differentiation between the two sets of samples with R 2 and Q 2 values computed to be 0.97 and 0.91, respectively ( Figure 3C). Upon investigation of the S-line plot of the OPLS-DA loading plot, characteristic UV regions of instant samples relative to roasted samples were identified in the range of 220-290 nm, likely due to the absorption wavelengths of sphingolipids (λ max = 230 nm) and/or acrylamide ( Figure 3D) [52]. Antioxidants 2022, 10, x FOR PEER REVIEW 21 of 27 For further confirmation, an OPLS-DA model was built for better differentiation between the two sets of samples with R 2 and Q 2 values computed to be 0.97 and 0.91, respectively ( Figure 3C). Upon investigation of the S-line plot of the OPLS-DA loading plot, characteristic UV regions of instant samples relative to roasted samples were identified in the range of 220-290 nm, likely due to the absorption wavelengths of sphingolipids (λmax = 230 nm) and/or acrylamide ( Figure 3D) [52].

Blended versus Plain Coffee Samples
The PCA model was constructed to distinguish blended and plain coffee samples, with no clear segregation of cardamom and Qassim blended coffee samples from roasted samples and with overlap between the two specimens ( Figure 3E). The loading plot showed the absorption of plain samples in the range of 350-450 nm, suggesting higher melanoidin levels, while cardamom blended samples, along with lightly roasted samples,

Blended versus Plain Coffee Samples
The PCA model was constructed to distinguish blended and plain coffee samples, with no clear segregation of cardamom and Qassim blended coffee samples from roasted samples and with overlap between the two specimens ( Figure 3E). The loading plot showed the absorption of plain samples in the range of 350-450 nm, suggesting higher melanoidin levels, while cardamom blended samples, along with lightly roasted samples, were more rich in phenolic acids, with an absorption range of 220-350 nm, indicating cumulative phenolic content for both cardamom and roasted coffee [54].

Comparison between UHPLC-MS and UV Fingerprinting Multivariate Data Analysis Models
The classification potential of both UV and UHPLC-MS were compared based on their PCA and OPLS results. Both techniques were found generally comparable and to complement metabolite detection in the different coffee specimens. The PCA and OPLS loading plots obtained from UHPLC-MS revealed that caffeine and CGLs contributed to the discrimination of roasted samples compared with the abundance of chlorogenic acids and diterpenes in unroasted samples. In contrast, the interference of caffeine UV bands with chlorogenic acids could be predicted in UV models, albeit with roasted samples, to show tight clustering at higher absorption ranges, i.e., 350-450 nm, attributable to melanoidin absorption (λ max = 420 nm) and not detected using UHPLC-MS. Other chemicals inferred from UV models were acrylamides, showing increased absorption in roasted specimens ( Figure 3B). On the other hand, unroasted coffee samples showed higher absorbance in the region of 220-350 nm, typically for chlorogenic acid absorption (λ max = 220 and 325 nm) and diterpenes (λ max = 298 nm) ( Figure 3B).
Likewise, the addition of cardamom to coffee was investigated by both techniques; the PCA score plot obtained from the UHPLC-MS measurements showed tighter clustering of instant samples than roasted samples, while the UV model could not distinguish clearly between roasted and cardamom blended samples, suggesting that blending effects are better revealed using UPLC-MS compared to a UV model.
However, the UHPLC-HRMS-derived model was neither able to detect acrylamide nor melanoidins that are important markers of processing impact on coffee and its safety. Nevertheless, the UV model loading plot showed that instant samples had strong absorption for acrylamide (λ max = 275 nm), while roasted samples had a higher absorbance range of 370-420 nm, suggesting variation in melanoidin formation during the roasting technique. Accordingly, instant coffee is considered less safe than other types of coffee (Figures 2 and 3). Comparative toxicological studies in animals should be pursued in the future to confirm such hypotheses generated using chemicals analyses.

Determination of Total Phenolic Content of Coffee Species
UHPLC-MS analysis revealed phenolic enrichment in coffee seeds (Table 2) to be affected by roasting [56]. Therefore, quantitative determinations of total phenolics were investigated in both commercial and authenticated coffee specimens and correlated with coffee seeds' bioactivities, i.e., antioxidant action.
The limit of detection (LOD) and limit of quantitation (LOQ) were calculated for the applied assay as 0.37 and 1.14 mg GAE/mg extract, respectively. The results of TPC showed that the highest levels were detected at 50-52 mg GAE/g in BRA, LRCK, and GCK, while the lowest were ICC and ICA at 3-7.7 mg GAE/g (Supplementary Table S1). In addition, increasing the roasting degree led to a decrease in TPC, with the highest level observed in lightly roasted samples and green samples and a marked decline in heavily roasted and instant samples. These findings were consistent with previous reports suggesting the superiority of green and light roasted coffee as a rich source of free polyphenols compared to processed (instant) and roasted coffee [57]. Nevertheless, differences between roasted samples may be attributed to the degradation of chlorogenic acids and their contribution to the development of Maillard reaction products, i.e., melanoidins [58]. Additionally, Arabian coffee blended with cardamom (ICC) and instant C. arabica (ICA) were recognized as having the lowest levels of phenolics, suggesting their degradation during the further processing steps and that instant coffee provides low phenolic levels, even compared to roasted coffee (Figure 4).  Table 1. *: Significant values compared to GCC specimen (p < 0.05).
In contrast, lightly roasted samples, including BRK, LRS, and LRCS, showed lower IC50 values, at 27.3, 43.2 and 48.6 μg/mL, respectively, suggesting their potential antioxidant power and that phenolics are more crucial than melanoidins for determining antioxidant action. A few roasted samples, such as RCC and RCA, had IC50 values at 74.2 and 103.3 μg/mL, respectively, indicating improvements in antioxidant activity that may be attributable to the production of melanoidins (Table 1 and Figure 4) [16].

In Vitro FRAP Assay
To confirm results derived from DPPH, another antioxidant FRAP assay was examined. FRAP results were generally in accordance with the DPPH radical scavenging activity (Supplementary Table S1). Both BRK, BRA, and GCC samples showed the strongest antioxidant effect with FRAP values of 34.1, 28.2, and 26.19 mg TE/mg extract, respectively. In contrast, heavily roasted and instant samples HRKC, RCA, and ICC showed FRAP results at 6.3, 3.9, and 1.5 mg TE/mg extract, respectively. Cardamom addition in the different coffee blends did not result in an increase in FRAP values, as in HRKC (6.3  Table 1. *: Significant values compared to GCC specimen (p < 0.05).
In contrast, lightly roasted samples, including BRK, LRS, and LRCS, showed lower IC 50 values, at 27.3, 43.2 and 48.6 µg/mL, respectively, suggesting their potential antioxidant power and that phenolics are more crucial than melanoidins for determining antioxidant action. A few roasted samples, such as RCC and RCA, had IC 50 values at 74.2 and 103.3 µg/mL, respectively, indicating improvements in antioxidant activity that may be attributable to the production of melanoidins (Table 1 and Figure 4) [16].

In Vitro FRAP Assay
To confirm results derived from DPPH, another antioxidant FRAP assay was examined. FRAP results were generally in accordance with the DPPH radical scavenging activity (Supplementary Table S1). Both BRK, BRA, and GCC samples showed the strongest antioxidant effect with FRAP values of 34.1, 28.2, and 26.19 mg TE/mg extract, respectively. In contrast, heavily roasted and instant samples HRKC, RCA, and ICC showed FRAP results at 6.3, 3.9, and 1.5 mg TE/mg extract, respectively. Cardamom addition in the different coffee blends did not result in an increase in FRAP values, as in HRKC (6.3 mg TE/mg extract), LRCM (7.9 mg TE/mg extract), and instant coffee products, i.e., ICC at 1.5 mg TE/mg extract, in accordance with the DPPH assay ( Figure 4 and Supplementary Table S1).

Correlation between Biological Assays and UHPLC-MS Metabolite Profile
A correlation between the biological assays, i.e., the antioxidant and UHPLC-MS datasets, were attempted to determine the metabolites responsible for the antioxidant activity of the different coffee samples. Hence, a partial least-squares (PLS) model was constructed, taking the UHPLC-MS metabolites as x-variables and the corresponding biological assay parameters (TPC, DPPH, FRAP) as y-variables. The PLS score plot explained 99% of the total variance in Y (R 2 = 0.99 and Q 2 = 0.92) and as a prediction parameter explained 92%, the loading plot displaying a positive correlation with all assays. Investigation of variable importance in projection (VIP) enabled recognition of the metabolites responsible for the antioxidant effects and the pinpointing of the relation between the xand y-variables in the PLS model. The main potential metabolites that had significant VIP scores included caffeine and caffeoylquinic acid, with VIP scores of 6.6 and 6.8, respectively.
The abundance of chlorogenic acid and its derivatives in coffee (7-12%), i.e., caffeoylquinic acids, resulted in additional potential antioxidant activities, including alleviation of cellular oxidative modulation. Additionally, several studies on caffeine revealed that it exerted hydrophilic and lipophilic antioxidant activity. Interestingly, dicaffeoyl quinolactone which is formed mainly during the roasting process, showed a lower score at 1.9, suggesting that it has a lower correlation potential and a lesser antioxidant effect, as demonstrated in the radar plot (Supplementary Figure S28).

Conclusions
The study represented a multiplex metabolomics approach using two different platforms-UHPLC-HRMS and UV fingerprinting techniques-for the tentative identification of secondary metabolites in different coffee products. Specimens differed with respect to several variables, such as genotype, roasting process, supplier, and additives. Both UHPLC-HRMS and UV spectroscopy coupled to multivariate data analysis revealed differences among authenticated Brazilian and commercial samples consumed in the Middle East. Such a comparative metabolomics approach presented the first detailed profile for green and roasted coffee metabolomes in that region. Additionally, GNPS aided in the identification of metabolites via UHPLC-HRMS data analysis, resulting in the tentative identification of several novel phenolics and diterpenes in coffee seeds. In contrast, UV fingerprinting provided preliminary data on the absorption ranges of the various main chemicals, showing its use as an alternative tool for UHPLC-HRMS, being cheaper and simpler to operate. Interestingly, both techniques were generally comparable with respect to metabolite detection in specimens, with more advantages found for UV in the identification of acrylamide and melanoidins-factors indicative of processing level. The developed comparative metabolomics approach can be considered for future quality control purposes in addition to other spectroscopic techniques. We have highlighted in the text that for unknown derivatives belonging to certain classes revealed by GNPS to have characteristic fragments/patterns, other spectral analyses, including NMR, should be considered in future work.
Furthermore, in vitro antioxidant assays provided a measure of how the antioxidant activity of different roasted and green coffee samples correlated with differences in metabolite composition, where phenolic compounds, such as caffeoylquinic acid, were more crucial than melanoidins. In addition, dicaffeoyl quinolactone had a lower impact on the antioxidant effect of coffee products compared to dicaffeoylquinic acid.