Characterization of Sparkling Wine Based on Polyphenolic Proﬁling by Liquid Chromatography Coupled to Mass Spectrometry

: Polyphenols are phytochemicals naturally present in wines that arouse much interest in the scientiﬁc community due to their healthy properties. In addition, their role as descriptors of various wine qualities, such as the geographical origin or the grape variety, cannot be underestimated. Here, Pinot Noir and Xarel · lo monovarietal samples belonging to the sparkling wine production process have been studied, corresponding to base wines from a ﬁrst alcoholic fermentation (plus malolactic in some cases), base wines resulting from tartaric stabilization, and sparkling wines from a second alcoholic fermentation aged for 3 and 7 months. One of the objectives of this paper is to obtain valuable chemical and oenological information by processing a huge amount of data with suitable chemometric methods. High-performance liquid chromatography coupled with ultraviolet spectroscopy and tandem mass spectrometry (HPLC-UV-MS/MS) has been used for the determination of polyphenols in wines and related samples. The method relies on reversed-phase mode and further detection by multiple reaction monitoring. Concentrations of relevant phenolic compounds have been determined, and the resulting compositional data have been used for characterization purposes. Exploratory studies by principal component analysis have shown that samples can be discriminated according to varietal and quality issues. Further classiﬁcation models have been established to assign unknown samples to their corresponding classes. For this purpose, a sequential classiﬁcation tree has been designed involving both variety and quality classes, and an excellent classiﬁcation rate has been achieved.


Introduction
Phenolic compounds are important phytochemicals naturally occurring in oenological products such as musts and wines.Despite the great structural diversity, they are often classified into various families, featuring phenolic acids, stilbenes, and flavonoids as the most remarkable classes in grape-derived matrices.The basic structural skeletons are depicted in Table S1 (Supplementary Material) , and additional descriptions can be found elsewhere [1,2].This large group of molecules exhibits remarkable antioxidant attributes responsible for beneficial features such as anti-inflammatory, cardioprotective, antineoplastic, or antimicrobial activities [1,3].
Beyond the well-known healthy properties, the role as descriptors of different food features dealing with the geographical origin, botanical variety, agricultural practices, or fermentation processes cannot be underestimated [4,5].Phenolic compounds are originally present in the grapes, especially in peels and seeds, while the levels in pulp are lower.Hence, the maceration process during the vinification is fundamental to lixiviate these compounds from the solid matter to the must.In white and rosé wines, the maceration is minimal (or very limited), so their phenolic content is significantly lower (up to 10-to 50-fold lower) compared to red wines.Regarding the family type, hydroxycinnamic acids are mainly detected since they are widespread in the pulp, while anthocyanins or tannins, which provide red color and astringency to the wine, are quite residual [6][7][8].
In this paper, white and rose base wines and sparkling wines were analyzed to evaluate the ability of polyphenols as variety and quality descriptors.Although varietal issues were commonly addressed based on phenolics-various representative examples have been cited above-to the best of our knowledge, the use of such data for quality assessment had not been considered previously.The quantification of phenolic acids and flavonoids was carried out by high-performance liquid chromatography-ultraviolet detection coupled with tandem mass spectrometry (HPLC-UV-MS/MS).The complex multivariate nature of the relationships between features and concentrations entailed that extracting the underlying information was difficult.Hence, statistical and chemometric methods were required to establish patterns to characterize and authenticate the wine samples.Conclusions were drawn concerning the influence of varietal and quality wine features on the compositional profiles and, despite the fact that no selective markers were found, some compounds were up-expressed in some wine classes.
For calibration purposes, standard solution mixtures at concentrations from 0.02 to 10 mg L −1 were prepared in methanol:water (1:1, v:v) and were stored at 4 • C until use.

Oenological Samples
Samples analyzed in this paper were kindly provided by Codorniu S.A. (Sant Sadurní d'Anoia, Spain).They are monovarietal products summarized in Table 1, including base wines, stabilized wines, and sparkling wines elaborated with a white grape (Xarel•lo) produced in Penedès (Catalonia, Spain) and a red grape (Pinot Noir) from Conca de Barberà and Costers del Segre (both from Catalonia, Spain).This range of wine products is representative of the different steps in the elaboration of sparkling wines following the traditional Champenoise method [10,33].In more detail, monovarietal base wine, either from Xarel•lo or Pinot Noir varieties, resulted from the first alcoholic fermentation developed in stainless steel tanks at 15 to 18 • C. When necessary, malolactic fermentation (MLF) was also applied to reduce the unpleased sour taste due to high levels of malic acid.Subsequently, the monovarietal base wines were clarified and stabilized to avoid further precipitation of tartrate salts, thus resulting in the so-called stabilized wines.The last step consisted of the second alcoholic fermentation in the bottle.After that, samples aged in contact with lees for 3 and 7 months were collected for analysis.Apart from winemaking aspects, products were classified into four qualities (here coded as A, B, C, and D, with A being the top quality and D the lower one) according to agricultural and vinification criteria.The initial classification of the products in the qualities A, B, C, and D was carried out by expert oenologists to pre-establish the oenological, sensory, and commercial possibilities of the future wines.In general, the highest quality wines were likely to be aged for several years and generate very select products with a high market value.On the other hand, qualities C and D were used to produce large volumes of wines that can only be aged for a short time, from 9 to 18 months, and were marketed at much lower prices.Briefly, the type of grape plantation (ecological or conventional), the type of harvest and transport (manual for class A or mechanized for B, C, and D), the vineyard productivity (ca.from 6000 for A to more than 10,000 kg per hectare for D) were the factors that conditioned the quality assignation.In addition to the harvest and transport, the pressure applied in the pressing to obtain the most was a fundamental issue in the wine quality.The higher the pressure, the higher the production yield was obtained, but the quality was lower since a stronger extrusion led to a higher proportion of astringent compounds and unwanted acids.In addition, the sensory freshness of the product diminished, and the aging possibilities were more limited.Moreover, MLF was applied to wines of C and D qualities to improve the organoleptic features leading to creamy flavors characteristic of high levels of lactic acid.Conversely, products of A and B quality were not subjected to MLF, so they presented fresh and fruity flavors in the mouth.

Analytical Procedure
Wines and related samples were filtered with 0.45 nylon filters (Whatman, Clifton, NJ, USA) and analyzed by HPLC-UV-MS/MS through the multiple reaction monitoring (MRM) acquisition mode.This method was previously established and validated by Mir-Cerdà et al. [34].The chromatography equipment was composed of an Agilent 1100 Series liquid chromatograph (Agilent, Technologies, Palo Alto, CA, USA) -equipped with a vacuum degasser (G1322A), binary pump (G1312A), and autosampler (G1367A)-coupled to an Applied Biosystems 4000 QTrap hybrid triple quadrupole/linear ion trap mass spectrometer (AB Sciex, Framingham, MA, USA).
Mass spectrometry with MRM mode was used for analyte confirmation and quantification using the corresponding standards.The electrospray source operated in negative mode at −2500 V at a temperature of 700 • C. Nitrogen used as nebulizer and auxiliary gas was set at 20, 50, and 50 arbitrary units for the curtain gas, the ion source gas 1, and the ion source gas 2, respectively.Declustering potential (DP), collision energy (CE), collision exit cell potential (CXP), and ion transitions pairs were optimized elsewhere [34] (see Table S2 in the Supplementary Material for detailed information).LC-UV-MS/MS chromatograms were acquired and processed with Analyst 1.6.2(AB Sciex, Framingham, MA, USA).For quantitative purposes, standard solutions of each analyte.
Samples were analyzed randomly in triplicate.Quality control (QC) and blank samples were measured every 10 samples.The calibration curve, prepared in the concentration range from 0.02 to 10 mg L −1 , was run at the beginning and at the end of the sample set.

Data Analysis
Statistical tests, ANOVA, and boxplots were performed using Microsoft Excel (Microsoft Corporation, Redmond, WA, USA), with α = 0.05 chosen as the significance level.Multivariate studies for preliminary data exploration by Principal Component Analysis (PCA) and sample classification by Partial Least Squares-Discriminant Analysis (PLS-DA) were carried out with SOLO (Eigenvector Research, Inc., Manson, WA, USA).More information on the chemometric algorithms and possibilities in food characterization and authentication can be found in the literature [35][36][37].

Results
Concentrations of remarkable phenolic acids and flavonoids occurring in the set of samples were determined by the HPLC-UV-MS/MS method described in the experimental section.This analytical method was previously developed and validated by Mir-Cerdà et al. [34].Samples were analyzed randomly in triplicate.QC and blanc solutions were injected every 10 samples to check the method's reproducibility and control the carryover.Analytes were quantified using the calibration curve prepared with the corresponding standard solutions, which were injected at the beginning and end of the analyses.Table S3 in the Supplementary Material details the compositional profiles of the samples under study.
Overall, caftaric acid was the most abundant compound, with concentrations generally higher than 10 mg L −1 , reaching values ca.26 mg L −1 in some Pinot Noir wines.Other hydroxycinnamic acids, such as coutaric, caffeic, and coumaric acids, were also remarkable, with concentration values ranging from 0.2 to 3.5 mg L −1 .The content of hydroxybenzoic acids was lower.Gallic acid was the most representative molecule, occurring at concentrations between 0.2 and 1.6 mg L −1 , while other detected compounds were present at sub-mg L −1 levels (for instance, vanillic, syringic, 4-hydroxybenzoic, 3,4-dihydroxybenzoic, and 2,5-dihydroxybenzoic).Regarding flavonoids, astilbin was found in concentrations ranging from ca. 0.5 to 5 mg L −1 .Other identified flavonoids generally occurring at sub-mg L −1 levels were catechin, epicatechin, and procyanidin dimers.
The compositional profiles were moderately stable throughout the winemaking process, from the base and stabilized wines to the sparkling wines obtained after a second alcoholic fermentation in the bottle (e.g., 3-month and 7-month-aged sparkling wines).This finding was confirmed by ANOVA, concluding that differences in the phenolic composition between base, stable and sparkling wines were not significant (p-values > 0.05).In contrast, it was found that polyphenolic profiles were highly dependent on product quality and variety.For all compounds tested, both factors were statistically significant, as well as their interaction.
In order to illustrate the significance of quality and variety on the phenolic concentration graphically, various representative examples are shown in Figure 1.In the case of gallic acid, for instance, the amount found in the samples increased from high to low qualities.It was also evidenced that, in general, Pinot Noir samples were richer than Xarel•lo ones.A similar pattern was also identified for other phenolic acids and flavonoids (e.g., vanillic acid, astilbin, catechin, or epicatechin).For other (di)hydroxybenzoic acids, contents also increased with decreasing the quality, but differences between Pinot Noir and Xarel•lo were not noticeable.The behavior of caftaric acid was more peculiar since the highest levels were obtained for the samples of the best quality.As mentioned above, the full compositional data are in Table S3 (Supplementary Material).Here, a heatmap is depicted with the average analyte concentrations for each type of sample to visualize globally the compositional patterns (see Figure 2).Although no specific class biomarkers were found, some concentrations were up-or down-regulated in some types of samples, so the data would be a valuable source of information to characterize and authenticate this type of wine samples.As can be seen in Figure 2, the greater the intensity of the red hue, the greater the concentration of analytes, while white and pale colors indicate low concentrations.In line with the previous comments, it can be seen that Pinot Noir is richer in phenolic compounds; also, qualities C and D contain higher concentrations of the analytes.As exceptions to this behavior, caffeic and caftaric acids are more abundant in quality A samples, and coumaric and coutaric acids are over-expressed in the xarel•lo variety.
full compositional data are in Table S3 (Supplementary Material).Here, a heatmap is depicted with the average analyte concentrations for each type of sample to visualize globally the compositional patterns (see Figure 2).Although no specific class biomarkers were found, some concentrations were up-or down-regulated in some types of samples, so the data would be a valuable source of information to characterize and authenticate this type of wine samples.As can be seen in Figure 2, the greater the intensity of the red hue, the greater the concentration of analytes, while white and pale colors indicate low concentrations.In line with the previous comments, it can be seen that Pinot Noir is richer in phenolic compounds; also, qualities C and D contain higher concentrations of the analytes.As exceptions to this behavior, caffeic and caftaric acids are more abundant in quality A samples, and coumaric and coutaric acids are over-expressed in the xarel•lo variety.The dataset, consisting of the compositional profiles of the phenolic acids and flavonoids in the samples, is huge, and the potential relationships between variables and sample features are often of multivariate nature.For this reason, extracting global information on the potential role of polyphenols as the descriptors of wine quality or varietal issues is a difficult task.However, chemometric methods for exploratory analysis and sample classification can deal with the multivariate nature of the compositional data, thus providing more comprehensive and accurate information.
The dataset was preliminarily studied by PCA.The matrix dimension was 112 × 18, with 112 being the number of 98 sample replicates plus 16 QCs and 18 the number of target compounds under study.Data were autoscaled to equalize the influence of the most abundant compounds (e.g., caftaric, gallic, caffeic and coutaric acids, astilbin, and catechin) with those occurring at lower levels (e.g., syringic, ferulic, and 4-hydroxybenzoic acids).PCA results are depicted in Figure 3.In agreement with previous exploratory results, the distribution of samples in the space of the principal components (PCs), PC1 versus PC2 (Figure 3a), revealed patterns related to the variety and quality of wines.PC1 mainly described the sample quality, with the best quality to the left and the lowest to the right.PC2 discriminated the samples according to variety, with Xarel•lo on the top and Pinot Noir on the bottom sectors.Moreover, QCs were grouped in a compact group in the    1.
The loading plot showed the distribution of phenolic compounds in the space of PCs (Figure 3b).It was deduced that, in general, the highest qualities corresponded to the poorest samples.Although there are no selective markers of quality or variety for this type of product, it was interestingly observed that some compounds predominated in the Pinot  1.
The dataset, consisting of the compositional profiles of the phenolic acids and flavonoids in the samples, is huge, and the potential relationships between variables and sample features are often of multivariate nature.For this reason, extracting global information on the potential role of polyphenols as the descriptors of wine quality or varietal issues is a difficult task.However, chemometric methods for exploratory analysis and sample classification can deal with the multivariate nature of the compositional data, thus providing more comprehensive and accurate information.
The dataset was preliminarily studied by PCA.The matrix dimension was 112 × 18, with 112 being the number of 98 sample replicates plus 16 QCs and 18 the number of target compounds under study.Data were autoscaled to equalize the influence of the most abundant compounds (e.g., caftaric, gallic, caffeic and coutaric acids, astilbin, and catechin) with those occurring at lower levels (e.g., syringic, ferulic, and 4-hydroxybenzoic acids).PCA results are depicted in Figure 3.In agreement with previous exploratory results, the distribution of samples in the space of the principal components (PCs), PC1 versus PC2 (Figure 3a), revealed patterns related to the variety and quality of wines.PC1 mainly described the sample quality, with the best quality to the left and the lowest to the right.PC2 discriminated the samples according to variety, with Xarel•lo on the top and Pinot Noir on the bottom sectors.Moreover, QCs were grouped in a compact group in the center of the model, thus suggesting that data were highly reproducible throughout the chromatographic sequence of analysis and supporting the soundness of the conclusions.The loading plot showed the distribution of phenolic compounds in the space of PCs (Figure 3b).It was deduced that, in general, the highest qualities corresponded to the poorest samples.Although there are no selective markers of quality or variety for this type of product, it was interestingly observed that some compounds predominated in the Pinot Noir variety (e.g., gallic, vanillic, and hydroxybenzoic acids, astilbin, and hydroxytyrosol) while other were comparable or even more abundant in Xarel•lo (e.g., caftaric, coutaric, and caffeic acids).All these sets of complex compositional differences were responsible for the sample distribution commented on above.
Based on the natural trends confirmed from the boxplots, statistics, and PCA, further studies were attempted to classify the wine samples according to quality and variety attributes simultaneously using PLS-DA.For such a purpose, a preliminary model was established considering varieties and qualities simultaneously, so eight classes were created.Results shown in Figure 4 are similar to those from PCA except for the rotation of the axis and the higher class discrimination by PLS-DA.Subsequently, an 8-class classification tree was defined from which Pinot Noir vs. Xarel•lo types were first separated, as the principal quantitative differences were due to variety.Then, within Pinot Noir and Xarel•lo classes, further divisions relied on quality, with D being the first dismembered class, followed by C, and, finally, A and B classes were split apart.Quantitative results from this classification process are summarized in Table 2.As can be seen, Pinot Noir vs. Xarel•lo samples could be perfectly distinguished and assigned in both calibration and cross-validation steps, with a 100% of classification rate.In a similar way, the following classification models, once Pinot Noir vs. Xarel•lo were separated apart, were also excellent, and only one misclassification occurred among the cross-validation results since one C sample was confounded with a D one.

Discussion
It is well known that, in general, red wines are richer in polyphenols, mainly because of the winemaking process.In the red vinification, the must maceration in contact peels, seeds, brunch stalks, and other vine solids favors the lixiviation of polyphenolic substances to the liquid to be fermented.In the white vinification, this contact is avoided as much as possible, so the lixiviation is minimal, and the occurrence of polyphenols in the samples comes from the squeezed pulp.Conversely, a short maceration process is performed in rosé wines so that a small fraction of the skin and vine soluble components can pass into the must.In this set of wines, Pinot Noir samples of C and D quality were intended to produce rosé wines, and concentrations of phenolic molecules predominant in the peels and seeds, such as flavonoids, increased correspondingly.This finding is clearly shown in Figure 1a,c) for gallic acid and astilbin, respectively.
Apart from these differences attributable to the extraction yield depending on the maceration process, the occurrence of up-expressed metabolites when comparing different wine varieties is a common trend.The influence of grape variety on the compositional profiles of phenolic acids and polyphenols has been pointed out elsewhere in various recent papers [9,[11][12][13].Anyway, specific class biomarkers are quite unusual.
In the study, selective varietal molecules have not been found either, but the differences in the compositional profiles of Pinot Noir and Xarel•lo wines are statistically significant.In general, Pinot Noir samples display higher concentrations of most of the analytes while, as commented above, a reduced list of hydroxycinnamic acids is more characteristic of Xarel•lo.
As a much more novel aspect to be highlighted, this work has also revealed interesting correlation patterns between the wine quality and the polyphenolic composition.When all the other oenological factors were maintained constant (i.e., for samples of the same variety and type), a progressive increase in the phenolic content was observed with decreasing quality, as shown in Figure 1.The only exception to this practice was for caftaric and caffeic acids, reaching the highest concentration values for samples of top quality (A quality).This apparently odd behavior could be explained as hydroxycinnamic acids are mainly found in the grape pulp; hence, they were better preserved when the products were treated under neat and careful conditions, as with the A quality.In contrast, the other phenolic acids and flavonoids coming from skins, seeds, and other grape residues already started their lixiviation towards the must during the harvest and transport to the cellar, when the integrity of some grape berries was broken, and there was a release of juices to the medium.This grape berry alteration increased with decreasing the product quality, and, reasonably, the concentrations of released polyphenols increased from the best to the poorest qualities.Accordingly, the concentration of gallic and vanillic acids, astilbin, and epicatechin, among others, were higher in the products of lower quality.
The conclusions gained from boxplots and statistics were globally visualized by PCA.Results proved that the phenolic composition was an excellent source of information to address variety and quality issues.The unsupervised data analysis showed the natural sample structuration and clustering, offering great possibilities for sample discrimination according to these features.PC1 mainly captured the influence of the quality, with wines of the best quality located to the left section and samples of lower quality distributed to the right.Conversely, PC2 retained the data variance dealing with the variety since Pinot Noir samples were mainly to the bottom and Xarel•lo samples to the top.Moreover, samples sharing the same quality and variety attributes clustered together regardless of other oenological aspects such as the wine type (base wine, stabilized wine, and sparkling wine) and aging (3 and 7 months).This means that although a slight decay in concentrations of the target phenolic acids and flavonoids was found, compositional differences throughout the winemaking process were much lower than those associated with varietal and quality attributes.Here, the analyte levels were maintained approximately constant from the first to the second fermentation and aging, and the extent of oxidation, hydrolysis, and other (bio)chemical reactions was limited.
Regarding the sample descriptors, the loading plot revealed that caffeic acid and its derivative (caftaric acid) were more characteristic of high-quality wines.Despite not being specific markers, the best samples displayed amounts significantly higher of these molecules.In comparison, the rest of the compounds were unexpressed in C and D wines.Pinot Noir and Xarel•lo wines were distinguishable from the levels of other phenolic species such as vanillic, syringic, and gallic acids, and astilbin, which predominated in pinot noir.Similarly, for instance, coutaric acid was more characteristic of Xarel•lo products.
The promising sample discrimination already achieved by PCA without imposing any class supervision foresaw the great possibilities of this set of descriptors to conduct classification and authentication studies.This expectation was confirmed by PLS-DA, in which various illustrative cases were considered.Since the principal sample differences were attributable to the grape variety, the classification of wines into Pinot Noir and Xarel•lo was first assessed.All the samples were correctly assigned to their classes.Conclusions on the variety markers agreed with PCA.Further classification studies attempted according to the classification tree approach provided excellent results in the sample assignation according to qualities.Again, caftaric and caffeic acids were the principal markers of the top quality.

Conclusions
This manuscript proposes a new approach to characterize and classify wine samples based on polyphenolic profiling and chemometric methods for data processing.The quantitative information was efficiently interpreted based on principal component analysis and partial least squares-discriminant analysis, and the main sample patterns were encountered.Despite not being selective, some tentative quality descriptors such as caffeic and caftaric acids and some varietal markers (e.g., gallic acid, vanillic acid, and astilbin for pinot noir, and coutaric acid for Xarel•lo) were confirmed statistically since they were up-expressed in the corresponding classes.The conclusions on the descriptive potential of phenolic acids and polyphenols were based on a limited set of samples involving two grape varieties and four wine qualities.Hence, the example developed is just a proof of concept, but they could be generalized to cases dealing with other varieties and coupages.The results obtained are promising and open up new opportunities to study the qualities of wine products using polyphenolic profiles as a source of information.Likewise, the proposed approach can help to distinguish and authenticate wine samples.

Fermentation
model, thus suggesting that data were highly reproducible throughout the chromatographic sequence of analysis and supporting the soundness of the conclusions.

Figure 2 .
Figure 2. Heatmap expressing the average concentration of the phenolic compounds in the set of samples.The red color intensity indicates the concentration level.Sample acronyms have been defined in Table1.

Figure 2 .
Figure 2. Heatmap expressing the average concentration of the phenolic compounds in the set of samples.The red color intensity indicates the concentration level.Sample acronyms have been defined in Table1.

Fermentation 2023, 9
, 223 8 of 12 were also excellent, and only one misclassification occurred among the cross-validation results since one C sample was confounded with a D one.

Figure 3 .
Figure 3.An exploratory study of samples and variables by principal component analysis.(a) The plot of scores; (b) The plot of loadings.

Figure 4 .Table 2 .
Figure 4. PLS-DA results considering varieties and classes simultaneously.(a) The plot of scores; (b) The plot of loadings.Table 2. Results from two-class classification models obtained by PLS-DA according to the classification tree.

Figure 3 .
Figure 3.An exploratory study of samples and variables by principal component analysis.(a) The plot of scores; (b) The plot of loadings.

Figure 3 .
Figure 3.An exploratory study of samples and variables by principal component analysis.(a) The plot of scores; (b) The plot of loadings.

Figure 4 .
Figure 4. PLS-DA results considering varieties and classes simultaneously.(a) The plot of scores; (b) The plot of loadings.

Figure 4 .
Figure 4. PLS-DA results considering varieties and classes simultaneously.(a) The plot of scores; (b) The plot of loadings.

Table 1 .
Set of samples under study.

Table 2 .
Results from two-class classification models obtained by PLS-DA according to the classification tree.

Table 2 .
Results from two-class classification models obtained by PLS-DA according to the classification tree.