Multi-Sensor Characterization of Sparkling Wines Based on Data Fusion

: This paper is focused on the assessment of a multi-sensor approach to improve the overall characterization of sparkling wines (cava wines). Multi-sensor, low-level data fusion can provide more comprehensive and more accurate vision of results compared with the study of simpler data sets from individual techniques. Data from different instrumental platforms were combined in an enriched matrix, integrating information from spectroscopic (UV/Vis and FTIR), chromatographic, and other techniques. Sparkling wines belonging to different classes, which differed in the grape varieties, coupages, and wine-making processes, were analyzed to determine organic acids (e.g., tartaric, lactic, malic, and acetic acids), pH, total acidity, polyphenols, total antioxidant capacity, ethanol, or reducing sugars. The resulting compositional values were treated chemometrically for a more efﬁcient recovery of the underlaying information. In this regard, exploratory methods such as principal component analysis showed that phenolic compounds were dependent on varietal and blending issues while organic acids were more affected by fermentation features. The analysis of the multi-sensor data set provided a more comprehensive description of cavas according to grape classes, blends, and viniﬁcation processes. Hierarchical Cluster Analysis (HCA) allowed speciﬁc groups of samples to be distinguished, featuring malolactic fermentation and the chardonnay and red grape classes. Partial Least Squares-Discriminant Analysis (PLS-DA) also classiﬁed samples according to the type of grape varieties and fermentations. Bar charts and complementary statistic test were performed to better deﬁne the differences among the studied samples based on the most signiﬁcant markers of each cava wine type. As a conclusion, catechin, gallic, gentisic, caftaric, caffeic, malic, and lactic acids were the most remarkable descriptors that contributed to their discrimination based on varietal, blending, and oenological factors.


Introduction
Cava is a type of sparkling wine with Protected Designation of Origin (PDO), which is elaborated according to the Champenoise method based on a second fermentation in bottle [1][2][3]. Nowadays, more than 200 million liters of cava are yearly produced in Spain, thus becoming the most international type of Spanish wine in terms of turnover. Grape varieties for cava production mainly comprise the classical white varieties of Macabeo, Xarel·lo, and Parellada, which provide freshness, fruity and floral aromas, and an equilibrated acidity to the future wines. Although monovarietal cavas are sometimes produced, these three varieties are often combined using different percentages according to the enologist recommendations. In the last decades, however, some typical French grapes such Chardonnay and Pinot Noir have been introduced to potentiate other fruity flavors [1,4]. Chardonnay grape is increasingly popular, providing a high acidity and a good aromatic potency while Pinot Noir is used to produce rosé cavas after a short period of must maceration with husks. on the wine-making steps results in valuable source of information to track the production progress and to assess the quality of the final product. In this regard, for instance, tartaric acid is often considered a positive parameter that will condition the further aging process while gluconic and acetic acids are markers of grape or wine spoilage, respectively. Despite the great technological relevance of organic acids, compared to polyphenols and other wine components (e.g., VOCs or elemental composition), they have been scarcely exploited for discrimination, classification, and authentication purposes. Some papers deal with chemometric classifications of wines according to grape varieties, coupages and oenological practices [20], or geographical origins [21], using concentrations of organic acids determined by HPLC-UV/Vis. In another case, the information from organic acids has been complemented with the polyphenolic and elemental composition to try to extract more global conclusions [22].
This idea of combining information from different sources, instrumental techniques, or sensors as a way to improve the overall description of the system has been exploited under the so-called data fusion approach. This topic has been reviewed in excellent publications, which emphasize the chemometric challenges as well as the undeniable capacity for the assessment of food quality and the detection of frauds [23][24][25]. In this regard, electronic tongues or noses have appeared in analytical scenario as greatly powerful devices for dealing with characterization, classification, and authentication purposes [26,27]. These approaches take advantage of the cross selectivities exhibited by the different matrix components towards the different sensors. As an illustrative example, cava samples have been successfully classified according to the aging period using an array of differently modified graphite-epoxy electrodes [28]. In another application, Cetó and coworkers developed an array of four biosensors as a bioelectronic tongue to assess the phenolic levels of rosé cava wines [29]. Regarding data fusion with e-devices, Men et al. joined etongue and e-nose techniques to deal with the best beer flavor modeling using multivariate analysis. Results showed that the overall description improved via multi-sensor data fusion [30].
Focusing on the combination of chromatographic data with other instrumental sources, Arslan et al. pointed out that during the last 20 years the combination of several techniques, such as near infrared (NIR), mid infrared (MIR), Raman, nuclear magnetic resonance (NMR), fluorescence, and UV/vis spectroscopies, chromatographic techniques, MS, and electroanalytical and optical sensors, together with chemometric tools have been extensively used to assess the characteristics of alcoholic beverages [31]. Gaena et al. proposed the simultaneous analysis of phenolic and elemental composition to discriminate among Romanian wines according to geographical origin and variety [32]. Di Eligio and coworkers combined Fourier transform (FT)-NIR and (FT)IR spectroscopy for determination of sugars, alcohols, and phenolic compounds in red wines [33]. Data were treated by PCA and LDA to predict the fermentation stage from initial to final phase. Samples belonging to a particular fermentation step could be correctly classified and the main compositional changes during alcoholic fermentation were assessed. In another paper, a methodology based on multiparametric methods such as FTIR and voltammetric e-tongues were exploited to obtain parameters related to the phenolic content of red wines. In this case, data were treated by PCA to classify samples according to their phenolic content, and PLS provided high correlation coefficients in the prediction of phenolic levels [34]. Other interesting cases to characterize alcoholic beverages including wines, beers, and spirits relying on data fusion are summarized in Table 1. Apart from the diversity of instrumental techniques available, the fusion level and the multivariate methods that have been used are given as well. Low-level degree corresponds to a simple data union without any prior preprocessing of the different individual sets. As can be seen in Table 1, this approach has been widely used due to its simplicity and enhanced descriptive performance. In most of the cases herein mentioned, authors remark on the improvements attained in characterization and classification from data fusion compared to the independent analysis of each individual data set. For instance, in the case of rum classification by Belmonte-Sanchez et al., authors achieved acceptable results using 1 H-NMR data, although several samples were misclassified. In contrast, 100% classification rates were accomplished for all proposed categories using low-level data to integrate 1 H-NMR, GC-MS, and LC-MS data [43]. In another example, Perez-Beltran and coworkers proposed a method for the authentication of white tequilas by FTIR and chemometrics. A multi-block fusion approach combining IR spectra from different baseline corrections was used to improve the discrimination among pure and mixed tequilas. Results accomplished Chemosensors 2021, 9, 200 5 of 18 under low-level fusion avoided issues dealing with the selection of the most appropriate preprocessing procedure [39].
Data resulting from each method can be combined in a more sophisticated way using mid-or high-level approaches, which involve preliminary comprehensive data preprocessing to condense the information or make data dimensions compatible [23,25]. In this way, each data type can be more arranged/treated in a customized mode regardless of the others. For instance, PCA and other chemometric methods can be used to reduce the dimensionality of the original spectroscopic or chromatographic variables, from hundreds or thousands of values into a few of highly descriptive principal components. In a similar way, multivariate curve resolutions with alternating least squares (MCR-ALS) or parallel factor analysis (PARAFAC) have been widely used to treat multi-way data, such as data tensors obtained, for instance, from excitation-emission fluorescence or chromatography with diode array or mass spectrometry detection [36,37,44]. In these cases, these methods were used to decompose the original 3D data and extract the concentration profiles of species together with pure spectral information. Eventually, concentration profiles, PCA scores, and other types of preprocessed data can by fused to achieve more accurate and exhaustive interpretation of the sample behavior.
Since the scientific literature relates that multi-sensor data fusion provides excellent results for samples' description, this paper aimed at exploring the possibilities of the approach combining data from different sources to improve the characterization of cava samples. Descriptors consisting of overall indexes of antioxidant capacity such as FC and Ferric Reducing Antioxidant Power (FRAP) and acidity and concentrations of target polyphenols and organic acids were obtained from different sensing techniques including UV/vis spectroscopy, potentiometry, FTIR, HPLC, and enzymatic sensors. Data values were arranged in a row-wise augmented data matrix in which each row represents a sample and each column an analyte/sensor according to a low-level fusion approach. The resulting matrix was subsequently treated by chemometric methods such as PCA for a preliminary description of wine features and bar charts to reveal the most significant markers of each wine type. Hierarchical cluster analysis (HCA) provided dendrograms, grouping samples into Blanc de Noirs, rosé, and white wines. PLS-DA emphasized the separation of samples according to the grape variety and oenological processes.
Tartaric, malic, citric, succinic, fumaric, gluconic, acetic, and lactic acids (analytical reagent grade, Merck) were used to prepare stock solutions at g L −1 in Milli-Q water. Standard working solutions were prepared in the range of 1 to 5000 mg L −1 in water/acetonitrile solution (95/5, v/v, pH 2).
Reagents to be used for the spectrophotometric indexes were Folin-Ciocalteu (FC commercial reagent solution, Panreac ApplyChem), Fe(III) chloride (analytical reagent grade, Merck), 2,4,6-tripyridyl-S-triazine (TPTZ, 98%, Alfa Aesar, Germany), and hydrochloric acid (37% m/m, Merck). FRAP reagent was prepared by mixing 20 mmol L −1 Fe(III), Commercial reagent kits to be used for the enzymatic determination of L-malic acid, acetic acid, D-gluconic acid, and L-lactic acid were purchased from TDI (Gavà, Spain). All of these kits were ready to use except for acetic acid, which needed a previous preparation according to the manufacturer specifications.

Samples
Cava samples, kindly provided by Raventós Codorníu Group, corresponded to white and rosé cavas (vintage of 2016) produced in Penedès and Costers del Segre regions (both from Catalonia, Spain). Additional details are given in Table 2. Samples were filtered through a nylon membrane (nylon syringe filters, 13 mm and 0.45 µm pore size, Filter-Lab ® , Filtros Anoia, Sant Pere de Riudebitlles, Spain) prior to analysis. To minimize the possible influence of the analyte adsorption on the filter, the initial portion of filtrate (about 1 mL) was discarded while the following 1.5 mL were collected in an injection vial and were stored at 4 • C until the analyses. In these conditions, samples were stable for, at least, 2 weeks. A quality control (QC) solution was prepared mixing 50 µL of each cava sample. The QC was used to evaluate the reproducibility of the analytical methods and the significance of the PCA models. Cava samples were analyzed randomly, and the QC was repeatedly measured every 10 samples.

Instruments and Laboratory Equipment
The chromatographic system was composed of an Agilent Series 1100 HPLC Chromatograph (Agilent Technologies, Palo Alto, CA, USA) equipped with a quaternary pump (G1311A), a degasser (G1322A), an automatic injection system (G1392A), and a diode array detector (G1315B). The instrument was controlled with the Agilent ChemStation for LC 3D (Rev. A. 10.02) software (also used for data acquisition and processing).
Fourier-transform infrared spectroscopy (FTIR) measurements were carried out using a Foss FT2 WineScanTM (Foss, Hilleroed, Denmark) equipped with an ASX-260 autosampler. FTIR spectra were recorded from 2500 to 7785 nm, acquiring absorbance values every 5 nm. Foss software was used for control and data processing and was equipped with the chemometric tool PLS-regression.
L-malic, L-lactic, acetic, and D-gluconic acids were determined enzymatically using a multiparametric analyzer Miura 200 UV-method (TDI, Gavà, Barcelona, Spain). Miura 200 software was used for instrument control and data processing.
Potentiometric analysis was carried out using a pH-meter GLP 21 (Crison, Alella, Barcelona, Spain) equipped by a pH electrode 5021T (Hach, Loveland, CO, USA) with a large cylindrical pH glass membrane, an encapsulated Ag/AgCl reference system, and a built-in Pt temperature sensor for the automatic temperature compensation.

Phenolic Profiling by HPLC
Concentrations of target phenolic acids and flavonoids were determined by HPLC-UV according to a previously validated method [45]. Briefly, compounds were separated by reversed-phase mode using core-shell column (Kinetex, 100 mm × 4.6 mm I.D., 2.6 µm particle size from Phenomenex, Torrance, CA, USA). Analytes were separated by an elution gradient program created with 0.1% formic acid aqueous solution and methanol as the components of the mobile phase. The flow rate was 1 mL min −1 and the injection volume was 10 µL. Chromatograms were acquired at various wavelengths, namely, 280 nm to monitor benzoic acids and flavanols, 310 nm for cinnamic acids and stilbenes, and 370 nm for flavonols.

FC Assay
Antioxidant activities according to the FC method were estimated, as explained elsewhere [46]. The FC procedure consisted of mixing 1 mL of water and 250 µL of FC reagent in an amber glass vial. After 8 min, 250 µL of sample/standard, 75 µL of 7.5% w:v sodium carbonate, and water (up to a final volume of 5 mL) were added to the vial to get concentrations in the range of 0.2 to 5 mg L −1 gallic acid. The reaction was developed for 2 h and the absorbance was then recorded at 765 nm using the blank of reagents as the reference. Results were expressed in mg kg −1 gallic acid.

FRAP Assay
The FRAP method was developed according to reference [46] by mixing 300 µL of FRAP reagent and appropriate volumes of each sample/standards. The reacting solution was diluted up to a final volume of 2.5 mL with Milli-Q water. The calibration was carried out in the range 0.2 to 5 mg L −1 Trolox and the absorbance was measured at 595 nm in front of the reagent blank after 5 min of reaction. Concentration results were expressed in mg kg −1 Trolox.

Organic Acid Profiling by HPLC
Concentrations of each individual organic acid were determined by HPLC using a method optimized and validated elsewhere [20]. Compounds were separated in a C18 polar analytical column (Zorbax SB-Aq 4.6 mm ID × 150 mm, 5 µm particle size, Agilent Technologies) under an isocratic elution mode. The mobile phase was acidified water/acetonitrile solution (95/5, v/v adjusted to pH 2 with phosphoric acid). The flow rate was 1 mL min −1 and the injection volume was 10 µL. Chromatograms were recorded at 210 nm.

The pH
Values of pH of cava samples were measured with the combined pH electrode previously calibrated with standards of pH 4.0 and 7.0.

Total Acidity
The total acidity was determined by acid-base titration using 0.1 M sodium hydroxide as the reagent. The end point was obtained potentiometrically when the pH of the sample solution was 7.

Enzymatic Determination of Organic Acids
L-malic, L-lactic, acetic, and D-gluconic acids were also determined by enzymatic methodology according to the specifications of the commercial kits. Three µL for L-malic, L-lactic, and acetic acids and 4 µL for D-gluconic acid were injected into the system. The incubation time was 636 s, and the detection wavelength was 340 nm.
The determination of acetic acid relied on its reaction, using acetyl-CoA synthase, with coenzyme A in the presence of adenosine-5 -triphosphate producing acetyl-CoA, the reaction of acetic acid. Then, citrate synthase catalyzed the reaction of acetyl-CoA and oxaloacetate to form citrate. Oxaloacetate, consumed in this reaction, was formed from L-malic acid in the presence of malate dehydrogenase. NAD+ was reduced to NADH with the corresponding increase in absorbance directly dependent on the concentration of acetic acid.
L-malic acid determination relied on the oxidation of L-malic acid to oxaloacetate, catalyzed by L-malate dehydrogenase, with concomitant reduction of NAD+. The increase in absorbance due to NADH formation was proportional to the concentration of L-malic acid in the sample.
L-lactic acid was determined in a similar way, using L-lactate dehydrogenase to catalyze the oxidation of L-lactic acid to pyruvate with the concomitant reduction of NAD+. Again, the increase in absorbance due to NADH was proportional to L-lactic acid concentration in the sample.
D-gluconic acid was determined by reaction, catalyzed D-gluconate kinase, with adenosine-5 -triphosphate, producing D-gluconate-6-phospate. Subsequently, D-gluconate-6-phophate dehydrogenase and NADP+ were used to form ribulose-5-phosphate. The reduction of NADP+ to NADPH was responsible for the measured absorbance, which was proportional to the concentration of D-gluconic acid in the sample.

Fourier-Transform Infrared Spectroscopy
Total reducing sugars, pH, acetic acid, total acidity, malic acid, lactic acid, and alcohol degree were determined by FTIR. The injection volume was 15 mL and the stop time was 2 min per analysis. Spectra were recorded from 2500 to 7785 nm in steps of 5 nm. Sensing values of each FTIR parameter were obtained from the standard software of the instrument, operating according to predefined calibration models given by the manufacturer.

Data Analysis
Multivariate methods such as Principal Component Analysis (PCA), Hierarchical Cluster Analysis (HCA), and Partial Least Squares Discriminant Analysis (PLS-DA) were used for sample characterization according to a multi-sensor approach based on low-level data fusion. Multivariate statistics and ANOVA were also applied for data exploration using Microsoft Excel. In the multi-sensor approach, the data matrix of responses, also referred to as X-matrix, was arranged by row-wise augmentation. Each row corresponded to a given sample and each column to the concentration of a target compound from a given method. As a result, dimensions of the whole data fusion matrix were 60 samples × 37 variables-55 cava wines + 5 quality controls (QCs) analyzed every 10 samples. For HCA and PLS-DA, there were 55 samples × 37 variables, since QCs were excluded. Additionally, other submatrices were also analyzed, such as those dealing with phenolic and organic acid data, separately. Hence, the matrix dimensions were 60 samples × 16 variables for polyphenols and 60 samples × 21 variables for organic acids.
PCA is perhaps the most versatile chemometric method, being very efficient for exploratory studies in the field of food analysis. PCA relies on the concentration of the chemical information contained originally in the X-matrix of responses (here, concentrations of selected species resulting from the different techniques) into a reduced group of mathematical variables, the so-called principal components (PCs). In this process, submatrices of scores (coordinates of the samples) and loadings (eigenvalues) were calculated to retain the maximum amount of relevant information. The scatter plot of scores, for instance, of PC1 vs. PC2, may reveal some patterns, similarities, and differences that might be attributed to the wine features such as grape varieties, coupages, and some wine-making practices. Complementarily, the plots of loadings show the distribution of variables (here, some wine compounds), thus revealing some correlations among variables. In addition, those significant descriptors or class markers can be identified from the interpretation of the plot of loadings.
HCA is an unsupervised method that was here carried out under a divisive approach as it was found that it provided the better discrimination results. HCA seeks to build a hierarchy of clusters, from top to bottom, by applying a k-means clustering algorithm, which determines the distance of each object to the centroids and groups the object on the basis of minimum Euclidean distance. Autoscaling was also applied to the X-matrix as a preprocessing method. Once all samples were assigned, the corresponding dendrogram was constructed in which samples with similar characteristics were grouped.
PLS-DA is a supervised classification method based on a set of well-known samples (calibration or training set) belonging to the (two or more) predefined classes, such as white, rosé, and Blanc de Noirs. The multi-sensor X-matrix was correlated with the Y-matrix of class assignation, which was encoded numerically. The classification model was built to reach the minimum error in the assignation of calibration samples into the corresponding classes. Cross validation is often used to estimate the optimum number of latent variables (LVs) to be used. PLS-DA models can be interpreted in a similar way to PCA to try to find markers of each class. The classification performance can be evaluated by external validation using a test set.

Phenolic Compounds
Polyphenolic data relied on individual polyphenol concentrations as well as reducing power indexes. Table S1 summarizes the concentration of polyphenolic compounds and related indexes, with average values, standard deviations, and maximum and minimum concentrations. Levels of some abundant analytes, including caftaric, gentisic, vanillic, gallic, homogentisic, caffeic, syringic, ferulic, and p-coumaric acids, (+)-catechin, and (−)epicatechin, were determined by HPLC-UV/vis. Additionally, the overall polyphenolic concentration and the antioxidant activities were estimated by FC and FRAP spectrophotometric methods. In Table S1, RSD% values indicate the variability of concentrations as a measure of discriminating power of variable, thus suggesting the potential descriptive capacity of compounds such as protocatechuic, homogentisic, and syringic acids, with RSD values above 100%.
The resulting data consisted of the X-matrix of responses to be treated by PCA. Concentration values were autoscaled to equalize the contribution of major and minor components in the description. Results from PCA are summarized in Figure 1. The scatter plot of scores of PC1 vs. PC2 (Figure 1a) showed three main clusters, which corresponded to rosé, chardonnay-based, and Macabeu, Xarel.lo, and Parellada combination, on the bottomright, top-left, and bottom-left areas, respectively. The scores map indicated that, using polyphenols and antioxidant indexes as the source of information, the description was mainly governed by the main varietal constituents of the coupage. QCs appeared in a compact group in the middle of the scores plot, thus indicating the excellent reproducibility of chromatographic data as well as the descriptive ability of the model.
using polyphenols and antioxidant indexes as the source of information, the description was mainly governed by the main varietal constituents of the coupage. QCs appeared in a compact group in the middle of the scores plot, thus indicating the excellent reproducibility of chromatographic data as well as the descriptive ability of the model.  Table 2.
The plot of loadings (Figure 1b) suggested that rosé samples were richer in this type of analyte (i.e., overall phenolic concentration and antioxidant activities were higher for these samples). Additionally, compounds such as gallic, ferulic, syringic, homogentisic, and protocatechuic acids and ethyl gallate were also more abundant in rosé samples. Species such as caffeic, caftaric, and p-coumaric acids were important in rosé coupages, but their concentrations were also remarkable in blends with a high percentage of Chardonnay, in which gentisic acid was especially significant. In contrast, coupages with predominance of Macabeu, Xarel.lo, and Parellada, in general, displayed poorer overall polyphenol concentrations but with a higher quantity of (+)-catechin.
The sample behavior regarding other features such as application of MLF and aging period was scarcely explained by this model, thus indicating that the descriptive ability of polyphenols in these issues was more limited.

Organic Acids
Organic acids were determined in different ways, namely, (1) HPLC for tartaric, citric, succinic, acetic, gluconic, malic, and lactic acids; (2) enzymatic methods for acetic and specific stereoisomers of D-gluconic, L-malic, and L-lactic acids; and (3) FTIR with multivariate calibration for total reducing sugars, alcoholic degree, pH, total acidity, and malic, lactic, and acetic acids. A data summary for the different classes, including average values, standard deviations, and maximum and minimum values, is given in Tables S2 and S3. According to the RSD values, the most discriminant variables were here malic and acid lactic since their concentrations varied substantially among samples, especially due to the remarkable influence of the application or not of MLF to the vinification.
Results of organic acids and related parameters were collected in an X-matrix to be analyzed by PCA. As above, data were autoscaled prior to PCA modelling to provide similar weights to all the variables. A first PCA model was examined using FTIR data, as this technique has been successfully used in multiple characterization and authentication studies. Results depicted in the bi-plot (see Figure S1 in supplementary material section) indicated that wines not subjected to MLF appeared far away from the rest of the samples. There was no clear distribution of cavas according to other attributes, although blends with predominance of Macabeu, Xarel·lo, and Parellada tended to be located on the right  Table 2.
The plot of loadings (Figure 1b) suggested that rosé samples were richer in this type of analyte (i.e., overall phenolic concentration and antioxidant activities were higher for these samples). Additionally, compounds such as gallic, ferulic, syringic, homogentisic, and protocatechuic acids and ethyl gallate were also more abundant in rosé samples. Species such as caffeic, caftaric, and p-coumaric acids were important in rosé coupages, but their concentrations were also remarkable in blends with a high percentage of Chardonnay, in which gentisic acid was especially significant. In contrast, coupages with predominance of Macabeu, Xarel.lo, and Parellada, in general, displayed poorer overall polyphenol concentrations but with a higher quantity of (+)-catechin.
The sample behavior regarding other features such as application of MLF and aging period was scarcely explained by this model, thus indicating that the descriptive ability of polyphenols in these issues was more limited.

Organic Acids
Organic acids were determined in different ways, namely, (1) HPLC for tartaric, citric, succinic, acetic, gluconic, malic, and lactic acids; (2) enzymatic methods for acetic and specific stereoisomers of D-gluconic, L-malic, and L-lactic acids; and (3) FTIR with multivariate calibration for total reducing sugars, alcoholic degree, pH, total acidity, and malic, lactic, and acetic acids. A data summary for the different classes, including average values, standard deviations, and maximum and minimum values, is given in Tables S2 and S3. According to the RSD values, the most discriminant variables were here malic and acid lactic since their concentrations varied substantially among samples, especially due to the remarkable influence of the application or not of MLF to the vinification.
Results of organic acids and related parameters were collected in an X-matrix to be analyzed by PCA. As above, data were autoscaled prior to PCA modelling to provide similar weights to all the variables. A first PCA model was examined using FTIR data, as this technique has been successfully used in multiple characterization and authentication studies. Results depicted in the bi-plot (see Figure S1 in supplementary material section) indicated that wines not subjected to MLF appeared far away from the rest of the samples. There was no clear distribution of cavas according to other attributes, although blends with predominance of Macabeu, Xarel·lo, and Parellada tended to be located on the right side. It was, thus, concluded that information gained from FTIR was just limited to the occurrence of MLF and the main descriptors; leading this characterization were malic and lactic acids, which were negatively correlated. In addition, the varieties of rosé cava tended to have a slightly higher alcoholic strength.
Results from PCA revealed the most dramatic differences occurred among the cava class subjected to MLF with respect to the others, with high and low scores on PC1, respectively. PC2 showed some trends dealing with the abundance of Chardonnay in the blends. Figure 2 depicts the information gained from PC1 and PC3. The scatter plot of scores (Figure 2a) showed that samples were grouped in three marked clusters in which, besides the segregation I class (non MLF), a trend was observed from coupages with 100% of the three classical varieties (Macabeu, Xarel·lo, and Parellada) to 100% Chardonnay. Samples with predominance of red grape varieties (classes P, V, and T) were located in close positions in the upper central area. QCs were located in a compact group in the central area of the graph, thus proving that the overall model was reliable and robust. side. It was, thus, concluded that information gained from FTIR was just limited to the occurrence of MLF and the main descriptors; leading this characterization were malic and lactic acids, which were negatively correlated. In addition, the varieties of rosé cava tended to have a slightly higher alcoholic strength.
Results from PCA revealed the most dramatic differences occurred among the cava class subjected to MLF with respect to the others, with high and low scores on PC1, respectively. PC2 showed some trends dealing with the abundance of Chardonnay in the blends. Figure 2 depicts the information gained from PC1 and PC3. The scatter plot of scores (Figure 2a) showed that samples were grouped in three marked clusters in which, besides the segregation I class (non MLF), a trend was observed from coupages with 100% of the three classical varieties (Macabeu, Xarel·lo, and Parellada) to 100% Chardonnay. Samples with predominance of red grape varieties (classes P, V, and T) were located in close positions in the upper central area. QCs were located in a compact group in the central area of the graph, thus proving that the overall model was reliable and robust.  Table 2.
The study of loadings ( Figure 2b) agreed with previous conclusions on organic acids and, as expected, lactic and malic acids were the respective markers of MLF (or not MLF). Chardonnay cavas presented the highest acidity with respect to the others but the rosé cavas were more characterized to present the highest tartaric acid concentrations. Concentrations of sugars were quite homogeneous for all the samples (brut type) except for the Chardonnay one (brut nature type).
The information gained from pH and ethanol was quite limited, as all the values were highly homogeneous, displaying a narrow range of variability, from 2.9 to 3.1 and 11 to 11.9% (v/v), respectively. Values of acetic acid were practically negligible, with concentrations by FTIR nearly 0.2 g L −1 in all the studied samples.

Data Fusion
In this part, the two main types of target compounds, namely polyphenols and organic acids, were simultaneously studied, comparing data from different techniques. This approach relied on a multi-sensing matrix-simply combined as in a row-wise arrangement according to a low-level data fusion process-to integrate the information from each particular subset in a comprehensive model. Hence, the partial insight deduced from the interpretation of models from a given family or technique could be improved from another point of view.
Here, a global PCA model was created involving all target compounds (see Figure  3). Concentrations values were autoscaled prior to PCA treatment to equalize the  Table 2.
The study of loadings ( Figure 2b) agreed with previous conclusions on organic acids and, as expected, lactic and malic acids were the respective markers of MLF (or not MLF). Chardonnay cavas presented the highest acidity with respect to the others but the rosé cavas were more characterized to present the highest tartaric acid concentrations. Concentrations of sugars were quite homogeneous for all the samples (brut type) except for the Chardonnay one (brut nature type).
The information gained from pH and ethanol was quite limited, as all the values were highly homogeneous, displaying a narrow range of variability, from 2.9 to 3.1 and 11 to 11.9% (v/v), respectively. Values of acetic acid were practically negligible, with concentrations by FTIR nearly 0.2 g L −1 in all the studied samples.

Data Fusion
In this part, the two main types of target compounds, namely polyphenols and organic acids, were simultaneously studied, comparing data from different techniques. This approach relied on a multi-sensing matrix-simply combined as in a row-wise arrangement according to a low-level data fusion process-to integrate the information from each particular subset in a comprehensive model. Hence, the partial insight deduced from the interpretation of models from a given family or technique could be improved from another point of view.
Here, a global PCA model was created involving all target compounds (see Figure 3). Concentrations values were autoscaled prior to PCA treatment to equalize the contribution of major and minor components in this overall description. In Figure 3a, the scatter plot of scores of PC1 vs. PC2 showed two main areas corresponding to wines with and without MLF (left and right sides, respectively). The variability of the experimental data was assessed from the dispersion of the QCs, which appeared in a compact group in the middle of the scores plot, thus, proving that the overall model was reliable and robust. After the first alcoholic fermentation, MLF was applied to avoid the strong acidity in palate due to malic acid, in a process to convert the later acid into lactic acid, thus conferring creamier taste attributes to wine. All the sparkling wines were subjected to MLF except for coupage I.
contribution of major and minor components in this overall description. In Figure 3a, the scatter plot of scores of PC1 vs. PC2 showed two main areas corresponding to wines with and without MLF (left and right sides, respectively). The variability of the experimental data was assessed from the dispersion of the QCs, which appeared in a compact group in the middle of the scores plot, thus, proving that the overall model was reliable and robust. After the first alcoholic fermentation, MLF was applied to avoid the strong acidity in palate due to malic acid, in a process to convert the later acid into lactic acid, thus conferring creamier taste attributes to wine. All the sparkling wines were subjected to MLF except for coupage I.  Table 2.
The separation according to PC2 showed two main patterns associated to the vinification and type of grape, with red or white grapes on the top and bottom areas, respectively. Red cava wines are more susceptible to contain high quantities of polyphenols due to the must maceration with their grape skins; thus, coupages P, V, and T made with red grapes varieties with a period of maceration displayed high quantities of polyphenols. In addition, blends W and I, made under vinification of blanc de noirs (i.e., red grapes with no maceration), appeared in the middle of the model, between white and rosé cava wines. Figure 2b represents the loadings plot to see in more detail which target compounds were relevant in each case. Although it is a multiparametric matrix offering global information dealing with a wide range of compounds, the separation according to PC1 was clearly focused on the MLF process, with malic acid and total acidity dominating the  Table 2.
The separation according to PC2 showed two main patterns associated to the vinification and type of grape, with red or white grapes on the top and bottom areas, respectively. Red cava wines are more susceptible to contain high quantities of polyphenols due to the must maceration with their grape skins; thus, coupages P, V, and T made with red grapes varieties with a period of maceration displayed high quantities of polyphenols. In addition, blends W and I, made under vinification of blanc de noirs (i.e., red grapes with no maceration), appeared in the middle of the model, between white and rosé cava wines. Figure 2b represents the loadings plot to see in more detail which target compounds were relevant in each case. Although it is a multiparametric matrix offering global information dealing with a wide range of compounds, the separation according to PC1 was clearly focused on the MLF process, with malic acid and total acidity dominating the right side of the graph and lactic acid and pH to the left. Regarding the phenolic data, except for gentisic acid, which was more specific of the white grapes, all individual compounds and antioxidant indexes appeared in the upper part, in agreement with previous conclusions, indicating that rosé wines were richer in these species.
Since MLF vs. non-MLF was the principal feature of this overall description, another PCA model was created, excluding those samples belonging to class I (non-MLF). The corresponding results are depicted in Figure 3c,d. In this case, PC1 mainly discriminated among rose and white wines. This distribution was dependent on the phenolic content, with the richest samples on the right and the purest on the left. In particular, aged wines (class K) were grouped apart from other younger wines, mainly because of the lowest phenolic contents and poorest antioxidant indexes. Hence, polyphenols' protocatechuic, caftaric, caffeic, p-coumaric, homogentisic, syringic, gallic, ferulic acids, ethyl gallate and FC and FRAP indexes predominated in rosé wines. PC2 was able to discriminate among the different white cavas according to the gradation of Chardonnay in the blends. Then, monovarietal Chardonnay samples were located at the top, while those composed of Macabeu, Xarel.lo, and Parellada were at the bottom. In the middle, different groups were observed, which distributed according to the Chardonnay percentages. This behavior was led by the composition of some organic acids and polyphenols.
HCA was here applied to obtain complementary information on the analogies and differences among sample types. A dendrogram based on the K-nearest algorithm is given in Figure 4. As can be seen, the group of samples belonging to Blanc de Noirs without MLF was the most dissimilar cluster. Subsequently, the monovarietal Chardonnay and rosé wines were separated from the main group, which consisted of the rest of white sparkling wines. Within this group, further fragmentations corresponded to Blanc de Noirs with MLF and cavas with predominance of Macabeu, Xarel.lo, and Parellada. The remaining cluster contained different percentages of blends of Chardonnay and Macabeu, Xarel.lo, and Parellada. This description was essentially in agreement with the aforementioned results by PCA, although no information on potential descriptors was here available.
right side of the graph and lactic acid and pH to the left. Regarding the phenolic data, except for gentisic acid, which was more specific of the white grapes, all individual compounds and antioxidant indexes appeared in the upper part, in agreement with previous conclusions, indicating that rosé wines were richer in these species.
Since MLF vs. non-MLF was the principal feature of this overall description, another PCA model was created, excluding those samples belonging to class I (non-MLF). The corresponding results are depicted in Figure 3c,d. In this case, PC1 mainly discriminated among rose and white wines. This distribution was dependent on the phenolic content, with the richest samples on the right and the purest on the left. In particular, aged wines (class K) were grouped apart from other younger wines, mainly because of the lowest phenolic contents and poorest antioxidant indexes. Hence, polyphenols' protocatechuic, caftaric, caffeic, p-coumaric, homogentisic, syringic, gallic, ferulic acids, ethyl gallate and FC and FRAP indexes predominated in rosé wines. PC2 was able to discriminate among the different white cavas according to the gradation of Chardonnay in the blends. Then, monovarietal Chardonnay samples were located at the top, while those composed of Macabeu, Xarel.lo, and Parellada were at the bottom. In the middle, different groups were observed, which distributed according to the Chardonnay percentages. This behavior was led by the composition of some organic acids and polyphenols.
HCA was here applied to obtain complementary information on the analogies and differences among sample types. A dendrogram based on the K-nearest algorithm is given in Figure 4. As can be seen, the group of samples belonging to Blanc de Noirs without MLF was the most dissimilar cluster. Subsequently, the monovarietal Chardonnay and rosé wines were separated from the main group, which consisted of the rest of white sparkling wines. Within this group, further fragmentations corresponded to Blanc de Noirs with MLF and cavas with predominance of Macabeu, Xarel.lo, and Parellada. The remaining cluster contained different percentages of blends of Chardonnay and Macabeu, Xarel.lo, and Parellada. This description was essentially in agreement with the aforementioned results by PCA, although no information on potential descriptors was here available. A study case dealing with a supervised classification of cava samples was carried out using PLS-DA in which three classes of sparkling wines were defined according to some natural patterns found from PCA and HCA studies, namely, (1) white cavas, produced from white grapes (here, Macabeu, Xarel·lo, Parellada, and chardonnay); (2) rosé cavas, A study case dealing with a supervised classification of cava samples was carried out using PLS-DA in which three classes of sparkling wines were defined according to some natural patterns found from PCA and HCA studies, namely, (1) white cavas, produced from white grapes (here, Macabeu, Xarel·lo, Parellada, and chardonnay); (2) rosé cavas, produced mainly from red grapes (pinot noir, garnatxa Negra, and trepat) under rosé vinification (must maceration with peels for a preestablished time to extract a part of natural coloring components); a certain percentage of chardonnay was also added to one blend (coupage T), although the rosé components were always predominant; (3) blanc de noir cavas, produced from red grapes of pinot noir variety following a white vinification process (i.e., without must maceration with skins and seeds so coloring component extraction was negligible).
For the classification study, the cava samples were divided into the calibration and validation sets to build the model and make independent predictions of class membership, respectively. For that purpose, 60% and 40% of samples were randomly assigned to each set. Some representative results of the PLS-DA classification are shown in Figure 5. The optimum number of latent variables (LVs) to carry out the predictions was estimated according to cross validation based on Venetian blinds. As can be seen in Figure 5a, 3 LVs were chosen for modelling the three classes. Figure 5b shows the scatter plot of scores of LV1 vs. LV2 in which samples belonging to the calibration set were represented with empty symbols while those included in the validation set were the solid ones. As can be seen, there was a clear sample discrimination depending on the classes defined for both training and test sets. The most characteristic variables in this segregation were caftaric, gallic and caffeic acids, and lactic acid as well as antioxidant indexes for rosé cavas and malic acid and other organic acids for blanc de noirs, while white ones contained lower levels of malic acid and most of polyphenols. natural coloring components); a certain percentage of chardonnay was also added to one blend (coupage T), although the rosé components were always predominant; (3) blanc de noir cavas, produced from red grapes of pinot noir variety following a white vinification process (i.e., without must maceration with skins and seeds so coloring component extraction was negligible).
For the classification study, the cava samples were divided into the calibration and validation sets to build the model and make independent predictions of class membership, respectively. For that purpose, 60% and 40% of samples were randomly assigned to each set. Some representative results of the PLS-DA classification are shown in Figure 5. The optimum number of latent variables (LVs) to carry out the predictions was estimated according to cross validation based on Venetian blinds. As can be seen in Figure 5a, 3 LVs were chosen for modelling the three classes. Figure 5b shows the scatter plot of scores of LV1 vs. LV2 in which samples belonging to the calibration set were represented with empty symbols while those included in the validation set were the solid ones. As can be seen, there was a clear sample discrimination depending on the classes defined for both training and test sets. The most characteristic variables in this segregation were caftaric, gallic and caffeic acids, and lactic acid as well as antioxidant indexes for rosé cavas and malic acid and other organic acids for blanc de noirs, while white ones contained lower levels of malic acid and most of polyphenols. The predictive ability of PLS-DA was first estimated for both calibration and validation samples. A representative example showing the assignation of white cavas in front of the other classes is depicted in Figure 5c. The prediction of the rosé and blanc de noir samples can be found in Figure S2 of the supplementary material. Additionally, sensitivity and selectivity results are summarized in Table S4. As can be seen, all the calibration samples were correctly assigned to their respective classes. Excellent prediction results were also obtained for the validation samples with only one misclassified sample (a white sample assigned as rosé type).

Bar Charts
Conclusions drawn by PCA comparing according to the differences in the composition of target compounds were crosschecked graphically from various bar plots. Figure 6 shows the five most significant polyphenols found (catechin, gallic, gentisic, caftaric, and caffeic acids). As pointed out, all coupages had high levels of gentisic acid The predictive ability of PLS-DA was first estimated for both calibration and validation samples. A representative example showing the assignation of white cavas in front of the other classes is depicted in Figure 5c. The prediction of the rosé and blanc de noir samples can be found in Figure S2 of the supplementary material. Additionally, sensitivity and selectivity results are summarized in Table S4. As can be seen, all the calibration samples were correctly assigned to their respective classes. Excellent prediction results were also obtained for the validation samples with only one misclassified sample (a white sample assigned as rosé type).

Bar Charts
Conclusions drawn by PCA comparing according to the differences in the composition of target compounds were crosschecked graphically from various bar plots. Figure 6 shows the five most significant polyphenols found (catechin, gallic, gentisic, caftaric, and caffeic acids). As pointed out, all coupages had high levels of gentisic acid but coupage G and S presented the highest amounts since they were elaborated with high percentages of Chardonnay variety (e.g., 100% and 50%, respectively). Coupage V was highlighted for its high amount of (+)-catechin. Gallic acid was representative of rosé cavas, and caftaric acid was representative of rosé and Chardonnay ones.
but coupage G and S presented the highest amounts since they were elaborated with high percentages of Chardonnay variety (e.g., 100% and 50%, respectively). Coupage V was highlighted for its high amount of (+)-catechin. Gallic acid was representative of rosé cavas, and caftaric acid was representative of rosé and Chardonnay ones.  Table 2.
Regarding organic acids and other parameters, all coupages presented similar amounts of tartaric acid but total acidity was more noticeable in Chardonnay ones. Figure  7 shows levels of malic and lactic from HPLC profiling were higher than those from enzymatic and FTIR techniques that were focused on given L or D stereoisomers. Additionally, as commented above, coupage I was featured by the highest amount of lactic acid and the lowest of malic acid. ANOVA of two factors with replicates was applied to evaluate the concentrations' malic and lactic acids from the different methods. It was found that, with 95% of statistical confidence, concentrations from the methods differed significantly. Indeed, values of these acids from these three methods were different due to the fact that HPLC accounted for both isomers L and D while FTIR and enzymatic procedures only detected the L isomer.  Table 2.

Conclusions
Results obtained here revealed some trends in the sparkling wine behavior from both types of target compounds, polyphenols, and organic acids. First, rosé vinifications showed they highest overall amounts of phenolic acids due to the lixiviation during the maceration process of grape skins with must. Despite being white varieties, Chardonnay wines were quite rich in polyphenols. Malic and lactic acids were the most relevant  Table 2.
Regarding organic acids and other parameters, all coupages presented similar amounts of tartaric acid but total acidity was more noticeable in Chardonnay ones. Figure 7 shows levels of malic and lactic from HPLC profiling were higher than those from enzymatic and FTIR techniques that were focused on given L or D stereoisomers. Additionally, as commented above, coupage I was featured by the highest amount of lactic acid and the lowest of malic acid. ANOVA of two factors with replicates was applied to evaluate the concentrations' malic and lactic acids from the different methods. It was found that, with 95% of statistical confidence, concentrations from the methods differed significantly. Indeed, values of these acids from these three methods were different due to the fact that HPLC accounted for both isomers L and D while FTIR and enzymatic procedures only detected the L isomer.
but coupage G and S presented the highest amounts since they were elaborated with high percentages of Chardonnay variety (e.g., 100% and 50%, respectively). Coupage V was highlighted for its high amount of (+)-catechin. Gallic acid was representative of rosé cavas, and caftaric acid was representative of rosé and Chardonnay ones.  Table 2.
Regarding organic acids and other parameters, all coupages presented similar amounts of tartaric acid but total acidity was more noticeable in Chardonnay ones. Figure  7 shows levels of malic and lactic from HPLC profiling were higher than those from enzymatic and FTIR techniques that were focused on given L or D stereoisomers. Additionally, as commented above, coupage I was featured by the highest amount of lactic acid and the lowest of malic acid. ANOVA of two factors with replicates was applied to evaluate the concentrations' malic and lactic acids from the different methods. It was found that, with 95% of statistical confidence, concentrations from the methods differed significantly. Indeed, values of these acids from these three methods were different due to the fact that HPLC accounted for both isomers L and D while FTIR and enzymatic procedures only detected the L isomer.  Table 2.

Conclusions
Results obtained here revealed some trends in the sparkling wine behavior from both types of target compounds, polyphenols, and organic acids. First, rosé vinifications showed they highest overall amounts of phenolic acids due to the lixiviation during the maceration process of grape skins with must. Despite being white varieties, Chardonnay wines were quite rich in polyphenols. Malic and lactic acids were the most relevant  Table 2.

Conclusions
Results obtained here revealed some trends in the sparkling wine behavior from both types of target compounds, polyphenols, and organic acids. First, rosé vinifications showed they highest overall amounts of phenolic acids due to the lixiviation during the maceration process of grape skins with must. Despite being white varieties, Chardonnay wines were quite rich in polyphenols. Malic and lactic acids were the most relevant organic acids in these descriptions and allowed us to discriminate cavas according to the application or not of malolactic fermentation (MLF).
Conclusions extracted separately from the analyses of each family of analytes could be visualized together from the simultaneous analyses of all sensing variables based on a lowlevel data fusion approach. As a result, concentration values of a wide range of analytes, including both individual and overall data, were joined in an augmented arrangement to be further analyzed by chemometric methods. The MLF was again the most remarkable feature for sample discrimination. Anyway, once samples not subjected to MLF were excluded from the analysis, interesting patterns were revealed, which depended on both phenolic acid and organic acid components. Hence, additional trends such as aging and percentage of Chardonnay in the blends could be visualized.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/chemosensors9080200/s1. Table S1: Average, standard deviation, RSD (%), and maximum and minimum concentrations of polyphenols in the set of samples under study. Standard deviation and relative standard deviation indicate the variability of concentrations as a measure of discriminating capacity among samples. Table S2: Average, standard deviation, RSD (%), and maximum and minimum concentrations of organic acids in the set of samples under study from enzymatic and HPLC methods. Table S3: Average, standard deviation, RSD (%), and maximum and minimum values from FTIR, potentiometric, and volumetric methods in the set of samples under study. Table S4: Summary of classification results by PLS-DA for the assignation of white, blanc de noirs, and rosé cava samples. Figure S1: PCA results showing the biplot of PC1 vs. PC2 from the study of FTIR data. Plot of scores (a) and plot of loadings (b). Cava class assignation: see Table 1. Figure