Simultaneous Monitoring of the Evolution of Chemical Parameters in the Fermentation Process of Pineapple Fruit Wine Using the Liquid Probe for Near-Infrared Coupled with Chemometrics

This study used Fourier transform-near-infrared (FT-NIR) spectroscopy equipped with the liquid probe in combination with an efficient wavelength selection method named searching combination moving window partial least squares (SCMWPLS) for the determination of ethanol, total soluble solids, total acidity, and total volatile acid contents in pineapple fruit wine fermentation using Saccharomyces cerevisiae var. burgundy. Two fermentation batches were produced, and the NIR spectral data of the calibration samples in the wavenumber range of 11,536–3952 cm−1 were obtained over ten days of the fermentation period. SCMWPLS coupled with second derivatives searched and optimized spectral intervals containing useful information for building calibration models of four parameters. All models were validated by test samples obtained from an independent fermentation batch. The SCMWPLS models showed better predictions (the lowest value of prediction error and the highest value of residual predictive deviation) with acceptable statistical results (under confidence limits) among the results achieved by using the whole region. The results of this study demonstrated that FT-NIR spectroscopy using a liquid probe coupled with SCMWPLS could select the optimized wavelength regions while reducing spectral points and increasing accuracy for simultaneously monitoring the evolution of four chemical parameters in pineapple fruit wine fermentation.


Introduction
Pineapple (Ananas comosus L.) originates from South America and is one of the most favoured subtropical fruits cultivated (above 20% of the tropical fruit generated in the world) and consumed worldwide. It is a drought-tolerant plant with good taste [1]. The top three pineapple producers worldwide in 2019 were reported to be Costa Rica (3328.10 × 10 3 metric tons), the Philippines (2747.86 × 10 3 metric tons), and Brazil (2426.53 × 10 3 metric tons), while Thailand is ranked seventh (1679.67 × 10 3 metric tons) [2]. The fruit is frequently consumed fresh and used in the food industry (canned fruit, jam, and concentrated juice) for alcoholic beverage production [3] and fibre production [1]. In Thailand, pineapple wine is popularly consumed because of its unique taste, colour, and flavour. The consumption trend of wine made from pineapple and other fruits is likely to increase, especially among health-conscious consumers, because fruit wines are also nutritious and healthy [4,5].  Accordingly, the wavelength selection method is a crucial tool for searching the relevant information to improve the quality of prediction models for NIR analysis in wine fermentation. Advanced chemometrics, namely searching combination moving window partial least squares (SCMWPLS) [24], has been proposed to improve the performance of a PLS model. It functions as a spectral selection method to locate and optimize informative regions through spectra. The ability of calibration models can be improved by building the PLS models using the optimized informative regions found by SCMWPLS. The potentials of SCMWPLS were demonstrated and appeared in previously published reports [24][25][26][27]. However, no reports appear to have been published on the quantification of multiple components in pineapple fruit wine during the fermentation process using NIR spectroscopy in combination with the wavelength selection method. Therefore, the objectives of this study were (1) to investigate the feasibility of using NIR spectroscopy coupled with SCMWPLS in finding and optimizing informative spectral regions for simultaneous monitoring of the evolution of ethanol, total soluble solids (TSS), total acidity (TA), and total volatile acids (TVA) in pineapple fruit wine during fermentation, and (2) the use of an NIR liquid probe for immediate monitoring without sample preparation.

Yeast Culture Preparation
Saccharomyces cerevisiae var. burgundy, the primary yeast for general wine fermentation used in this study, was obtained from the Institute of Food Research and Product Development (IFRPD), Kasetsart University, Thailand. Yeast strains were activated on YPD agar for 24 to 48 h before use. An inoculum of 5% (V V −1 ) was prepared by mixing pineapple juice with yeast colonies (1 × 10 5 CFU mL −1 ) for an incubation time of 24 h as a starter.

Preparation of Pineapple Must
Pineapple samples at the ripe stage were purchased from a local market in Bangkok, Thailand. They were cleaned, peeled, and crushed. The ratio of pineapple juice to water was adjusted to obtain a 2:1 optimum ratio. The initial sugar concentration of the pineapple juice was adjusted to 25 • Brix by adding sucrose. Potassium metabisulfite (K 2 S 2 O 5 ) was then added for decontamination to achieve a 75-100 mg L −1 final concentration.

Pineapple Juice Fermentation
The fermentation of pineapple wine was performed in a polyethylene terephthalate bottle with a working volume of 15 L. Inoculum yeast cultures (5% V V −1 ) were used as a starter for wine fermentation in sterile pineapple juice. Fermentation was conducted for ten days at a controlled temperature of 30 • C using a water bath system. In total, three batches of fermentations were independently performed and employed two batches for calibration development and one batch for testing the predictive performance of the calibration model in this study.

NIR Liquid Probe Spectral Acquisition
A liquid fibre-optic probe (IN271P-02, Bruker Optics GmbH & Co. KG, Ettlingen, Germany) was used to collect the spectral data of the liquid wine sample in transflection mode. The NIR spectral information obtained using a transflectance probe provided an adequate signal dominating from both transmittance and reflectance information. The probe length was 14 cm, with a fixed optical path length of 2 mm (slit 1 mm). It consisted of fibre bundles with seven fibres in the stainless-steel probe housing with a sapphire window and an immersion probe designed for bubble shedding that is suitable for lab and process applications. The liquid probe was connected to an FT-NIR spectrophotometer (MPA II, Multi-Purpose Analyser, Bruker Optics GmbH & Co. KG, Ettlingen, Germany) for spectral acquisition between 11,536 and 3952 cm −1 , and it was immersed in liquid samples for spectra acquisition. The air spectra were collected as the background for the measurements. For the establishment of the calibration model, a 30-40 mL sample was collected aseptically at the beginning of fermentation (0 h) and continued with a loop time of 3, 6, and 18 h every day until 240 h, for NIR scanning and chemical analysis. Before the sampling, the fermented samples in the bottle were randomly stirred by a sterile plastic rod and pipetted into a 50 mL sterile plastic tube. Each sample was divided for the NIR analysis and chemical analysis. To obtain the NIR information of a sample close to the actual samples as in the fermented bottle, all samples were directly scanned by the liquid probe without further preparation and the sterile plastic tube was used as the holder for the liquid sample. Regarding the acquisition process, the sample variation and light scattering variation were included in this study. After sampling, NIR measurements of the samples were immediately performed at a spectral resolution of 16 cm −1 with an interval of 8 cm −1 and a repeating 32-time scan per one measurement. Therefore, the data of 99 NIR spectral samples were obtained from one batch of the pineapple wine fermentation process (1 batch × 11 times (0 h and ten days) × 3 sampling times × 3 subsamples). The validation of model performance in NIR analysis was performed using the same process as described earlier, except the NIR spectral acquisition of samples that were measured by immersing the liquid fibre-optic probe in the fermented pineapple wine bottle. Furthermore, the plastic tube was used to cover the fibre line to avoid error from moving while the spectra were collected. The sample temperature was controlled at 30 • C throughout the experiment. Figure 1 shows the setting of the liquid fibre-optic probe for the NIR measurement of test samples during the fermentation process.

Pineapple Wine Chemical Analysis for Ethanol, Total Soluble Solids, Total Acidi Total Volatile Acids Using the Conventional Reference Methods
Four parameters of ethanol, total soluble solids (TSS), total acidity (TA volatile acids (TVA) contents were monitored during fermentation proc employed as the reference chemical data for NIR model development. For th analysis, the samples were filtrated through the filter paper (No.1, Whatm determination as follows: (1) Ethanol concentration was assessed chromatography (Chromosorb-103, GC4000; GL Sciences; Tokyo, Japan) w capillary (30 m × 0.32 mm × 0.25 μm; JW Scientific; Folsom, CA, USA). (2) The in the sample was determined using a digital refractometer (PAL-1, ATAG Japan). (3) TA [28] and (4) TVA [29] were determined as citric acid and respectively, by titration using phenolphthalein as an indicator. For TA analys (10 mL) was pipetted into a 250 mL Erlenmeyer flask containing 100 mL of h water. Phenolphthalein (2-3 drops) was added to the flask and titrated with 0 until a pink colour appeared. TVA is separated from the wine samples distillation before titration using sodium hydroxide (0.1 N) to obtain the pin indicated by the phenolphthalein solution. All measurements were pe triplicate.

Preprocessing Method
The NIR spectral data were collected using OPUS software (version 8 system, Bruker Optics GmbH & Co. KG, Ettlingen, Germany) and converted i files. After that, the JCAMP files were imported into Unscrambler software ( CAMO AS, Trondheim, Norway) and were independently performed without of spectral pretreatment (original spectral data) and with the method derivatives (SD) based on the Savitzky-Golay model (polynomial order = 2, smoothing points = 7) in order to remove the signal variation (spectral offsets

Pineapple Wine Chemical Analysis for Ethanol, Total Soluble Solids, Total Acidity, and Total Volatile Acids Using the Conventional Reference Methods
Four parameters of ethanol, total soluble solids (TSS), total acidity (TA), and total volatile acids (TVA) contents were monitored during fermentation processing and employed as the reference chemical data for NIR model development. For the chemical analysis, the samples were filtrated through the filter paper (No.1, Whatman) before determination as follows: (1) Ethanol concentration was assessed using gas chromatography (Chromosorb-103, GC4000; GL Sciences; Tokyo, Japan) with an HP5 capillary (30 m × 0.32 mm × 0.25 µm; JW Scientific; Folsom, CA, USA). (2) The TSS content in the sample was determined using a digital refractometer (PAL-1, ATAGO, Tokyo, Japan).
(3) TA [28] and (4) TVA [29] were determined as citric acid and acetic acid, respectively, by titration using phenolphthalein as an indicator. For TA analysis, a sample (10 mL) was pipetted into a 250 mL Erlenmeyer flask containing 100 mL of hot distilled water. Phenolphthalein (2-3 drops) was added to the flask and titrated with 0.1 N NaOH until a pink colour appeared. TVA is separated from the wine samples by steam distillation before titration using sodium hydroxide (0.1 N) to obtain the pink end point indicated by the phenolphthalein solution. All measurements were performed in triplicate.

Preprocessing Method
The NIR spectral data were collected using OPUS software (version 8.2: MPA II system, Bruker Optics GmbH & Co. KG, Ettlingen, Germany) and converted into JCAMP files. After that, the JCAMP files were imported into Unscrambler software (version 9.8: CAMO AS, Trondheim, Norway) and were independently performed without the method of spectral pretreatment (original spectral data) and with the method of second derivatives (SD) based on the Savitzky-Golay model (polynomial order = 2, number of smoothing points = 7) in order to remove the signal variation (spectral offsets and slopes) from light scattering in the fermented samples [15].

Searching Combination Moving Window Partial Least Squares (SCMWPLS) Analysis
Two algorithms, moving window partial least squares regression (MWPLSR) [30] and SCMWPLS [24], were employed, respectively, in the calculation procedure. The calculation process of SCMWPLS is described as follows. Step 1: MWPLSR Calculation In MWPLSR, the calculation starts by building a series of PLS models in a spectral window X i (m × h matrix) that starts at the ith spectral channel and ends at the (i + H − 1)th spectral channel, which moves over the whole spectral region (m × n matrix). The spectra obtained in the spectral window is a sub-matrix X i (n × h matrix) containing the ith to the (i + H − 1)th columns of the calibration matrix X. The PLS-1 models with different numbers of LV can then be built to relate the spectra in the window to the concentrations of the analyte as follows: where b i,k (H × 1 vector) is the regression coefficients vector estimated using PLS with k-LV and e i,k is the residue vector obtained with a k-LV model. In this study, the window size for MWPLSR and the maximum LVs number were set to 20 spectral points and 10 LVs. The mean centred spectra in the whole region of 11,536-3952 cm −1 were applied. To avoid the effect on the residue lines obtained, the window size should be larger than the desired model dimensionality (LVs) and smaller than the spectral regions to be discovered. The window is moved over the whole spectral region. At each position, PLS models with varying LV numbers are built for the calibration samples, and the log of sums of squared residues (log(SSR)) are calculated with these PLS models and plotted as a function of the window position.
This will yield a number of residue lines, with each line associated with the log(SSR) for a certain LV in the corresponding window position. Then, the informative NIR spectral regions were discovered by plotting the residue lines corresponding to 1 to 10 LVs for PLS as a function of the position of the spectral window. A figure containing such residual lines provides information about informative spectral regions where residual lines show low values of SSR.
Step 2: SCMWPLS Calculation After the selection of informative regions by MWPLSR, SCMWPLS starts to work for a given informative region with p spectral points by changing the moving window size w from 1 to p. A moving window is moved from the first spectral point to the (p − w + 1)th point over the informative region to collect all possible sub-windows for every window size. When w = 1, moving the window from the first to the end point will collect all possible sub-windows with the window size of one. Similarly, in other cases of w, all sub-windows with the size of w may be obtained. Therefore, this algorithm considers all possible spectral intervals in the range of the informative region. For every window, a PLS model with a selected LVs number is constructed, and the root mean square error of the calibration (RMSEC) is calculated. Comparing values of RMSEC for all sub-regions, the sub-region with the smallest value of RMSEC is considered the optimized informative region.
In this study, more than one informative region is suggested by MWPLSR, and the optimized combination of informative regions was performed by using the optimized sub-region as the base region. Next, SCMWPLS is performed for the second informative region, in which one uses the combinations of the base region and one of the possible spectral intervals selected from the second informative region to build PLS models and calculate their RMSEC values. After that, a new base region will be selected, which shows the smallest value of RMSEC. This calculation procedure is repeated to look for another new base region for the next informative region, until the last informative region is reached. After finishing calculations for all informative regions, the final base region is considered as the optimized combination. In SCMWPLS, the maximum LVs number is constrained and the LVs number selected by the validation method must not be larger than the maximum. The LVs number of the PLS model for an informative region can easily be estimated by regressing the spectra in the region against the concentrations. The LVs number is determined to be the number where the root mean square error of calibration (RMSEC) begins to decrease insignificantly with the increase in the LVs number. This LVs number is considered the maximum LVs number. All these calculations were carried out using in-house written programs in the MATLAB software (version 2020b: The MathWorks Inc., Natick, MA, USA).

Calibration Development
PLS-1 (Unscrambler software) was applied to the spectral regions to develop the calibration models for the quantitative determination of ethanol, TSS, TA, and TVA in samples, simultaneously. The saturated NIR spectral region of 5248-4984 cm −1 was not included in the model developments as this spectral range is beyond the linear response region of the detector [19,25]. Two and one fermentation batches for pineapple wine samples were employed as the calibration set (n =198) and test set (n = 99), respectively. The full cross-validation method was used to find the optimum number of LVs for PLS by considering the number at which the lowest root mean squares error of cross-validation was obtained, and it increased from the next number [24,30,31]. The performances of the established calibration equations were further validated using the test set. To investigate the benefit of SCMWPLS, the PLS prediction results for the calibration models developed by using the spectral regions found by SCMWPLS were compared with those by using the full spectral regions according to the general PLS method.

Evaluation of the Predictive Ability of PLS and SCMWPLS Models
The prediction ability of models built by the whole NIR spectral region and the informative NIR region found by SCMWPLS were investigated and compared on the test set using the coefficient of determination (R 2 ), root mean square error of calibration (RMSEC), root mean square error of prediction (RMSEP), and residual predictive deviation (RPD). An acceptable NIR model should present high values of R 2 and RPD and low values of RMSEC and RMSEP. In addition, the accuracy of the best model was evaluated using values of the bias confidence limits (T b ) and the unexplained error confidence limits (T UE ), following the guidelines for the application of the NIR spectrometer described in ISO 12099 (2017) [32]. This verification method can assess the accepted model performance when the given standard error of prediction (SEP) and bias values fall within the confidence limits. Several earlier reports employed the standard ISO method, which has been detailed previously [33,34]. The statistics employed in this study are defined in Table 2. Table 2. Summary of statistical computations used to estimate NIR model performance.

Statistical Terms Computations
Coefficient of determination (R 2 ) Root mean square error (RMSE) RMSEC in the calibration set RMSEP in the test set Unexplained error confidence limits (T UE )

Measured Chemical Characteristics of Pineapple Wines during Fermentation by Reference Methods
The results of the chemical analysis of pineapple wine samples during the process of fermentation are listed in Table 3. The results in a row show the averages of multiple measurements from two sample batches collected in the same day of fermentation. During the fermentation, the samples have an ethanol content of 0.0590 to 10.7592%, TTS in the range of 23.70 to 10.25 • Brix, TA of 0.2925 to 0.4558%, and TVA of 0.0013 to 0.0018%. The concentrations of ethanol, TA, and TVA in the samples increased with days of fermentation. On the other hand, the concentration of TTS in the sample decreased. Among the four analysts, ethanol and TSS values have higher variation than TA and TVA values as shown in Figure 2. As a result, the amount of TSS decreases rapidly because Saccharomyces cerevisiae var. burgundy produces the invertase enzyme that breaks down sucrose into glucose and fructose [35]. Then, glucose is converted into ethanol and carbon dioxide with other enzymes related to Embden-Meyerhof-Panas (EMP). Thus, yeast uses sugar to grow and produce ethanol at the same time ( Figure 2A). It can be seen that both of the TA and TVA contents increased slightly during fermentation ( Figure 2B). An increase in the acid content of wine during the fermentation period resulted in suitable conditions for yeast growth [36]. Table 4 summarizes the distribution of the ethanol, TSS, TA, and TVA reference values in the samples for calibration and test sets. The content ranges of all chemical reference values in the samples for the calibration set covered those ranges found in the samples for the test set. Consequently, the variability of sample data in both calibration and prediction sets was considered appropriate for developing reliable NIR calibration models for ethanol, TSS, TA, and TVA predictions.

NIR Spectra of Pineapple Wines from the Fermentation Process
One hundred and ninety-eight of the original NIR spectra in the 11,536-3952 cm −1 region of pineapple wine samples obtained during fermentation using a liquid probe, and the eleven averaged spectra of the fermentation samples from 0 to 10 days in the whole spectral region, are shown in Figure 3A,B, respectively. A major component of pineapple wine is water. Therefore, a strong absorption band near 6900 and a saturated feature around 5000 cm −1 are assigned to the combination of OH symmetric and antisymmetric stretching modes, and the combination mode of the OH stretching and bending vibrations of water, respectively [15,37]. It is noted in Figure 3A,B that the saturated spectral region in a grey bar is excluded for model development. However, the spectral changes of the samples during different days of fermentation were not clearly visible in the original NIR spectra. Thus, the second derivatives (SD) were applied to reveal the significant NIR regions in the 11 averaged spectra of the fermentation samples from 0 to 10 days. Figure 3C,D presents the SD pretreated spectra in the 9500-5500 cm −1 range of pineapple wines for different fermentation dates. In Figure 3C, the SD pretreated spectra reveal the changes in NIR absorption bands around 8400, 6800, 5900, 5750, and 5650 cm −1 increased with fermentation time. Moreover, two dominant absorption bands can be seen near 4450 and 4340 cm −1 in the SD pretreated spectra of 4600-4000 cm −1 region that changed by increasing the fermentation time ( Figure 3D). The absorption bands from ethanol production during the wine fermentation were previously reported in the 6060-5715 and 4545-4350 cm −1 spectral regions [13,38,39]. The former was due to the C-H stretch first overtones of the CH 3 and CH 2 groups, and the latter was assigned to the C-H stretch and C-H deformation combination from the CH 3 group of ethanol [13,38,39]. The changes in the characteristic absorption bands observed in this study are similar to those reported by others. Therefore, they are related to characteristic bands for ethanol production from wine fermentation.
of water, respectively [15,37]. It is noted in Figure 3A,B that the saturated spectral r in a grey bar is excluded for model development. However, the spectral changes samples during different days of fermentation were not clearly visible in the origina spectra. Thus, the second derivatives (SD) were applied to reveal the significan regions in the 11 averaged spectra of the fermentation samples from 0 to 10 days. F 3C,D presents the SD pretreated spectra in the 9500-5500 cm -1 range of pineapple for different fermentation dates. In Figure 3C, the SD pretreated spectra reveal the ch in NIR absorption bands around 8400, 6800, 5900, 5750, and 5650 cm -1 increased fermentation time. Moreover, two dominant absorption bands can be seen near 445 4340 cm -1 in the SD pretreated spectra of 4600-4000 cm -1 region that changed by incre the fermentation time ( Figure 3D). The absorption bands from ethanol production d the wine fermentation were previously reported in the 6060-5715 and 4545-4350 spectral regions [13,38,39]. The former was due to the C-H stretch first overtones CH3 and CH2 groups, and the latter was assigned to the C-H stretch and C-H deform combination from the CH3 group of ethanol [13,38,39]. The changes in the charact absorption bands observed in this study are similar to those reported by others. Ther they are related to characteristic bands for ethanol production from wine fermentat There are absorption bands around 7056 and 5610 cm −1 that decreased with fermentation time ( Figure 3C). They were assigned to the O-H stretch first overtone and C-H stretch first overtone, respectively, which are characteristic bands for sugars [40,41]. The sugar contents are expressed by means of the TSS value. It is because the sugar contents are the highest among soluble solids dissolved in a pineapple wine sample. The characteristic bands for sugars decrease with fermentation time, corresponding to the process by which yeast converts sugars to ethanol. Furthermore, the functional groups of sugars and starch for the O-H stretch first overtone (6500 to 6300 cm −1 ), C-H stretch first overtone (5903 to 5650 cm −1 ), O-H stretch and C-O stretch combinations, and C-H combinations of stretch and deformation (4504 to 4250 cm −1 ) are expected to appear in the NIR spectra of pineapple wine (Table 5) [40,41]. In Figure 3C,D, the characteristic of changes involving such expected bands are found to increase in absorption over the time of fermentation. Although the sugar contents should be greatly reduced by yeast for ethanol production in pineapple winemaking, the contents of glucose and fructose are also increased by the enzyme invertase found in the growth phase of the yeast [35]. Therefore, the NIR spectra may convey two opposite directions of sugar changes due to the fermentation pathway by yeast. Acidity in wine is expressed as the concentration of acids present, namely citric acid (TA) and acetic acid (TVA). From the literature, the chemical structures of both acids for the C-H stretch first overtone, C-H stretch second overtone, and C-H stretch and C=O stretch combinations are expected to appear in the spectral regions of 8504 to 8304, 5952 to 5600, and 4504 to 4200 cm −1 , respectively [40,41]. Although the TA and TVA values increased over the fermentation time as shown in Figure 2B, the NIR bands involve the functional groups of the major constituents in pineapple wine, i.e., water, ethanol, and sugars, which also appeared around these areas as well. The major constituents in wine exhibited the dominant NIR bands where there may be overlap with the acid bands. It is because pineapple wine in the fermentation process has very low concentrations of citric acid (<0.5%) and acetic acid (<0.002%) ( Table 2). Hence, the individual spectral regions associated with the citric and acetic acids cannot be clearly identified in the original and SD pretreated NIR spectra of pineapple wines. The NIR band assignments from the SD pretreated spectra of pineapple wine during the fermentation process are summarized in Table 5.

SCMWPLS Analysis
The original NIR data after performing the SD method were employed in the MWPLSR calculations for searching for the informative spectral regions for ethanol, TSS, TA, and TVA in pineapple wine spectra. The residue lines for ethanol, TSS, TA, and TVA obtained by MWPLSR for the whole NIR spectral region of 11,536-3952 cm −1 are shown in Figure 4A-D, respectively. In residual line spectra, each line represents a certain number of LVs. The top line shows the log(SSR) values of the first LV model, and then the LV model increases accordingly in the following lines. In this study, the maximum LVs number was set to 10 LVs, resulting in a total of 10 lines of the residual spectra. It is noted in Figure 4 that the saturated spectral region in a grey bar is excluded for model development. a = The spectral regions of 9500-5500 and 4600-4000 cm −1 ; b = The intensity changes according to the fermentation date; [40,41] = All substances in Table 5 are referred to in reference numbers 40 and 41; Additional references to some substances are annotated by superscript as reference numbers.

SCMWPLS Analysis
The original NIR data after performing the SD method were employed in the MWPLSR calculations for searching for the informative spectral regions for ethanol, TSS, TA, and TVA in pineapple wine spectra. The residue lines for ethanol, TSS, TA, and TVA obtained by MWPLSR for the whole NIR spectral region of 11,536-3952 cm -1 are shown in Figure 4A-D, respectively. In residual line spectra, each line represents a certain number of LVs. The top line shows the log(SSR) values of the first LV model, and then the LV model increases accordingly in the following lines. In this study, the maximum LVs number was set to 10 LVs, resulting in a total of 10 lines of the residual spectra. It is noted in Figure 4 that the saturated spectral region in a grey bar is excluded for model development.  Figure 4A shows the four obtained informative spectral regions of 9200-8000 (a), 7800-6800 (b), 6720-5256 (c), and 4976-4008 (d) cm -1 for ethanol calculated by MWPLSR. They correspond to the second (a) and first (b, c) overtones, and combination bands (d) from the functional groups of ethanol, respectively (Table 5). These informative spectral regions discovered by MWPLSR can easily be seen to encompass those bands assigned for ethanol from the SD pretreated spectra of pineapple wine samples ( Figure 3C,D).
Four informative spectral regions of 9200-8000 (a), 7800-6904 (e), 6848-5256 (f), 4976-4008(d) cm -1 for TSS found by MWPLSR, are shown in Figure 4B. All informative spectral regions of a, e, f, and d for TSS cover the band assignments for sugars and related  Figure 4A shows the four obtained informative spectral regions of 9200-8000 (a), 7800-6800 (b), 6720-5256 (c), and 4976-4008 (d) cm −1 for ethanol calculated by MWPLSR. They correspond to the second (a) and first (b, c) overtones, and combination bands (d) from the functional groups of ethanol, respectively (Table 5). These informative spectral regions discovered by MWPLSR can easily be seen to encompass those bands assigned for ethanol from the SD pretreated spectra of pineapple wine samples ( Figure 3C,D).
Four informative spectral regions of 9200-8000 (a), 7800-6904 (e), 6848-5256 (f ), 4976-4008(d) cm −1 for TSS found by MWPLSR, are shown in Figure 4B. All informative spectral regions of a, e, f, and d for TSS cover the band assignments for sugars and related compounds given in Table 5. In Figure 4A,B, the third overtone bands for the 11,536 to 9800 cm −1 spectral region for ethanol and TSS show obviously high residue values (approximately > 2.3) from the residual spectral lines of two LVs. This line is the starting point for the suitability of the model dimensions built in this region, i.e., the fitness of residual lines is considering the line, showing the residue values decrease insignificantly as the number of LVs increases. Therefore, this third overtone spectral region was omitted in the optimization by SCMWPLS due to less spectral information of ethanol and TSS for the model developments. Figure 4C,D presents the same for four informative spectral regions of 9400-7904 (g), 7896-6808 (h), 6800-5256 (i), and 4976-4008 (d) cm −1 for TA and TVA obtained by MWPLSR. The NIR band assignments that fall in these four informative spectral regions found by MWPLSR are described in Table 5. Although the individual spectral regions associated with both acids cannot be identified in the original and SD pretreated NIR spectra of pineapple wines, MWPLSR can suggest using the informative spectral regions of g, h, i, and d for both acids with the low SSR values. The sharp peaks around 11,536-9800 cm −1 of the residual line spectra for TA and TVA show the residue values at the last line (10 LVs) close to those values obtained from the four informative spectral regions (g, h, i, d). However, the residual lines can be fitted from two LVs in this region where they have higher SSR values than those given by the selected informative regions of g, h, i, and d ( Figure 4C,D). Therefore, this spectral region of 11,536-9800 cm −1 is not chosen as the informative spectral region for TA and TVA. It is then excluded for optimization by SCMWPLS. For the informative spectral regions of ethanol, TSS, TA, and TVA obtained by MWPLSR, the SCMWPLS algorithm was performed to search for the optimized spectral regions.

Comparison of PLS Calibration Models
Statistical results for ethanol, TSS, TA, and TVA models developed by using the whole spectral region in both the original and SD pretreated NIR data and the optimized informative region obtained from SCMWPLS are compared in Table 6. In all cases, the spectral region from 5248 to 4984 cm −1 , where the saturate water band is located, was removed. Table 6. Statistics results for PLS calibration models of ethanol, TSS, TA, and TVA contents for pineapple wine in fermentation developed using uncorrected spectrum or second derivative corrected spectrum in the whole regions and those regions selected by SCMWPLS. The acceptable NIR models should show high R 2 and RPD values and low RMSEC and RMSEP values. In addition, the best model for each analyte could be evaluated after performing the validation method by using an external test set. Therefore, the model gives the lowest RMSEP and the highest RPD, and it is the better model. The interpretation of R 2 and RPD values qualify a model as good for screening with an R 2 of 0.66 to 0.81 or RPD > 3, good for quality control with an R 2 of 0.83 to 0.90 or RPD > 5, and excellent for all analytical tasks with an R 2 > 0.91; RPD > 8 [41].
It can be seen from Table 6 that the PLS calibration model for ethanol obtained using the whole spectral region yields the lowest predictability among the prediction results for ethanol. PLS prediction results for ethanol using the whole spectral region of the original NIR spectral data are an RMSEP of 0.466% and an RPD of 7.36 at 4 LVs, while the model base on the whole spectral region after performing SD pretreatment gives a better prediction model with the lower RMSEP of 0.406%, a higher RPD of 8.44 at four LVs. Moreover, the SCMWPLS coupled with the SD pretreatment provides the optimized combination of 9104-7984, 7752-6704, 6600-5256, and 4976-4008 cm −1 regions. This optimized combination provides very good prediction results with the lowest RMSEP of 0.393%, the highest RPD of 8.72, and a high R 2 of 0.984 with three LVs. These results are reasonably better than those calculated by using the whole spectral regions.
For TSS, the PLS prediction results of using the whole spectral region of the original NIR spectral data are an RMSEP of 0.441% and an RPD of 10.47 at five LVs, while the whole spectral region after performing SD pretreatment shows significant improvements with a lower RMSEP of 0.219% and a higher RPD of 21.08 at two LVs. In total, the MWPLSR suggested four individual informative spectral regions for TSS in the SD pretreated NIR spectra ( Figure 4B). After performing SCMWPLS for these four informative regions, one spectral region of 6800-5360 cm −1 that provided the best prediction results, with the lowest RMSEP of 0.166 • Brix and the highest RPD of 27.82 with two LVs, was revealed. SCMWPLS improves the RMSEP and RPD values significantly, and the number of LVs is clearly reduced.
By comparison of the results listed in Table 6, one can find that the predictive performance of models for TA and TVA are lower than those models for ethanol and TSS predictions. The quality of models for TA and TVA can be classified as good for quality control (R 2 > 0.88; averaged RPD = 3.07) and good for screening (R 2 > 0.75 averaged RPD = 2.69), respectively. This was caused by the low concentrations of citric acid (TA) and acetic acid (TVA) in pineapple wine from the fermentation process. In addition, the concentration range and standard deviation of both acids are narrow, with 0.2880-0.4757% and 0.0514 of the SD for citric acid (TA), and 0.0011-0.0019% and 0.0002 of the SD for acetic acid (TVA). However, the best result for the calibration model of TA is obtained from the optimized combination of 9200-5408 and 4976-4008 cm −1 regions generated by SCMWPLS. It achieves improvement with the lowest RMSEP of 0.0181% and the highest RPD of 3.17 at two LVs. As for TVA, the optimized combination generated by SCMWPLS is composed of the 6504-5280 and 4504-4248 cm −1 regions. The optimized combination provides the best prediction result with the lowest RMSEP of 0.000105% and the highest RPD of 2.86 with two LVs.
One can see in Table 6 that the best models obtained by SCMWPLS could reduce the NIR spectral data points for model development. The smallest NIR spectral data were 181 points for building the TSS calibration model and the highest NIR spectral data were 597 points for the modelling of TA. The simultaneous monitoring of all four chemical changes could be performed by setting the spectral acquisition for the FT-NIR spectrometer to 616 spectral points (9200-5256 and 4976-4008 cm −1 ), in which these wavenumber variables cover the optimized region for all constituents found by SCMWPLS. Then, the measurement time will become faster than collecting the NIR spectral data for the whole region (915 spectral points). Figure 5 shows the NIR predicted and reference values of the independent test set versus the fermentation time using the best NIR models for ethanol, TSS, TA, and TVA obtained from SCMWPLS. The best predictive result is obtained from the calibration model for TSS prediction, where the NIR-predicted values did not differ from the references detected by a conventional method. This can be seen from the cross symbol showing the NIR prediction values overlaid with the circle symbol showing the reference values ( Figure 5B). Figure 5 shows the NIR predicted and reference values of the independent test set versus the fermentation time using the best NIR models for ethanol, TSS, TA, and TVA obtained from SCMWPLS. The best predictive result is obtained from the calibration model for TSS prediction, where the NIR-predicted values did not differ from the references detected by a conventional method. This can be seen from the cross symbol showing the NIR prediction values overlaid with the circle symbol showing the reference values ( Figure 5B). The NIR prediction results calculated by the best models for ethanol prediction built by the use of optimized spectral region found by SCMWPLS also yield accurate results. However, a distinct difference between the reference values and the NIR prediction occurs during 234 to 240 h of fermentation ( Figure 5A). It was almost the end of the fermentation process at this time, in which the ethanol production should be nearly constant. The NIR prediction value seems to show a more realistic change than the reference at this point. For TA and TVA, the prediction results obtained from the best models showed lower accuracy than the TSS and ethanol prediction results. Figure 5C,D shows the apparent difference between the reference and NIR predictive values of TA and TVA that occurred approximately from 48 to 114 h and around 18 and 69 to 96 h of fermentation, respectively. The reason may be that this period shows a high rate of ethanol production. As can be seen in Figure 5A, ethanol content gradually increases after 18 h and then increases rapidly from 24 to 114 h. Conversely, the TSS values show a corresponding decrease at the same time ( Figure 5B). Therefore, both ethanol and CO2 are rapidly abundant in the fermentation sample. They can interfere with the observed NIR information due to the citric and acetic acids, and this may result in low accuracy for TA and TVA predictions at this period. The NIR prediction results calculated by the best models for ethanol prediction built by the use of optimized spectral region found by SCMWPLS also yield accurate results. However, a distinct difference between the reference values and the NIR prediction occurs during 234 to 240 h of fermentation ( Figure 5A). It was almost the end of the fermentation process at this time, in which the ethanol production should be nearly constant. The NIR prediction value seems to show a more realistic change than the reference at this point. For TA and TVA, the prediction results obtained from the best models showed lower accuracy than the TSS and ethanol prediction results. Figure 5C,D shows the apparent difference between the reference and NIR predictive values of TA and TVA that occurred approximately from 48 to 114 h and around 18 and 69 to 96 h of fermentation, respectively. The reason may be that this period shows a high rate of ethanol production. As can be seen in Figure 5A, ethanol content gradually increases after 18 h and then increases rapidly from 24 to 114 h. Conversely, the TSS values show a corresponding decrease at the same time ( Figure 5B). Therefore, both ethanol and CO 2 are rapidly abundant in the fermentation sample. They can interfere with the observed NIR information due to the citric and acetic acids, and this may result in low accuracy for TA and TVA predictions at this period.
To assure the predictive performance of the best NIR models built by the optimized region from SCMWPLS, the bias confidence limits (T b ) and the unexplained error confidence limits (T UE ) were also employed as an indicator of NIR predictions in this study. The validation process through an independent test set provided the SEP and bias values, which should be compared with the calculated T UE and T b values, respectively. When both the SEP and bias values were below these two confidence limits (SEP < T UE ; bias < ±T b ), this NIR model is considered to be accepted for its performance. The statistical results for the performance evaluation of the best models are summarized in Table 7. There is no doubt regarding an accurate predictive performance for ethanol and TSS models as the results show above. However, this statistical analysis was specially employed because the efficiency of the best models for TA and TVA should be taken into account. From Table 7, it can be seen that the statistical results obtained from the best models for TA and TVA also met the criteria. The interpretations of this result are that the SEP value is low enough to make it practically acceptable when it was lower than the calculated T UE value, and the bias value was not significantly different from zero when it was lower than that calculated ±T b . Table 7. Statistics for assessment of the model performance.

Conclusions
The results of present study demonstrated the potential of NIR spectroscopy coupling with SCMWPLS to enhance the predictive performance of NIR calibration models for simultaneously monitoring the changes in ethanol, total soluble solids, total acidity, and total volatile acids in pineapple fruit wine during the fermentation process. SCMWPLS could select and optimize informative spectral regions from the second derivative spectra of very complicated mixtures such as wine obtained by the FT-NIR fibre optic probe. The optimized informative regions are the combination of 9104-7984, 7752-6704, 6600-5256, and 4976-4008 cm −1 regions for ethanol, the 6800-5360 cm −1 region for TSS, the combination of 9200-5408 and 4976-4008 cm −1 regions for TA, and the combination of 6504-5280 and 4504-4248 cm −1 regions for TVA. The quality of their PLS calibrations is improved in comparison with those obtained using the whole region. Furthermore, the present study has verified the advantages of the NIR liquid probe in combination with SCMWPLS for direct NIR measurements in pineapple wines from the fermentation process without sample preparation. Therefore, the best models obtained from these tools provided good prediction results with acceptable statistics and especially the use of a small number of spectral data points that will make faster NIR measurement possible. However, further cases or device designs for liquid probe measurement should be considered to protect the probe from being disturbed by the CO 2 and microparticles (if the interference has a particle size smaller than the probe slit < 1 mm) found in the fermentation system in order to stabilize the NIR signal and improve the prediction of low-concentration constituents.
Author Contributions: Conceptualization, funding acquisition, supervision and writing-original draft preparation, S.K.; conceptualization, fermentation methodology, and investigation, A.B.; K.N. is an engineer who contributed to the setup of the NIR experiment, validation process, and software; review and editing original draft preparation and project administration, S.J.; formal analysis and performed methodology, W.A., P.J. and J.M.; resource and visualization, P.V. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement:
The data presented in this study are available in the article.