Estimating Sensory Properties with Near-Infrared Spectroscopy: A Tool for Quality Control and Breeding of ‘Calçots’ ( Allium cepa L.)

: Using trained panelists to evaluate sensory attributes is unfeasible when many samples must be evaluated, such as in quality control or breeding programs. Near-infrared spectroscopy (NIRS) is a rapid inexpensive method often used in food quality evaluation. We assessed the feasibility of using NIRS to estimate sweetness, ﬁber perception, and o ﬀ -ﬂavors, the most important sensory attributes in cooked ‘calçots’ (the immature ﬂoral stems of second-year onion resprouts). The best results were achieved through models using interval partial least squares (iPLS) variable selection on spectra from pureed cooked ‘calçots’, which yielded values of the ratio of performance to deviation (RPD) greater than 1.4 in all cases. Therefore, it would be feasible to use NIRS to estimate sensory properties in ‘calçots’. This approach would be useful in initial screening to discard samples that di ﬀ er substantially from the ideotype; thus, sensory analysis by trained panels could be reserved for ﬁner discriminations.


Introduction
'Calçots' (Allium cepa L.) are the immature floral stems of second-year 'Blanca Tardana de Lleida' (BTL) onion landrace resprouts. This crop is typical of Catalonia (Northeast Spain), where 'calçots' are typically prepared by roasting on a hot open fire. Although official economic data are lacking, the market volume for 'calçots' is estimated at €20 million [1]. However, the economic importance of 'calçots' lies not only in their production but also in associated agro-tourism, which boosts the regional economy and has increased the demand for 'calçots' worldwide.
'Calçots' from the traditional cultivation area have been awarded the European Union's food-quality label 'Protected Geographical Indication' (PGI) [2]. To date, the PGI's regulating board has focused the quality control of 'calçots' on morphological traits (length and width of the edible part) but producers aim to highlight the internal quality, especially organoleptic attributes.
The sensory ideotype representing consumer preferences of the ideal 'calçot' is sweet, with low fiber perception, and without off-flavors [3]. The evaluation of sensory attributes requires trained panelists to apply standardized methods and a considerable amount of the sample. Since only a few samples can be evaluated in each session, panelists must meet several times, making it impracticable to analyze large numbers of samples [4]. These limitations discourage breeders from including sensory traits in breeding programs and limit the scope of the regulating authority's efforts at quality control. Therefore, other approaches are needed to replace or complement traditional sensory analysis.
Chemical analysis has been used to estimate sensory attributes in fruit and vegetables, including melons (Cucumis melo L.) [5], onions [6,7], tomatoes (Solanum lycopersicum L.) [8] or red raspberry (Rubus idaeus L.) [9], among others. Correlations between sensory attributes and chemical parameters have been established also for 'calçots' [10]. Nevertheless, standard chemical analyses are expensive, laborious, and time-consuming. Besides, organoleptic properties can be difficult to predict from isolated chemical data because these properties result from interactions among various compounds [11].
In recent years, near-infrared spectroscopy (NIRS) has attracted attention because it is a fast technique and easy to apply, inexpensive, and enables the simultaneous estimation of different properties from a single spectrum. NIRS has been widely used in the agri-food industry, for example it is often used to evaluate food quality [12], to control food safety [13], and to detect food adulterations [14]. Albeit to a lesser extent, NIRS has been used to estimate sensory attributes in products such as wine [15], green tea [16], coffee [17,18], chicory [19], beans [20], potatoes [21], and meat and fish products [22,23], among others. These studies have shown that NIRS is a useful tool for sensory evaluation when combined with chemometrics.
In 'calçots', NIRS has been applied to determine some chemical properties (dry matter content, soluble solid content, titratable acidity, and ash content), providing good results [24]. The present study aims to investigate the usefulness of NIRS in the determination of sensory attributes of cooked 'calçots' as a further step in quality evaluation of 'calçots'. To this end, we developed models to estimate sweetness, fiber perception, and off-flavors, the sensory attributes included in the ideotype of 'calçots' promoted by the PGI.

Samples
The experiment used 85 samples of cooked 'calçots'. Each sample comprised a set of 80 commercial 'calçots' (PGI regulations define commercial 'calçot' as having a compact white edible base measuring 15-25 cm in length and 1.7-2.5 cm in diameter 5 cm from the root). To ensure variation in sensory attributes and to take into account the influence of environmental factors on quality traits [10], samples were harvested in different environments (inside and outside the PGI area) at different harvesting times during three consecutive seasons: 2014-15, 2015-16, and 2016-17.

Sample Preparation
Samples were prepared as described by Simó et al. [3] Leaves were cut 4 cm above the ligule and roots were removed. Then 'calçots' were cleaned with tap water to remove adhering soil. Samples were roasted at 270 • C for 18 min using a convection oven (SALVA Kwik-co). After cooking, the two most external leaves were removed, and the lower, edible part was cut off and pureed with a mixer (Taurus BAPI 850). Half the pureed samples were dried for 72 h at 60 • C and then ground to an average particle size <0.4 mm to obtain ground dried puree. The remaining pureed samples were frozen with liquid nitrogen and stored at −20 • C until their sensory analysis and NIR registration.

Sensory Analysis
A panel of 8 trained judges used previously reported protocols for quantitative descriptive sensory analysis [3] to analyze samples of 'calçot' puree. Briefly, in each session, judges used a semi-structured visual scale labeled from 0 to 10 to evaluate organoleptic descriptors (sweetness, fiber perception, and off-flavors) in 5 different samples. All tests were carried out in a room designed for sensory tests that fulfilled the standards specified by the International Organization for Standardization [25]. All samples were evaluated in duplicate.

Spectral Measurement
NIR spectra of cooked 'calçots' were recorded from two types of preprocessed samples: puree and ground dried puree as described by Sans et al. [24]. Spectra were registered with a spectrophotometer (Foss NIRSystems model 5000, Silver Spring, MD, USA) equipped with a rapid content analyzer module and Vision software, version 2.51. Spectra were recorded every 2 nm between 1100 nm to 2500 nm and averaged from 32 scans. Puree samples were measured in reflectance mode and ground dried puree was measured in transflectance mode. The spectrum was expressed as log (1/R). Three spectra were registered for each sample and the average spectrum was used for computations.

Data Analysis
Data were analyzed with the PLS_Toolbox v.8.21 (Eigenvector Research Inc., Wenatchee, WA, USA) and in-house routines running under MATLAB R2017a (The MathWorks™ Inc., Natick, MA, USA). In all cases, spectra from puree and ground dried puree were treated independently.
To ensure that significant variation had been detected for the three sensory attributes (sweetness, fiber perception, and off-flavors) evaluated by the tasting panel, sensory data were analyzed using the analysis of variance (ANOVA) according to the following linear model: where s is the sample factor and p the panelist factor. Both factors were considered fixed. Prediction models were built using partial least squares (PLS) regression with the NIPALS algorithm as implemented in PLS_Toolbox v.8.21 software. After exploring spectra by principal component analysis (PCA) to detect clustering due to season or origin and outliers, samples were randomly divided into 2 groups so that about 75% of the samples could be used for calibration and 25% for external validation.
To obtain the best PLS models, the following spectral pretreatments were tested to reduce unwanted variation due to sources unrelated to the properties of interest: multiplicative scatter correction (MSC), standard normal variate (SNV), Savitzky-Golay (SG) first-and second-order derivatives with second order polynomial approximation, and different point window size. The pretreated spectra and the values of the sensory attributes were mean-centered before being submitted to the regression algorithm. To increase the predictive accuracy of the models, the results using both the full spectrum and specific spectral regions were compared. To select variables, interval PLS (iPLS) was used [26], configuring the iPLS algorithm in stepwise forward mode, with interval size of 1 variable and using between 10 and 50 intervals.
PLS regression models were evaluated using a venetian blind cross-validation with 10 data splits. Combinations of data pretreatments and different numbers of factors were tried out with the aim of constructing a model with a good enough compromise among a low root mean square error of calibration (RMSEC), low root mean square error of cross-validation (RMSECV), high coefficient of determination (R 2 ), and low bias. The optimal PLS models were finally tested with the external validation set (25% of the original samples) that had not been used for calibration. To estimate the performance of the calibration model, the root mean square error of prediction (RMSEP) evaluated with these samples was used. Also, the model's predictive ability was assessed with the ratio of performance to deviation (RPD) and the relative ability of prediction (RAP), calculated as follows: where SDx is the standard deviation of the validation reference data and S ref is the standard error of the reference method, which indicates the uncertainty of the analysis due to the panelist. The RPD is a dimensionless index widely used to evaluate NIRS models in agricultural products [27]. The RAP takes into account both the error of NIRS prediction and the uncertainty of the panelists' evaluations; it has a value between 0 and 1 [28].

Sensory Analysis
In the ANOVA, both the sample and panelist factors, but not the interaction between them, were significant (p < 0.05) for the three sensory attributes. The significance of the panelist factor indicates that panelists were using the scales differently in their evaluations; this finding is common in descriptive sensory analyses, and it is related to slight differences in the reference values that panelists learn [29]. However, the lack of a significant interaction between the panelist and sample factors indicates that the panel adequately discriminated between phenotypic differences. Table 1 shows the means, standard deviations, and ranges of the sensory attributes scored by the panel and divided into calibration and validation sets. To develop robust calibration models, it is critical to obtain a wide range of values for each attribute to be correlated with the NIR measurements. The values of the attribute sweetness were widely dispersed, reflecting the variability commonly found in 'calçots'. The values of the attributes fiber perception and off-flavors were mostly at the lower end of the scale, resulting in a narrower range, especially for the attribute off-flavors; however, these findings were expected because 'calçots' from BTL varieties usually have low values of these attributes in comparison with 'calçots' from other onion varieties [3].

NIRS to Estimate Sensory Attributes
PCA of SNV-pretreated spectra showed no clustering due to season or origin in the score plots (results not shown). The first two principal components explained 92.09% and 89.70% of the variation for puree and ground dried puree samples, respectively. No outliers were detected. Figure 1 shows the raw and pretreated spectra measured from puree and ground dried puree samples. As stated in a previous study [24], it is difficult to assign specific absorption bands to specific functional groups, due to the complex composition of vegetables. The main difference between the spectra from puree and ground dried samples is that the spectra from puree are strongly influenced by water bands, have two characteristic absorption peaks around 1450 nm (stretch of the O-H bonds, first overtone) and 1940 nm (stretch of the O-H bonds and O-H deformation). Water is a major constituent of 'calçots', ranging its content between 79.9-87.3% for the samples registered. Since the high water content in the samples could limit the use of NIRS due to the strong absorption bands that predominate in the spectrum, we considered the alternative of using the spectra from ground dried puree samples. First, PLS regression models to estimate the sensory attributes were developed, using the entire spectral range and separately using spectra from either puree or ground dried puree samples. The optimal number of PLS factors was established as that which did not significantly reduce the RMSECV when the number of factors increased. Nevertheless, to prevent overfitting, the upper limit of optimal PLS factors was set at one PLS factor per ten calibration samples, plus two [30]. The performance of the models varied for each sensory attribute and for the two sample preparations. In general, using the entire spectral range, the best prediction was for the attribute sweetness found using spectra from puree (R 2 pred = 0.66 and RMSEP = 0.78). By contrast, PLS models yielded poor results for the attribute off-flavors, both using puree spectra (R 2 pred = 0.31 and RMSEP = 0.93) and ground dried puree spectra (R 2 pred = 0.27 and RMSEP = 0.96), and for the attribute fiber perception using ground dried puree spectra (R 2 pred = 0.26 and RMSEP = 0.87) ( Table 2).  First, PLS regression models to estimate the sensory attributes were developed, using the entire spectral range and separately using spectra from either puree or ground dried puree samples. The optimal number of PLS factors was established as that which did not significantly reduce the RMSECV when the number of factors increased. Nevertheless, to prevent overfitting, the upper limit of optimal PLS factors was set at one PLS factor per ten calibration samples, plus two [30]. The performance of the models varied for each sensory attribute and for the two sample preparations. In general, using the entire spectral range, the best prediction was for the attribute sweetness found using spectra from puree (R 2 pred = 0.66 and RMSEP = 0.78). By contrast, PLS models yielded poor results for the attribute off-flavors, both using puree spectra (R 2 pred = 0.31 and RMSEP = 0.93) and ground dried puree spectra (R 2 pred = 0.27 and RMSEP = 0.96), and for the attribute fiber perception using ground dried puree spectra (R 2 pred = 0.26 and RMSEP = 0.87) ( Table 2). LVs: number of latent variables; R 2 cal : coefficient of determination of calibration; RMSEC: root mean square error of calibration; R 2 CV : coefficient of determination of cross-validation; RMSECV: root mean square error of cross-validation; R 2 pred : coefficient of determination of prediction; RMSEC: root mean square error of prediction; GD puree: ground dried puree; m.c.: mean centering; SNV: standard normal variate; SG-1D: Savitzky-Golay first-order derivative; SG-2D: Savitzky-Golay second-order derivative; between parentheses: window size.
To improve the models developed using the entire spectral range, iPLS variable selection was used ( Figure 2, Table 3). Once again, sweetness was the parameter best predicted, in both puree (R 2 pred = 0.66 and RMSEP = 0.76) and ground dried puree samples (R 2 pred = 0.72 and RMSEP = 0.73). In general, iPLS improved the prediction of all the attributes, especially sweetness (ground dried puree) and off-flavors (Puree) ( Tables 2 and 3).   LVs: number of latent variables; R 2 cal : coefficient of determination of calibration; RMSEC: root mean square error of calibration; R 2 CV : coefficient of determination of cross-validation; RMSECV: root mean square error of cross-validation; R 2 pred : coefficient of determination of prediction; RMSEC: root mean square error of prediction; GD puree: ground dried puree; m.c.: mean centering; SNV: standard normal variate; SG-1D: Savitzky-Golay first-order derivative; SG-2D: Savitzky-Golay second-order derivative; between parentheses: window size. The dimensionless parameter RPD is commonly used to evaluate the predictive ability of NIRS, and the reliability of the model is commonly classified into three quality categories: excellent (RPD > 2), fair (1.4 < RPD < 2), or poor (RPD < 1.4) [27,31]. According to these thresholds, the models  Figure 3 plots NIRS-predicted values for puree and ground dried puree versus reference values of sensory attributes for the models developed using iPLS variable selection, which generally yielded better predictions than the models calculated from the entire spectra. In general, for the sensory attributes, PLS models developed from puree spectra yielded better predictions than those developed from ground dried puree, with the exception of sweetness, where scant differences were found between the two ( Table 3). As stated before, we used ground dried puree samples spectra to develop the regression models as an alternative considering the high water content of puree samples. In a previous study, we found better prediction models using ground dried puree spectra for some chemical parameters [24]. In this case, the better performance of the models developed from puree can be explained because panelists evaluated the samples as a puree and the process of drying and grounding probably changed the properties of the samples.
The dimensionless parameter RPD is commonly used to evaluate the predictive ability of NIRS, and the reliability of the model is commonly classified into three quality categories: excellent (RPD > 2), fair (1.4 < RPD < 2), or poor (RPD < 1.4) [27,31]. According to these thresholds, the models developed could be considered useful for predicting sweetness from NIRS on puree and ground dried puree and for predicting fiber perception and off-flavors from NIRS on puree (Table 3). However, these thresholds are not based on statistical analyses, and some researchers have used much higher thresholds [27]. On the other hand, RPD values for sensory attributes are usually lower than those for chemical or physical properties. For example, in the application of NIRS to estimate sensory attributes in common beans, the best models developed presented RPD values between 1.19-1.90 [20]. In different meats, RPD values in almost all the cases were also lower than 1.5 [32]. Higher values of RPD (i.e., RPD > 2) were reported for some properties in sensory evaluation using NIRS in cheese [33,34], wine [15], and chicory hybrids [19]. However, in some cases, RPD values were calculated as the ratio of the SD to RMSECV, rather than by external validation of the models.
It is also important to remember that the use of RPD assumes that the errors of the reference method are negligible, which is not the case with sensory analysis [15]. For this reason, previous studies used the parameter RAP to relate the predictive ability of NIRS to the precision of the panelists' evaluation [20]. The models developed using iPLS variable selection for sweetness showed values of RAP greater than 0.70 (Table 3), suggesting these models are reliable in predicting this sensory attribute. The best models developed to estimate fiber perception and off-flavors had RAP values greater than 0.5, which are comparable to RAP values reported for other products, such as beans [20], peas [35], or rice [36].
In balance, these results demonstrate the potential of NIRS in the evaluation of complex sensory properties in cooked 'calçots', with a pretreatment as simple as pureeing the samples, which makes it possible to homogenize many specimens representative of a stock, thus enabling a good average evaluation with a limited number of registers. Although models developed to estimate sensory attributes are less accurate than those developed to estimate chemical properties [24], used together with PLS regression, NIRS promises to be useful for evaluating the sensory attributes of cooked 'calçots' in plant breeding or quality control.

Conclusions
Quality control and plant breeding programs need to analyze large numbers of samples and rapid, inexpensive phenotyping methods are needed to enable the analysis of sensory attributes. Our results show that it is feasible to use NIRS to estimate the most important sensory attributes in cooked 'calçots': sweetness, fiber perception, and off-flavors. The best approach to predict these attributes was using iPLS variable selection to develop predictive models from spectra from pureed cooked 'calçots', which yielded RPD values greater than 1.4 in all cases.
Although NIRS models are less robust for sensory attributes than for other properties such as chemical composition, they can be used in the initial screening of samples of cooked 'calçots', allowing the more time-consuming and costly panel sensory analysis to be reserved for only when more accuracy is needed. In the same way, NIRS can help detect materials that would clearly fail to meet the standards of the PGI label, facilitating quality control and helping ensure customer loyalty. In summary, NIRS promises to be a key tool to enable the analysis of sensory properties in 'calçots'.