Authentication of Sorrento Walnuts by NIR Spectroscopy Coupled with Di ﬀ erent Chemometric Classiﬁcation Strategies

: Walnuts have been widely investigated because of their chemical composition, which is particularly rich in unsaturated fatty acids, responsible for di ﬀ erent beneﬁts in the human body. Some of these fruits, depending on the harvesting area, are considered a high value-added food, thus resulting in a higher selling price. In Italy, walnuts are harvested throughout the national territory, but the fruits produced in the Sorrento area (South Italy) are commercially valuable for their peculiar organoleptic characteristics. The aim of the present study is to develop a non-destructive and shelf-life compatible method, capable of discriminating common walnuts from those harvested in Sorrento (a town in Southern Italy), considered a high quality product. Two-hundred-and-twenty-seven walnuts (105 from Sorrento and 132 grown in other areas) were analyzed by near-infrared spectroscopy (both whole or shelled), and classiﬁed by Partial Least Squares-Discriminant Analysis (PLS-DA). Eventually, two multi-block approaches have been exploited in order to combine the spectral information collected on the shell and on the kernel. One of these latter strategies provided the best results (98.3% of correct classiﬁcation rate in external validation, corresponding to 1 misclassiﬁed object over 60). The present study suggests the proposed strategy is a suitable solution for the discrimination of Sorrento walnuts. and F.M.; software, A.B. and F.M.; validation, A.B. and F.M.; formal analysis, L.A. and P.F.; investigation, A.B. and L.A.; resources, R.B. and F.M.; data curation, R.B. and L.A.; writing—original draft preparation, L.A. and A.B.; writing—review and editing, A.B. and F.M.; visualization, A.B. and F.M.; supervision, A.B. and F.M.; project administration, F.M. and R.B.; funding acquisition, R.B. and F.M. All authors have read and agreed the published version of the manuscript.


Introduction
Walnut is the fruit of the Juglans regia L. tree. It is an economically interesting arboreal species, appreciated for its wood and edible fruits, which grows in temperate climate areas. Its seed is an important source of phospholipids, tocopherol, proteins, and mono-and polyunsaturated fatty acids. Overall, it is a noticeable source for some microelements, such as iron, copper, selenium, and zinc. In addition, as it has been observed that their consumption is capable to reduce the incidence of coronary diseases, walnuts have been deeply studied in the last few years [1]. In Italy, walnuts are harvested throughout the national territory; nevertheless, the fruits produced in the Sorrento area (South Italy) are commercially valuable for their peculiar organoleptic characteristics, confirmed by genetic criteria established by Foroni et al. [2]. From this reason arises the necessity of characterizing Sorrento walnuts, in order to discriminate it from common fruits (having lower market value), preventing possible For each sample, NIR spectra were collected on the nutshell (two replicates, one per side) and on the kernel (two replicates, one per side), for a total of 4 spectra collected on each walnut. This procedure led to a total of 948 (237 × 4) NIR spectra. These measurements were organized into two data matrices, the first one (X 1 -of dimensions 237 × 3112-), containing all the spectra collected on the shell (averaged over the two replicates) and the second one (X 2 -of dimensions 237 × 3112-), made of the spectra collected on the kernel (averaged over the two replicates). NIR spectra were collected by the OMNIC software (Thermo Scientific Inc., Madison, WI), and imported in MATLAB 2015b (The Mathworks, Natick, MA) for calculations. Prior to the creation of classification models, spectra were divided into training and test sets by using the Duplex algorithm [36] to pursue external validation of the models. In order to divide samples into subsets taking into account the variability of both X 1 and X 2 , the procedure described in [37] was applied. Eventually, the training set included 177 samples (77 samples belonging to the "Sorrento class" and 100 to "non-Sorrento class"), while the test set comprehended 60 objects (28 Sorrento walnuts and 32 non-Sorrento); obviously, the same division was used for both X 1 and X 2 .

Partial Least Squares Discriminant Analysis (PLS-DA)
Partial Least Squares Discriminant Analysis (PLS-DA) [38,39] is a widely applied tool in the context of discriminant classification. One of its major benefits is that it allows handling ill-conditioned data matrices (a condition often encountered working with spectral data) [40]. This approach, despite being a classifier, starts from the resolution of a regression problem defined between a predictor data matrix X and a dummy response y [41]. The dummy y has a key role in the application of the PLS-DA algorithm; in fact, it encodes class information through binary ciphering, and it allows solving the classification problem by estimating the regression equation represented by Equation (1): The solution is achieved by Partial Least Squares (PLS) [42,43]. In Equation (1), b represents the regression coefficients. When the investigated problem involves only two classes (as in the present work), y is a binary vector whose elements represent whether the corresponding sample belongs to one class (y = 1) or to the other (y = 0). For example, in a two-category case, for six samples equally distributed between the two classes (the first three objects belonging to the first class and the remaining ones to the second category), the y dummy would be y = [ 1 1 1 0 0 0] T . Once the calibration model is built on the training samples (i.e., a set of objects whose class-membership is known), it is possible to classify unknown samples (X new ) and estimate the predictedŷ new . Nevertheless,ŷ new will be made of real numbers, and, consequently, the class-membership cannot be directly deduced. Different classification rules have been proposed to face this issue (see, e.g., in [44][45][46][47]; in the present work, the solution suggested by Indahl and collaborators, i.e., to apply linear discriminant analysis on the predicted response, has been applied [46]. In the present work, PLS-DA has been (separately) applied on spectra collected on the shell (X 1 ) and on the kernel (X 2 ).

Sequential and Orthogonalized Partial Least Squares Linear Discriminant Analysis (SO-PLS-LDA)
Sequential and Orthogonalized-PLS is a multi-block regression approach developed to handle data matrices removing redundant information possibly present [48]. For two predictor blocks (X 1 and X 2 ) and a response matrix y, the algorithm can be ensemble in four steps: 1. X 1 is used to estimate y by PLS. Scores T x1 and y-residuals e are calculated.

2.
X 2 is orthogonalized with respect to T x1 , obtaining X 2orth 3. X 2,orth is used to estimate e by PLS.

4.
The equation y = X 1 b + X 2 c + f is solved (b and c being the regression coefficients and f the residuals).
Sequential and Orthogonalized Partial Least Squares Linear Discriminant Analysis leans on SO-PLS, exploiting it for features reduction. In fact, in order to apply SO-PLS-LDA analysis [30] it is sufficient to create the SO-PLS model, and then applying LDA on the predicted y or on the concatenated scores. For more details on SO-PLS-LDA the reader is referred to the works in [30,49]. Calculations were made using in-house written functions running under Matlab, which are freely downloadable at [50].
In the present work, SO-PLS-LDA has been used to distinguish Sorrento and Non-Sorrento walnuts, simultaneously handling data collected on the shell (X 1 ) and on the kernel (X 2 ).

Sequential and Orthogonalized Covariance Selection Linear Discriminant Analysis (SO-CovSel-LDA)
Sequential and Orthogonalized Covariance Selection Linear Discriminant Analysis [31] is a multi-block classifier based on the combination of the regression approach called SO-CovSel and LDA. The algorithm of SO-CovSel is similar to the one for SO-PLS; the main difference being the fact the feature reduction operated by PLS (in SO-PLS) is replaced by the variable selection achieved by CovSel [51]. Briefly, considering the two predictor blocks X 1 and X 2 , for the prediction of the response matrix y, the SO-CovSel algorithm can be summarized as follows.

1.
Variables in X 1 are selected by CovSel, obtaining the reduced matrix X 1sel 2.
X 1sel is used to predict y by ordinary least squares 3.
X 2 is orthogonalized with respect to X 1sel , obtaining X 2orth 4.
Variables in X 2orth are selected by CovSel, obtaining the reduced matrix X 2orth,sel 5.
X 2orth,sel is used to estimate the residuals from step 2 6.
The equation y = X 1 b + X 2 c + f is solved (b and c being the regression coefficients and f the residuals).
Eventually, if the aim is to create a classification model, LDA can be calculated on the y predicted at step 6. For more details on SO-CovSel-LDA, the reader is addressed to the work in [31]. Calculations were made using in-house written functions running under Matlab, which are freely downloadable at [52].

Results
After the division into training a test set described in Section 2.1, data collected on the shell (X 1 ) and on the kernel (X 2 ) were analyzed by PLS-DA. The outcomes of these analyses are reported in Sections 3.1 and 3.2, respectively. In both cases, different spectral pretreatments were tested on the spectra, in order to remove spurious information possibly present. The tested preprocessing approaches are 1st and 2nd derivatives calculated according to the Savitzky-Golay approach (19 points window, and second-or third-order interpolating polynomial, respectively) [53], Standard Normal Variate (SNV) [54], and their combinations. These pretreatments were chosen because derivatives are expected to remove both addictive and multiplicative effect from spectra, whereas SNV has been conceived to attenuate the artifacts given by the scattering. Moreover, the width of the interpolation window was selected on the basis of our previous experience with similar NIR data as the one providing the best compromise between noise reduction and excessive (artifact) smoothing.
Eventually, a multi-block strategy has been exploited for the joint analysis of both sets of spectra. The results of this latter analysis are reported in Section 3.3. In all the classification models described in these three sections, the optimal data pretreatment model parameters were selected as the ones leading to the lowest classification error in a 7-fold cross-validation procedure on the training samples. Regardless the pretreatment used, blocks were always mean-centered prior to the creation of any model.

PLS-DA Analysis of NIR Spectra Collected on the Shell
As mentioned, NIR spectra were collected on the whole nuts (i.e., on the shellnut); the average spectra for samples belonging to Class Sorrento (red line) and Class Non-Sorrento (blue line) are reported in Figure 1. From the plot is clear that the two spectra are very similar. Different pretreatments were tested on the data. The average classification error in crossvalidation (reported in Table 2) was used to select the optimal preprocessing. The model providing the lowest classification error was the one built on data preprocessed by 1 st derivative. The application of this model to the test set led to 92.9% sensitivity and 96.9% specificity for Class Sorrento; naturally, due to the symmetry of the classification results for a two class-problem, these values are reversed in the case of Class Non-Sorrento, for which sensitivity was 93.6% and specificity 92.9%. Altogether, two Sorrento and one Non-Sorrento test objects were misclassified. A graphical representation of this outcome is also reported in Figure 2, where the predicted y is displayed as a function of the sample index. In the figure, training objects are represented by empty symbols, while the test ones are displayed as filled items. The black dashed line in the plot is the threshold: samples falling above it are assigned by the model to Class Sorrento, whereas those below the line are predicted as belonging to Class Non-Sorrento. From the representation, it is easy to spot the three misclassified test samples: one object belonging to Class Non Sorrento (Blue square) and two samples appertaining to Class Sorrento (red diamonds). Different pretreatments were tested on the data. The average classification error in cross-validation (reported in Table 2) was used to select the optimal preprocessing. Table 2. Partial Least Squares-Discriminant Analysis (PLS-DA) modeling of the spectra collected on the nutshell-results of cross-validation (LVs: Latent variables).

Pre-Treatment LVs Average Classification Error (%CV)
Mean Centering (MC) 9 2.8 1st derivative (+ MC) 11 1.6 2nd derivative (+ MC) 9 2.9 SNV (+ MC) 12 3.4 SNV+ 1st derivative (+ MC) 11 2.8 SNV+ 2nd derivative (+ MC) 9 2.9 The model providing the lowest classification error was the one built on data preprocessed by 1st derivative. The application of this model to the test set led to 92.9% sensitivity and 96.9% specificity for Class Sorrento; naturally, due to the symmetry of the classification results for a two class-problem, these values are reversed in the case of Class Non-Sorrento, for which sensitivity was 93.6% and specificity 92.9%. Altogether, two Sorrento and one Non-Sorrento test objects were misclassified. A graphical representation of this outcome is also reported in Figure 2, where the predicted y is displayed as a function of the sample index. In the figure, training objects are represented by empty symbols, while the test ones are displayed as filled items. The black dashed line in the plot is the threshold: samples falling above it are assigned by the model to Class Sorrento, whereas those below the line are predicted as belonging to Class Non-Sorrento. From the representation, it is easy to spot the three misclassified test samples: one object belonging to Class Non Sorrento (Blue square) and two samples appertaining to Class Sorrento (red diamonds). Eventually, in order to understand which spectral variables contribute the most to the discrimination between the two categories, Variable Importance in Projection (VIP) [55] analysis was performed. Applying this approach, it is possible to obtain a VIP index for each predictor (i.e., spectral variable), reflecting its contribution to the discriminant model. Customarily, a variable presenting a VIP index higher than 1 is counted as relevant. Handling spectral data, the outcomes of VIP analysis can be straightforwardly inspected through a graphical representation such as the one reported in Figure 3. In the plot, variables presenting a VIP index higher than 1 are highlighted in red, over the mean spectrum of the samples. From the figure, it can be noticed that the area between 4000 and 4200 cm −1 is selected. These variables interest the combination bands of C-H bonds and are probably due to the presence of fatty acids in walnuts. The spectral features constituting the peak at 5199 cm −1 (approximately from 4840 cm −1 to 5363 cm −1 ) also present VIP index > 1. These variables are linked to the CC and CH combination modes of unsaturated fatty acids [54]. A VIP index higher than 1 are is shown by variables in the spectral range 7000 to 7200 cm −1 ; this area is associable to the presence of Eventually, in order to understand which spectral variables contribute the most to the discrimination between the two categories, Variable Importance in Projection (VIP) [55] analysis was performed. Applying this approach, it is possible to obtain a VIP index for each predictor (i.e., spectral variable), reflecting its contribution to the discriminant model. Customarily, a variable presenting a VIP index higher than 1 is counted as relevant. Handling spectral data, the outcomes of VIP analysis can be straightforwardly inspected through a graphical representation such as the one reported in Figure 3. Eventually, in order to understand which spectral variables contribute the most to the discrimination between the two categories, Variable Importance in Projection (VIP) [55] analysis was performed. Applying this approach, it is possible to obtain a VIP index for each predictor (i.e., spectral variable), reflecting its contribution to the discriminant model. Customarily, a variable presenting a VIP index higher than 1 is counted as relevant. Handling spectral data, the outcomes of VIP analysis can be straightforwardly inspected through a graphical representation such as the one reported in Figure 3. In the plot, variables presenting a VIP index higher than 1 are highlighted in red, over the mean spectrum of the samples. From the figure, it can be noticed that the area between 4000 and 4200 cm −1 is selected. These variables interest the combination bands of C-H bonds and are probably due to the presence of fatty acids in walnuts. The spectral features constituting the peak at 5199 cm −1 (approximately from 4840 cm −1 to 5363 cm −1 ) also present VIP index > 1. These variables are linked to the CC and CH combination modes of unsaturated fatty acids [54]. A VIP index higher than 1 are is shown by variables in the spectral range 7000 to 7200 cm −1 ; this area is associable to the presence of In the plot, variables presenting a VIP index higher than 1 are highlighted in red, over the mean spectrum of the samples. From the figure, it can be noticed that the area between 4000 and 4200 cm −1 is selected. These variables interest the combination bands of C-H bonds and are probably due to the presence of fatty acids in walnuts. The spectral features constituting the peak at 5199 cm −1 (approximately from 4840 cm −1 to 5363 cm −1 ) also present VIP index > 1. These variables are linked to the CC and CH combination modes of unsaturated fatty acids [54]. A VIP index higher than 1 are is shown by variables in the spectral range 7000 to 7200 cm −1 ; this area is associable to the presence of carbohydrates, or to the absorption of non-bonded O-H groups in fatty acids [56]. Eventually, some variables between 8800 cm −1 and 8900 cm −1 are selected by VIP analysis. In this range, the absorptions of the second overtone of the C-H bonds and the combinations bands of the O-H bond take place [57].

PLS-DA Analysis of NIR Spectra Collected on the Kernel
As discussed before, after measuring the shellnut, each walnut was opened and NIR spectra were collected on the kernels. Mean spectra for samples belonging to Class Sorrento (red line) and Class Non-Sorrento (blue line) are reported in Figure 4. Moreover, in this case, it is not possible to appreciate a significant difference between the spectra collected on samples belonging to the two categories.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 14 carbohydrates, or to the absorption of non-bonded O-H groups in fatty acids [56]. Eventually, some variables between 8800 cm −1 and 8900 cm −1 are selected by VIP analysis. In this range, the absorptions of the second overtone of the C-H bonds and the combinations bands of the O-H bond take place [57].

PLS-DA Analysis of NIR Spectra Collected on the Kernel
As discussed before, after measuring the shellnut, each walnut was opened and NIR spectra were collected on the kernels. Mean spectra for samples belonging to Class Sorrento (red line) and Class Non-Sorrento (blue line) are reported in Figure 4. Moreover, in this case, it is not possible to appreciate a significant difference between the spectra collected on samples belonging to the two categories. PLS-DA analysis was carried out following the same procedure used for spectra collected on the nutshell. Consequently, different pretreatments were tested, and the optimal pretreatment and complexity to build the final calibration model were selected on the basis of smallest average classification error in cross validation. The results from this part of the analysis are summarized in Table 3. Table 3. PLS-DA modeling of the spectra collected on the kernel: results of cross-validation (LVs: Latent variables). The PLS-DA model providing the lowest average classification error is the one built on data preprocessed by 1 st derivative. Consequently, this pretreatment was considered the most suitable for the investigated data. When the calibration model was used to predict test samples, it correctly PLS-DA analysis was carried out following the same procedure used for spectra collected on the nutshell. Consequently, different pretreatments were tested, and the optimal pretreatment and complexity to build the final calibration model were selected on the basis of smallest average classification error in cross validation. The results from this part of the analysis are summarized in Table 3. Table 3. PLS-DA modeling of the spectra collected on the kernel: results of cross-validation (LVs: Latent variables). The PLS-DA model providing the lowest average classification error is the one built on data preprocessed by 1st derivative. Consequently, this pretreatment was considered the most suitable for the investigated data. When the calibration model was used to predict test samples, it correctly classified all Class Sorrento objects (i.e., 100% of sensitivity) with a specificity of 93.8%, and it misclassified two (out of 32) Non-Sorrento samples (corresponding to a sensitivity of 93.8%), the specificity being 100%, due to the symmetry of the classification results for a two-class problem.

Pre-Treatment LVs Average Classification Error (%CV)
The predicted y is displayed as a function of the sample index in Figure 5. The plot is quite self-explanatory. As before, samples associated to a y higher than the threshold are predicted as belonging to Class Sorrento (otherwise, they are predicted as Class Non-Sorrento). From the figure, it is clear only two samples are misclassified: two Non-Sorreto samples (blue squares) predicted as belonging to Class Sorrento (red diamonds).
Appl. Sci. 2020, 10, x FOR PEER REVIEW 8 of 14 classified all Class Sorrento objects (i.e., 100% of sensitivity) with a specificity of 93.8%, and it misclassified two (out of 32) Non-Sorrento samples (corresponding to a sensitivity of 93.8%), the specificity being 100%, due to the symmetry of the classification results for a two-class problem.
The predicted y is displayed as a function of the sample index in Figure 5. The plot is quite selfexplanatory. As before, samples associated to a y higher than the threshold are predicted as belonging to Class Sorrento (otherwise, they are predicted as Class Non-Sorrento). From the figure, it is clear only two samples are misclassified: two Non-Sorreto samples (blue squares) predicted as belonging to Class Sorrento (red diamonds). VIP analysis was run also on this model. The variables identified as important were in agreement with the ones discussed in Section 3.1; consequently, they will not be discussed again here, but they are shown in Figure A1 in Appendix A.
Comparing the predictions provided by the PLS-DA models built on spectra collected on the nutshell and on the kernel, it can be observed how one of the misclassified samples (belonging to Class Non-Sorrento) was wrongly predicted by both models. This is not completely surprising because, as detailed in Table 1, some Non-Sorrento walnuts are Italian, so they could have been harvested in an area nearby Sorrento or in a town presenting similar pedoclimatic conditions.

SO-PLS-LDA Analysis
The sequential data fusion model was built using data preprocessed by the optimal pretreatment selected in individual analysis: first derivative. Building SO-PLS-LDA models, the optimal number of latent variables, six for X1 and seven for X2, was selected based on a cross-validation procedure. The corresponding optimal classification model provided a sensitivity of 98.7% and a specificity of 98.0% for Class Sorrento and vice versa for Class Non-Sorrento (i.e., 98.0% sensitivity and 98.7% specificity). When this SO-PLS-LDA model was used to predict validation samples, the classification rates were extremely satisfying. In fact, it correctly classified all test samples except one belonging to Class Non-Sorrento. In Figure 6 the histograms representing the scores of the training and test samples along the canonical variate are displayed both as scatterplot (panel a) and as histograms (panels b and c). Taking a look at this graphical representation of the results, it appears that samples VIP analysis was run also on this model. The variables identified as important were in agreement with the ones discussed in Section 3.1; consequently, they will not be discussed again here, but they are shown in Figure A1 in Appendix A.
Comparing the predictions provided by the PLS-DA models built on spectra collected on the nutshell and on the kernel, it can be observed how one of the misclassified samples (belonging to Class Non-Sorrento) was wrongly predicted by both models. This is not completely surprising because, as detailed in Table 1, some Non-Sorrento walnuts are Italian, so they could have been harvested in an area nearby Sorrento or in a town presenting similar pedoclimatic conditions.

SO-PLS-LDA Analysis
The sequential data fusion model was built using data preprocessed by the optimal pretreatment selected in individual analysis: first derivative. Building SO-PLS-LDA models, the optimal number of latent variables, six for X 1 and seven for X 2 , was selected based on a cross-validation procedure. The corresponding optimal classification model provided a sensitivity of 98.7% and a specificity of 98.0% for Class Sorrento and vice versa for Class Non-Sorrento (i.e., 98.0% sensitivity and 98.7% specificity). When this SO-PLS-LDA model was used to predict validation samples, the classification rates were extremely satisfying. In fact, it correctly classified all test samples except one belonging to Class Non-Sorrento. In Figure 6 the histograms representing the scores of the training and test samples along the canonical variate are displayed both as scatterplot (panel a) and as histograms (panels b and c). Taking a look at this graphical representation of the results, it appears that samples belonging to Class Sorrento (red bars) present negative value of the canonical variate score; on the contrary, Non-Sorrento Appl. Sci. 2020, 10, 4003 9 of 14 samples present positive or slightly negative values of CV1. The misclassified test sample is the Non-Sorrento object whose score on the canonical variate is at around −0.1. VIP analysis has been carried out also on this model, following the procedure described in [58]. The selected variables are approximately the same highlighted before, so the discussion is not reported here; the graphical representation is displayed in Figure A2 in Appendix A.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 14 belonging to Class Sorrento (red bars) present negative value of the canonical variate score; on the contrary, Non-Sorrento samples present positive or slightly negative values of CV1. The misclassified test sample is the Non-Sorrento object whose score on the canonical variate is at around −0.1. VIP analysis has been carried out also on this model, following the procedure described in [58]. The selected variables are approximately the same highlighted before, so the discussion is not reported here; the graphical representation is displayed in Figure A2 in Appendix A.

SO-CovSel-LDA Analysis
Similarly to the procedure described in Section 3.3.1, the SO-CovSel-LDA model was built using both predictors blocks preprocessed by 1 st derivative. The optimal number of selected variables (defined on the basis of cross-validation) was 1 for X1 and 20 for X2. The cross-validated calibration model provided sensitivities of 96.1% and 96.0% for Class Sorrento and Class Non-Sorrento, respectively. The application of this model to the test samples led to the correct classification of 58 over 60 validation objects. This outcome is good, comparable to those obtained by PLS-DA, but less satisfying than SO-PLS-LDA analysis.

Discussion
All the discussed models served as suitable tools for the discrimination of Sorrento walnuts from all the other inspected samples. This outcome was not that obvious, because, among the investigated walnuts, there are fruits produced on the Italian territory, not necessarily far from Sorrento and/or grown in particularly different soils and climatic conditions. Ideally, the best solution for the problem under consideration would be to allow discrimination by using the spectra collected on the nutshell, because this means avoiding any loss of product, with the consequence of having a lower economic

SO-CovSel-LDA Analysis
Similarly to the procedure described in Section 3.3.1, the SO-CovSel-LDA model was built using both predictors blocks preprocessed by 1st derivative. The optimal number of selected variables (defined on the basis of cross-validation) was 1 for X 1 and 20 for X 2 . The cross-validated calibration model provided sensitivities of 96.1% and 96.0% for Class Sorrento and Class Non-Sorrento, respectively. The application of this model to the test samples led to the correct classification of 58 over 60 validation objects. This outcome is good, comparable to those obtained by PLS-DA, but less satisfying than SO-PLS-LDA analysis.

Discussion
All the discussed models served as suitable tools for the discrimination of Sorrento walnuts from all the other inspected samples. This outcome was not that obvious, because, among the investigated walnuts, there are fruits produced on the Italian territory, not necessarily far from Sorrento and/or grown in particularly different soils and climatic conditions. Ideally, the best solution for the problem under consideration would be to allow discrimination by using the spectra collected on the nutshell, because this means avoiding any loss of product, with the consequence of having a lower economic impact. The results described in Section 3.1 demonstrate that this is actually possible, with a relatively low total classification error (5%, corresponding to 3 over 60 misclassified test samples). It has to be noticed that, among the misclassified samples, two belong to Class Sorrento and only one to Class Non-Sorrento, indicating the possibility of false positive (i.e., Non-Sorrento walnuts predicted as Sorrento) is definitely reasonable (1 sample out of 32, corresponding to~3% of error).
Despite the results obtained on the individual analysis were satisfying, the application of the multi-block strategies, and, in particular, of SO-PLS-LDA, provided an improvement from the prediction point of view. Consequently, whether the aim is to maximize the efficiency of the analysis, even considering the possibility of losing part of the product (which could anyhow be sold as shelled walnuts) SO-PLS-LDA represents a definitely suitable solution, with a rather low total error rate of 1% in prediction.

Conclusions
Two-hundred-and-thirty-seven walnuts have been investigated by NIR spectroscopy coupled with chemometrics in order to understand whether it is possible to discriminate fruits harvested in the Sorrento area from other walnuts. NIR spectra were collected on the whole fruit (i.e., with shell) and on the kernels, and then classified by PLS-DA. As auspicated, the PLS-DA model built on data collected on the shells provided satisfying results (3% of total classification error rate in external validation), indicating the proposed strategy is a suitable solution to discriminate Sorrento samples avoiding any loss of product (walnuts can be sold as they are after NIR analysis on the shell). Nevertheless, whether a more accurate solution is required, the multi-block strategy represents the ideal approach. In fact, SO-PLS-LDA led to a total classification rate of 1% in external validation.

Conflicts of Interest:
The authors declare no conflicts of interest.

Appendix A
In Figure A1, the graphical representation of VIP analysis on spectra collected on the kernels is displayed.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 10 of 14 impact. The results described in Section 3.1 demonstrate that this is actually possible, with a relatively low total classification error (5%, corresponding to 3 over 60 misclassified test samples). It has to be noticed that, among the misclassified samples, two belong to Class Sorrento and only one to Class Non-Sorrento, indicating the possibility of false positive (i.e., Non-Sorrento walnuts predicted as Sorrento) is definitely reasonable (1 sample out of 32, corresponding to ~3% of error). Despite the results obtained on the individual analysis were satisfying, the application of the multi-block strategies, and, in particular, of SO-PLS-LDA, provided an improvement from the prediction point of view. Consequently, whether the aim is to maximize the efficiency of the analysis, even considering the possibility of losing part of the product (which could anyhow be sold as shelled walnuts) SO-PLS-LDA represents a definitely suitable solution, with a rather low total error rate of ~1% in prediction.

Conclusions
Two-hundred-and-thirty-seven walnuts have been investigated by NIR spectroscopy coupled with chemometrics in order to understand whether it is possible to discriminate fruits harvested in the Sorrento area from other walnuts. NIR spectra were collected on the whole fruit (i.e., with shell) and on the kernels, and then classified by PLS-DA. As auspicated, the PLS-DA model built on data collected on the shells provided satisfying results (3% of total classification error rate in external validation), indicating the proposed strategy is a suitable solution to discriminate Sorrento samples avoiding any loss of product (walnuts can be sold as they are after NIR analysis on the shell). Nevertheless, whether a more accurate solution is required, the multi-block strategy represents the ideal approach. In fact, SO-PLS-LDA led to a total classification rate of 1% in external validation.

Appendix A
In Figure A1, the graphical representation of VIP analysis on spectra collected on the kernels is displayed. In Figure A2, the graphical representation of VIP analysis on the SO-PLS-LDA model is displayed.
Appl. Sci. 2020, 10, x FOR PEER REVIEW Figure A1. VIP Analysis: Mean spectrum (black line). Variables presenting a VIP index > 1 are highlighted in red.
In Figure A2, the graphical representation of VIP analysis on the SO-PLS-LDA model is displayed.