Application of ATR-FT-MIR for Tracing the Geographical Origin of Honey Produced in the Maltese Islands

Maltese honey has been produced, marketed, and sold as an exclusive local gourmet food product for countless years. Yet, thus far, no study has evaluated the individuality of this local food product. The evaluation of the parameters and properties which characterise the provenance and floral source of honey have been the subject of various studies worldwide, owing to the price and potential beneficial properties of this food product. Models analysing the potential of attenuated total reflection mid-infrared (ATR-FT-MIR) spectroscopy in discriminating and classifying local honey from that of foreign origin were investigated using 21 Maltese honey samples and 49 honey samples collected from abroad (Sicily, Greece, Sweden, Italy, France, Estonia and other samples of mixed geographical origin). Through a combination of spectroscopic techniques, spectral transformations, variable selection and partial least squares discriminant analysis (PLS-DA), chemometric models which successfully classified the provenance of local and non-local honey were developed. The results of these models were also corroborated with other classification and pattern recognition techniques, such as linear discriminate analysis (LDA), support vector machines (SVM) and feed-forward artificial neural networks (FF-ANN).


Introduction
There is a considerable number of apiaries in Malta and Gozo, producing honey which is sought by locals and tourists alike, and valued for its unique taste and characteristics. At the moment, there is a sizable number of beekeepers selling Maltese honey directly or through local markets. However, the vast amount of honey being sold is raising suspicion that there might be cases of fraud, where the honey is either being mislabelled as Maltese honey, or else adulterated with sugar syrup and/or non-local honey.
The sugars fructose and glucose account for about 85% of honey solids, given that floral nectar is the source of honey sugars. Glucose and fructose are reported to be the only monosaccharides in honey, with an average concentration of 38% w/w for fructose and 31% w/w for glucose [1,2]. Oligosaccharides represent about 10% of the total honey weight [3] and are composed of several units, generally two to six units of glucose and fructose, with glycosidic linkages in different positions. Siddiqui [2] characterised 14 disaccharides and 11 trisaccharides, while Doner [4] showed that there is satisfactory evidence for the presence of 10-13 disaccharides and 8-9 trisaccharides. More recent studies have shown that 25 trisaccharides and 10 tetrasaccharides have been found in honey samples from Spain and New Zealand [5]. The composition of oligosaccharides in honey is related to the floral source, however, it is when applied to IR data, in order to remove spectral artefacts such as baseline shifts and multicollinearity and can also reveal 'hidden' information by emphasising small spectral variations. The aim of this research is the analysis of Maltese and foreign honey by ATR-FT-MIR, alongside several data treatment and pattern recognition techniques. The study also aims to identify which spectral transformation or combination of them are more adequate for their discrimination.

Honey Samples
A total of 70 samples were collected and 21 local samples were directly collected from Maltese and Gozitan beekeepers post honey harvest, between the period of 2015 and 2016. Further details are presented in the Supplementary Materials Figure S1 and Table S1. Overall, 49 foreign samples were collected from different Mediterranean countries directly from various international beekeepers associations. Furthermore, foreign samples sold from local supermarkets were also included. All samples were kept in the dark at 20 • C until analysis.

FTIR Method
Prior to scanning, honey samples were homogenized after heating to 30 • C for one hour, followed by stirring. A Shimadzu IR-Affinity 1 equipped with a Silver Gate Zn/Se ATR was used for spectral acquisition. The instrument was set to acquire 32 scans per spectrum at a resolution of 4 cm −1 in the range of 4000−550 cm −1 . In order to obtain a spectrum with a high signal to noise ratio and to reduce the error in the baseline, the instrument was blanked before each sample and each spectrum was run in triplicate. The mean of these replicates was then used in the following data analysis procedures.

Chemometric Analysis
Initial data treatment included first removing the region between 2800 and 1800 cm −1 in which no peaks arise. The honey sample matrix contains a negligible amount of chemicals which have active bands in this region [29]. The spectra were also trimmed at the ends to a range of 740-3600 cm −1 , in order to remove regions which contained a significant amount of noise and no relevant chemical data ( Figure 1). The spectra obtained were subjected to different spectroscopic signal processing techniques which were evaluated and compared. These include subtraction of a linear baseline, multiplicative scatter correction (MSC), orthogonal signal correction (OSC), standard normal variate (SNV), and first and second derivative Savitzky-Golay transformations. The effect of the different spectral transformations on the final classification outcomes was compared to those obtained without any signal processing.
Several spectral transformations were applied prior to statistical analysis using the Unscrambler X (CAMO A/S, Oslo, Norway). Smoothing was the first transformation applied to the IR spectra. There are a variety of smoothing algorithms which can be applied to spectra, including moving average, Gaussian, median and Savitzky-Golay. The spectrum with maximum smoothness and minimum distortion from the original signal was selected, thus, a compromise between noise reduction and retention of information was evaluated. This was also further confirmed by PLS-DA analysis, which showed that the best improvement and highest explained variance out of all smoothed spectra was obtained when using a median filter with a gap size of three.
MSC, OSC, detrending, deresolving, SNV, along with a combination of SNV and detrending filters were then applied to the smoothed spectral data, in order to determine their effect on the misclassification rate and the RMSE error. Furthermore, first and second Savitzky-Golay derivative transformations were applied to the spectra in the region with a gap size of 7 points and a polynomial order of two. Several spectral transformations were applied prior to statistical analysis using the Unscrambler X (CAMO A/S, Oslo, Norway). Smoothing was the first transformation applied to the IR spectra. There are a variety of smoothing algorithms which can be applied to spectra, including moving average, Gaussian, median and Savitzky-Golay. The spectrum with maximum smoothness and minimum distortion from the original signal was selected, thus, a compromise between noise reduction and retention of information was evaluated. This was also further confirmed by PLS-DA analysis, which showed that the best improvement and highest explained variance out of all smoothed spectra was obtained when using a median filter with a gap size of three.
MSC, OSC, detrending, deresolving, SNV, along with a combination of SNV and detrending filters were then applied to the smoothed spectral data, in order to determine their effect on the misclassification rate and the RMSE error. Furthermore, first and second Savitzky-Golay derivative transformations were applied to the spectra in the region with a gap size of 7 points and a polynomial order of two.
Principal component analysis (PCA) was carried out on the data, in order to hint at possible outliers or any possible clustering of the Maltese samples present within the data set. The supervised chemometric treatment was performed using PLS-DA, in order to classify the geographical origin with regards to Maltese and non-Maltese samples. The former samples were assigned a dummy variable of 1, while the latter were assigned a value of 0. Samples with a predicted value of >0.5 were thus labelled as foreign, while the remaining samples were labelled as local. PLS-DA analysis was carried out on the whole data set, using leave one out cross-validation (LOOCV), after which PLS-DA was repeated using excluded rows validation (ERV), with the exclusion of one third of the Principal component analysis (PCA) was carried out on the data, in order to hint at possible outliers or any possible clustering of the Maltese samples present within the data set. The supervised chemometric treatment was performed using PLS-DA, in order to classify the geographical origin with regards to Maltese and non-Maltese samples. The former samples were assigned a dummy variable of 1, while the latter were assigned a value of 0. Samples with a predicted value of >0.5 were thus labelled as foreign, while the remaining samples were labelled as local. PLS-DA analysis was carried out on the whole data set, using leave one out cross-validation (LOOCV), after which PLS-DA was repeated using excluded rows validation (ERV), with the exclusion of one third of the samples from each class to assess for model overfitting. The RMSE for the model was calculated as shown below, in order to further assess the accuracy of the model. Where y pred corresponds the value between 0 and 1 generated by the model, whilst y ref corresponds to the dummy variable to which the honey was assigned.
The optimum model for each transformation was chosen after an assessment of the PLS-DA model parameters. The classification accuracy of the LOOCV and ERV models, explained as X and Y variance, RMSE and number of factors were used to evaluate the performance of the chemometric models.

Variable Selection
Once the optimum number of factors is determined, the data points which had a VIP (variable importance in projection) >0.8 were then used to develop subsequent PLS-DA models. The VIP score is a measure of a variable's importance in the PLS model. It represents the contribution of a variable to the PLS model and is determined through a weighted sum of the squared correlations between the model components and the original variable. A value of less than 0.8 is typically considered to be a small VIP, and thus, a candidate for deletion from the model [43]. VIP scores are useful in understanding X space predictor variables that best explain y variance. VIP scores give an estimate of the contribution of a given predictor to a PLS regression model [44].
Furthermore, stepwise linear canonical discriminant analysis (SLC-DA), as implemented within JMP, was also used, in order to reduce the number of variables used in PLS-DA models. A stepwise analysis allows for the manual selection of variables used to build the linear model up to a maximum number of entries (n-1), where n is the number of samples in the sample set. The model containing the most discriminant variables was selected on the basis of a low F-ratio and a high p-Value.

Statistical Analysis
Feed-forward artificial neural networks (FF-ANN), support vector machines (SVM) and linear discriminant analysis (LDA) were implemented as a further corroboration and validation to the PLS-DA models. FF-ANN, LDA and SVM analysis were carried using a Python script and the 'scikit-learn' Machine Learning toolbox for Python [45]. FF-ANN models were implemented on data without variable selection, whilst SVM and LDA classification methods were applied on data with SLC-DA model selection.
SVM models have no limit on the number of variables which can be used in a model. Nonetheless, SVM models require a computationally intensive grid search and thus analysis were performed on SLC-DA selected variables. Models for LDA were also performed on the SLC-DA selected variable, as this classification technique is usually limited to small number of variables, which must be less than the number of samples in each class. On the other hand, FF-ANN models are suited for modelling data with a large number of variables, and thus were used to model data with no variable selection. In all cases the models were validated both using ERV in a similar fashion to PLS-DA models.
The aforementioned statistical analysis and variable selection steps were also carried out on the fingerprint region (760-1400 cm −1 ), in order to determine the effect of using this portion of the spectrum only on the prediction rate and RMSE of the models.

Geographical Classification Using ATR-FT-MIR
Monosaccharides, water, and other sugars are the main components in honey, thus, most of the spectral peaks observed in honey IR spectra appertain to vibrational modes exhibited from sugars and water [23]. Water in honey shows up as a very distinct broad peak between 3500−3000 cm −1 in the honey IR spectrum (Figure 1). Additionally, a peak is observed between 3000−2800 cm −1 , which arises from vibrational modes of carbohydrates [32], carboxylic acids [46] and amino acids [29]. The region between 1700−1600 cm −1 shows the vibrational modes from water [47], carbohydrates [32] and the amide I band [48]. The peaks within the fingerprint region (1500−700 cm −1 ) are attributed to various vibrational modes of carbohydrates and ketones. The vibrations that occur between 1200 and 1300 cm −1 are attributed to the presence of -C-O bonds, whilst that at 1750 cm −1 accounts for the carboxylic acid functionalities (C=O) of various carbohydrates. The bands observed in the range between 1150 and 995 cm −1 are attributed to the stretching and bending vibrations of C-O, C-H and C-OH vibrations arising from carbohydrates. [34,36].
The first data handling stage involved the removal of the region between 2700 cm −1 and 1800 cm −1 , as it contained no IR bands which are expected to show up from the honey sample matrix; for simplicity's sake, this region will be referred to as the 'whole spectrum' henceforth. The median filter transformation was particularly effective in removing any noise generated by the ATR-FTIR while leaving any slight spectral variations intact. Further spectral transformations were then applied to median filter smoothed MIR spectra, since the application of some spectral transformations such as derivative transformations tended to accentuate any noise present.
A visual inspection of the MIR spectra ( Figure 1) and the resulting spectral transformations revealed no regions which offer discrimination between Maltese and non-Maltese samples. Furthermore, PCA identified no outliers within the dataset for the untreated spectra, or when the spectra were subjected to spectral transformations. Through PCA, no samples were observed to cluster according to geographical origin. This result is not unexpected, since the honey samples being tested do not differ only by geographical origin, since other sources of variability, such as botanical origin, were present. PCA analysis is presented in Supplementary Materials Figures S2 and S3. 3.1.1. PLS-DA and Variable Selection PLS-DA was used as the primary statistical model for sample classification and the prediction of Maltese and non-Maltese samples (Table 1). In the majority of the cross validated PLS models on the 'whole' MIR spectra (Table 1), no samples were misclassified, except for the second derivative transformation model, which exhibited an accuracy of 98.6%. The RMSEs for all the LOOCV models were considerably low, wherein the RMSE will effectively describe the average distances of the predicted sample towards the dummy classification system used.   ERV PLS-DA models exhibited a decrease in the % accuracy, accompanied with an increase in the RMSE; this is expected, since the models relied on smaller number of samples. Ideally, well-calibrated models should exhibit little or no change on moving from cross-validated models to excluded rows validated models, whereas a significant drop in accuracy and an increase in the RMSE often suggests that the model is over-fitted. An over-fitted model will not solely describe the systematic variation in a model, but will also describe some of the random variation within the dataset and will give inaccurate predictions. For most of the models (Table 1), the drop in prediction accuracy was not very large, since most models show an accuracy >95%, which suggests that these models were well-calibrated.
The use of VIP scores for data reduction in PLS shows a notable improvement with regards to the classification rate in both the cross validated and excluded rows models. Furthermore, a markedly larger improvement was also observed when using SLC-DA for variable selection (Table 1), where all the internally validated PLS-DA models exhibited no misclassifications and considerably lower RMSEs than the PLS models without variable selection and VIP variable selection. An example of such a plot of VIP and SLCDA scores for the media transformation is included in Figure 2b,c.
The externally validated PLS-DA models also showed a significant improvement when SLCDA is used, with most models showing no misclassifications. The PLS-DA model improvement can be attributed to the large amount of variable reduction, from around 550 variables to around 20-40 variables, wherein the amount of redundant and collinear variables is reduced (Figure 2). Generally, there is a marked improvement when only the fingerprint region was used to develop PLS-DA model (Table 1), when compared to models on the 'whole' spectra. A similar trend towards a drop in the prediction accuracy for the excluded rows validation is also observed in this case. The OSC, SNV and a combination of SNV and detrending transformations generated PLS-DA models, which correctly classify all the samples through external validated models, without any variable selection as show in Figure 3. This highlighted the effectivity of these transformations when used in conjunction with PLS-DA for the geographical profiling of Maltese and non-Maltese samples. Generally, there is a marked improvement when only the fingerprint region was used to develop PLS-DA model (Table 1), when compared to models on the 'whole' spectra. A similar trend towards a drop in the prediction accuracy for the excluded rows validation is also observed in this case. The OSC, SNV and a combination of SNV and detrending transformations generated PLS-DA models, which correctly classify all the samples through external validated models, without any variable selection as show in Figure 3. This highlighted the effectivity of these transformations when used in conjunction with PLS-DA for the geographical profiling of Maltese and non-Maltese samples.
The removal of the variables which had VIP scores less than 0.8 on the fingerprint region generally decreased the RMSE in the internally validated PLS-DA models, which maintained their classification accuracy. The PLS-DA models using external validation were also shown to generally improve, whereas the models using OSC and median filter transformations showed an increase in the number of misclassifications. Moreover, PLS-DA models using the variables selected by SLC-DA (Table 1) generally showed a marked improvement on the models using no variable selection and VIP variable selection, except for the model using a de-resolve transformation.
Furthermore, apart from a higher accuracy and lower RMSE, PLS-DA models on spectral transformations of the fingerprint region generally exhibited a higher % explained variance for the predictor matrices when compared to the 'whole' spectra. This was generally true in the case of PLS-DA models without variable selection and models which used VIP scores for variable selection. Nonetheless, the performance of the PLS models using the SLC-DA variables was similar in both cases ( Table 1). The lack of improvement over using the fingerprint region versus the whole spectrum for analysis in the case of SLC-DA, is due to the fact that SLC-DA is very effective at variable selection, thus removing any redundant variables present in the region from 3600−2800 cm −1 .
In light of these findings, it can be concluded that the fingerprint region is more suited for the differentiation of local and foreign samples using PLS-DA analysis, since it carries more relevant information and still gives very good classification accuracy without the need of variable selection. Lastly, while the model parameters give a good indication of the performance of the PLS-DA models, at this stage, they should not be used to single out the best performing transformation method for classifying Maltese and non-Maltese honey using PLS-DA. This is because the samples only represent a small set of local honey and an even smaller set of non-local honey, and thus different samples might be better represented using different transformations. Nevertheless, the high classification rates and low error values highlight the potential application of ATR-FT-MIR spectroscopy and spectral transformations in combination with PLS-DA for the routine classification of local and foreign honey. In light of these findings, it can be concluded that the fingerprint region is more suited for the differentiation of local and foreign samples using PLS-DA analysis, since it carries more relevant information and still gives very good classification accuracy without the need of variable selection. Lastly, while the model parameters give a good indication of the performance of the PLS-DA models, at this stage, they should not be used to single out the best performing transformation method for classifying Maltese and non-Maltese honey using PLS-DA. This is because the samples only represent a small set of local honey and an even smaller set of non-local honey, and thus different samples might be better represented using different transformations. Nevertheless, the high classification rates and low error values highlight the potential application of ATR-FT-MIR spectroscopy and spectral transformations in combination with PLS-DA for the routine classification of local and foreign honey.

Other Models
Excluded row FF-ANN models ( Table 2) on spectral transformations of the whole MIR spectra showed a slight improvement in the classification rate when compared to their respective PLS-DA models. Conversely, the excluded rows validated FF-ANN models performed on spectral transformations of the fingerprint region showed an increase in the number of misclassifications for nearly all the spectral transformations, except for models derived from MSC and SNV transformed

Other Models
Excluded row FF-ANN models ( Table 2) on spectral transformations of the whole MIR spectra showed a slight improvement in the classification rate when compared to their respective PLS-DA models. Conversely, the excluded rows validated FF-ANN models performed on spectral transformations of the fingerprint region showed an increase in the number of misclassifications for nearly all the spectral transformations, except for models derived from MSC and SNV transformed spectra. This infers that FF-ANN models were more effective at extracting information from the 'whole' spectra for classification than from the fingerprint region. A similar trend was observed in the excluded rows SVM model results (Table 3), where the classification rate was generally higher for SVM models of spectral transformation on the 'whole' spectra than for spectral transformation in the fingerprint region; thus, the same inference can be made. Nevertheless, the high classification rates obtained by most excluded rows validated FF-ANN, SVM and PLS-DA models, further corroborating the use of MIR spectra and spectral transformations as a method for the classification of local and foreign samples. The PLS-DA model results were also further corroborated by the LDA model results (Table 3), which showed no misclassifications in most instances. In fact, LDA models commonly showed better classification rates than the corresponding PLS-DA models. These results highlight the potential use of open source pattern recognition packages for further development and implementation in chemometric applications.

Conclusions
Most spectral transformations on ATR-FT-MIR data in combination with PLS-DA were shown to be very effective in classifying local and non-local honey samples. Furthermore, the use of the fingerprint region for classifying samples was shown to be more effective in PLS-DA models using no variable selection and VIP variable selection. The use of SLC-DA for variable selection was also shown to be significantly effective in decreasing the number of misclassifications, both when using the 'whole' spectrum and when using the fingerprint region.
FF-ANN, SVM and LDA models were shown to offer similar classification rates to PLS-DA models and this thus corroborates the results obtained from the PLS-DA models and places confidence in the use of ATR-FT-MIR methods in conjunction with spectral transformations, for the classification of Maltese and foreign honey samples. These results highlight the potential of these methods to be further developed, for the detection of adulteration and for more in-depth profiling and classification of Maltese honey. Furthermore, the results obtained highlight the effectiveness of chemometric and pattern recognition-based approaches, in order to quickly and reliably test the authenticity of honey samples. These promising results should thus serve as an incentive for more research to be done on developing a more extensive model, using other techniques such as fluorescence spectroscopy and NMR.
Supplementary Materials: The following are available online at http://www.mdpi.com/2304-8158/9/6/710/s1, Figure S1: Map of Maltese islands highlight locality of honey samples in Table S1, Figure S2: Two component plot from PCA on Median filtered data in the region between 760 and 1400 cm −1 , Figure S3: Loading plot for the first two component of PCA on Median filtered data in the region between 760 and 1400 cm −1 , Table S1: Sample code, locality and date of harvest for local samples.
Author Contributions: J.P.F., data acquisition, research paper conceptualization, methodology, software, validation, formal analysis, investigation, data curation, writing-original draft preparation, writing-review and editing, and funding acquisition; F.L., data acquisition, methodology, software, validation, writing-review and editing, conceptualization, writing-original draft preparation, D.M., writing-review and editing, and supervision; C.F., conceptualization, software, writing-original draft preparation, formal analysis supervision and project administration. All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported by the ENDEAVOR Scholarships scheme (ESF.01.015). The scholarship is part-financed by the European Union-European Social Fund.