Salento Honey (Apulia, South-East Italy): A Preliminary Characterization by 1 H-NMR Metabolomic Fingerprinting

: Honey is a natural sweet substance produced by honeybees from the nectar of flowers, plant secretions or plant-sucking insect excretions. Sugars and water constitute the major components, other minor components characterize the organoleptic and nutritional properties. To date, Salento (Apulia region, Italy) honey production is considerably threatened due to the suggested use of neonicotinoids in order to control the insect-vectored bacterium Xylella fastidiosa (subsp. pauca ). Metabolomics based on Nuclear Magnetic Resonance (NMR) spectroscopy was used to describe, for the first time, the composition of honey samples from different Salento producers. Exploratory Principal Component Analysis (PCA) showed, among the observed clustering, a separation between light and dark honeys and a discrimination according to producers, both further analyzed by supervised multivariate analysis. According to the obtained data, although limited to small-scale emerging production, Salento honey shows at the molecular level, a range of specific characteristic features analogous to those exhibited by similar products originating elsewhere and appreciated by consumers. The impact on this production should therefore be carefully considered when suggesting extensive use of pesticides in the area.


Introduction
Honey is a natural sweet substance produced by honeybees from the nectar of flowers, plant secretions or plant-sucking insect excretions (honeydew honey). Botanical and geographic origins determine both sensory properties (colour, flavor, and texture) and physicochemical parameters (viscosity and crystallinity) of this product [1]. Honey is one of the most complex natural foods and it is considered the only sweetening agent that can be used by humans without processing [2]. Sugars constitute the major component (95% of dry weight) responsible for the energy value of honey: two monosaccharides, fructose, and glucose are the dominant constituents [3]. Besides the two main sugar components, honey contains about 25 oligosaccharides (tri-and tetra-saccharides). Several studies reported the characterization of the carbohydrate profile of honey [3,4]. Water is the second most abundant honey constituent (12-23%). Other minor components such as organic acids, minerals, vitamins, enzymes, proteins, amino acid, and volatile and phenolic compounds characterize the organoleptic and nutritional properties of honey [5]. Besides the high nutritive value, minor components are also responsible for the healthy properties of honey such as antibacterial, anti-inflammatory, antioxidant, and immune system-stimulating activities, as reported in the literature [2,6]. Many factors such as geographical, botanical or floral origin, together with climatic and seasonal variations, influence honey chemical composition and quality [7]. Moreover, other external factors such as the environment, honey treatment methods, and storage conditions used by beekeepers as well as possible deliberate adulteration should also be taken into account [8]. According to the current standards of the Codex Alimentarius [9] and the European Union (EU) [10], several physical and chemical measurements are required for honey quality control, all very important for honey producers, the food industry, consumers, and regulatory authorities. To date, several different botanical varieties of honey are available on the market, namely, monofloral or polyfloral, with strong differences in composition and physical, chemical, and organoleptic characteristics [11,12]. Honey colour strongly depends on its age and the kind of flowers that supplied the nectar to the producing bees. Honey colour is also related to its flavor. Light-coloured honeys are mild-flavoured, while the dark ones have a stronger flavor [13,14]. Darker honeys such as honeydew honeys are reported to contain more phenolic acid derivatives but fewer flavonoids than light coloured ones [13,15,16]. Several studies have reported on the assessment of the quality, geographical, and botanical origins of honey, also with the aim to detect any possible adulteration [3,8,17]. Application of Nuclear Magnetic Resonance (NMR) spectroscopy for the analysis of honey offers some advantages compared to other conventional analytical methods (such as GC and GC-MS) [17,18]. These include the availability of a wealth of information in a single measurement, simultaneous detection of various components, high reproducible and comparable data with a high statistical confidence level, and minimal requirements for sample amount and pre-processing [19,20]. In particular, the metabolomic approach based on NMR spectroscopy, in combination with chemometrics, is a powerful fingerprinting technique that is successfully employed for biomarker detection, food quality control, and/or origin discrimination [11,[21][22][23][24][25][26]. This approach is used to analyze metabolite profiles and identify the most important discriminating compounds that differentiate honeys. Indeed, several studies proved the NMR-based screening techniques are a suitable tool for the rapid authenticity analysis of honey [19].
In this paper, we present an investigation of combined NMR and a chemometric data analysis approach to describe the variability in the composition of honey samples from different local Salento (Lecce and Taranto Provinces, Apulia region, South-East Italy) producers. Although some papers describing NMR characterization of Italian honey have already been published [3,11,12,21,27,28], the present study appears to be the first work focused on honey from Salento. On the other hand, although it can be considered one of the most important Italian regions as a foodstuff source, nowadays, Apulia only represents an emerging honey producer with about 290 t, less than 2% of the total Italian production (23,344 tons) [29]. Moreover, the Salento contribution to the Apulia production only relies on a minor part of active beehives in the region [29,30]. Nevertheless it should be considered that Salento honey (part of the Apulia production) is considerably threatened due to the suggested use of neonicotinoids in order to control the insect-vectored bacterium Xylella fastidiosa (subsp. pauca) [31]. The suggested use of insecticides in order to control X. fastidiosa vector(s) in the Salento area has been recently reviewed in an EFSA (European Food Safety Authority) report [32]. Neonicotinodis, the insecticides that are most widely used worldwide, are substances known to be carcinogenic to people and considered responsible for the deaths of bees, representing a serious risk for biodiversity and ecosystems [33][34][35]. After monitoring by the EFSA in 2013 [36], the EU decided to prohibit the use of neonicotinoids in open fields [37]. Nevertheless, as a consequence of the "National emergency plan for X. fastidiosa management in Italy", on the 13 of March 2018, the Italian Minister of Agriculture, issued a decree determining in the Apulia Region a specific area (Salento) where farmers may be forced to use these pesticides [31,38,39]. On the other hand, Salento honeys, under treat due to the possible deaths of bees related to pesticide use, do not appear to have been previously characterized. Thus, this could be one of the main reasons to focus, for the first time, on the characterization of honey produced in Salento (an Apulia sub-region where X. fastidiosa diseases first appeared). In particular, thirteen honey samples from different provinces of the Salento area were subjected to spectroscopic fingerprints in order to characterize them qualitatively and also to identify potential molecular components responsible for discrimination among sample clusters.

1 H-NMR Fingerprinting and Metabolite Identification
A total of 13 samples of mono-and multi-floral honeys and honeydew supplied in commercial jars by four trusted Salento honey producers were analyzed. Three technical replicates were obtained from a single specific commercial honey jar in order to minimize possible data inhomogeneity. The studied honeys included eight multi-floral, three mono-floral (one acacia, one orange, and one citrus sample), and two honeydew samples. The limited size of the reported sample set was chosen to provide preliminary indications of Salento honey characteristics. However, several studies used a very limited number of samples for much wider productions and geographic areas, such as the quality and bio-functional properties characterization of seven honey samples of different botanical and geographical origins collected from different regions of Portugal [40]; the 1 H-NMR profiling and chemometric analysis of ten selected honeys from South Africa (five), Slovakia (three), and Zambia (two) [8]; the chemometric analysis of three stingless bee honey samples from different botanical origins collected in Malaysia [41]; the 1 H NMR characterization of twenty Finnish honeys [42]. The screenings based on 1 H-NMR spectroscopy allowed us to extract, in a single and quick analysis, all sample molecular information while maintaining the specific ratios between metabolites present in a complex matrix, without previous separation. The assignments of honey signals identified in the samples, prepared according to the procedures described in the experimental section, are reported in Table 1. A typical 1 H-NMR spectrum of honey sample dissolved in deuterated water as indicated in the Material and Methods section is reported in Figure 1.   From visual inspection of the expansions, three regions could be identified, corresponding to the region of amino acids (0-3 ppm) (Figure 1a), the region of sugars (3-6 ppm) (Figure 1b), and the aromatic region (6-10 ppm) (Figure 1c). The most intense and dominant signals in the spectrum are represented by sugars, typically α-and β−glucose and fructose. Moreover, as already reported in the literature, signals of other sugars, e.g., disaccharides such as sucrose, maltose, isomaltose, and turanose, and trisaccharides such as raffinose, could be identified [3]. These signals were identical in all the honey samples analyzed with minimal intensity variations, as already observed in the literature [19,22]. The resonances of the minor components, which play important roles in honey differentiation were less intense (for example, amino acids, organic acids, etc.). In particular, the minor signals were ascribable to organic carboxylic acids (such as citric acid, acetic acid, and succinic acid) and both aliphatic (alanine, valine, leucine, isoleucine, lysine, proline) and aromatic (phenylalanine and tyrosine) amino acids. Proline, which might originate from bees, is the prevalent honey amino acid and makes up to 50-85% of the amino acid fraction [44]. Organic acids are present in honey at low concentrations (< 0.5%) and they are related to colour, flavor, and other honey physical-chemical properties, such as pH, acidity, and electrical conductivity. Moreover, organic acids can synergistically enhance the antioxidant action of phenolic compounds by chelating metals [1,6]. Acetic acid and ethanol are known to be used as fermentation indicators whereas formic acid is used for the treatment of Varroa infestation [45]. Interestingly, specific resonance (1.38 ppm, s) of methyl groups from a monoterpenoid acid, cyclohexa-1,3-diene-1-carboxylic acid, and as very minor component its (1-O-gentiobiosyl) ester derivative were observed in both light and dark polyfloral honeys. The proton and carbon assignments, obtained by 2D hsqc and hmbc NMR experiments confirmed that the (1.38 ppm, s) signal was ascribable to the known linden honey monoterpenoid acid marker [20,21,27]. Furthermore, a significant number of other compounds (formic acid, ethanol, trigonelline) and other aromatic signals could be observed by NMR, with variations in intensity according to the honey samples. Trigonelline is a plant hormone typical of herbaceous species, identified for the first time in Corsican honeys [46]. Another important observed signal is related to hydroxymethylfurfural (HMF), a furan derivative produced by sugar degradation. HMF is considered an indicator of overheating and long storage conditions [1,22], being an intermediate in the Maillard reaction [47], which links the concentration of HMF to aging and heating processes [48]. HMF levels are used to evaluate honey freshness, although its presence could naturally occur in honey of warm climatic areas, such as tropical and subtropical countries [12]. Typical signals of kynurenic acid (KYNA) (8.22, 7.86, 7.55, 6.94 ppm), often associated with chestnut honeys [21], were also observed in the analyzed samples, although only in the dark honey from a specific producer. Kynurenic acid is known to have beneficial properties in various diseases of the gastrointestinal tract [49], although other authors suggested that an increase in the levels of kynurenic acid in the brain could be linked to some neuropathologies [50]. Further studies regarding the effects of kynurenic acid (KYNA) are needed in order to better understand its role in human health [12]. Honeydew 1 H NMR spectra showed intense aliphatic resonances in the range 0.90-1.65 ppm and a diagnostic doublet at 5.70 ppm. These signals are according to the reported resonances characteristic of honeydew honeys [21] Finally, a not intense but clear signal at 5.90 ppm was observed only in the orange and citrus honey 1 H NMR spectra and it could be ascribable to a diagnostic resonance of 8-hydroxylinalool, a specific marker of citrus honey [20,27]. In all the studied samples, none of the representative signals indicating the presence of neonicotinoid pesticides [51] could be detected.

Unsupervised and Supervised Discriminant Analyses
A preliminary unsupervised multivariate analysis (PCA) (three components, R 2 X = 0.891; Q 2 = 0.826) was performed on the bucket-reduced spectra for the studied samples to obtain an overview of the data and reveal a possible data grouping of observations without any a priori-defined class. No specific outliers were detected in the scores plot and a general natural tendency of the samples clustering according not only to technical replicates ( Figure S1a) but also to specific honey features (dark, light, honeydew) and producers was observed. In particular, the same unsupervised PCA showed a "naturally" clear clustering [52] of the honey samples into two groups according to the declared colour (dark-and light-coloured honeys) (Figure S1b). A summer honeydew, a honeydew, a summer polyfloral honey, and two summer polyfloral honeys clustered into a group characterized by dark colour. On the other hand, citrus, acacia, orange, three spring polyfloral, and two polyfloral honey samples grouped together in the light-coloured class. The distinctive observed colour and clustering showed the occurrence of all the summer-and spring-collected samples in the dark-and light-coloured groups, respectively. This is probably ascribable to a specific relation between collection season and colour already reported in the literature [53]. We refined the separation analysis between the two observed macro-classes, dark and light samples, by pair-wise supervised OPLS-DA aimed to specify the discriminating molecular components responsible for the observed clear discrimination between the light-and dark-coloured honey groups- Figure 2a,b.
Sustainability 2020, 12 2 of 21 two polyfloral honey samples grouped together in the light-coloured class. The distinctive observed colour and clustering showed the occurrence of all the summer-and spring-collected samples in the dark-and light-coloured groups, respectively. This is probably ascribable to a specific relation between collection season and colour already reported in the literature [53]. We refined the separation analysis between the two observed macro-classes, dark and light samples, by pair-wise supervised OPLS-DA aimed to specify the discriminating molecular components responsible for the observed clear discrimination between the light-and dark-coloured honey groups- Figure 2   We obtained a model with good fit and prediction parameters (1 + 2 + 0 gave R 2 X 0.874; R 2 Y 0.956; Q 2 = 0.939; p[cv]-ANOVA = 4.87 × e −18 ). As observed from the S-line for the model, the loadings ascribable to the sugar signals characterized both macro groups. Interestingly, the signals in the aliphatic region (proline, succinate, lactate, monoterpenoid acid) and those related to aromatic compounds (tyrosine, phenylalanine, KYNA, formate) characterized only dark-coloured honeys according to the literature data [10,18,26]. Thus, discriminating information between dark and light honeys could be found specifically in the signals of the minor components [19]. Moreover, in the OPLS-DA score plot ( Figure 3) the light and dark honeys were also separated into different subgroups, according to the producers, along the first orthogonal component (intra-class variation). We obtained a model with good fit and prediction parameters (1 + 2 + 0 gave R 2 X 0.874; R 2 Y 0.956; Q 2 = 0.939; p[cv]-ANOVA = 4.87 × e −18 ). As observed from the S-line for the model, the loadings ascribable to the sugar signals characterized both macro groups. Interestingly, the signals in the aliphatic region (proline, succinate, lactate, monoterpenoid acid) and those related to aromatic compounds (tyrosine, phenylalanine, KYNA, formate) characterized only dark-coloured honeys according to the literature data [10,18,26]. Thus, discriminating information between dark and light honeys could be found specifically in the signals of the minor components [19]. Moreover, in the OPLS-DA score plot ( Figure 3) the light and dark honeys were also separated into different subgroups, according to the producers, along the first orthogonal component (intra-class variation).
This observed clustering clearly indicates that the honey intraclass variability was producer dependent and related to both dark and light products obtained from the same farm. Moreover, the OPLS-DA data of Figure 3 confirmed the hint of further grouping according to producers observed in the PCA scores plot ( Figure S1). Therefore, the whole honey sample set was further studied with supervised analysis aimed at discrimination of the different producers. A supervised PLS-DA was performed, according to different producer classes, considering dark and light honeys separately. Furthermore, chemometric methods were applied to the different spectral regions of the aliphatic (from 0 to 3.14 ppm) and the aromatic (6 to 10 ppm) areas in order to identify and characterize the functional groups of the molecules (organic acids, amino acids, aromatic molecules) responsible for samples discrimination. The possible presence of low-intensity signals of potential discriminating markers for the analyzed honeys and their contribution to samples differentiation were evaluated by excluding the sugar signals region in the statistical analysis [19]. This observed clustering clearly indicates that the honey intraclass variability was producer dependent and related to both dark and light products obtained from the same farm. Moreover, the OPLS-DA data of Figure 3 confirmed the hint of further grouping according to producers observed in the PCA scores plot ( Figure SM1). Therefore, the whole honey sample set was further studied with supervised analysis aimed at discrimination of the different producers. A supervised PLS-DA was performed, according to different producer classes, considering dark and light honeys separately. Furthermore, chemometric methods were applied to the different spectral regions of the aliphatic (from 0 to 3.14 ppm) and the aromatic (6 to 10 ppm) areas in order to identify and characterize the functional groups of the molecules (organic acids, amino acids, aromatic molecules) responsible for samples discrimination. The possible presence of low-intensity signals of potential discriminating markers for the analyzed honeys and their contribution to samples differentiation were evaluated by excluding the sugar signals region in the statistical analysis [19].

Aliphatic Region
In the case of light honeys, the supervised PLS-DA resulted in a model with very good descriptive and predictive parameters (four components gave R 2 X = 0.915; R 2 Y = 0.945, Q 2 = 0.889; p[cv] -ANOVA= 1.015 × e −12 ) (Figure 4).

Aliphatic Region
In Honeydew honey samples from Producer C were located at t [2] negative values and values close to 0 of the main component t [1]. The discriminating metabolites, identified from the loading scatter plot for the model were lactate (1.32 ppm) and succinate (2.56 ppm) for producer A and D and monoterpenoid acid derivatives (1.38 ppm) for Producer B. The observed higher relative content of organic acid in honeydew samples from Producer D was already reported in the literature [42]. Honeydew samples from producer C were characterized by a higher relative content of proline (2.34 ppm) (Figure 5b) as already observed for honeydews from different geographical origins [54,55]. Interestingly, the proline content was recently reported as a possible indicator of honey ripeness as it constantly decreases during storage [54].

Aromatic Region
The multivariate statistical analysis was then applied to all 39 honey samples considering the spectral region between 6 and 10 ppm, characteristic of aromatic protons. The supervised discriminant analysis (PLS-DA) was therefore carried out for the aromatic region, considering separately the light and dark honeys, according to the membership class (Producer). The PLS-DA applied to the light honey samples gave a four-component model with descriptive and predictive parameters equal to R 2 X= 0.936, R 2 Y = 0.708, Q 2 = 0.514; p[cv] -ANOVA= 2.35 × e −6 ( Figure 6). Once again from the score plot (Figure 6a), a separation of the samples based on the class producer could be observed. The loading scatter plot for the model described the molecular components underlying this separation. Specifically, the samples of Producer B, placed at negative values of component t1, were characterized by the highest relative content of formate (8.42-8.46 ppm). Although this compound is normally present in honeys, a possible use of this organic acid in the treatment against an ectoparasitic mite (Varroa) could not be excluded [45]. Samples from Producer A were characterized by a higher phenylalanine content (7.42, 7.38, and 7.32 ppm) while the samples of Producers C and D were basically grouped in a single cluster, with positive values of the t[2] component, characterized by the presence of higher tyrosine (7.19 and 6.90 ppm) and other unidentified aromatic molecules (8.62 ppm). Interestingly, the latter (bucket at 8.62) was also observed but not identified in Finnish lingonberry honey as a specific signal at 8.60 ppm [42]. A possible attribution to niacin could be suggested according to the established presence of this metabolite in honey [56].
Finally, the PLS-DA was conducted on the aromatic region of dark honey samples, considering the different producers (Figure 7). Once again from the score plot (Figure 6a), a separation of the samples based on the class producer could be observed. The loading scatter plot for the model described the molecular components underlying this separation. Specifically, the samples of Producer B, placed at negative values of component t1, were characterized by the highest relative content of formate (8.42-8.46 ppm). Although this compound is normally present in honeys, a possible use of this organic acid in the treatment against an ectoparasitic mite (Varroa) could not be excluded [45]. Samples from Producer A were characterized by a higher phenylalanine content (7.42, 7.38, and 7.32 ppm) while the samples of Producers C and D were basically grouped in a single cluster, with positive values of the t[2] component, characterized by the presence of higher tyrosine (7.19 and 6.90 ppm) and other unidentified aromatic molecules (8.62 ppm). Interestingly, the latter (bucket at 8.62) was also observed but not identified in Finnish lingonberry honey as a specific signal at 8.60 ppm [42]. A possible attribution to niacin could be suggested according to the established presence of this metabolite in honey [56].
Finally, the PLS-DA was conducted on the aromatic region of dark honey samples, considering the different producers (Figure 7).  (8.22, 7.86, 7.55, 6.94 ppm) [21]. Signals of phenylalanine (7.42, 7.38, and 7.32 ppm), an aromatic amino acid, characterized the dark honeys of producers A and D. Honeydew honey from Producer C was characterized by variables (loadings) corresponding to the NMR signals at 7.28 and 6.18 ppm typical of nucleoside derivatives (Figure 7b).

Metabolites Comparison
The variation in discriminating metabolites content for light and dark honeys, among the observed groups (Producer classes) was calculated by the integration of selected distinctive unbiased NMR signals. In particular, signals corresponding to proline, monoterpenoid acid derivative, lactate, ethanol, phenylalanine and lysine were integrated for light honey samples, whereas formate, monoterpenoid acid derivatives, proline, lactate, phenylalanine, nucleoside derivatives, succinate, and KYNA were integrated for dark honeys. Metabolites that showed a significant variation among groups were validated by one-way ANOVA with the HSD post-hoc test and are reported as the mean and standard deviation of integrals for each group in Table 2 ( Figure SM2 and SM3).  [2] component and could be ascribed to the intrinsic sample inhomogeneity (the reason for preparing and analyzing technical replicates). Nevertheless, the residual variability was associated with decreases in the standard deviation as observed in the contribution plot (data not shown) and therefore the Sample 4 remained in the model. On the other hand, the three Sample 4 replicates appeared closely related to each other in the aliphatic region ( Figure 5), and in the overall samples PCA ( Figure S1), Producer B group was characterized by a higher content of formate (8.42-8.46 ppm) as already observed for light samples from the same Producer B and typical signals of kynurenic acid (KYNA) (8.22, 7.86, 7.55, 6.94 ppm) [21]. Signals of phenylalanine (7.42, 7.38, and 7.32 ppm), an aromatic amino acid, characterized the dark honeys of producers A and D. Honeydew honey from Producer C was characterized by variables (loadings) corresponding to the NMR signals at 7.28 and 6.18 ppm typical of nucleoside derivatives (Figure 7b).

Metabolites Comparison
The variation in discriminating metabolites content for light and dark honeys, among the observed groups (Producer classes) was calculated by the integration of selected distinctive unbiased NMR signals. In particular, signals corresponding to proline, monoterpenoid acid derivative, lactate, ethanol, phenylalanine and lysine were integrated for light honey samples, whereas formate, monoterpenoid acid derivatives, proline, lactate, phenylalanine, nucleoside derivatives, succinate, and KYNA were integrated for dark honeys. Metabolites that showed a significant variation among groups were validated by one-way ANOVA with the HSD post-hoc test and are reported as the mean and standard deviation of integrals for each group in Table 2 ( Figures S2 and S3).  According to the obtained data, although limited to a relatively low sample number, but representing small-scale emerging production, the Salento honey shows at the molecular level, a range of specific characteristics features. These features are analogous to those exhibited by similar products originating elsewhere and well established in the literature [1,8,11,22,42,46]. Moreover, metabolic fingerprinting allowed us to clearly differentiate the studied production according not only to the macroscopic season-related character but also to the specific producer. High levels of proline, representing a quality criterion with respect to sugar adulteration, were also observed, although not equally distributed in the samples [54]. Therefore, the suggested use of neonicotinoids [32] to control the insect-vectored bacterium Xylella fastidiosa (subsp. pauca) should also take into account the need for preserving this emerging local foodstuff product.

Sampling
A total of 13 samples of monofloral and polyfloral honeys and honeydew characterized by different colours were analyzed. Samples were obtained from thirteen different commercial jars supplied by four trusted honey producers located in the Apulia Region (Lecce and Taranto provinces). The studied honeys included eight polyfloral, three monofloral (one acacia, one orange, and one citrus sample), and two honeydew samples. The botanical origins were assigned according to the trusted producers' declarations. Since polyfloral honeys can be considered a miscellaneous pool of samples of various botanical origins [57], whenever possible, the reliability of the declarations for monofloral samples was confirmed by the identification of 1 H NMR signals of specific markers of botanical origin [20,21,23,27]. Nevertheless further melissopalynological analysis was necessary in order to characterize the pollen types in the studied samples. The different honey colours were assigned according to the trusted producers' declarations. The different colours were also checked by visual comparison supported by unsupervised PCA without any a priori-defined class (including colour differentiation) scores plot ( Figure S1a,b,) and pair-wise supervised analysis (OPLS-DA) ( Figure 3). All honeys were stored at room temperature and in the dark before spectral analysis. In order to minimize possible sample inhomogeneity, each honey sample consisted of three technical replicates obtained from a single specific commercial honey jar. Honey descriptions and pesticide-free area production according to the producers (A, Margarito; B, Selvaggi; C, Greco; D, Salento Miele) are reported in Table 3. In order to check any possible variability in the sugar content for the different honey productions, the caloric content (as Kcal/100 g) of different producers' honey samples ( Figure S4) was determined using an adiabatic calorimeter bomb (IKA C7000, Staufne, Germany). Although with a similar description supplied by the producer, these samples refer to different production batches.

Sample Preparation for NMR Analysis
Each of the 13 supplied honeys was used to prepare three technical replicates for a total of 39 samples. Each sample was obtained by dissolving 100 mg of honey in 600 µl of deuterated water (D 2 O) containing the standard 3-trimethylsil-2,2,3,3-d4 propionic acid (TSP), 0.5 mM. The pH of aqueous solution honey samples was not corrected for slight deviations as the buckets could be adjusted in the processing step in order to include possible chemical-shift deviations [12,58]. The final solution was placed in an NMR tube (0.5 mm diameter).

1 H-NMR Spectra Acquisition and Processing
All spectra were acquired at a constant temperature (300 K) on a Bruker Avance III 600 MHz Ascend NMR Spectrometer (Bruker Italia, Milano, Italy), operating at 600.13 MHz, equipped with a TCI cryoprobe (inverse Triple Resonance Cryoprobe Prodigy), incorporating a z axis gradient coil and automatic tuning-matching (ATM). Experiments were acquired in automation mode after loading individual samples on an integrated Bruker Automatic Sample Changer, interfaced with IconNMR software (Bruker). For each sample, a 1 H NMR spectrum was acquired with water signal suppression (Bruker pulseprogram zgcppr), in a spectral window of 20.0276 ppm (12019.230 Hz), with 64 scans and a 90 • pulse of 7.620 µsec. After the acquisition, the standard FID processing procedures were carried out, by using TopSpin 3.5 (Bruker, Biospin, Italy), such as Fourier transform (mathematical operation that converts signals into a frequency spectrum), phase and baseline correction, and 0.3 Hz line broadening. All NMR spectra were calibrated with respect to the internal standard TSP (δ = 0.00 ppm). The characterization of the metabolites was determined by the analysis of two-dimensional homoand heteronuclear NMR spectra (2D 1 H J-resolved, 1 H COSY, 1 H-13 C HSQC, and HMBC) and by comparison with the literature data [1,3,11,18,21,22,24,46]. The NMR spectra were converted to a suitable form for multivariate analysis by Amix 3.9.15 (Analysis of Mixture, Bruker BioSpin GmbH, Rheinstetten, Germany) software. Specifically, each NMR spectrum was segmented into areas or histograms, with a fixed base width of 0.04 ppm normal rectangular bucketing"). The bucket tables thus obtained were subjected to a standardization procedure, in order to minimize the possible differences in the concentration of the various metabolites due to sample preparation and/or acquisition conditions. Subsequently, the data matrices (buckets) were subjected to centering and scaling operations: The Pareto scaling method, obtained by dividing each variable by the square root of the variable standard deviation centered around the mean value, was applied [59]. The total sum normalization was applied to minimize small differences due to metabolites concentration and/or experimental conditions among samples [59]. The data table, generated by all aligned buckets row-reduced spectra, was used for further multivariate data analysis. Each bucket row represents the entire NMR spectrum, with all the molecules in the sample. Moreover, each bucket, in a buckets row-reduced spectrum, is labeled with the value of the central chemical shift for its specific 0.04 ppm width. The variables used as descriptors for each sample in chemometric analyses are the buckets.

Multivariate Statistical Analysis
After the data processing step, an exploratory and discriminating analysis was performed, using a multivariate statistical approach, with the help of Simca-P version 14 (Sartorius Stedim Biotech, Umeå, Sweden) software. In particular, Principal Components Analysis (PCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA) were performed. Unsupervised methods such as principal component analysis (PCA) represent the first step in data analysis. Principal component analysis is a chemometric technique aimed at extracting the maximum possible information from a multivariate data structure, summarizing it in a few linear combinations of the variables themselves [60]. PCA is frequently used in the first data processing step in order to obtain a general description of the samples distribution and possible grouping in homogeneous clusters [61]. The possible correlation between the clusters distribution of the analyzed samples and the considered classes is carried out with the subsequent analyses. By PCA, it is also possible to identify outliers (samples showing characteristics of variability of the data particularly different from the others). The assessment of the correlation between the clusters distribution of the analyzed samples (observed by PCA) and the considered classes (such as variety and/or geographical origin) is therefore carried out by using supervised multivariate statistical analyses such as PLS-DA(Projections to Latent Structures Discriminant Analysis, PLS -DA) and OPLS-DA (Orthogonal Partial Least Squares Discriminant Analysis). In the present case, discriminating analysis of the PLS-DA type was performed to classify the honey samples and find indications for maximizing the separation among the classes (inter-class variability) while minimizing the dispersion within each class (intra-class variability). The PLS-DA technique is currently the most widely used for the discrimination of samples with different characteristics (by treatment, species, origin). The PLS-DA is performed in order to refine the separation between groups of observations, rotating the main components, i.e., the axes that express the variance of the data, so as to obtain a maximum separation between the classes and information on the variables responsible for this separation [62]. OPLS-DA is a modification of the PLS-DA method which filters out variation not directly related to the focused discriminating response. This is accomplished by separating the portion of the variance useful for predictive purposes from the non-predictive variance (which is made orthogonal). The result is a model with improved interpretability [63]. Validation of statistical models was performed and further evaluated by using the internal cross-validation default method (7-fold) and with permutation test (20 permutations) available in SIMCA-P software [64]. R 2 , Q 2 , and p[CV -ANOVA] parameters was used to describe the quality of the model. The first (R 2 ) is a cross validation parameter defined as the portion of data variance explained by the models and indicates the goodness of fit. The second (Q 2 ) represents the portion of variance in the data predictable by the model. The minimal number of components required can be easily defined since R 2 (cum) and Q 2 (cum) parameters display completely diverging behavior as the model complexity increases [65]. Cross-validated analysis of variance (p[CV -ANOVA]) provides a p-value indicating the level of significance of group separation in PLS-DA and OPLS-DA [63,65]. The variables responsible for the observed discrimination were identified by using the statistical tool loading scatter plot. This tool creates a scatter plot of the loading vectors for the first two components. The change in discriminating metabolite content among the observed groups was determined by analyzing the integrals of selected distinctive unbiased NMR signals after spectra normalization (to the total spectrum excluding the residual water region) [24,25]. Results, as mean intensities and standard deviation of the selected NMR signals, were validated by analysis of variance (one-way ANOVA) with Tukey's honestly significant difference (HSD) post-hoc test by using MetaboAnalyst software [66]. Statistical significance was set at least at an adjusted p-value < 0.05. The use of chemometric methods, both of an exploratory nature and of a discriminating or classifying nature, of the metabolic components present in the samples, to verify the "clustering" of the samples. The metabolomics based on Nuclear Magnetic Resonance (NMR) spectroscopy, allows to simultaneously detect a wide range of structurally different metabolites, bringing useful information to the discrimination of the samples.

Conclusions
In this work, the application of chemometric methods to 1 H NMR spectra allowed us to preliminary characterize and discriminate the honeys produced by four different producers in a sub region (Salento, Lecce and Taranto Provinces), of Apulia, in South-East Italy. To the best of our knowledge, this research, based on an 1 H-NMR fingerprinting approach, represents the first study applied to honeys from the Salento area within the Apulia Region. The suggested use of harmful neonicotinoids in the Xylella fastidiosa-infected area represents a serious risk for biodiversity and the ecosystem; these chemicals are highly toxic to insects such as bees, species of vital importance to humans [33]. Thus, the present preliminary study of Salento honeys demonstrated the need to protect this local natural food which could be clearly and profitably characterized as similar products reported in the literature. The unsupervised PCA showed among the other feature-related natural clustering, a clear separation of the samples into two macro classes according to the light and dark colour of the samples. We refined the separation analysis between the two observed macro-classes, dark and light samples, by a pair-wise supervised OPLS-DA aimed to specify the molecular components responsible for the observed differences. Interestingly, in accordance with literature data, the signals in the aliphatic region and those related to the phenolic compounds characterized specifically only the dark samples. The two macroclasses were analyzed separately, by discriminant analysis (PLS-DA), considering aliphatic and aromatic regions, in order to observe the distribution of the samples according to the specific local producer. Signals of molecules responsible for the discrimination among the different local producers were clearly identified, and the differences among discriminant metabolites were quantified and statistically validated. Among these, high levels of proline, representing a quality criterion with respect to sugar adulteration, were also observed, although not equally distributed in the samples. This approach, based on the combination of NMR spectroscopy with unsupervised (PCA) and supervised analysis (PLS-DA) was confirmed as a high efficiency tool to characterize naturally complex honey samples. Thanks to the possibility of automation and the low cost per analysis required for screening, 1 H-NMR profiling has already confirmed the potential for foodstuff traceability and authenticity assessment for commercial use. Although further investigations such as melissopalynological analysis are needed to better characterize this local product, the obtained data provide useful information to gain knowledge about Salento honeys. Composition features arising from this NMR-chemometric study as those exhibited by similar products originating elsewhere could be proficiently used as a starting point for a complete characterization of local honey production. Therefore, the impact on this production should be carefully considered when suggesting extensive use of pesticides for Xylella fastidiosa vectors fighting purposes.
Funding: This research received no external funding.