Characterization of Chinese Unifloral Honeys Based on Proline and Phenolic Content as Markers of Botanical Origin, Using Multivariate Analysis

The phenolic and proline content were determined in honey samples of different floral origins (rapeseed, sunflower, buckwheat and Codonopsis) from five different regions of China. The phenolic and proline profile of these samples were used to construct a statistical model to distinguish honeys from different floral origins. Significant differences were identified among the studied honey samples from multivariate chemometric methods. The proline content varied among the four types of honeys, with the values decreasing in the order: buckwheat > Codonopsis > sunflower > rapeseed. Rapeseed honeys contained a high level of benzoic acid, while rutin, p-coumaric acid, p-hydroxybenzoic acid were present at relatively high levels in buckwheat honeys. Principal component analysis (PCA) revealed that rapeseed honey could be distinguished from the other three unifloral honeys, and benzoic acid, proline and kaempferol could serve as potential floral markers. Using 18 phenolic compounds and proline the honey samples were satisfactorily classified according to floral origin at 94% correct prediction by linear discriminant analysis (LDA). The results indicated that phenolic compounds and proline were useful for the identification of the floral origin of the four type honeys.


Introduction
Honey is popular for its nutritional and medicinal values. As a natural sweetening agent, honey is consumed directly, widely applied in the food industry, and also used as a food preservative [1]. Recent studies demonstrate that honey possesses many health benefits, including antimicrobial and anti-inflammatory effects, and heart disease and cancer risk reduction [2,3]. Honey has a complex composition consisting of a high concentration of sugars combined with minerals, free amino acids, enzymes, vitamins, phenolic compounds and numerous volatile compounds [4]. These components highlight both physical properties and nutraceutical characteristics of the product itself [5]. Different unifloral honeys may have different functional properties due to their different constituents.
In general, unifloral honeys is regarded as more valuable due to their good quality and pure flavour. Hence, market prices are determined by its botanical origin, and the increased value of some honey types stimulates the adulteration of honey of certain botanical origins. Honey adulteration, not only defrauds consumers, but also has impact on a country's bee product exports. Therefore, identification of unifloral honey has attracted great attention from researchers. Identification of honey floral makers by analyzing of the physicochemical parameters like sugar content, diastase activity, amino acids or phenolic acid, flavonoids or volatiles have been used on certain unifloral honeys [6,7].
Phenolic compounds, mainly phenolic acids and flavonoids, comprise one of the most important constituents of honey, and have been considered as indicators of its antioxidant activity. The honey phenolic compounds originate as plant secondary metabolites, and their content in plants varies according to the plant species, variety, physiological stage, and environmental factors such as climate [4]. Phenolic compounds have been used to determine the botanical origin of unifloral honeys [8][9][10][11]. In previous research studies, several phenolic compounds have been reported to serve as floral origin markers for different unifloral honeys. For example, quercetin was suggested as a floral marker for sunflower honey [12]; gallic acid was the main phenolic acid in Manuka honey [13]; chlorogenic acid and ellagic acid were possible markers of acacia and rapeseed honeys, respectively [14]; ferulic acid, morin and kaempferol could distinguish chaste honey from rapeseed honey [15].
Proline is dominant amino acid in honey, and has been considered an indicator of honey quality [16]. The proline content in honey depends on the time the nectar is processed by the bees. Indirectly, proline levels also reflect botanical origin [17]. Previous studies found that the proline content of honey was associated with its floral and geographical origin [18]. Some studies indicated that sunflower honey contained slightly higher proline levels, as compared to rapeseed and acacia honeys [18,19].
China has a long history of beekeeping due to its vast geography, appropriate climatic conditions, and rich abundance of several plant species. There are a variety of honey types produced in China. The most common types of honey in China are rapeseed, acacia, Vitex, and jujube honeys. Other special types of honey, such as buckwheat, sunflower, and Codonopsis, are used mainly for commercial purposes. Even though China is one of the world's largest honey exporters, the chemical composition of Chinese honey has not been comprehensively investigated. Previous studies on Chinese unifloral honeys are focused mainly on minerals, amino acids, and phenolic content [11,14,[20][21][22], however, no report data on the phenolic and proline profile of certain Chinese unifloral honeys, such as sunflower and Codonopsis.
For these reasons, the principal aim of the present research was to investigate the proline and phenolic composition of four different unifloral honeys (rapeseed, sunflower, buckwheat and Codonopsis honey) collected from different regions in China. In addition, chemometric methods were applied to identify potential floral markers of different honey samples using phenolic compounds and proline as variables. These results provide insight into the composition of major phenolic compounds and proline profile of Chinese honeys and could help improve the market perception of Chinese honeys by lending credibility to their claims of authenticity.

Proline Profile of the Different Unifloral Honeys
The measured proline content of the different floral origin honeys is shown in Figure 1. Of the four unifloral honey species, the buckwheat honeys exhibited the highest proline content (average 610.16 mg/kg) (p < 0.05), followed by Codonopsis honeys (494.49 mg/kg), sunflower honeys (400.75 mg/kg), and rapeseed honeys (201.61 mg/kg). The observation that the proline content varied with the type of unifloral honey is consistent with previous analysis of European and Serbian unifloral honeys [18]. Our results confirmed that rapeseed honey exhibits low proline levels [17,18]. In addition, we observed large proline variation for different unifloral honey samples, which result from regional differences in the floral sources. For example, the range of proline content for rapeseed honeys was from 122.50 to 336.02 mg/kg, the sunflower honeys from 214.06 to 601.11 mg/kg, the buckwheat honeys from 412.56 to 874.62 mg/kg, and Codonopsis honeys from 380.37 to 699.53 mg/kg. These results are in accordance with other reports of variance in proline levels samples from unifloral honeys with different geographical sources [21,23]. The proline content of the three multifloral honey samples (S17, C10, C11) was 612.55, 497.74 and 219.77 mg/kg, separately (Supplementary Table S1). Although proline levels have been considered as a useful parameter for unifloral honey classification, the observed variance suggests that discrimination with proline is not complete and this measurement should be used together with other parameters including sugar and mineral content and the levels of phytochemicals [24]. Proline content in honey has been proposed as an indicator of honey quality [16], and this significant difference in proline content for different botanical types of honey may suggest variation of quality.
Molecules 2017, 22, 735 3 of 13 levels have been considered as a useful parameter for unifloral honey classification, the observed variance suggests that discrimination with proline is not complete and this measurement should be used together with other parameters including sugar and mineral content and the levels of phytochemicals [24]. Proline content in honey has been proposed as an indicator of honey quality [16], and this significant difference in proline content for different botanical types of honey may suggest variation of quality.

Phenolic compounds in Four Different Honey Types
The distribution and levels of eighteen phenolic compounds in investigated honey samples were listed in Table 1. The phenolic content of the three multifloral honey samples (S17, C10, C11) was showed in Supplementary Table S1. Phenolic acids including protocatechuic acid, caffeic acid, gallic acid, p-coumaric acid, p-hydroxybenzoic, ferulic acid and benzoic acid, and flavonoids such as quercetin, kaempferol, pinocembrin, caffeic acid phenethyl ester (CAPE), chrysin and rutin showed differing amounts in the four types of unifloral honeys. Morin, myricetin, and naringenin were only detected at relative high level in rapeseed honey. Galangin were detected in rapeseed and sunflower honey. Morin was previously found in rapeseed honey by Zhou et al. [15]. Apigenin and CAPE were detected at low levels in the four unifloral honeys. Our analysis identified the predominant phenolic compounds for the four unifloral honeys as p-hydroxybenzoic, p-coumaric, caffeic, ferulic, and protocatechuic acid. The dominant flavonoids identified were quercetin, kaempferol, chrysin. These findings are in agreement with previous reports [25][26][27].

Phenolic Compounds in Four Different Honey Types
The distribution and levels of eighteen phenolic compounds in investigated honey samples were listed in Table 1. The phenolic content of the three multifloral honey samples (S17, C10, C11) was showed in Supplementary Table S1. Phenolic acids including protocatechuic acid, caffeic acid, gallic acid, p-coumaric acid, p-hydroxybenzoic, ferulic acid and benzoic acid, and flavonoids such as quercetin, kaempferol, pinocembrin, caffeic acid phenethyl ester (CAPE), chrysin and rutin showed differing amounts in the four types of unifloral honeys. Morin, myricetin, and naringenin were only detected at relative high level in rapeseed honey. Galangin were detected in rapeseed and sunflower honey. Morin was previously found in rapeseed honey by Zhou et al. [15]. Apigenin and CAPE were detected at low levels in the four unifloral honeys. Our analysis identified the predominant phenolic compounds for the four unifloral honeys as p-hydroxybenzoic, p-coumaric, caffeic, ferulic, and protocatechuic acid. The dominant flavonoids identified were quercetin, kaempferol, chrysin. These findings are in agreement with previous reports [25][26][27]. The levels of individual compounds can be strongly affected by floral and geographical origin. For example, gallic acid was found at the highest concentrations in Codonopsis honeys compared to the other three types of honey (p < 0.05). The amount of gallic acid found in sunflower honey was lower than that reported for Serbian sunflower honey [28]. The amounts of p-hydroxybenzoic acid and p-coumaric acid determined in our buckwheat honey samples were higher as compared to those of previous studies [25,29]. The determined amount of p-hydroxybenzoic acid was lower than that reported for buckwheat honey from the Shanxi Province in China [30], possibly reflecting regional differences of floral sources. The highest level of benzoic acid was found in rapeseed honeys. The average content of benzoic acid in rapeseed honey (9.475 mg/kg) was dozens of times higher compared to the levels in sunflower honey, buckwheat honey, and Codonopsis honey (p < 0.05). Heather honey has been reported to contain a high level of benzoic acid, thought to contribute to the aroma of heather honey [31]. The highest average content of caffeic acid and ferulic acid were found in sunflower honey (p < 0.05). Rutin (flavonol 3-O-rutinoside), which was presumed to be responsible for antioxidant activity of buckwheat [32], was detected at the highest levels in buckwheat honey (0.225 mg/kg), compared to the other three types of honeys. Rutin has also been detected in Polish rapeseed and multi-flower honeys [29], but was not found in studies of buckwheat honeys [25,33]. Rutin has never been found in buckwheat honeys, with the explanation that the phenolic pattern of the buckwheat nectar, and the corresponding honey, might be quite different from that of other plant tissues [25,33]. However, rutin was detected in our buckwheat honey samples for the first time, the reason may because of the variety and regional differences in floral sources, or the analysis method differences. The phenolic compounds analysis method (chromatography-multiple reaction monitoring-MS method) was validated by multiple parameters [15]. Chrysin has been found in many types of honeys including rapeseed honey and multifloral honeys [34], and here we found high levels of chrysin in sunflower honey and buckwheat honey.
More specifically, the phenolic content showed a great variability among different honey samples, for example caffeic acid ranged from 0.004 to 0.872 mg/kg in buckwheat honey, p-coumaric acid ranged from 0.002 to 0.051 mg/kg in Codonopsis honey, quercetin ranged from 0.022 to 5.184 mg/kg in rapeseed honey, and 0.002 to 0.659 mg/kg in sunflower honey. High varying amounts of phenolic compounds were also found in other similar studies [8,27], and this was also in agreement with the results of Karabagias et al. reported similar varying content (mg/kg) in quercetin, syringic acid, kaempferol, chrysin and myricetin for fir, pine, orange and thyme honey samples tested from the Greek market [6].

Principal Component Analysis (PCA)
Multivariate statistical analysis is an important feature of modern analytical approaches allowing characterization of complex matrices by extracting information from multivariate chemical data. Three honey samples (S17 and C10, C11) were exclude in the following statistical analysis according to the pollen analysis results. The proline and phenolic profiles of the four unifloral honeys were analyzed using PCA to investigate subtle differences for the different types of honey samples. The first three principal components (PCs) explained 60.6% of the total variance, where PC1 explained 23.8%, PC2 21.7% and PC3 15.1%, respectively. The scatter plot of the first two PCs (PC1 and PC2) for the classification of honey samples according to their botanical origin are shown in Figure 2. The 95% confidence interval is indicated by confidence ellipses for each set of honey samples in the PCA score plot ( Figure 2). The rapeseed honey samples were well separated from the buckwheat, sunflower, and Codonopsis honey samples in the positive region of PC1. The other honey group was well clustered in the negative region of PC1. Codonopsis honeys were distributed around the origin of the score plot. The three clusters of buckwheat, sunflower and Codonopsis honey samples, overlapped partially. Sunflower honey samples were separated from rapeseed honey samples by PC2. The highest eigenvector in PC1 was explained by kaempferol, myricetin, quercetin, morin and protocatechuic, whilst the highest eigenvector in PC2 was explained by the benzoic acid, caffeic acid, galangin, chrysin and quercetin, whilst PC3 was explained by p-coumaric acid, p-hydroxybenzoic acid, rutin, caffeic acid and myricetin (Table 2).With respect to the different floral origins, the clustering of rapeseed, buckwheat, and sunflower honey, was separated from the other three honey samples, due to an obvious boundary in the PC1-PC2 grouping, respectively. chrysin and quercetin, whilst PC3 was explained by p-coumaric acid, p-hydroxybenzoic acid, rutin, caffeic acid and myricetin (Table 2).With respect to the different floral origins, the clustering of rapeseed, buckwheat, and sunflower honey, was separated from the other three honey samples, due to an obvious boundary in the PC1-PC2 grouping, respectively.  Hierarchical clustering was used to show the phenolic and proline differences of different floral origin honeys. The dendrograms of the hierarchical clustering and heatmap results are shown in Figure 3. Honeys samples were obviously clustered into two main groups. One group contained  Hierarchical clustering was used to show the phenolic and proline differences of different floral origin honeys. The dendrograms of the hierarchical clustering and heatmap results are shown in Figure 3. Honeys samples were obviously clustered into two main groups. One group contained mainly rapeseed honeys and the other group included buckwheat, sunflower, and Codonopsis honeys, indicting similarities between these three types of honeys, which was consistent with the PCA results. All samples of the four honey types were arranged in homogenous clusters, indicating that different floral origin honeys may be distinguished based on their phenolic profiles and proline content. mainly rapeseed honeys and the other group included buckwheat, sunflower, and Codonopsis honeys, indicting similarities between these three types of honeys, which was consistent with the PCA results. All samples of the four honey types were arranged in homogenous clusters, indicating that different floral origin honeys may be distinguished based on their phenolic profiles and proline content. The heatmap visualization of phenolic compounds revealed that buckwheat honey samples can be grouped together, showing high concentrations of p-hydroxybenzoic acid, p-coumaric acid, and rutin. The rapeseed honey samples were grouped together, and show high concentrations of benzoic acid, kaempferol, naringenin, and pinocembrin. The sunflower and Codonopsis honey samples showed high concentrations of galangin, CAPE, and caffeic acid, and were grouped closely. Sunflower and Codonopsis honey samples were positioned separately with a prominent level of quercetin and chrysin in sunflower honeys and relative high level of gallic acid in Codonopsis honeys.

Discrimination of Honey Samples Based on Linear Discriminate Analysis (LDA)
LDA was used for the categorization of the honey samples. The grouping variables were the sixty-seven honey samples of different types (rapeseed, sunflower, buckwheat and Codonopsis) and the independent variables were the 18 phenolic compounds and proline. Results showed that three statistically significant discriminant functions were formed (Wilk's Lamda = 0, X2 = 462.611, df = 57, p value = 0 < 0.05) for the first function, (Wilk's Lamda = 0.011, X2 = 244.139, df = 36, p value = 0 < 0.05) for the second and (Wilk's Lamda = 0.176, X2 = 94.630, df = 17, p value = 0 < 0.05) for the third. The first discriminant function accounted for 73.8% of total variance while the second accounted for19.8%. Both accounted for 93.6% of the total variance. Figure 4 shows the scatter plot of honey The heatmap visualization of phenolic compounds revealed that buckwheat honey samples can be grouped together, showing high concentrations of p-hydroxybenzoic acid, p-coumaric acid, and rutin. The rapeseed honey samples were grouped together, and show high concentrations of benzoic acid, kaempferol, naringenin, and pinocembrin. The sunflower and Codonopsis honey samples showed high concentrations of galangin, CAPE, and caffeic acid, and were grouped closely. Sunflower and Codonopsis honey samples were positioned separately with a prominent level of quercetin and chrysin in sunflower honeys and relative high level of gallic acid in Codonopsis honeys.

Discrimination of Honey Samples Based on Linear Discriminate Analysis (LDA)
LDA was used for the categorization of the honey samples. The grouping variables were the sixty-seven honey samples of different types (rapeseed, sunflower, buckwheat and Codonopsis) and the independent variables were the 18 phenolic compounds and proline. Results showed that three statistically significant discriminant functions were formed (Wilk's Lamda = 0, X2 = 462.611, df = 57, p value = 0 < 0.05) for the first function, (Wilk's Lamda = 0.011, X2 = 244.139, df = 36, p value = 0 < 0.05) for the second and (Wilk's Lamda = 0.176, X2 = 94.630, df = 17, p value = 0 < 0.05) for the third. The first discriminant function accounted for 73.8% of total variance while the second accounted for19.8%. Both accounted for 93.6% of the total variance. Figure 4 shows the scatter plot of honey samples defined by the discriminant functions. It was shown that the rapeseed and buckwheat honeys were well differentiated from sunflower and Codonopsis honeys. More specifically, the first discriminant function clearly separated buckwheat honey from all other honey types while the second discriminant function clearly separated rapeseed honey from all other honey types. The overall correct classification rate was 98.5% and 94.0% separately, using both the original and the cross validation method. Correct classification (100.0%) was obtained for Codonopsis honey samples followed by those of sunflower, rapeseed, and buckwheat (correct classification 94.7%, 92.6% and 90.9%, separately).
samples defined by the discriminant functions. It was shown that the rapeseed and buckwheat honeys were well differentiated from sunflower and Codonopsis honeys. More specifically, the first discriminant function clearly separated buckwheat honey from all other honey types while the second discriminant function clearly separated rapeseed honey from all other honey types. The overall correct classification rate was 98.5% and 94.0% separately, using both the original and the cross validation method. Correct classification (100.0%) was obtained for Codonopsis honey samples followed by those of sunflower, rapeseed, and buckwheat (correct classification 94.7%, 92.6% and 90.9%, separately). These results are in agreement with those of Zhao et al. who indicated that phenolic compounds could be used for the determination of floral origin [11]. And when conjunction with other physicochemical parameters, the predicted results will be more reliable by using multivariate statistical analysis [6,8,24].

Honey Samples and Pollen Analysis
A total of seventy honey samples of four different floral types (rapeseed honeys, sunflower honeys, buckwheat honeys and Codonopsis honeys) were collected from Hubei, Jiangsu, Sichuan, Inner Mongolia and Xinjiang China (marked in Figure 5) in the flowering season of 2015. Honey samples were collected through beekeepers accompanied by our researcher. Honey samples were stored in the dark at 4-5 °C until analyzed. Information about season, hive location and available floral sources were collected by asking the beekeepers to ensure the authenticity of botanical origin. The detailed information about the honey samples is summarized in Supplementary Table S2. Moreover, their floral origins were confirmed by melissopalynological analysis [35]. Briefly, ten grams of each honey was dissolved in 20 mL of warm water (40 °C). The solution was centrifuged for 5 min at 4000 r/min, the supernatant solution was decanted, and the centrifugal step was repeated twice to remove excess water. The sediments were blended with glycerin. Two slides were prepared from each sample and photographed under a DM2500 light microscope (Leica, Heidelberg, Germany). Pollen types were identified by comparison with reference slides of pollen collected directly from the plants in the study and reference images of Pollen and Apicultural Plants in literature [36]. About 500 pollen grains were counted from each sample. The percentage frequency of the four characteristic pollen types in honey samples was calculated. Twenty-seven rapeseed honeys and eleven buckwheat These results are in agreement with those of Zhao et al. who indicated that phenolic compounds could be used for the determination of floral origin [11]. And when conjunction with other physicochemical parameters, the predicted results will be more reliable by using multivariate statistical analysis [6,8,24].

Honey Samples and Pollen Analysis
A total of seventy honey samples of four different floral types (rapeseed honeys, sunflower honeys, buckwheat honeys and Codonopsis honeys) were collected from Hubei, Jiangsu, Sichuan, Inner Mongolia and Xinjiang China (marked in Figure 5) in the flowering season of 2015. Honey samples were collected through beekeepers accompanied by our researcher. Honey samples were stored in the dark at 4-5 • C until analyzed. Information about season, hive location and available floral sources were collected by asking the beekeepers to ensure the authenticity of botanical origin. The detailed information about the honey samples is summarized in Supplementary Table S2. Moreover, their floral origins were confirmed by melissopalynological analysis [35]. Briefly, ten grams of each honey was dissolved in 20 mL of warm water (40 • C). The solution was centrifuged for 5 min at 4000 r/min, the supernatant solution was decanted, and the centrifugal step was repeated twice to remove excess water. The sediments were blended with glycerin. Two slides were prepared from each sample and photographed under a DM2500 light microscope (Leica, Heidelberg, Germany). Pollen types were identified by comparison with reference slides of pollen collected directly from the plants in the study and reference images of Pollen and Apicultural Plants in literature [36]. About 500 pollen grains were counted from each sample. The percentage frequency of the four characteristic pollen types in honey samples was calculated. Twenty-seven rapeseed honeys and eleven buckwheat honeys were classified as unifloral according to the melissopalynological analysis, with Brassica campestris L. pollen the share being in the range from 74 to 96%, and the Fagopyrum esculentum pollen share in the range from 66 to 96%, respectively; nineteen sunflower honey samples were unifloral with Helianthus annuus L. pollen in the range from 48 to 80%; and ten Codonopsis honey samples were unifloral with Codonopsis pilosula pollen shares in the range from 50 to 65% (Supplementary Table S2). Three honey samples (S17 and C10, C11) which were defined as sunflower honey and Codonopsis honey by the beekeepers, however, were classified as multifloral (characteristic pollen type <45%) [35,36], according to the pollen analysis, and were excluded from the statistical analysis.
honeys were classified as unifloral according to the melissopalynological analysis, with Brassica campestris L. pollen the share being in the range from 74 to 96%, and the Fagopyrum esculentum pollen share in the range from 66 to 96%, respectively; nineteen sunflower honey samples were unifloral with Helianthus annuus L. pollen in the range from 48 to 80%; and ten Codonopsis honey samples were unifloral with Codonopsis pilosula pollen shares in the range from 50 to 65% (Supplementary Table S2). Three honey samples (S17 and C10, C11) which were defined as sunflower honey and Codonopsis honey by the beekeepers, however, were classified as multifloral (characteristic pollen type <45%) [35,36], according to the pollen analysis, and were excluded from the statistical analysis.
L-Proline (Pro) was supplied by Aladdin (Shanghai, China). Methanol and acetonitrile (HPLC grade) were purchased from MREDA (MREDA Technology Inc., Columbia, TN, USA). Ultrapure water was produced by using a Milli-Q system (Millipore, MA, USA). Stock solutions of single phenolic compounds (1~10 mg/mL) were prepared by dissolving appropriate amounts of phenolic standards in methanol. Mixed intermediate solutions were prepared by mixing of appropriate stock solutions and dissolving with methanol. Series of working standard solutions were diluted directly to the required concentrations with pure water on the day of use, based on the sensitivity of detection and the linearity range of the study.

Solid-Phase Extraction and HPLC-ESI-MS/MS Analysis
Solid-phase extraction (SPE) method was used to extract phenolic compounds from honey according to a previous method with slight modifications [15]. Honey samples (10.0 g) were mixed with 30 mL acidulated water (pH = 2.0) and vortexed to completely blend. The solution was then centrifuged for 10 min at 10,000 rpm to remove impurities. Prior to extraction, the Oasis HLB cartridge
L-Proline (Pro) was supplied by Aladdin (Shanghai, China). Methanol and acetonitrile (HPLC grade) were purchased from MREDA (MREDA Technology Inc., Columbia, TN, USA). Ultrapure water was produced by using a Milli-Q system (Millipore, MA, USA). Stock solutions of single phenolic compounds (1~10 mg/mL) were prepared by dissolving appropriate amounts of phenolic standards in methanol. Mixed intermediate solutions were prepared by mixing of appropriate stock solutions and dissolving with methanol. Series of working standard solutions were diluted directly to the required concentrations with pure water on the day of use, based on the sensitivity of detection and the linearity range of the study.

Solid-Phase Extraction and HPLC-ESI-MS/MS Analysis
Solid-phase extraction (SPE) method was used to extract phenolic compounds from honey according to a previous method with slight modifications [15]. Honey samples (10.0 g) were mixed with 30 mL acidulated water (pH = 2.0) and vortexed to completely blend. The solution was then centrifuged for 10 min at 10,000 rpm to remove impurities. Prior to extraction, the Oasis HLB cartridge column (Waters, Wexford, Ireland) was activated with methanol (10 mL) and acidulated water (pH 2.0, 10 mL). After the supernatant was loaded, the column was then washed with distilled water and the phenolic compounds were eluted with methanol into a 10 mL flask. The solvent was evaporated to dryness at 40 • C and the residue was dissolved in 1.0 mL of 0.1% formic acid:acetonitrile (70:30, v/v). The sample was then mixed on a vortexer for 2 min and filtered through a filter membrane with 0.25µm pore size. All measurements were performed in triplicate.
The extracts were analysed using an Agilent C18 column (100 mm × 2.1 mm, 2.7 µm; Agilent, Wilmington, DE, USA) within 50 min. The column temperature was maintained at 30 • C. 0.1% formic acid in water (solvent A) and methanol (solvent B) were used as mobile phase, and the flow rate was 0.2 mL/min. The HPLC-MS conditions were as described in our previous study [15]. A typical TIC chromatogram from standard phenolic compounds and buckwheat honey sample is given in Figure 6A,B. The MS parameters used to measure phenolic acids and flavonoids and the transitions from precursor to product ions are shown in Supplementary Table S3. The identification and quantification of phenolic compounds was performed with MassHunter software (Agilent Technologies). The detailed description of the calibration curves is presented in Supplementary Table S4. column (Waters, Wexford, Ireland) was activated with methanol (10 mL) and acidulated water (pH 2.0, 10 mL). After the supernatant was loaded, the column was then washed with distilled water and the phenolic compounds were eluted with methanol into a 10 mL flask. The solvent was evaporated to dryness at 40 °C and the residue was dissolved in 1.0 mL of 0.1% formic acid: acetonitrile (70:30, v/v). The sample was then mixed on a vortexer for 2 min and filtered through a filter membrane with 0.25μm pore size. All measurements were performed in triplicate. The extracts were analysed using an Agilent C18 column (100 mm × 2.1 mm, 2.7 μm; Agilent, Wilmington, DE, USA) within 50 min. The column temperature was maintained at 30°C. 0.1% formic acid in water (solvent A) and methanol (solvent B) were used as mobile phase, and the flow rate was 0.2 mL/min. The HPLC-MS conditions were as described in our previous study [15]. A typical TIC chromatogram from standard phenolic compounds and buckwheat honey sample is given in Figure  6A,B. The MS parameters used to measure phenolic acids and flavonoids and the transitions from precursor to product ions are shown in Supplementary Table S3. The identification and quantification of phenolic compounds was performed with MassHunter software (Agilent Technologies). The detailed description of the calibration curves is presented in Supplementary Table S4.

HPLC analysis of Proline
The extraction, derivatization and chromatographic separation method for proline in the honey samples were performed as described by Li et al. [37]. Briefly, twenty milliliters of 0.1 M boric acid buffer was added to 1.0 g of honey and subject to ultrasound treatment for approximately 10 min. After extraction, the final volume was diluted to 50 mL with borate buffer solution (pH = 8.0). The solution of honey was filtered through 0.25 µm nylon filter membrane and transferred to vials for refrigerator storage prior to derivatisation. All measurements were performed in triplicate.

Data Statistical Analysis
Principal component analysis (PCA), Hierarchical clustering (HCA) were performed using MetaboAnalyst 2.0 through the 'Statistical Analysis' tool [38]. Prior to statistical analysis, all data were normalized using 'Autoscaling' in the MetaboAnalyst program in order to prevent a dominating effect of more highly abundant components over components present in much smaller quantities. Unsupervised Hierarchical clustering using Ward's method of agglomeration and Pearson distances to evaluate the similarity between samples was used to provide an overview on clustering of different floral origin honeys. Comparison of the means was achieved by a one-way analysis of variance (ANOVA). Linear Discriminant Analysis (LDA) was applied to explore the possibility of classification of honey samples according to floral origin. ANOVA and LDA analysis was performed using the SPSS 20.0 statistical software (International Business Machines Corporation, Armonk, NY, USA).

Conclusions
This research study characterized the phenolic and proline profiles of four types of unifloral honey (rapeseed, sunflower, buckwheat, and Codonopsis honey) collected from beekeepers in China. The proline and phenolic content showed significant differences in the four types of honey. Buckwheat honey contained the highest content of proline, p-coumaric acid, and p-hydroxybenzonic acid; rapeseed honey has the lowest content of proline, and highest content of benzoic acid. PCA and hierarchical clustering results showed that rapeseed honey could be separated from buckwheat, sunflower and Codonopsis honey by the selected phenolic compounds and proline. LDA results showed that phenolic compounds and proline in combination with chemometrics may differentiate the floral origin of four Chinese honeys with a classification rate of 94.0%. On the other hand, the determination of chemical composition of Chinese unifloral honey, could lead to the formation of a database in which can be used to classify honeys from different sources.
Supplementary Materials: The following are available online: Table S1: Content of phenolic compounds and proline in the three multifloral honey samples (mg/kg), Table S2: Description of Sampling Regions and characteristic pollen type frequency for different types of honey in China, Table S3: The mass spectrum parameters applied for phenolic compounds, Table S4: Regression equation, r2, Linear range, limit of detection (LOD) and limit of quantification (LOQ) of phenolic compounds.