The Use of UV Spectroscopy and SIMCA for the Authentication of Indonesian Honeys According to Botanical, Entomological and Geographical Origins

As a functional food, honey is a food product that is exposed to the risk of food fraud. To mitigate this, the establishment of an authentication system for honey is very important in order to protect both producers and consumers from possible economic losses. This research presents a simple analytical method for the authentication and classification of Indonesian honeys according to their botanical, entomological, and geographical origins using ultraviolet (UV) spectroscopy and SIMCA (soft independent modeling of class analogy). The spectral data of a total of 1040 samples, representing six types of Indonesian honey of different botanical, entomological, and geographical origins, were acquired using a benchtop UV-visible spectrometer (190–400 nm). Three different pre-processing algorithms were simultaneously evaluated; namely an 11-point moving average smoothing, mean normalization, and Savitzky–Golay first derivative with 11 points and second-order polynomial fitting (ordo 2), in order to improve the original spectral data. Chemometrics methods, including exploratory analysis of PCA and SIMCA classification method, was used to classify the honey samples. A clear separation of the six different Indonesian honeys, based on botanical, entomological, and geographical origins, was obtained using PCA calculated from pre-processed spectra from 250–400 nm. The SIMCA classification method provided satisfactory results in classifying honey samples according to their botanical, entomological, and geographical origins and achieved 100% accuracy, sensitivity, and specificity. Several wavelengths were identified (266, 270, 280, 290, 300, 335, and 360 nm) as the most sensitive for discriminating between the different Indonesian honey samples.


Introduction
According to the Codex Alimentarius Commission [1], honey is defined as "the natural sweet substance produced by honey bees from the nectar of plants or from secretions of living parts of plants, which the bees collect, transform by combining with specific substances of their own, deposit, dehydrate, store and leave in the honey comb to ripen and mature". The main components of honey are carbohydrates (e.g., glucose and fructose in nearly 75% w/w), organic acids, amino acids, vitamins, volatile oils, and minerals [2][3][4]. Moreover, its minor components and appearance are highly affected by different sources of nectar (botanical origin), different types of honeybees, and geographical factors, such as beekeeping practices, climate, and storage conditions [5]. In terms of botanical origin, honey can be categorized into two broad types, namely monofloral and multifloral honey. Generally, due to limited production and availability, monofloral honey is more valued and consequently has a higher market price than multifloral honey [3,6]. Currently, Indonesian objective is to implement a simple, inexpensive, and chemical-free analytical method based on UV spectroscopy and SIMCA (soft independent modeling of class analogy) for the classification of Indonesian honey according to its botanical, entomological, and geographical origins.

Honey Samples from Different Botanical, Entomological, and Geographical Origins
Four types of honey collected by Apis dorsata bees were used: monofloral acacia (Acacia mangium), monofloral durian (Durio zibethinus), multifloral Muara Enim, and multifloral Jambi honey. Two types of honey collected by Apis mellifera bees were also used: monofloral longan (Euphorbia longan) and monofloral rubber tree (Hevea brasiliensis) honey. The samples were harvested during 2018 and 2019 from different origins in Indonesia (Table 1 and Figure 1). Apis mellifera honey from rubber tree and longan samples were harvested from a relatively homogeneous plantation of Hevea brasiliensis and Euphorbia longan in Batang, Central Java, Indonesia. All Apis dorsata honey samples were collected from a forest in Sumatra island, Indonesia. In Sumatra, there are five types of forest bioregion: protected forest, conservation forest, limited production forest, permanent production forest, and conversion forest. Monofloral acacia in Riau was harvested from a limited production forest bioregion with predominant vegetation of Acacia mangium, Acacia crassicarpa, and Eucalyptus sp. In Jambi and Muara Enim, multifloral honey samples were collected from the protected forest with various large plants such as Koompassia excelsa, Bouea macrophylla Griffith, Lansium parasiticum, and several small plants such as Imperata cylindrica. Monofloral durian honey from Jambi was harvested from a conversion forest. In this forest, farmers planted several fruit plants such as durian (Durio zibethinus), oil palm (Elaeis guineensis), and longan (Euphorbia longan), and woody plants such as Hevea brasiliensis and Tectona grandis L.f . A total of 2.5 kg of honey (non-filtered) was obtained for each type of honey and kept in plastic bottles at room temperature (20-25 • C) until analysis. The total number of samples of each type of honey was 120 for longan and rubber tree, and 200 for acacia, durian, Muara Enim, and Jambi; a total of 1040 honey samples were collected for analysis. These samples were randomly divided into three sample sets, namely calibration (524 samples), validation (344 samples), and prediction (172 samples).

UV Spectra Data Acquisition
The collected honey samples crystallized during storage in the laboratory. Prior to spectral measurement, honey samples were heated using a water bath at 60 • C for 30 min to liquefy the crystallized honey and obtain a homogenized honey sample, and were then kept at room temperature [32]. To take UV spectral measurements each honey sample was diluted 1:20 (mL:mL) using distilled water. A total of 2 mL of the diluted honey was pipetted into a 10 mm quartz cuvette. The UV spectra (190-400 nm) at 1 nm intervals of the honey samples were obtained using a low-cost benchtop UV-Vis spectrometer (Genesys™ 10S UV-Vis, Thermo Scientific, Waltham, MA, USA) in transmittance mode. The spectral acquisition was performed at room temperature. Three different pre-processing algorithms, namely 11 points of moving average smoothing (MAS), mean normalization (MN), and Savitzky-Golay first derivative with 11 points, and second-order polynomial fitting (ordo 2) (SG 1d) were simultaneously used in sequence to improve the obtained raw spectral data.
Griffith, Lansium parasiticum, and several small plants such as Imperata cylindrica. Monofloral durian honey from Jambi was harvested from a conversion forest. In this forest, farmers planted several fruit plants such as durian (Durio zibethinus), oil palm (Elaeis guineensis), and longan (Euphorbia longan), and woody plants such as Hevea brasiliensis and Tectona grandis L.f. A total of 2.5 kg of honey (non-filtered) was obtained for each type of honey and kept in plastic bottles at room temperature (20-25 °C) until analysis. The total number of samples of each type of honey was 120 for longan and rubber tree, and 200 for acacia, durian, Muara Enim, and Jambi; a total of 1040 honey samples were collected for analysis. These samples were randomly divided into three sample sets, namely calibration (524 samples), validation (344 samples), and prediction (172 samples).

Chemometrics Analysis
To explore the spectral data, PCA (principal component analysis) and SIMCA (soft independent modeling of class analogy) were used for chemometric analysis. PCA is an unsupervised pattern recognition technique, which provides results about similarities and differences between samples without knowing anything about them [33]. In general, PCA is used to decompose highly correlated and complex spectral data by projecting variables into simpler and fewer uncorrelated new variables called principal components (PCs). For each PC, the PCA calculated its score and loading. In most reported papers, the score plot of the first two PCs (PC1 × PC2) or the first three PCs (PC1 × PC2 × PC3) are used to visualize the cluster formation of the samples and investigate the possible occurrence of outliers. Moreover, the plot of loadings visualizes the contribution of PCs and determines important variables. Further explanation of the concepts and details of PCA have been described in previous reports [34,35].
SIMCA is a supervised pattern recognition technique and belongs to a family of modeling classifiers that distinguishes between members and non-members of different classes [36]. SIMCA enables samples to be classified into existing classes or groups, assigning new samples to one class, more than one class, or no class according to similarities with these classes. First, a PCA model was developed for each class in the calibration sample set. The developed SIMCA model was then validated using a leave-one-out cross-validation method for the validation sample set. The prediction sample set (unknown samples) was then compared to the class models and its membership assigned to classes according to two criteria: the distance from the model center (leverage) and the distance to the model (residual). The sample-to-model distance (Si) is a measure of how far the sample lies from the modeled class. It is computed as the square root of the sample residual variance. SIMCA results can be visualized in a Cooman's plot (Si vs. Si) and confusion matrix. A Cooman's plot typically has four quadrants (Q1-Q4) and each predicted sample is assigned to one of the four possible quadrants: Q1 is the upper left rectangle (for samples that belong to model class 1), Q2 is the lower right rectangle (for samples that belong to model class 2), Q3 is the lower-left rectangle (for samples that belong to both model classes 1 and 2), and Q4 is the upper-right rectangle (for samples that do not belong to either model class 1 or 2).
The sensitivity, specificity, and accuracy were calculated using the equations according to Xu et al. [37]. The Unscrambler X version 10.4 (64-bit) (Camo Software AS, Oslo, Norway) was used to perform spectra pre-processing, PCA, and SIMCA calculations. Figure 2 shows the averaged original or raw (a) and pre-processed spectra (b) of Indonesian honey of different botanical, entomological, and geographical origins. As it can be seen in Figure 2, in the raw UV spectra of 6 types of honey samples, large variations in the absorbance spectra were observed among different types of flora/botanical (monofloral versus multifloral), among the different geographical origins of the honey samples (Sumatra versus Java) as well as the different types of honeybees (Apis dorsata versus Apis mellifera). It was difficult to directly extract significant information from the raw spectra. For this reason, we improved the quality of the raw spectral data by applying spectral data pre-processing. Mean-normalization (MN) was performed as one of the spectral data pre-processing methods in this study. As was mentioned by Xing et al. [38], mean-normalization is one of the most classical normalization methods. It is equivalent to replacing the raw absorbance values by a profile centered on unity: only the relative absorbance values are used to describe the sample, and the information carried by their absolute levels is dropped. Savitzky-Golay first derivative with a second-order polynomial and a window size of 11 points (SG 1d) was used to cancel the baseline drifts and to enhance small spectral differences [39]. Due to similarity in honeybees (entomological), and geographical and botanical origins, especially for Apis dorsata multifloral from Jambi and Apis dorsata monofloral from Jambi, it was expected that the spectral differences within those honey samples were small. This is the main reason to use SG 1d-to enhance those small spectral differences. However, at the same time, as a consequence of derivation, the noises were also enhanced. To avoid this, the spectra were first smoothed using 11 points of moving averaging smoothing pre-processing (MAS) as recommended by previous work [39]. Therefore, in this present study, we utilized three sequential spectral data pre-processing methods: MAS, MN, and SG 1d (MAS + MN + SG 1d). A similar approach was previously used by Zhang et al. [39] and Shawky and Selim [40]. Overall, the shape of the spectral curves was quite similar, especially between 250-400 nm, with sharp differences in peak absorbance intensity at 270 and 300 nm. The two distinctive peaks observed at around 270 nm and 300 nm are associated with the absorbance of benzoic, salicylic, and aryl-alyphatic acids in honey [41]. These spectral results are consistent with previously reported work. Previously, UV-Vis absorption spectra of sixteen bulk Tuscany honey samples were reported, including acacia, clover, etc. [42]. Distinct peak absorbance intensities were observed around 270-280 nm depending on the type of honey. Minor peaks were also observed between 300-335 nm. The typical feature of original or raw UV spectral data is high noise with a very high absorbance (more than 2) especially in the interval of 190-250 nm (high-frequency noise). This raw spectral data is rich in unrelated information such as background information and systematic noise coming from the influences of light scattering, differences in path length, sample particle size, low lamp intensity at the start of spectral acquisition, and other factors [38]. Therefore, to achieve an acceptable result, in this present study for further chemometrics calculation we utilized relatively low noise spectral data using pre-processed spectral data in the interval of 250-400 nm. Figure 3 shows the results of PCA analysis in a two-dimensional score plot of the first two PCs (PC1 × PC2) in the original UV spectra (a) and pre-processed UV spectra of the honey samples (b). PCA was calculated using 1040 honey samples (including all spectra) from both the original and pre-processed spectral data (250-400 nm). The cumulative informative variance (CIV) for the two PCs was 98% and 97% for original and pre-processed spectra, respectively. This indicates that most of the variance in the original dataset was contained in these two principal components. While both the original and pre-processed spectra could be used to separate the honey samples, when using the original spectra some overlapping samples were observed between the clusters, especially in the PC1 direction along the x-axis. For example, the durian cluster had a very similar PC1 to that of the Jambi cluster. Clearer separation was obtained using the pre-processed spectra. Even so, the durian and Jambi honey clusters were still very close to each other. Both these honeys were harvested from the same geographical origin (Jambi), as shown in Table 1. In the PC1 direction, which accounted for 93% of the explained variance, all clusters could be discriminated with no overlapping samples being observed. It is evident that rubber tree, longan, durian, Jambi, Muara Enim, and acacia honey could be correctly classified. Therefore, further chemometrics analysis of SIMCA was performed using the pre-processed spectra (250-400 nm) to classify the Indonesian honey samples according to their botanical, entomological, and geographical origins.  Figure 3 shows the results of PCA analysis in a two-dimensional score plot of the first two PCs (PC1 × PC2) in the original UV spectra (a) and pre-processed UV spectra of the honey samples (b). PCA was calculated using 1040 honey samples (including all spectra) from both the original and pre-processed spectral data (250-400 nm). The cumulative informative variance (CIV) for the two PCs was 98% and 97% for original and pre-processed spectra, respectively. This indicates that most of the variance in the original dataset was contained in these two principal components. While both the original and pre-processed spectra could be used to separate the honey samples, when using the original spectra some overlapping samples were observed between the clusters, especially in the PC1 direction along the x-axis. For example, the durian cluster had a very similar PC1 to that of the Jambi cluster. Clearer separation was obtained using the pre-processed spectra. Even so, the durian and Jambi honey clusters were still very close to each other. Both these honeys were harvested from the same geographical origin (Jambi), as shown in Table 1. In the PC1 direction, which accounted for 93% of the explained variance, all clusters could be discriminated with no overlapping samples being observed. It is evident that rubber tree, longan, durian, Jambi, Muara Enim, and acacia honey could be correctly classified. Therefore, further chemometrics analysis of SIMCA was performed using the pre-processed spectra (250-400 nm) to classify the Indonesian honey samples according to their botanical, entomological, and geographical origins.   Figure 3 shows the results of PCA analysis in a two-dimensional score plot of the first two PCs (PC1 × PC2) in the original UV spectra (a) and pre-processed UV spectra of the honey samples (b). PCA was calculated using 1040 honey samples (including all spectra) from both the original and pre-processed spectral data (250-400 nm). The cumulative informative variance (CIV) for the two PCs was 98% and 97% for original and pre-processed spectra, respectively. This indicates that most of the variance in the original dataset was contained in these two principal components. While both the original and pre-processed spectra could be used to separate the honey samples, when using the original spectra some overlapping samples were observed between the clusters, especially in the PC1 direction along the x-axis. For example, the durian cluster had a very similar PC1 to that of the Jambi cluster. Clearer separation was obtained using the pre-processed spectra. Even so, the durian and Jambi honey clusters were still very close to each other. Both these honeys were harvested from the same geographical origin (Jambi), as shown in Table 1. In the PC1 direction, which accounted for 93% of the explained variance, all clusters could be discriminated with no overlapping samples being observed. It is evident that rubber tree, longan, durian, Jambi, Muara Enim, and acacia honey could be correctly classified. Therefore, further chemometrics analysis of SIMCA was performed using the pre-processed spectra (250-400 nm) to classify the Indonesian honey samples according to their botanical, entomological, and geographical origins.  To detect the possible occurrence of outliers, a Hotelling's T 2 versus Q-residual from the PCA calculation at the 95% probability level was plotted (Figure 4) using the third principal component (PC3). Hotelling's T 2 is a common approach to determine the significance of multivariate distances, while Q-residual represents how well samples are in accordance with the model. Samples with both a high Q-residual and Hotelling's T 2 are more likely to be outliers. According to the Hotelling's T 2 and Q residuals tests, three Muara Enim samples were identified as outliers and removed before further analysis.

PCA Analysis
To detect the possible occurrence of outliers, a Hotelling's T 2 versus Q-residual from the PCA calculation at the 95% probability level was plotted (Figure 4) using the third principal component (PC3). Hotelling's T 2 is a common approach to determine the significance of multivariate distances, while Q-residual represents how well samples are in accordance with the model. Samples with both a high Q-residual and Hotelling's T 2 are more likely to be outliers. According to the Hotelling's T 2 and Q residuals tests, three Muara Enim samples were identified as outliers and removed before further analysis. An x-loadings plot of the first three PCs of the pre-processed spectra was used to evaluate the most influential variables to classify honey samples ( Figure 5). The x-loadings plot of the PC1, which accounts for 93% of the variance, has a positive peak at 266 nm and a negative peak at 300 nm. The PC2 and PC3, which account for 7% of the total variance, had a negative peak at 280 nm and a positive peak at 290 nm.

Supervised Classification of SIMCA
A SIMCA model for each class was created using calibration and validation samples as shown in Table 2 (total 868 samples for 6 classes). The optimum number of principal components (PCs) used for each class was determined by using a leave-one-out crossvalidation method. As seen in Table 2, a SIMCA model for each class was constructed An x-loadings plot of the first three PCs of the pre-processed spectra was used to evaluate the most influential variables to classify honey samples ( Figure 5). The x-loadings plot of the PC1, which accounts for 93% of the variance, has a positive peak at 266 nm and a negative peak at 300 nm. The PC2 and PC3, which account for 7% of the total variance, had a negative peak at 280 nm and a positive peak at 290 nm.
the PCA calculation at the 95% probability level was plotted (Figure 4) using the third principal component (PC3). Hotelling's T 2 is a common approach to determine the significance of multivariate distances, while Q-residual represents how well samples are in accordance with the model. Samples with both a high Q-residual and Hotelling's T 2 are more likely to be outliers. According to the Hotelling's T 2 and Q residuals tests, three Muara Enim samples were identified as outliers and removed before further analysis. An x-loadings plot of the first three PCs of the pre-processed spectra was used to evaluate the most influential variables to classify honey samples ( Figure 5). The x-loadings plot of the PC1, which accounts for 93% of the variance, has a positive peak at 266 nm and a negative peak at 300 nm. The PC2 and PC3, which account for 7% of the total variance, had a negative peak at 280 nm and a positive peak at 290 nm.

Supervised Classification of SIMCA
A SIMCA model for each class was created using calibration and validation samples as shown in Table 2 (total 868 samples for 6 classes). The optimum number of principal components (PCs) used for each class was determined by using a leave-one-out crossvalidation method. As seen in Table 2, a SIMCA model for each class was constructed

Supervised Classification of SIMCA
A SIMCA model for each class was created using calibration and validation samples as shown in Table 2 (total 868 samples for 6 classes). The optimum number of principal components (PCs) used for each class was determined by using a leave-one-out crossvalidation method. As seen in Table 2, a SIMCA model for each class was constructed with a different number of optimum PCs. Three PCs were used to construct class rubber tree, longan, Jambi and acacia SIMCA models with the obtained CIV in calibrations of 99.489, 99.520, 99.113, and 99.059%, respectively. Two PCs were used to develop class durian and Muara Enim SIMCA models with the obtained CIV in calibration of 98.054 and 98.667%, respectively. To evaluate the classification performance of the analyses investigated between all the different honey samples collected in terms of botanical, entomological, and geographical origins, we considered six classes of honey: rubber tree, longan, durian, Jambi, Muara Enim, and acacia. The developed SIMCA models were used to verify the membership of the prediction sample sets into six available classes. The prediction results are presented in Table 3. According to Xu et al. [37], the accuracy, sensitivity, and specificity were 100% for all classes. Table 3. Confusion matrix of for SIMCA prediction result using prediction sample sets. The model distances for the six classes obtained by the SIMCA classification are presented in Table 4. These values are usually utilized to estimate the distance between the models to quantify possible differences between the models. It should be noted that a model distance and discrimination power (dp) larger than 3 indicates good class separation and that the models are significantly different with a low risk of misclassification in the model [43][44][45]. As can be seen in Table 4, model distances were larger than 3 for all classes, indicating that the developed SIMCA models were significantly different between the six honey types collected. To evaluate the most influential wavelength for discriminating between the models, discrimination power values of the SIMCA models were also plotted against wavelength (Figure 6). For the spectra data from 250-400 nm, the discrimination power was greater than 3. This confirms the spectral window selected (250-400 nm) was appropriate for classifying the different types of Indonesian honey collected. Specific wavelengths had a very high discrimination power: at 270 nm (dp = 120), 280 nm (dp = 200), 300 nm (dp = 175), 335 nm (dp = 275), and 360 nm (dp = 200). These wavelengths are extremely important in the discrimination of the types of Indonesian honey collected.  Figure 6. The discrimination power plot of wavelengths obtained by prediction sample sets using pre-processed spectra over the range of 250-400 nm (the black dashed line at discrimination power = 3 is a threshold for selecting important wavelengths). SIMCA prediction results are also presented in the form of a Cooman's plot ( Figure  7). There are six pairwise classes to be tested. The Cooman's plots for these are shown in Figure 7: acacia versus Jambi (a), acacia versus longan (b), acacia versus durian (c), acacia versus rubber tree (d), Jambi versus Muara Enim (e), and longan versus durian (f). The red dashed lines in Cooman's plot are membership lines with a 95% confidence limit that divide the membership area into four quadrants. In Figure 7a, the x-axis and y-axis denote the distance to acacia and Jambi samples. All acacia samples appeared in the upper left quadrant and were properly classified into the acacia class (33 samples). The Jambi samples were located in the lower right quadrant (32 samples), which indicates correct classification into the Jambi class, with only one Jambi sample falling in the upper right quadrant (belong to neither class). However, no samples were plotted in the lower left quadrant (belongs to both classes). Rubber tree (20 samples), durian (33 samples), Muara Enim (33 samples), and longan (20 samples) were successfully rejected as belonging either to the acacia or Jambi classes and were plotted in the upper right quadrant. Similar results were observed for other pairwise classes, as can be seen in Figure 7b-f. This demonstrates the capability of the developed SIMCA model to classify Indonesian honeys according to their botanical, entomological, and geographical origins. SIMCA prediction results are also presented in the form of a Cooman's plot (Figure 7). There are six pairwise classes to be tested. The Cooman's plots for these are shown in Figure 7: acacia versus Jambi (a), acacia versus longan (b), acacia versus durian (c), acacia versus rubber tree (d), Jambi versus Muara Enim (e), and longan versus durian (f). The red dashed lines in Cooman's plot are membership lines with a 95% confidence limit that divide the membership area into four quadrants. In Figure 7a, the x-axis and y-axis denote the distance to acacia and Jambi samples. All acacia samples appeared in the upper left quadrant and were properly classified into the acacia class (33 samples). The Jambi samples were located in the lower right quadrant (32 samples), which indicates correct classification into the Jambi class, with only one Jambi sample falling in the upper right quadrant (belong to neither class). However, no samples were plotted in the lower left quadrant (belongs to both classes). Rubber tree (20 samples), durian (33 samples), Muara Enim (33 samples), and longan (20 samples) were successfully rejected as belonging either to the acacia or Jambi classes and were plotted in the upper right quadrant. Similar results were observed for other pairwise classes, as can be seen in Figure 7b-f. This demonstrates the capability of the developed SIMCA model to classify Indonesian honeys according to their botanical, entomological, and geographical origins.

Conclusions
In this study, PCA and SIMCA were performed on UV spectral data from Indonesian honey samples of different botanical, entomological, and geographical origins. The subsequent evaluation of these models demonstrated that this UV spectroscopy along with chemometrics can be used as a simple, chemical-free (no toxic waste), and low-cost analytical method for the authentication of Indonesian honeys from differing botanical, entomological, and geographical origins. PCA of the spectra, together with SIMCA, confirmed In this research, a satisfactory classification of the different types of Indonesian honey collected from different botanical, entomological, and geographical origins was achieved using pre-processed UV spectral data (250-400 nm). Supervised classification using SIMCA reported an accuracy, sensitivity, and specificity of 100%. Several specific wavelengths were identified for accurate classification of Indonesian honey types, including 266, 270, 280, 290, 300, 335, and 360 nm. It should be noted that the obtained spectra were consistent with the results reported in previous studies. A spectral window between 200-400 nm provided the means to accurately classify and distinguish between monofloral genuine Sidr honey from different geographical origins, the season of harvest, and non-Sidr honey. Using HCA and PCA, obvious grouping patterns between the genuine Sidr and non-Sidr honeys were achieved [26]. However, there was no further explanation on specific wavelengths that were responsible for the discrimination. Using fluorescence spectroscopy, an excitation wavelength with a spectral window between 260-290 nm had strong emission spectra between 330-360 nm. Fluorescence spectra in this region originate from the aromatic amino acids and are commonly used for the determination of the botanical origin of honey [46,47]. The excitation at 360 nm was associated with the fluorescent Maillard reaction products present in honey, such as furosine and hydroxymethylfurfural, and exhibited emission peaks at 440 and 425 nm [47]. In another study, excitation at 280 nm was used to classify raw Ethiopian honeys using front-face fluorescence spectroscopy [48]. Recently, excitations at 270-290 nm and 320-340 nm were also reported to be appropriate for detecting emission spectra of minor components in honey such as amino acids (mainly tryptophan), proteins, and some phenolic acids [42].

Conclusions
In this study, PCA and SIMCA were performed on UV spectral data from Indonesian honey samples of different botanical, entomological, and geographical origins. The subsequent evaluation of these models demonstrated that this UV spectroscopy along with chemometrics can be used as a simple, chemical-free (no toxic waste), and low-cost analytical method for the authentication of Indonesian honeys from differing botanical, entomological, and geographical origins. PCA of the spectra, together with SIMCA, confirmed that the different honey types could be distinguished based on specific, sensitive wavelengths within the 200 to 400 nm range, and potential outliers were detected. The prediction performance of these developed models had an accuracy, sensitivity, and specificity of 100% for all models. Several specific wavelengths were identified as being highly sensitive and specific fingerprints of the Indonesian honey types were investigated. These wavelengths are closely associated with the optical properties of important chemical components in these honeys. The present study provides a foundation for developing robust models for distinguishing between the botanical, entomological, and geographical origins of Indonesian honey types, as well as between different harvest years.