Essential oils (EOs) and their volatile fraction have been known since ancient times to have broad applications in prevention and therapy for human health [1
]. Undiluted EOs are sold at a high price on the international aromatherapy, perfume, and cosmetic markets [2
EOs from Citrus
genus are the most popular EOs and are the largest proportion of natural flavors and fragrances [4
]. Some orange varieties such as bitter orange (Citrus aurantium
) are grown primarily for their peel and the associated essential oil industrial production for citrus flavor applications [5
]. As an example, EOs from the dried peel of unripe bitter orange fruits flavor drinks and liquors, like Curaçao, Cointreau, and Triple Sec [2
]. In the food industry, there is strong interest in their antimicrobial activity linked to the main constituents of their volatile fraction [6
]. Fruit ripening has the most important impact on the development of flavor and other chemical quality attributes, such as nutritional and biochemical compositions, in citrus fruits. Many researchers have pointed out the importance of ripening on quality attributes of citrus peels and their products such as essential oils [6
]. Owing to the commercial importance of bitter orange EO its characterization and analysis has been extensively developed [10
]. For quality assessment, two main types of methods can be used, subjective (sensorial analysis) and objective (analytical methods). The first option is sensorial analysis. This step has several undesired features such as the need of involving a group of trained panelists; this can be problematic for routine analysis [11
] as they can suffer adaptation or fatigue [12
]. Sensorial analysis can have large sources of variation, low throughput [12
], and it can be costly [13
]. For this reason, alternative analytical techniques such as separation techniques based on chromatography in tandem with mass spectrometry could be very beneficial; they are non-subjective, highly repeatable, and reproducible. Additionally, the possibility to identify the compounds from mass spectral data is a major factor in favor of this technique [14
]. Several studies have already investigated the composition of the volatile compounds of bitter orange EO by this technique. They were conducted to determine chemical families present in the EO [15
], authenticity [16
], olfactive properties [17
] or the effects of several factors including geographical location [19
], season [21
and variety, on composition [8
]. While most research is focused on direct injection of the EO, there are also authors that propose to directly sample the headspace focusing directly on the more volatile fraction [23
In recent years, there have been an increasing number of studies trying to use the techniques of metabolomics applied to foods. This new approach has been named ‘Foodomics’ [24
]. In this strategy, the results of the chromatographic analysis are in many cases further analyzed using chemometrics or machine learning techniques. However, alternative technologies, such as direct injection mass spectrometry and proton transfer reaction time-of-flight mass spectrometry (PTR-TOF-MS) combined with chemometrics methods have recently been used for prediction of sensory profiles [25
]. For our purpose, the underlying rationale is that EOs can be characterized by a composition fingerprint. While this chromatographic fingerprint [24
] will have a certain degree of normal variability, anomalous departures from the normal cluster may indicate voluntary or accidental adulteration. In fact, there is a rich literature in the use of analytical techniques in conjunction with chemometrics for fraud detection or quality determination [19
]. For further information, the reader is referred to the review by Cubero-Leon et al. [31
] and the references therein. In the case of citrus essential oils, Parastar et al. used GC-MS fingerprinting in combination with principal component analysis (PCA) and k-Nearest Neighbor classifier (k-NN) to determine from which citrus species (sweet oranges, bitter oranges, lemons, or bergamots) the EO was produced. Additionally, they used counter propagation artificial neural networks (CP-ANN) to determine the chemotypes responsible for this differentiation [32
A few studies have reported the evolution of the chemical composition of the EO in bitter oranges during the ripening stage [8
]. They showed that the composition and other properties of the bitter orange EO are subject to important alterations during ripening. The variation levels (%) of chemical classes (monoterpene hydrocarbons, oxygenated monoterpenes, sesquiterpene hydrocarbons, and oxygenated sesquiterpenes) of bitter orange EO do not evolve linearly with the ripening stages [8
], and lead to drastic variations in EO quality [33
]. For this reason, it becomes important to ensure that the EO has been produced with fruits collected at the optimum ripening point. At this point, we should remark that the optimum ripening point may depend on the desired application of the EO; the optimum ripening condition being different for uses such as antioxidant, antifungal, antimicrobial, anti-inflammatory, antiparasitic, green solvent, etc. [2
The aim of this work is to explore the possibility of controlling the original ripening stages of the citrus fruit from the headspace of the EO using objective methods, avoiding the use of costly human panels. We propose to use headspace gas chromatography–mass spectrometry HS-GC-MS untargeted chemical profiling combined with non-linear classifiers to classify the ripening stage in four classes. Additionally, we will use a feature selection method to determine the most relevant VOCs (Volatile Organic Compounds) for this differentiation.
3. Results and Discussion
Applying a proper threshold to the data, 22 peaks were consistently detected through all the chromatograms (See Table 1
). The heights of the detected peaks of bitter orange EOs were used as the feature vector for the classification.
For exploratory purposes, principal component analysis (PCA) was applied to the feature vector (all 22 peak heights). It was performed for initial exploration of the data distribution and to learn about the inner dimensionality of the dataset since we expected a high correlation among the peak heights. The projection of the data to the first principal components (PCs) did not show a distinct differentiation among the fruit-ripening stages (Figure 3
). It indicated that the data separation is not trivial. The total variance of the first three PCs was 90%. Previous research on the changes of the peel EO composition during ripeness confirm our results [8
]. In other words, there is no specific trend for all compounds in the ripening process; some compounds increase, some decrease, and the others have a changing trend.
Regarding the training algorithm, the RP was evaluated as the best algorithm in comparison with the LM for the classification of ripening stages. Hyperbolic tangent sigmoid and logistic sigmoid activation functions showed no significant difference for the optimizing criterion. The best ANN architecture in internal cross-validation turned out to be 22-4-13-4, that is 4 and 13 hidden neurons in the first and second hidden layers, respectively. The overall accuracy of the model with 22 peak intensities as input was obtained as 70 ± 6%.
The response of the optimum ANN to perturbations made to individual predictor variables (peaks) while locking in all other parameters to their mean value is illustrated in Figure 4
. The higher the response (sensitivity) is, the more effective the peak is for predicting each class. Using a sensitivity analysis, the peaks were ranked based on their relevance to predicting each class in Table 2
. Peaks number 1, 11, and 19 are the most effective ones for predicting ripening classes. As detailed in the Materials and Methods Section, feature selection was carried out based on the rankings provided by sensitivity analysis combined with prediction accuracy in internal validation based on random subsampling. The classification accuracy in cross-validation is shown in Figure 5
. As it can be seen, step viii. including 15 peaks (Table 3
) showed the highest CCR. Thus, it was selected as the best set of inputs; including the peaks No. 1 (ND), 3 (α-pinene), 4 (β-pinene), 8 (ocimene), 9 (Cyclopropane,1,2-dibutyl-), 14 (ND), 17 (linalyl butyrate), 18 (α-terpineol), 19 (3-carne), and 20 (nerol). Less sensitivity for other compounds means that their changes do not significantly contribute to output prediction and may therefore be possibly disregarded. Such pruning results in reducing the network complexity and assisting in the network interpretation [51
]. Fifteen peaks selected as the most effective ones for modeling the output classes are shown in Figure 1
The input layer of final ANNs consisted of the heights of 15 selected peaks by using the SA. The output layer specified four classes of ripening stages. Figure 6
shows the performance of the constructed ANNs using the external validation data. The external validation data for each RS iteration consisted of 20 samples with three replicates each. External validation data was stratified per class.
Despite the complexity of the dataset, the overall accuracy of the model was obtained as 82 ± 1%. It is 20% higher than the average accuracy of models which consider all peaks as inputs. It means that the feature selection not only reduced the complexity of the models but also increased the accuracy. According to the confusion matrix, the mean classification accuracy of ANN models for the first, second, third, and fourth classes was 80 ± 7%, 86 ± 3%, 84 ± 2%, and 76 ± 4%, respectively. The confusion matrix (see Figure 6
) shows the model could classify ripening stages of bitter orange EOs based on their extracted features of chromatograms obtained from HS-GC-MS analyses. As shown in the confusion matrix, the second and third stages were easier to classify. September samples were partially classified as October samples (12% of cases), and December samples were partially classified as November samples. We see that the errors appear just in consecutive months at the beginning and at the end of the ripening process. This may indicate a sigmoidal time evolution of the ripening process so that the most important changes occur in the central months, while the evolution is slower at the beginning and at the end of the ripening process. In fact, it is known that the evolution of volatile production can be strongly non-linear during the ripening process. For example, the intensity of β-pinene slightly decreases at first and then increases; while myrcene increases in the initial stages of ripening, but later it decreases and its amount in the initial and final stages of ripening is similar [8
To confirm the statistical significance of the obtained results, a permutation test was used. The histogram of two sets of classification (with real and permuted labels) is shown in Figure 7
. The Mann–Whitney U test [50
] was used as a nonparametric test for equality of population medians of the two samples. The result rejects the null hypothesis at the 1% significance level. So, they are meaningfully different.
Thus, the ripening stages of bitter orange EOs were successfully classified based on chromatogram features obtained from HS-GC-MS analyses. This could help differentiating EOs from different ripening stages for their distinct applications. On the basis of sensitivity analysis, the peak heights of 15 compounds were evaluated as effective for the aforementioned purpose. These compounds are among those which were previously reported to have an important role in characterizing bitter orange EO [18
]. However, limonene and myrcene, reported as the major compounds of bitter orange EO [7
], were not among the effective peaks for the classification. In fact, Ahmed et al. [52
] also reported that both compounds have a higher odor threshold when compared to α-pinene for example. From this point of view, it seems that the peaks detected by our analysis follow the perception impact of the different compounds. That is, the procedure automatically selects compounds that are relevant from a perceptual point of view.
Overall, the developed approach can be an effective step in creating an appropriate alternative to the subjective methods of quality assessment of EOs. This highly repeatable and reproducible method can overcome the mentioned disadvantages of sensory evaluation, such as susceptibility to large sources of variation and the time-consuming nature of it. Unlike sensory evaluation, a trained panel is not required. The combination of headspace chromatography and machine learning for classification increases both the speed and efficiency of the proposed quality assessment. Moreover, the need for an expert to design and manage the sensory evaluation and interpret the obtained data was eliminated. As for the limitations of the current approach, we should mention that the current prediction models could be location specific, taking into consideration the conditions of the used cultivars. The present study did not take into account the ability of this model to function for other bitter orange samples over diverse geographical origins. This would require sampling bitter oranges in different parts of the world and remains out of the scope of this initial study.
Classifying bitter orange EOs using the different ripening degrees based on consumer preference or other applications of constituents can be explored in the light of the obtained results.
The obtained results show that the ripening degree can be predicted from key volatile compounds in the bitter orange EO headspace using gas chromatography mass spectrometry, avoiding the need for sensorial analysis by trained human panels. The proposed methodology combines chromatographic fingerprinting followed by data analysis using machine learning techniques, in particular ANN. A non-linear classifier (e.g., an MLP) was used since the PCA of the data matrix did not show simple separation of the different classes using three PCs that accounted for 90% of the total variance. Additionally, sensitivity analysis was used to determine the chemotypes responsible for the observed differentiation. The analysis shows that the intensity of fifteen compounds is enough to classify the ripening degree in four classes with a moderate accuracy of 82 ± 1%. The statistical significance of this result was confirmed using a permutation test. The main compounds to predict the ripening degree of bitter orange are α-pinene, β-pinene, ocimene, Cyclopropane,1,2-dibutyl-, linalyl butyrate, myrcenol, linalool α-terpineol, 3-carne, and nerol. These results open a new approach to quality control for bitter orange EOs.