3.1. Odor Intensity Predictive Performance of the SVR Model
As shown in Figure 1
, the olfactory measured odor intensity and the SVR predicted odor intensity were compared in the form of scatter plots. When the sample point was close to the diagonal (red line), it meant that the SVR model made an accurate prediction. For most of the training samples and test samples, the SVR models successfully made perfect predictions. It demonstrated the feasibility and good fitting ability of the SVR algorithm in regular odor data analysis. Besides, the similar predictive accuracy between training samples and test samples proved that these SVR models were not overfitted. The overfitted model usually will correspond exactly to the training set, and may, therefore, fail to predict future observations reliably (like the test sample). This phenomenon is usually caused by the strong fitting ability of machine learning algorithms and its improper parameter settings, which is one of the key issues that should be avoided in the application of machine learning methods [34
]. Table 2
listed the coefficient of determination (R2
) and mean absolute error (MAE
) between olfactory measured odor intensity and SVR predicted odor intensity of each odor mixture individually. From the R2
values of training samples and test samples, it also confirmed that the SVR models had good predictive accuracy and it was not over-fitted. Different from the other mixtures, the R2
values of mixture T+E was lower. It probably was caused by a relatively poor accuracy of the olfactory measured results. Because the SVR algorithm is very sensitive to the noise in the training data [35
]. Therefore, the noise (arising from the error of olfactory evaluation) in the training samples can easily affect the fitting effect of the SVR model. Nevertheless, the MAE
results still showed that the prediction error of the SVR models was very limited. In the regular olfactory evaluation tests, the 0.4 OIRS level of error was usually observed and widely accepted [25
]. Thus, the optimized SVR models were considered to be useful and accurate in the odor intensity prediction of these binary odor mixtures.
The odor intensity prediction models were considered promising techniques in the field of odor evaluation. First, the prediction models could directly perform the odor intensity evaluation (basing on the composition and concentration information measured by analytical equipment) instead of human assessors. The influences of assessor quantity, age, gender, and testing environments could be avoided [36
]. On the other hand, it has been reported that the e-nose can directly perform odor intensity evaluation [21
]. However, it directly correlates the sensor signal and odor intensity, and does not fully consider the gas mixture’s composition. Therefore, the device is more focused to specific target gases. In contrast, e-noses and online monitoring devices capable of gas identification and concentration detection are more common and more mature [38
]. If combining the odor intensity prediction model with these e-noses and online monitoring devices, it will significantly improve their olfactory assessment capacity and extend the applicable scope.
3.2. SVR-Assisted Visual Analysis of Odor Interaction
In comparison with traditional odor intensity prediction models, some machine learning methods like SVR have a significant advantage and also one of its disadvantages. The mechanism like a black box severely limited its function in explaining related mechanisms and laws. Although many studies have established empirical models to explain the odor interaction phenomenon and made conclusions, we still hope to develop more analytical methods through the reasonable use of machine learning methods. Since it has been proved that there is a simple linear relationship between OI
of an individual substance, we think that using the lnOAV
value to represent the content of a component is also helpful in odor interaction studies [23
]. As illustrated in Figure 2
a, c, and d, scatter plots of the relationship between each component’s content (in the form of lnOAV
) and the odor interaction degree (in the form of OI reduction
values; i.e., the color of each dot) was plotted. Based on the definition of OI reduction
in Equation (6), the larger OI reduction
value meant the stronger degree of antagonism effect [40
]. It could be seen that when the content of both components was high, the antagonism effect would be more intense. Because the amount of actual olfactory measured odor data was limited, this scatters plot only provided little information and the results were not intuitive enough.
In order to obtain more odor evaluation results, we used the SVR model instead of olfactory measurement which saved much time and economic costs. Since the core idea of machine learning was to find out the mathematical relationship between the chemical composition and the corresponding mixture’s odor intensity, a certain amount of training samples usually could guarantee the modeling effect. After that, the optimized model would also be valid for other similar samples. This strategy and function have been fully demonstrated and widely applied in many research [41
]. As shown in the above R2
analysis results of the training and test samples, the SVR model had obtained the correct mapping relationship. In this case, the optimized model also will be valid for other samples of the corresponding mixture with different chemical concentrations. As shown by the black dots plotted in Figure 2
b,d,f, we predicted the odor intensity of many binary mixtures with different chemical contents. Results from actual olfactory measurements in our previous studies were plotted with red dots. In order to distinguish the data source here, the color of the dots no longer indicates the OI reduction
value like Figure 2
a,c,e. Based on these olfactory measured and SVR predicted results, the contour maps about the OI reduction
degree and mixtures’ composition were plotted. Through these diagrams, the interaction of odor substances became more intuitive. It could be concluded that the degree of antagonism effect was usually weak if the lnOAV
value of any component is low. When the content of one component was constant, the antagonism degree would increase as the content of the other component enhancing. Besides, the antagonism degree would become more intense when the lnOAV
values of the two components were approaching close.
In the same way, the odor interaction of binary aldehydes mixtures was also analyzed (Figure 3
). The interaction pattern of binary aldehydes mixtures was almost the same with esters mixtures. However, we could still see that there were some differences in the details of their contour maps. For instance, the antagonism degree of mixture BA+EB (Figure 2
d) was weaker when the lnOAVBA
values were close to 2–3. So did the mixture EA+BA when the lnOAVEA
values were close to 4.5–5.5 (Figure 2
b). A very obvious difference was that there were almost no olfactory measured sample data in the areas mentioned above. Because there were not enough samples to reflect the real odor interaction in these areas, the performance of the SVR model to the corresponding area was easily affected by other samples. In machine learning, this phenomenon is generally observed because of insufficient sample amount and lacking data representativeness [42
]. The samples in the aldehyde mixtures were more evenly dispersed, so the odor interaction in each area was fully reflected. Therefore, reasonable sampling also should be concerned when using machine learning methods in odor researches.
The contour maps of binary aromatic hydrocarbon mixtures were plotted in Figure 4
. Unlike binary mixtures of esters and aldehydes, the overall antagonism degree of binary aromatic hydrocarbon mixtures was at a relatively low level. When the lnOAV
value of both components was higher than 2.5, a similar antagonism degree-mixing ratio relationship like esters and aldehydes was observed (Figure 4
b,d,f). When their lnOAV
values were smaller than this critical value, we observed a synergism effect (i.e., the negative OI reduction
value which meant that the OImix.
was higher than the OIsum.
). Because there was no actual olfactory measured sample data in this area, the reliability of this phenomenon should be verified by further olfactory evaluation tests. In all these contour maps, the odor interaction of odor samples with too small lnOAV
values (i.e., the lower left blank corner of the contour plot) was not considered. Because odor samples belonging to this area usually had the odor intensity value lower than 2.0 of the OIRS (it could be demonstrated from Figure 1
). For those odor samples, the error of olfactory evaluation was usually higher [25
]. Meanwhile, it also was more meaningful to analyze the odor interaction of odor mixtures with distinct olfactory stimulation.
3.3. Similarity of Binary Odor Interaction Pattern
In order to further verify the above-observed odor interaction pattern, we also analyzed the influence of the sample’s odor intensity level. As depicted in Figure 5
, all the olfactory measured data and SVR predicted data were employed and colors represented the odor intensity of each sample. We defined the OI reduction ratio
(Equation (7)) and mixing ratio of the binary mixture (i.e., xa
)). Firstly, we observed the same conclusion as the above contour maps. When the lnOAV
mixing ratio of the two components was close, the antagonism degree was the most obvious (i.e., higher OI reduction ratio
). Secondly, most of the odor samples followed the same odor interaction pattern regardless of its specific odor intensity level. No significant correlation was observed between the OI reduction ratio
and the sample’s odor intensity value. It was consistent with the phenomenon observed in our previous PDE (partial differential equation) model researches [25
]. Compared with the influence of the sample’s odor intensity value, the odor interaction degree was mainly affected by the components’ mixing ratio.
In this study, we mainly employed the SVR algorithm as a useful tool for data analysis. Based on its strong regression ability, more reliable data was collected and it helped to explore the odor interaction more intuitively. Although the odor interaction has been widely studied by many empirical models who have made very accurate explanations [5
], the machine learning method still has distinct advantages like visual analysis and low time/economic cost. As we found in this study, enough olfactory measured data was an essential guarantee to the accuracy of machine learning models. When applying the machine learning methods, we also should pay attention to the sample representativeness and the objective analysis of simulation results. On the other hand, the observed odor interaction pattern was only verified by several substances from the esters, aldehydes, and aromatic hydrocarbon groups. In order to further prove the reliability and applicable scope of currently observed odor interaction pattern, it is necessary to test more odor substances.