3.1. Odor Intensity Predictive Performance of the SVR Model
As shown in
Figure 1, the olfactory measured odor intensity and the SVR predicted odor intensity were compared in the form of scatter plots. When the sample point was close to the diagonal (red line), it meant that the SVR model made an accurate prediction. For most of the training samples and test samples, the SVR models successfully made perfect predictions. It demonstrated the feasibility and good fitting ability of the SVR algorithm in regular odor data analysis. Besides, the similar predictive accuracy between training samples and test samples proved that these SVR models were not overfitted. The overfitted model usually will correspond exactly to the training set, and may, therefore, fail to predict future observations reliably (like the test sample). This phenomenon is usually caused by the strong fitting ability of machine learning algorithms and its improper parameter settings, which is one of the key issues that should be avoided in the application of machine learning methods [
34].
Table 2 listed the coefficient of determination (
R2) and mean absolute error (
MAE) between olfactory measured odor intensity and SVR predicted odor intensity of each odor mixture individually. From the
R2 values of training samples and test samples, it also confirmed that the SVR models had good predictive accuracy and it was not over-fitted. Different from the other mixtures, the
R2 values of mixture T+E was lower. It probably was caused by a relatively poor accuracy of the olfactory measured results. Because the SVR algorithm is very sensitive to the noise in the training data [
35]. Therefore, the noise (arising from the error of olfactory evaluation) in the training samples can easily affect the fitting effect of the SVR model. Nevertheless, the
MAE results still showed that the prediction error of the SVR models was very limited. In the regular olfactory evaluation tests, the 0.4 OIRS level of error was usually observed and widely accepted [
25]. Thus, the optimized SVR models were considered to be useful and accurate in the odor intensity prediction of these binary odor mixtures.
The odor intensity prediction models were considered promising techniques in the field of odor evaluation. First, the prediction models could directly perform the odor intensity evaluation (basing on the composition and concentration information measured by analytical equipment) instead of human assessors. The influences of assessor quantity, age, gender, and testing environments could be avoided [
36]. On the other hand, it has been reported that the e-nose can directly perform odor intensity evaluation [
21,
37]. However, it directly correlates the sensor signal and odor intensity, and does not fully consider the gas mixture’s composition. Therefore, the device is more focused to specific target gases. In contrast, e-noses and online monitoring devices capable of gas identification and concentration detection are more common and more mature [
38,
39]. If combining the odor intensity prediction model with these e-noses and online monitoring devices, it will significantly improve their olfactory assessment capacity and extend the applicable scope.
3.2. SVR-Assisted Visual Analysis of Odor Interaction
In comparison with traditional odor intensity prediction models, some machine learning methods like SVR have a significant advantage and also one of its disadvantages. The mechanism like a black box severely limited its function in explaining related mechanisms and laws. Although many studies have established empirical models to explain the odor interaction phenomenon and made conclusions, we still hope to develop more analytical methods through the reasonable use of machine learning methods. Since it has been proved that there is a simple linear relationship between
OI and ln
OAV of an individual substance, we think that using the ln
OAV value to represent the content of a component is also helpful in odor interaction studies [
23,
24]. As illustrated in
Figure 2a, c, and d, scatter plots of the relationship between each component’s content (in the form of ln
OAV) and the odor interaction degree (in the form of
OI reduction values; i.e., the color of each dot) was plotted. Based on the definition of
OI reduction in Equation (6), the larger
OI reduction value meant the stronger degree of antagonism effect [
40]. It could be seen that when the content of both components was high, the antagonism effect would be more intense. Because the amount of actual olfactory measured odor data was limited, this scatters plot only provided little information and the results were not intuitive enough.
In order to obtain more odor evaluation results, we used the SVR model instead of olfactory measurement which saved much time and economic costs. Since the core idea of machine learning was to find out the mathematical relationship between the chemical composition and the corresponding mixture’s odor intensity, a certain amount of training samples usually could guarantee the modeling effect. After that, the optimized model would also be valid for other similar samples. This strategy and function have been fully demonstrated and widely applied in many research [
41]. As shown in the above
R2 and
MAE analysis results of the training and test samples, the SVR model had obtained the correct mapping relationship. In this case, the optimized model also will be valid for other samples of the corresponding mixture with different chemical concentrations. As shown by the black dots plotted in
Figure 2b,d,f, we predicted the odor intensity of many binary mixtures with different chemical contents. Results from actual olfactory measurements in our previous studies were plotted with red dots. In order to distinguish the data source here, the color of the dots no longer indicates the
OI reduction value like
Figure 2a,c,e. Based on these olfactory measured and SVR predicted results, the contour maps about the
OI reduction degree and mixtures’ composition were plotted. Through these diagrams, the interaction of odor substances became more intuitive. It could be concluded that the degree of antagonism effect was usually weak if the ln
OAV value of any component is low. When the content of one component was constant, the antagonism degree would increase as the content of the other component enhancing. Besides, the antagonism degree would become more intense when the ln
OAV values of the two components were approaching close.
In the same way, the odor interaction of binary aldehydes mixtures was also analyzed (
Figure 3). The interaction pattern of binary aldehydes mixtures was almost the same with esters mixtures. However, we could still see that there were some differences in the details of their contour maps. For instance, the antagonism degree of mixture BA+EB (
Figure 2d) was weaker when the ln
OAVBA and ln
OAVEB values were close to 2–3. So did the mixture EA+BA when the ln
OAVEA and ln
OAVBA values were close to 4.5–5.5 (
Figure 2b). A very obvious difference was that there were almost no olfactory measured sample data in the areas mentioned above. Because there were not enough samples to reflect the real odor interaction in these areas, the performance of the SVR model to the corresponding area was easily affected by other samples. In machine learning, this phenomenon is generally observed because of insufficient sample amount and lacking data representativeness [
42]. The samples in the aldehyde mixtures were more evenly dispersed, so the odor interaction in each area was fully reflected. Therefore, reasonable sampling also should be concerned when using machine learning methods in odor researches.
The contour maps of binary aromatic hydrocarbon mixtures were plotted in
Figure 4. Unlike binary mixtures of esters and aldehydes, the overall antagonism degree of binary aromatic hydrocarbon mixtures was at a relatively low level. When the ln
OAV value of both components was higher than 2.5, a similar antagonism degree-mixing ratio relationship like esters and aldehydes was observed (
Figure 4b,d,f). When their ln
OAV values were smaller than this critical value, we observed a synergism effect (i.e., the negative
OI reduction value which meant that the
OImix. was higher than the
OIsum.). Because there was no actual olfactory measured sample data in this area, the reliability of this phenomenon should be verified by further olfactory evaluation tests. In all these contour maps, the odor interaction of odor samples with too small ln
OAV values (i.e., the lower left blank corner of the contour plot) was not considered. Because odor samples belonging to this area usually had the odor intensity value lower than 2.0 of the OIRS (it could be demonstrated from
Figure 1). For those odor samples, the error of olfactory evaluation was usually higher [
25]. Meanwhile, it also was more meaningful to analyze the odor interaction of odor mixtures with distinct olfactory stimulation.
3.3. Similarity of Binary Odor Interaction Pattern
In order to further verify the above-observed odor interaction pattern, we also analyzed the influence of the sample’s odor intensity level. As depicted in
Figure 5, all the olfactory measured data and SVR predicted data were employed and colors represented the odor intensity of each sample. We defined the
OI reduction ratio (Equation (7)) and mixing ratio of the binary mixture (i.e.,
xa = ln
OAVa/(ln
OAVa + ln
OAVb)). Firstly, we observed the same conclusion as the above contour maps. When the ln
OAV mixing ratio of the two components was close, the antagonism degree was the most obvious (i.e., higher
OI reduction ratio). Secondly, most of the odor samples followed the same odor interaction pattern regardless of its specific odor intensity level. No significant correlation was observed between the
OI reduction ratio and the sample’s odor intensity value. It was consistent with the phenomenon observed in our previous PDE (partial differential equation) model researches [
25]. Compared with the influence of the sample’s odor intensity value, the odor interaction degree was mainly affected by the components’ mixing ratio.
In this study, we mainly employed the SVR algorithm as a useful tool for data analysis. Based on its strong regression ability, more reliable data was collected and it helped to explore the odor interaction more intuitively. Although the odor interaction has been widely studied by many empirical models who have made very accurate explanations [
5,
6,
11,
43], the machine learning method still has distinct advantages like visual analysis and low time/economic cost. As we found in this study, enough olfactory measured data was an essential guarantee to the accuracy of machine learning models. When applying the machine learning methods, we also should pay attention to the sample representativeness and the objective analysis of simulation results. On the other hand, the observed odor interaction pattern was only verified by several substances from the esters, aldehydes, and aromatic hydrocarbon groups. In order to further prove the reliability and applicable scope of currently observed odor interaction pattern, it is necessary to test more odor substances.