Odor Fingerprint Analysis Using Feature Mining Method Based on Olfactory Sensory Evaluation

In this paper, we aim to use odor fingerprint analysis to identify and detect various odors. We obtained the olfactory sensory evaluation of eight different brands of Chinese liquor by a lab-developed intelligent nose. From the respective combination of the time domain and frequency domain, we extract features to reflect the samples comprehensively. However, the extracted feature combined time domain and frequency domain will bring redundant information that affects performance. Therefore, we proposed data by Principal Component Analysis (PCA) and Variable Importance Projection (VIP) to delete redundant information to construct a more precise odor fingerprint. Then, Random Forest (RF) and Probabilistic Neural Network (PNN) were built based on the above. Results showed that the VIP-based models achieved better classification performance than PCA-based models. In addition, the peak performance (92.5%) of the VIP-RF model had a higher classification rate than the VIP-PNN model (90%). In conclusion, odor fingerprint analysis using a feature mining method based on the olfactory sensory evaluation can be applied to monitor product quality in the actual process of industrialization.


Introduction
Due to its particularity and generality, fingerprint can provide the basis to distinguish between samples due to its uniqueness and reliability [1]. Odor fingerprint analysis is preferred to the use of intelligent instruments which are sensitive to the stimulation of odor to produce the relevant data of volatile feature components. Adoption of odor fingerprint analysis is widely used in the field of foods. For example, the maturity of fruits could be expressed by the odor intensity [2], the degree of freshness [3], and diseases [4,5] could be determined by odor fingerprint analysis. Thus, the use of odor as a biometrics recognition method is feasible [6].
Chinese liquors belong to the distilled liquor which is loved by people for its strong aromatic odor. As a traditional fermented beverage, the saccharifying ferment of Chinese liquor is daqu, xiaoqu, bran koji and yeast wine, which is produced with cereal grains as the main raw materials and is processed by distilling, saccharifying and fermenting [7]. The microconstituents of liquors are organic compounds which directly influence the flavor of liquor quality. These organic contents are 1% to 2% including acids, esters, alcohols, aldehydes, and so on. Depending on the different brewing techniques and raw features about samples [35,36]. However, this method causes information redundancy, that is, as the number of dimensions increases, the training time and forecasting time of the model will take longer. Therefore, it is of greatest importance to find a more reasonable and effective feature mining method to extract efficient features.
Taking eight different brands of Chinese liquors as an example, this paper aims to use the odor fingerprint analysis, simulate human olfaction through experiments with the lab-developed intelligent nose and adopt the feature mining method to detect and identify various odors. According to the raw experimental data from 16 sensors of the lab-developed intelligent nose, we extracted the time domain and frequency domain characteristics to construct the odor fingerprint. In addition, odor fingerprints were analyzed by PCA and VIP scores for selecting characteristic features. Next, we selected Random Forest (RF) and Probabilistic Neural Network (PNN) to dynamically characterize the interactions among the feature variables, and then obtained the best variable characteristics and the highest classification accuracy. This is a significant study for the detection and identification of Chinese liquors through odor fingerprint analysis based on the olfactory sensory evaluation. Figure 1 shows the flow chart of odor fingerprint analysis for this article. domain and frequency domain features, can be used to mine information that reflects different odor fingerprint features about samples [35,36]. However, this method causes information redundancy, that is, as the number of dimensions increases, the training time and forecasting time of the model will take longer. Therefore, it is of greatest importance to find a more reasonable and effective feature mining method to extract efficient features.
Taking eight different brands of Chinese liquors as an example, this paper aims to use the odor fingerprint analysis, simulate human olfaction through experiments with the lab-developed intelligent nose and adopt the feature mining method to detect and identify various odors. According to the raw experimental data from 16 sensors of the lab-developed intelligent nose, we extracted the time domain and frequency domain characteristics to construct the odor fingerprint. In addition, odor fingerprints were analyzed by PCA and VIP scores for selecting characteristic features. Next, we selected Random Forest (RF) and Probabilistic Neural Network (PNN) to dynamically characterize the interactions among the feature variables, and then obtained the best variable characteristics and the highest classification accuracy. This is a significant study for the detection and identification of Chinese liquors through odor fingerprint analysis based on the olfactory sensory evaluation. Figure 1 shows the flow chart of odor fingerprint analysis for this article.

Liquor Samples
In this paper, eight different brands of Chinese liquors purchased at a local liquor store were selected as samples. These samples differed in brand, alcohol content, flavor, raw materials, and origin. Details were listed in Table 1.

Intelligent Nose
As shown in Figure 2, the lab-developed intelligent nose system contains three units-the air flow velocity and direction control unit (consists of air purification, valve, gas flowmeter, and air pump), the sensors unit (includes sensor arrays and chamber), and the data acquisition and analysis unit (contains data acquisition card (DAQ) and PC with the self-made test software). The two major functions (gas injection and system cleaning) were carried out by adjusting valves. The air purification consists of activated carbon, molecular sieve and allochroic silicagel gel, and more remarkably, allochroic silicagel gel, which belongs to the high-grade drying agent, can visually signal the relative humidity of the environment according to the color variation (from blue to red). It is usually used for instruments, equipments and other closed conditions. The role of air pump 1 and 2 are to clean the system and to collect gas, respectively. In addition, the combination of these two air pumps are used to raise the gas volume rate in the gas cleaning process. The dimension of the chamber is 10.5 cm long, 8.2 cm wide and 5 cm high with a volume of about 431 cm 3 . The chamber is made of cardboard which is covered by Polytetrafluoroethylene (PTFE). PTFE has weak adsorption and strong leakproofness so that there is no other interfering research to affect the test results in the air chamber. Sensor arrays contain a temperature sensor, humidity sensor and 16 independent sensors. LM35CZ type temperature sensor by National Semiconductor, Santa Clara, CA, USA and HIH-4000-003 type humidity sensor by Yi Jiajie Electronic Technology CO., LTD, ShenZhen, China, in the air chamber are used to monitor the internal temperature and humidity. Sixteen independent sensors are sensitive to different substances. These sensors can detect odor fingerprint data and consist of two systems: TGS-8 system by FIGARO, Japan and MQ/MP system by ZhengZhou Winsen Electronics Techbology CO., LTD, ZhengZhou, China. Details of these sensors used in the experiment are listed in Table 2. The NI USB-6211 type data acquisition card by National Instruments, Austin, TX, USA, was selected to collect data. There are eight analog input channels and two analog output channels and the sample rate reaches 48 Ks/s. sensors are sensitive to different substances. These sensors can detect odor fingerprint data and consist of two systems: TGS-8 system by FIGARO, Japan and MQ/MP system by ZhengZhou Winsen Electronics Techbology CO., LTD, ZhengZhou, China. Details of these sensors used in the experiment are listed in Table 2. The NI USB-6211 type data acquisition card by National Instruments, Austin, TX, USA, was selected to collect data. There are eight analog input channels and two analog output channels and the sample rate reaches 48 Ks/s.  The static head-space sampling method was adopted in this experiment. The lab environment is best to control the temperature at 23 ± 2 °C and the relative humidity at 60 ± 5%. The experimental procedure was performed as following: (1) Open the air pump 2 and valve 1, put a clean and empty Erlenmeyer flask in the defined location. Then observe the zero value of each sensor and compare with the standard value.
(2) Twenty milliliters of the sample was put in a 100 mL Erlenmeyer flask, sealed and left to sit for 5 min.  The static head-space sampling method was adopted in this experiment. The lab environment is best to control the temperature at 23 ± 2 • C and the relative humidity at 60 ± 5%. The experimental procedure was performed as following: (1) Open the air pump 2 and valve 1, put a clean and empty Erlenmeyer flask in the defined location. Then observe the zero value of each sensor and compare with the standard value.
(2) Twenty milliliters of the sample was put in a 100 mL Erlenmeyer flask, sealed and left to sit for 5 min.
(3) Close air pump 2 and adjust air pump 1 so that the gas flowmeters 1 and 2 (by Qihai Electromechanical Manufacturing CO., LTD, Chengdu, China) display 2 L/min to clean windpipes for 10 s. Then open air pump 2 to clean the entire device. This process lasted 5 min to eliminate the influence by other gases.
(4) Place the test samples in the defined location and adjust air pumps 1 and 2 so that the gas flowmeters 1 and 2 display 0.5 L/min to let the gas enters the chamber. Ten seconds later, close air pump 2 and keep the gas coming into the chamber sequentially. At the same time, observe the signals and record test data.
(5) Without loss of generality, repeat the experiment 10 times for each sample by repeating Steps (2)- (4). Note that the relative humidity will not change in the course of the experiment. At last, a total of 80 sets of data is obtained.
In this paper, we extracted time domain and frequency domain features to construct an odor fingerprint map. The time-domain feature is the average value (AV) of intelligent nose response signals of sensors. The frequency domain feature is the mean of variance (MV) of the eight wavelet packet coefficients obtained by three layers of wavelet packet decomposition with db6 wavelet [37].
The time domain features of the ith sensor of TGS-8 system were defined as: where x Ti1 , x Ti2 , . . . , x Ti5940 are response value of the ith sensor of TGS-8 system intelligent nose. The time domain features of the ith sensor of the MQ/MP system were defined as: where x Mi1 , x Mi2 , . . . , and x Mi5940 are response values of the ith sensor of the MQ/MP system electronic nose.
The time domain features of the ith sensor of the TGS-8 system were defined as: where S Ti1 , S Ti2 , . . . , S Ti8 are the variance yields extracted from the coefficients of the wavelet packet of the ith sensor of the TGS intelligent nose; the response value measured from the intelligent nose was decomposed into wavelet packet components based on the db6 wavelet, and then extracting the coefficients of the wavelet packet.
The frequency domain features of the ith sensor of the MQ/MP system were defined as: where S Mi1 , S Mi2 , . . . , S Mi8 in the formula are the variance yields extracted from coefficients of wavelet packet of the ith sensor of the MQ/MP intelligent nose; the response value measured from the intelligent nose was decomposed into wavelet packet components based on the db6 wavelet, and then extracting the coefficients of the wavelet packet.

Data Processing of Odor Fingerprint Analysis
As is known to all, sensor sensitivity has a great influence on the intelligent nose system performance. The sensitivity of the sensor should be considered to achieve the best performance.
As shown in Figure 3, in the drive circuit of the sensor, R p is the resistance value of the sensor. R l is the resistance value of the load resistance and the output voltage of sensor is the voltage across the load resistance. The relationship between the output voltage and reference voltage is as follows:  As shown in Figure 3, in the drive circuit of the sensor, Rp is the resistance value of the sensor. Rl is the resistance value of the load resistance and the output voltage of sensor is the voltage across the load resistance. The relationship between the output voltage and reference voltage is as follows: All these show that when Rl is equal to Rp, the sensor has the greatest response sensitivity to improve the performance of the intelligent nose system.
As shown in Figure 4, taking the TGS-821 sensor for example, the best output response was studied by changing different Rl values. Other sensors have the same characteristics. As shown in Figure 5, in order to find the appropriate resistance value of the load resistor in experiment, we perform an experiment with the purpose of supervising the zero value of sensors which continued for 127 days. By experiment, when Rl is about one-fifteenth of the value of Rp, the output response of sensors is obvious. All these show that when R l is equal to R p , the sensor has the greatest response sensitivity to improve the performance of the intelligent nose system.
As shown in Figure 4, taking the TGS-821 sensor for example, the best output response was studied by changing different R l values. Other sensors have the same characteristics. As shown in Figure 3, in the drive circuit of the sensor, Rp is the resistance value of the sensor. Rl is the resistance value of the load resistance and the output voltage of sensor is the voltage across the load resistance. The relationship between the output voltage and reference voltage is as follows: All these show that when Rl is equal to Rp, the sensor has the greatest response sensitivity to improve the performance of the intelligent nose system.
As shown in Figure 4, taking the TGS-821 sensor for example, the best output response was studied by changing different Rl values. Other sensors have the same characteristics. As shown in Figure 5, in order to find the appropriate resistance value of the load resistor in experiment, we perform an experiment with the purpose of supervising the zero value of sensors which continued for 127 days. By experiment, when Rl is about one-fifteenth of the value of Rp, the output response of sensors is obvious. As shown in Figure 5, in order to find the appropriate resistance value of the load resistor in experiment, we perform an experiment with the purpose of supervising the zero value of sensors which continued for 127 days. By experiment, when R l is about one-fifteenth of the value of R p , the output response of sensors is obvious. Signal processing, as an important step of improving the performance of the intelligent nose, refers to preprocess signals of sensor array responses. The standardized processing is the most popular method that translates raw data into a dimensionless index. Therefore, this step can avoid pattern recognition failure because of the large magnitude of some sensors. We choose the relative difference method to suppress sensor drift. xs(0) is the zero response value of the sensor. Signal processing, as an important step of improving the performance of the intelligent nose, refers to preprocess signals of sensor array responses. The standardized processing is the most popular method that translates raw data into a dimensionless index. Therefore, this step can avoid pattern recognition failure because of the large magnitude of some sensors. We choose the relative difference method to suppress sensor drift. x s (0) is the zero response value of the sensor.
Then, in order to expedite the convergence rate of the model, the odor fingerprint information obtained by different sensors should be converted to the same dimension and the same order of magnitudes. We normalized the fusion feature sets and the normalized interval is (0, +1). After the series of the above-mentioned processing (relative difference method and normalization), additive drift and response drift of the sensor will be suppressed. Figure 6a,b shows radar plots for time domain and frequency domain features, respectively. Since each sensor detects cross-information of olfactory, it is difficult to determine which features are the characteristic values that affect the olfactory information of liquors. It can be seen that the sensor T3 and T4 are obviously different in AV value. Does it proves these two values are the main factors affecting the olfactory information of Chinese liquors? Meanwhile, the sensor M1 and M2 are slightly difference in MV value. Does it proves these two characteristics have little effect on the olfactory information of Chinese liquors? Therefore, it is indispensable to find a suitable feature mining method to delete the redundant information and select characteristic features that can affect the olfactory information of Chinese liquors. In addition, the best combination of variables and fusion methods to reduce the complexity of the model prediction and achieve the best classification performance have to be chosen.

Feature Extraction and Filtering
Principal Component Analysis (PCA) is a meaningful multivariate statistical method. It can convert multiple variables to a few comprehensive variables through linear transforming. These comprehensive variables that are principal components can reflect most of the information of the original variables at the greatest extent. These principal components are not only linearly independent of each other but also mutually orthogonal. In this paper, PCA was used to process the original features that fused the time domain and frequency domain. From this, principal components can express characteristic features of Chinese liquors' olfactory information.
In the Partial Least Square (PLS), the Variable Importance of Projection (VIP) scores were used to create a new data space in a lower dimensional system [38]. The VIP scores can express the interpretative ability of the independent variables to dependent variables. With higher scores meaning a greater rate of contribution to covariance and stronger distinguishing ability, each variable of the original feature was evaluated and obtained corresponding scores. These variables were sorted based on the VIP scores and selected to form the new characteristic space. The feature fusion strategy is as follows: (1) The original features that fused time domain and frequency domain were sorted based on VIP scores. (2) K = [k 1 , k 2 , . . . , k m ] variable subsets were generated based on the best VIP scores. Which ki means the subset has top ith variables and m is the number of all variables. In this paper, we analyzed the original features and generated 32 subsets based on the VIP scores to express the interaction between different variables.  In the Partial Least Square (PLS), the Variable Importance of Projection (VIP) scores were used to create a new data space in a lower dimensional system [38]. The VIP scores can express the interpretative ability of the independent variables to dependent variables. With higher scores

Multivariate Analysis
In this paper, altogether 80 sets of data were divided into two parts based on the Kennard-stone algorithm, 1/2 as the training set and the rest as the testing set. The former was used to construct the classification model and the latter was used to test the classification performance of models established by the former.
The KS algorithm is commonly used as an effective method to select a training set. In the KS algorithm, all samples were considered as candidates for training sets that were selected in order. The KS algorithm can be summarized as follows: (1) Calculating the distance between every two samples and selecting the two samples with the largest distance. (2) Calculating the distance between the remaining sample and the selected two samples, respectively. (3) Repeating this step until the number of selected samples is equal to the predetermined number [39].
Random Forest (RF) is an ensemble of classification and regression tree (CART). It was first proposed by Kam in 1995 [40] and Breiman made an intensive study [41]. The essence of RF is a nonlinear classifier that contains multiple decision trees. There is no correlation between these trees. When the testing data entered into the random forest, the data was classified by each decision tree. The final results are the most classified results in all trees.
With its fast training rate and simple realization, it is widely used in biological information [42], ecology [43], medicine [44], economic finance [45], computer vision [46], speech [47], data mining [48], remote sensing geography [49] and other fields. The execution procedure of RF is: Assuming that the number of attributes of the sample is M. Resampling based on the Bootstrap method. Then T training sets S 1 , S 2 , . . . , S T were generated. (2) The corresponding decision trees C 1 , C 2 , . . . , C T were generated by each training set. Before the property was selected on each internal node, m properties that were randomly selected from M properties should be seen as the split attribute set of the current node.
(3) Each tree has complete growth without pruning. (4) For the testing set sample X, every decision tree was tested to obtain the corresponding categories C 1(X) , C 2(X) , . . . , C T(X) . (5) By taking the vote, the most output category in the T decision trees was taken as the category of the testing set.
Probabilistic Neural Networks (PNN) is the supervised classifier which was first put forward by D. F. Speeht in 1990 [50]. It is a parallel algorithm based on the Bayes classification rule and the Parzen window's probability density function. With its simple learning process, fast training speed, better compatibility and strong nonlinear ability, PNN was applied to image recognition [51], chemical detection [52] and stereo vision matching [53] fields. PNN generally consists of four layers: The input layer, the model layer, the summation layer, and the output layer. The steps of PNN networks are as follows: (1) Collecting sample data and dividing into a training set and a testing set. (2) Creating PNN networks and training the network according to training sets. (3) Testing network performance.

Dimension Reduction by PCA
The odor fingerprint information obtained in the experiment was analyzed by the PCA algorithm. The first three principal components account for 42.59%, 34.16%, and 11.95% respectively. Figure 7a shows the PCA processing results of different brands of Chinese liquors. Observing the scree plot from Figure 7b, when the number of principal components reaches 10, the polyline area is stable. The cumulative contribution of principal components reaches 99.368%, which can represent all characteristic data. Therefore, we extracted the first 10 principal components as a new feature data set to substitute the original variables. Results showed that it provides a reliable method to construct a little more concise odor fingerprint map. Figure 8 shows the VIP scores for each feature variable of the original fusion dataset measured by PLS discrimination analysis. As shown, the VIP score of AVM5, AVM4, AVM7, MVM5, MVM4, MVM7, AVM8, MVM2, AVT6, AVM2, MVM8, MVT6, MVM1 and AVM1 are greater than 1, indicating that these variables have significant meaning in the odor fingerprint of Chinese liquors. While the VIP scores of the rest are less than 1, which means that these variables have less effect on the classification of Chinese liquors, VIP scores cannot give a verdict for the classification performance of models. Observing the scree plot from Figure 7b, when the number of principal components reaches 10, the polyline area is stable. The cumulative contribution of principal components reaches 99.368%, which can represent all characteristic data. Therefore, we extracted the first 10 principal components as a new feature data set to substitute the original variables. Results showed that it provides a reliable method to construct a little more concise odor fingerprint map. Figure 8 shows the VIP scores for each feature variable of the original fusion dataset measured by PLS discrimination analysis. As shown, the VIP score of AV M5 , AV M4 , AV M7 , MV M5 , MV M4 , MV M7 , AV M8 , MV M2 , AV T6 , AV M2 , MV M8 , MV T6 , MV M1 and AV M1 are greater than 1, indicating that these variables have significant meaning in the odor fingerprint of Chinese liquors. While the VIP scores of the rest are less than 1, which means that these variables have less effect on the classification of Chinese liquors, VIP scores cannot give a verdict for the classification performance of models. Therefore, we found a series of fusion matrix as an input of the model based on VIP scores. Each subset includes the top several variables, in other words, subset #1 includes AV M5 , subset #2 contains AV M5 and AV M4 , the last subset #32 contains all variables. We can select the prime variable combination by dynamically observing the classification performance of RF networks and PNN network. Results showed that it provides a reliable method to construct a much more concise odor fingerprint map by selecting the best combination of variables. Therefore, we found a series of fusion matrix as an input of the model based on VIP scores. Each subset includes the top several variables, in other words, subset #1 includes AVM5, subset #2 contains AVM5 and AVM4, the last subset #32 contains all variables. We can select the prime variable combination by dynamically observing the classification performance of RF networks and PNN network. Results showed that it provides a reliable method to construct a much more concise odor fingerprint map by selecting the best combination of variables.

Figure 8.
Relative variable importance based on calculated VIP. Table 3 shows the accuracy rate achieved by RF and PNN models. As the number of variables increases, the classification accuracy rates show an upward tendency. Specifically, the classification accuracy of RF and PNN in subset #11 have reached the same accuracy as the original fusion dataset. This indicates that the original fusion dataset contains a large amount of redundant information. With the number of variables increasing, RF models appeared to have the highest accuracy rate of 92.5% under the subset #15 and PNN appeared to have the highest accuracy rate of 87.5% under subset #16. We continued to raise variables, and the accuracy rate of each model did not exceed the above-mentioned maximum value. These results are consistent with the VIP scores shown in Figure 8. That is, the performance of the model increased with variables added whose VIP scores were greater than one, while the performance of the model decreased with the rest of the variables added whose VIP scores were less than one. From above, we chose subset #15 as the best combination.   Table 3 shows the accuracy rate achieved by RF and PNN models. As the number of variables increases, the classification accuracy rates show an upward tendency. Specifically, the classification accuracy of RF and PNN in subset #11 have reached the same accuracy as the original fusion dataset. This indicates that the original fusion dataset contains a large amount of redundant information. With the number of variables increasing, RF models appeared to have the highest accuracy rate of 92.5% under the subset #15 and PNN appeared to have the highest accuracy rate of 87.5% under subset #16. We continued to raise variables, and the accuracy rate of each model did not exceed the above-mentioned maximum value. These results are consistent with the VIP scores shown in Figure 8. That is, the performance of the model increased with variables added whose VIP scores were greater than one, while the performance of the model decreased with the rest of the variables added whose VIP scores were less than one. From above, we chose subset #15 as the best combination.  Table 3. Cont.

Classification Using Random Forest
In RF networks, the value of mtry and the number of decision trees are the main parameters of generalization performance. The default mtry value is the square root of the total number of variables, so the value of mtry in the experiment was four. We selected the number of decision trees from 2 to 100 at two trees intervals. The training accuracy rate and predicting accuracy rate were regarded as the evaluation criterion. From this, we can focus on the influence of decision trees on the classification performance in RF networks.

Classification Using Random Forest
In RF networks, the value of mtry and the number of decision trees are the main parameters of generalization performance. The default mtry value is the square root of the total number of variables, so the value of mtry in the experiment was four. We selected the number of decision trees from 2 to 100 at two trees intervals. The training accuracy rate and predicting accuracy rate were regarded as the evaluation criterion. From this, we can focus on the influence of decision trees on the classification performance in RF networks.
(a) Based on the original feature set  The three feature sets (original, PCA-optimized, and VIP-optimized, from which the 15th variable subset was extracted based on the VIP scores) combined with the RF model achieved the classification for olfactory information of Chinese liquors. To reduce the impact of randomness, 100 prediction models were established, and their accuracy rates were averaged as the classification accuracy rate of the current model. As shown form Figure 9a-c, the training accuracy rate reaches 100% when the number of decision trees is greater than 8, 4, and 12, respectively. Besides, in the RF model based on the VIP-optimized feature set, when the number of decision trees exceeds 72, the testing accuracy reaches up to 92.5%. Further, along with the continual increase of the decision trees, the system remains stable. Results showed that the olfactory information of original features contains redundant information. Besides, the feature mining method based on VIP-optimized can extract effective features.

Classification Using PNN
The three feature sets (original, PCA-optimized, and VIP-optimized from which the 16th variable subset was extracted based on the VIP scores) combined with the PNN model work well in The three feature sets (original, PCA-optimized, and VIP-optimized, from which the 15th variable subset was extracted based on the VIP scores) combined with the RF model achieved the classification for olfactory information of Chinese liquors. To reduce the impact of randomness, 100 prediction models were established, and their accuracy rates were averaged as the classification accuracy rate of the current model. As shown form Figure 9a-c, the training accuracy rate reaches 100% when the number of decision trees is greater than 8, 4, and 12, respectively. Besides, in the RF model based on the VIP-optimized feature set, when the number of decision trees exceeds 72, the testing accuracy reaches up to 92.5%. Further, along with the continual increase of the decision trees, the system remains stable. Results showed that the olfactory information of original features contains redundant information. Besides, the feature mining method based on VIP-optimized can extract effective features.

Classification Using PNN
The three feature sets (original, PCA-optimized, and VIP-optimized from which the 16th variable subset was extracted based on the VIP scores) combined with the PNN model work well in classifying the olfactory information of Chinese liquors. As shown in Figure 10a,c and e, 40 training samples were classified correctly, as shown in predicting effect of PNN, the test accuracy rate was 65%, 77.5% and 87.5% with 40 test samples (The vertical axis is category label. And from 1 to 8 are category labels of eight brands of Chinese liquors, respectively.).
The PNN models based on PCA-optimized and VIP-optimized are superior to the model based on the original features, which means that there is a lot of redundant information in the original features. Compared with the PNN model based on PCA-optimized, the model of VIP-optimized performed well, which means that the feature mining method based on VIP-optimized can improve the accuracy rate and extract effective features. classifying the olfactory information of Chinese liquors. As shown in Figure 10a,c and e, 40 training samples were classified correctly, as shown in predicting effect of PNN, the test accuracy rate was 65%, 77.5% and 87.5% with 40 test samples (The vertical axis is category label. And from 1 to 8 are category labels of eight brands of Chinese liquors, respectively.). The PNN models based on PCA-optimized and VIP-optimized are superior to the model based on the original features, which means that there is a lot of redundant information in the original features. Compared with the PNN model based on PCA-optimized, the model of VIP-optimized performed well, which means that the feature mining method based on VIP-optimized can improve the accuracy rate and extract effective features.       Table 4 shows the classification accuracies under different data processing and pattern recognition methods. As shown in Table 4:

Discussion
(1) By comparison, the classification accuracy of the RF network was better than the PNN network based on the different feature methods. Thus, it can be seen that the RF network has stronger processing power in this experiment.
(2) Compared with the original features, classification performance did not significantly improve based on the PCA-optimized both in the RF network and PNN network. The data processing method based on PCA cannot obtain the best combination of variables to identify various odors more accurately.
(3) Compared with the original feature and PCA-optimized, selected features based on the best VIP scores obtained the obvious promotion of the classification performance. The classification accuracy of the RF network in subset #15 and the PNN network in subset #16 was 92.5% and 87.5%, respectively. Finally, the RF network showed the best classification performance of 92.5% in subset #15. Combined with VIP scores, AVM5, AVM4, AVM7, MVM5, MVM4, MVM7, AVM8, MVM2, AVT6, AVM2, MVM8, MVT6, MVM1, AVM1, and AVM6 were considered as the characteristic features.

Conclusions
In conclusion, taking eight different brands of Chinese liquors as an example, our work adopted the odor fingerprint analysis based on olfactory sensory evaluation and the feature mining method which combined the time domain and frequency domain to simulate human olfaction and to identify various odors. Variable selection using VIP scores is especially suitable for extracting features from a  Table 4 shows the classification accuracies under different data processing and pattern recognition methods. As shown in Table 4:

Discussion
(1) By comparison, the classification accuracy of the RF network was better than the PNN network based on the different feature methods. Thus, it can be seen that the RF network has stronger processing power in this experiment.
(2) Compared with the original features, classification performance did not significantly improve based on the PCA-optimized both in the RF network and PNN network. The data processing method based on PCA cannot obtain the best combination of variables to identify various odors more accurately.
(3) Compared with the original feature and PCA-optimized, selected features based on the best VIP scores obtained the obvious promotion of the classification performance. The classification accuracy of the RF network in subset #15 and the PNN network in subset #16 was 92.5% and 87.5%, respectively. Finally, the RF network showed the best classification performance of 92.5% in subset #15. Combined with VIP scores, AV M5 , AV M4 , AV M7, MV M5 , MV M4 , MV M7 , AV M8 , MV M2 , AV T6 , AV M2 , MV M8 , MV T6 , MV M1 , AV M1 , and AV M6 were considered as the characteristic features.

Conclusions
In conclusion, taking eight different brands of Chinese liquors as an example, our work adopted the odor fingerprint analysis based on olfactory sensory evaluation and the feature mining method which combined the time domain and frequency domain to simulate human olfaction and to identify various odors. Variable selection using VIP scores is especially suitable for extracting features from a mass of data. In addition, the VIP-based models achieved better prediction accuracies than the PCA's. The results demonstrated that VIP coupled with the RF or PNN network is effective in extracting and analyzing features of odor fingerprint. Compared with the PNN model, the RF model achieved the slightly higher accuracy. Meanwhile, compared with the traditional statistical methods and simple extraction, this feature mining method used the least characteristic variables and the best fusion method and can capture hidden patterns and variables inside the odor fingerprint. The odor fingerprint analysis using the feature mining method based on olfactory sensory evaluation can be applied to the food and drinks industry for product discrimination, classification, quality and control. Besides, the lab-developed intelligent nose can be used in the actual process of industrialization to monitor product quality.
Author Contributions: H.M. and J.L. conceived and designed experiments. Y.J. analyzed the data and wrote the paper. Y.S. and F.G. performed the experiment to obtain the olfactory information. Y.C. and H.F. extracted the olfactory characteristic information.

Conflicts of Interest:
The authors declare no conflict of interest.