^{1}

^{1}

^{1}

^{*}

^{2}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Biologically-inspired models and algorithms are considered as promising sensor array signal processing methods for electronic noses. Feature selection is one of the most important issues for developing robust pattern recognition models in machine learning. This paper describes an investigation into the classification performance of a bionic olfactory model with the increase of the dimensions of input feature vector (outer factor) as well as its parallel channels (inner factor). The principal component analysis technique was applied for feature selection and dimension reduction. Two data sets of three classes of wine derived from different cultivars and five classes of green tea derived from five different provinces of China were used for experiments. In the former case the results showed that the average correct classification rate increased as more principal components were put in to feature vector. In the latter case the results showed that sufficient parallel channels should be reserved in the model to avoid pattern space crowding. We concluded that 6∼8 channels of the model with principal component feature vector values of at least 90% cumulative variance is adequate for a classification task of 3∼5 pattern classes considering the trade-off between time consumption and classification rate.

Some bionic analytical instruments such as electronic nose (eNose) and electronic tongue (eTongue) generally consist of an array of cross-sensitive chemical sensors and an appropriate pattern recognition (PARC) method for automatically detecting and discriminating target analytes [

Besides, feature selection or feature extraction is also an important issue for developing robust PARC models in machine learning. The outputs of sensor arrays are usually time series. Some features should be extracted to represent the original signals for PARC systems. In some cases, not all elements in feature vector obtained from the preprocessing stage are essential for classification due its high-dimensionality and redundancy [

In our previous work [

Over the last decades of study on mammalian olfactory system, Freeman and his colleagues [

The dynamics of every node can be described by a 2nd-order ordinary differential equation (ODE) as follows:
_{i}_{j}_{ij}_{i}

The dynamics of the whole KIII system can be mathematically expressed by a set of ODEs. The parameters of W_{ij}

The operation of learning and memorizing in the KIII model can be described as follows. The system with no stimuli is in a high dimensional state of spatially coherent basal activity, presenting a chaotic global attractor in phase space; while with external stimulus the system will soon turn to a γ-range of quasi-periodic burst, presenting a local basin of an attractor wing in phase space. Using some learning rules to adjust lateral connection weights in the OB layer, the model is able to remember a number of odor patterns. When a familiar odor presents again, its spatiotemporal pattern will soon transmit to a corresponding local attractor.

Some learning rules were proposed for the KIII model, such as Hebbian reinforcement learning rule, global habituation rule, anti-Hebbian learning rule and local habituation rule [

When the activities of the _{m}_{Heb}_{Heb}_{hab}_{hab}_{Heb}

At the end of learning, the connection weights are fixed and the cluster centroids of every pattern are determined. While inputting a new sample from testing set, the Euclidean distances from the corresponding activity vector to those training pattern cluster centroids are calculated, and the minimum distance determines the classification.

All model implementations in this paper were carried out in MATLAB 7.5 (The MathWorks^{®} Inc., Natick, MA, USA) on a Lenovo computer (Pentium^{®} Dual CPU 1.86 GHz and RAM 2 GB) running Windows XP (Microsoft^{®} Corp., Redmond, WA, USA).

A homemade electronic nose system has been developed for data acquisition, as illustrated in

To demonstrate the classification performance of the KIII model with respect to dimension varying of input feature vector, a wine data set provided by UCI Machine Learning Repository [

Selecting an appropriate set of features which is optimal for a given classification task is one of the most important issues in machine learning. Many techniques such as principal component analysis and independent component analysis (ICA) produce a mapping between the original feature space to a lower dimensional feature space, and are usually proposed for dimension reduction and feature selection.

In this paper, the PCA technique is used for feature selection. The aim is to pick out patterns in multivariate data and reduce the dimensionality of the input vector without a significant loss of information. PCA can also help to get an overall view of these data through giving an appropriate visual representation with fewer dimensions.

Five samples in each class of wine data set were randomly chosen for training set and the others were used for testing. The first _{hab}_{Heb}_{Heb}

Five brands of commercial green tea derived from five different provinces of China,

According to the experience of Section 3.1, in this case, only the first two principal components are sufficient as selected features to train KIII model. The verification experiment was carried out as follows. Five samples in each class of tea data set were randomly chosen for training set and the others were used for testing. The first two principal components were selected as input feature vector for a 2-channel KIII model. The parameters of the model and the implementing method were the same as described above. The average correct classification rates of ten trials are presented in

As described in Section 2.2, the KIII model acts as an associative memory capable of storing previously trained patterns by means of Hebbian rule at OB layer, and recovering incomplete or corrupted pattern. From a neurodynamics viewpoint, there is a global chaotic attractor composed of a central part and multiple wings in the KIII system, and the functions above work through the transition back and forth between the central part and one wing or between the wings. When parallel channels of the KIII model reduce, there will not enough state space to represent that kind of attractor,

In this paper, the classification performance of a bionic olfactory model called KIII with respect to outer factor (the dimension of input feature vector) and inner factor (the amount of its parallel channels) was investigated, through recognition of three classes of wine derived from three different cultivars and five classes of green tea derived from five different provinces of China. In the first case, the PCA plot implied that the first three principal components do not represent sufficient information for the database, so more PCs were put in to feature vector and the corresponding classification rates were calculated. The results showed that classification performance of the KIII model gradually becomes better and better as PC numbers increase. Considering the trade-off between time consumption and classification rate, a PC feature of seven (about 90% of cumulative variance percentage) is an appropriate dimension for this special task. In the second case, the first two PCs (about 97% of cumulative variance percentage) were selected as input feature vector for a 2-channel KIII model. The poor results suggested that cumulative variance percentage is not the only factor to affect the KIII model’s classification performance. The results showed that classification performance of the KIII model gradually become better and better as its parallel channels extend, so we concluded that 6∼8 channels of the model with principal component feature vector values of at least 90% cumulative variance is adequate for a classification task of 3∼5 pattern classes considering the trade-off between time and classification rate. The study may be helpful to determine the input feature dimension and its parallel channels as applying the KIII model for pattern recognition (not limited to eNose applications). Future works will be addressed on analyzing higher-dimensional MS-based electronic nose data using the model.

This work was supported by Zhejiang Provincial Natural Science Foundation of China under Grant No. Y1110074 and the Scientific Research Fund of Zhejiang Provincial Education Department under Grant No. Y201010012. It was also financially supported in part by the Research Fund of the Science and Technology Department of Zhejiang Province (2008C14100) and Commonweal Technology Research Foundation of Zhejiang Province (2011C23068). The authors would like to thank UCI Machine Learning Repository for providing the wine data set.

The topological structure of the KIII olfactory model [

Notes: The lateral connection weights of M1 nodes (red arrows) are adjustable for learning, while the others (black arrows) are usually fixed for system stability.

The trajectory in phase-space. (

Binary experiments for (

Photo of the experimental set-up with the customized electronic nose system.

The 3D-PCA plot of three classes of wine derived from three different cultivars.

Note: Symbols: ^{O}-Class 1; *-Class 2 and □-Class 3.

The correct classification rate of three classes with respect to PC numbers.

The average correct classification rate with respect to cumulative variance percentage of principal components.

The 2D-PCA plot of five classes of green tea derived from five different provinces of China.

Note: Symbols: ^{O}-class 1; *-class 2; □-class 3; ♦-class 4; x-class 5.

The classification performance of the KIII model with the increase of the dimension of input feature vector.

Class 1 | 79.19 | 85.45 | 95.14 | 95.56 | 98.34 | 98.82 |

Class 2 | 73.08 | 87.82 | 93.29 | 96.96 | 95.91 | 96.73 |

Class 3 | 81.25 | 84.35 | 92.03 | 93.83 | 92.07 | 94.14 |

Average | 77.84 | 85.87 | 93.49 | 95.45 | 95.44 | 96.57 |

CVP | 73.60 | 80.16 | 85.10 | 89.34 | 92.02 | 94.24 |

Note: all classification rate and PCA values in the table are expressed as percentages.

The classification performance of the KIII model for five kinds of green tea with the increase of its parallel channels.

Class 1 | 10.59 | 14.12 | 30.59 | 37.06 | 87.06 | 95.88 |

Class 2 | 10.00 | 18.24 | 25.88 | 35.29 | 86.47 | 94.12 |

Class 3 | 7.06 | 15.29 | 27.65 | 44.12 | 92.94 | 93.53 |

Class 4 | 12.35 | 21.76 | 27.06 | 41.76 | 87.06 | 94.12 |

Class 5 | 9.41 | 22.35 | 34.71 | 48.24 | 85.88 | 97.65 |

| ||||||

Average | 9.88 | 18.35 | 29.18 | 41.29 | 87.88 | 95.06 |

| ||||||

CVP | 97.26 | 99.22 | 99.75 | 99.91 | 99.96 | 99.99 |

Note: all classification rate and PCA values in the table are expressed as percentages.