- freely available
Sensors 2014, 14(4), 6938-6951; doi:10.3390/s140406938
Abstract: We present a composite vector selection method for an effective electronic nose system that performs well even in noisy environments. Each composite vector generated from a electronic nose data sample is evaluated by computing the discriminant distance. By quantitatively measuring the amount of discriminative information in each composite vector, composite vectors containing informative variables can be distinguished and the final composite features for odor classification are extracted using the selected composite vectors. Using the only informative composite vectors can be also helpful to extract better composite features instead of using all the generated composite vectors. Experimental results with different volatile organic compound data show that the proposed system has good classification performance even in a noisy environment compared to other methods.
An electronic nose is an instrument intended to identify the specific components of an odor. While human olfactory sensing is prone to be easily fatigued, an electronic nose has the merit of consistently detecting odors, including those harmful to the human body [1–4]. Electronic nose systems are used for various purposes, such as quality control applications in the food and cosmetics industries, the detection of odors regarding specific diseases for medical diagnosis, and the detection of gas leaks for environmental protection [3,5–9].
An electronic nose consists of a sensor array for chemical detection, which is made of polymer carbon composite materials, and a classifier based on various pattern recognition techniques. Hence, the sensitivity of a sensor array and the design of a classifier are crucial factors for the improvement of electronic noses. There are several types of sensor arrays for electronic noses [10–15]. Among them, conducting polymer composites, intrinsically conducting polymer and metal oxides are most commonly used for sensing materials in conductivity sensors. Once volatile organic compounds (VOC) are adsorbed on the sensor surface, a specific response is obtained as a numerical variable by an electronic interface.
In classification problems, the processes can be decomposed into a few steps: feature selection, feature extraction and choosing a classifier. Various static or dynamic information for odor classification can be obtained from the sensor response curve [16–18]. In [17,18], five features, which are the relative change in resistance, the curve integral both over the gas adsorption and desorption process and the phase space integral, again over adsorption and desorption, are extracted from the response curves of six metal oxide sensors. The analysis of the dynamic features of metal oxide sensors was presented to classify four types of volatile compounds, namely acetone, acetic acid, acetaldehyde and butyric acid  and active analyses were proposed to deal with gas mixture problems [19,20]. In [21–23], various compensation methods were proposed to solve the drift problem causing a random temporal variation of the sensor response under identical conditions.
The features extracted from the sensor array are fed into a classifier such as the NN (Nearest Neighbor rule)  or SVM (Support Vector Machine)  for prediction of the class label. In order to improve the performance of a classifier, various feature extraction methods can be used for discriminant analysis and dimensionality reduction [24–27]. Since each method has its pros and cons, an appropriate method must be selected considering the properties of the data and the problem that needs to be solved. For instance, the PCA (Principal Component Analysis) method  does not utilize class information of data samples, and finds the projection vectors that correspond to a set of large eigenvalues of the total scatter matrix of data samples. Thus, it is more appropriate to use the PCA method for data representation, rather than data classification. On the other hand, the LDA (Linear Discriminant Analysis) method  seeks the linear transformation that maximizes the ratio of the between-class scatter matrix (SB) and the within-class scatter matrix (Sw). While it gives good performance for classification problem, it suffers from the SSS (Small Sample Size) problem  in case of high-dimensional data.
The above methods extract features based on covariance matrices which differ depending on their objective functions. Unlike this, some methods such as MatFLDA (Matrixized Fisher Linear Discriminant Analysis) , 2DFLD (Two-Dimensional Fisher Linear Discriminant) , or CLDA (Composit LDA) [32,33], use a different type of covariance matrix, which is called an image-covariance matrix. The elements of an image covariance matrix are defined as the expectation of the inner products of predefined vectors. These methods are often effective for data that has a large correlation between primitive variables or high-dimensional data such as the electronic nose data  because they utilize information about the statistical dependency among multiple primitive variables and result in a saving in computational effort.
The composite features are extracted by using the covariance of composite vectors composed of a number of primitive variables in various shapes of windows. However, it is likely that there is redundancy between composite vectors when generating composite vectors. Moreover, If there are problems in the data collection process, or when attributes among the collected primitive variables that have no association with solving the classification problem are included, the feature extraction results do not result in optimal solutions and degrade the classification performance . Therefore, distinguishing good composite vectors containing informative primitive variables before the feature extraction process is important to extract better composite features for classification.
In this paper, we propose a method to select the composite vectors which contain informative variables in an electronic nose data sample measured by a sensor array. We measure the amount of discriminative information that each composite vector has, based on the discriminant distance  for each composite vector and rank nCf composite vectors in descending order according to its discriminant score. The informative composite vectors are distinguished before the process of feature extraction, and then the composite features to be used for the classifier are extracted from the only selected composite vectors. There are potential benefits in employing this selection process such as reduction in computation, storage and processing time in addition to prediction performance improvement. In the process of extracting composite features, the computational effort increases in the order of υ2 as the number of composite vectors (υ) increases. This implies that the computational complexity can be significantly reduced by the proposed method. By using a classifier in an electronic nose with the extracted composite features, we design the robust electronic nose system to noisy environments (Figure 1). The experimental results show that the proposed method gives very good classification results even in a noisy environment.
The rest of this paper is organized as follows. Section 2 introduces a discriminant distance and presents how to select composite vectors based on their discriminant scores. Section 3 explains the acquisition of electronic nose data and how composite features are extracted using the selected composite vectors for odor classification. Section 4 describes the experimental results and the conclusions follow in Section 5.
2. Composite Vector Selection Based on Discriminant Distance
Composite vectors can be defined in various ways depending on the shape of a window. The data acquired from a sensor array is stored in an n-dimensional vector, and a composite vector xi ϵ ℝ l consists of l(l < n) primitive variables. Composite vectors are generated by shifting a window as much as s, which is usually smaller than the length of a composite vector, and thus composite vectors overlap with each other, as shown in Figure 2. The correlation between neighboring variables can be better utilized in the use of the covariance of composite vectors. The number of composite vectors υ is , where └ · ┘ is the floor operator, which gives the largest integer value that is not greater than the value inside the operator. Then, the k-th data sample is represented by X(k) = [x1(k),..,xυ(k)]T ϵ ℝ υ×l, which is a set of composite vectors. The final composite features for classification are extracted by using the covariance of these composite vectors .
However, the overlapped composite vectors as in Figure 2, which may result in redundancy in extracting composite features. Therefore, it needs to find out the composite vectors that promise good class separability among different classes as well as make the samples in the same classes as close as possible. Motivated from the method to select individual variables based on a distance discriminant , we define the distance within classes and the distance between classes to compute the discriminant distance for the i-th composite vector as follows:
3. Design of Electronic Nose System
3.1. Acquisition of Electronic Nose Data
The sensor array used in our system was implemented by dispensing a CB polymer composite-solvent solution in a micromachined gas sensor array chip . While the polymer composite has some drawbacks such as sensor drift, limited sensor life, or sensitivity to temperature and humidity it offers many advantages over other materials when used as gas sensor, e.g., the wide range of polymetric materials, inexpensiveness, stable operation at room temperature, and less power consumption, etc.  The sensor array consists of 16 separate sensors with an interdigitated electrode, microheater, and micromachined membrane in each channel for further temperature-controlled measurement applications (Table 1). The resistance change of each polymer composite film was monitored in response to the incorporation of chemical vapor. The resistance change of polymer composite film was amplified by 20 times and recorded every 0.1 s (Figure 3). Measurement consisted of three steps of stabilization (30 s), exposure (60 s), and purge (110 s). It was performed after the sensor array was placed into the chamber and and the signal of resistance was stabilized. Then, the flow control unit in our system allows the vapors to flow in at desired concentration during about 60 s and afterward flushes the remainder by air flow for about 110 s . The measured data are collected in PC using data acquisition (DAQ) board DAQ6062E and LabVIEW (National Instrumentation, USA). The voltage-divider operated in the range from -10 V to +10 V and gains of 16 identical amplifiers were set to 10 (output/input voltage) for maximum DAQ resolution .
3.2. Extraction of Composite Features from Selected Composite Vectors
It is very effective for classifying patterns if the within-class variance is small while the between-class variance is large. Similar to LDA, a discriminant analysis using the covariance of composite vectors is derived from the between-class covariance matrix (CB) and the within-class covariance matrix (CW) . Assume that each training sample belongs to one of c classes, and that there are Ni samples in the class ci. Let X′(k) ϵ ℝ ncf×l denote the set of the selected composite vectors of the k-th sample. Then, CW ϵ ℝ ncf× ncf is defined as
The image covariance can be also interpreted from another point of view, not from the view of the composite vectors. If letting χ(k) and m be column vectors of X′(k) and M, respectively, CW and CB can be rewritten as
Composite features are obtained by linear combinations of the composite vectors and each feature is a vector whose dimension is equal to the dimension of the composite vector. For composite feature extraction, the projection matrix W is found by maximizing the following objective function:
The length of the window (l), the number of composite features (m) and the step size of the shift (s) are important parameters that influence the classification performance. We investigated the classification rates with respect to l, m and s. Table 2 shows the classification rates with respect to l and m. In this case, we set s = l/2 as in . As can be seen in Table 2, the classification rates are not sensitive to l if m is properly decided. We set l and m to 400 and 25, respectively. Then, we investigated the classification rates with respect to s. As can be seen in Table 3, the classification rates are not sensitive to s and the classification rate of s = 200 was slightly better than those of other s values. Therefore, we set s to 200. Also, in order to find the optimal number of the selected composite vectors, we checked the classification rates for the electronic nose data by increasing the number of selected composite vectors ncf. As a result, we set the number of selected composite vector ncf to 150.
The overall procedure of our system can be summarized as follows (Figure 4):
Generate υ composite vectors xi(k), i = 1,.., υ ϵ ℝ l from an e-nose data sample by shifting the l length of window as much as the step size of shift (s).
For each composite vector xi(k), compute the distances within- and between-classes .
Compute the discriminant distance for the i-th composite vector by .
Construct the measure vector S ϵ ℝ υ whose element Si.
Select ncf composite vectors corresponding to larger Sis.
Extract the final composite features with the only selected composite vectors.
4. Experimental Results
The VOC measurement data consists of 8 classes, which are acetone, benzene, cyclo-hexane, ethanol, heptane, methanol, propanol, and toluene . For each class, we obtained 20 samples, and thus the total data set contains 160 samples. Figure 5 shows the distribution of the data samples in the subspace consisted of two principal component axes. The e-nose sensor used in this experiment measures vapors with a speed of 10 Hz, which corresponds to a sampling rate of 2,000 points per 200 s. Each data sample was measured through 16 channel over 2,000 time points and was represented as a 16 × 2,000 matrix. Then, the raw data was transformed into the 32,000-dimensional vector by using the lexicographic ordering operator for feature extraction (Figure 2).
When setting l and s as 400 and 200, respectively, the total 159 composite vectors can be generated from a 32,000-dimensional data sample. We measured the discriminant scores of each composite vector by using the proposed method. Out of the total 159 composite vectors, we represented the composite vectors with top 60 and 120 scores as ‘1’ and the rest as ‘0’ (Figure 6). In Figure 6, we can see that the ‘stabilization’ and ‘purge’ periods contain the discriminative information for odor classification as well together with the ‘exposure’ period.
We compared the classification performance of the proposed method (CVS) with that of the LDA method , the FF (Feature Feedback) method , the CC-PCA (Component Correction by PCA) method , and CC-CPCA (Component Correction by Common PCA) method . We applied PCA after CC-PCA and CC-CPCA, which slightly increased their classification rates. Each method was evaluated using an 8-fold cross validation strategy . In this scheme, the data is first randomly partitioned into 8 equally sized folds. Then, 8 iterations of training and testing are performed, within each of which a different fold of the data (20 data samples) is used for testing, while the remaining 7 folds (140 data samples) are used for training. The nearest neighbor rule was used as a classifier and the l2 nor was used to measure the distance between two samples. We repeated this test 8 times and computed the average classification rate. All the data samples are normalized using the mean and the variance of the training set.
Since noise is likely to occur in sensing data, we added Gaussian noise with a standard deviation 3 to each data sample, and evaluated the robustness of each method to the noise (Figure 7). Figures 8 show examples of the data with or without Gaussian noise and the classification rates of each case, respectively.
For the original data, all the methods classified each vapor well with high classification rates as can be seen in Figure 8a. When Gaussian noise is added, the classification rates of the other methods decreased rapidly (Figure 8b). In contrast, the proposed method gave consistently high classification rates of 97.3% ∼ 98.4%, which showed that our system performs reliably in a noisy environment.
We have presented a method to select useful composite vectors for odor classification. Composite vectors, which are generated from an electronic nose data sample by shifting the window, are likely to contain redundant information for extracting discriminant features and some noise occurred in measuring with a sensor array. Thus, we evaluated the class separability power of each composite vector based on a discriminant distance and selected the only composite vectors with large discriminative information. This selection process has the advantage to holistically view the electronic nose response by its focus on the extraction of informative response characteristics. The proposed composite vector selection method not only reduced the computational complexity, but also helped to extract better features. Since extracting good features not only relieves the influence of noise in the measured data, but also improves the performance of a classifier such as SVM and NN. When using SVM without any feature extraction, while the classification rate for the original electronic nose data was 98.0%, the classification rate dropped to 51.2% for the data with Gaussian noise. On the contrary, NN with the features extracted by the proposed method gave the classification rates of 99.8% and 98.4% for the same data sets, respectively. Hence, the proposed method can be utilized together with algorithms of other classification processes such as feature selection or classifier design and improve the performance of the overall classification system.
In this paper, we focus on the classification between gas data classes without interference. It is also important to classify the data which contains combinations of gases, different concentration, etc. in e-nose data. In near future, we will deal with the interference between gases and gas combinations.
The present research was conducted by the research fund of Dankook University in 2011.
Sang-Il Choi designed the experiments, and drafted the manuscript. Gu-Min Jeong provided useful suggestions and edited the draft. All authors approved the final version of the manuscript.
Conflicts of Interest
The authors declare no conflict of interest.
- Chiu, S.-W.; Tang, K.-T. Towards a chemiresistive sensor-integrated electronic nose: A review. Sensors 2013, 13, 14214–14247. [Google Scholar]
- Choi, S.-I.; Kim, S.-H.; Yang, Y.; Jeong, G.-M. Data refinement and channel selection for a portable e-nose system by the use of feature feedback. Sensors 2010, 10, 10387–10400. [Google Scholar]
- Sayeed, A.; Shameem, M.S. Electronic nose. Adv. Med. Inform. 2011, 1, 6–9. [Google Scholar]
- Perera, A.; Pardo, A.; Gutierrez-Osuna, R.; Marco, S. A portable electronic nose based on embedded PC technology and GNU/Linux: Hardware, software and applications. IEEE Sens. J. 2002, 2, 235–246. [Google Scholar]
- Khalaf, W.; Pace, C.; Gaudioso, M. Least square regression method for estimating gas concentration in an electronic nose system. Sensors 2009. [Google Scholar]
- Wilson, A.D.; Baietto, M. Applications and advances in electronic-nose technologies. Sensors 2009, 9, 5099–5148. [Google Scholar]
- Macias, M.M.; Agudo, J.E.; Manso, A.G.; Orellana, C.J.G.; Velasco, H.M.G.; Caballero, R.G. A compact and low cost electronic nose for aroma detection. Sensors 2013, 13, 5528–5541. [Google Scholar]
- Wilson, A.D.; Baietto, M. Advances in electronic-nose technologies developed for biomedical applications. Sensors 2011, 11, 1105–1176. [Google Scholar]
- Pardo, M.; Sberveglieri, G. Classification of electronic nose data with support vector machines. Sens. Actuators B Chem. 2005, 107, 730–737. [Google Scholar]
- Arshak, K.; Moore, E.; Lyon, G.M.; Harris, J.; Clifford, S. A review of gas sensors employed in electronic nose applications. Sens. Rev. 2004, 24, 181–198. [Google Scholar]
- Albert, K.J.; Lweis, N.S. Cross reactive chemical sensor arrays. Chem. Rev. 2000, 100, 2595–2626. [Google Scholar]
- Dickinson, T.A.; White, J.; Kauer, J.S.; Walt, D.R. Current trends in artificial-nose technology. Trends Biotechnol. 1998, 16, 250–258. [Google Scholar]
- Mirmohseni, A.; Oladegaragoze, A. Construction of a sensor for determination of ammonia and aliphaticamines using polyvinylpyrrolidone coated quartz crystal microbalance. Sens. Actuators B Chem. 2003, 89, 146–172. [Google Scholar]
- Pearce, T.C.; Schiffman, S.S.; Nagle, H.T.; Gardner, J.W. Handbook of Machine Olfaction: Electronic Nose Technology, 2nd ed.; Wiley-VCH: Weinheim, Germany, 2003. [Google Scholar]
- Yang, Y.S.; Ha, S.-C.; Kim, Y.S. A matched-profile method for simple and robust vapor recognition in electronic nose (E-nose) system. Sens. Actuators B Chem. 2005, 106, 263–270. [Google Scholar]
- Šetkus, A.; Olekas, A.; Senulienė, D.; Falasconi, M.; Pardo, M.; Sberveglieri, G. Analysis of the dynamic features of metal oxide sensors in response to SPME fiber gas release. Sens. Actuators B Chem. 2010, 146, 539–544. [Google Scholar]
- Prado, M.; Sberveglieri, G. Comparing the performance of different features in sensor arrays. Sens. Actuators B Chem. 2007, 123, 437–443. [Google Scholar]
- Falasconi, M.; Pardo, M.; Sberveglieri, G.; Riccò, I.; Bresciani, A. The novel EOS835 electronic nose and data analysis for evaluating coffee ripening. Sens. Actuators B Chem. 2005, 110, 73–80. [Google Scholar]
- Huang, J.; Gutierrez-Osuna, R. Active analysis of chemical mixtures with multi-modal sparse non-negative least squares. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada, 26–31 May 2013; pp. 8756–8760.
- Gosangi, R.; Gutierrez-Osuna, R. Active temperature modulation of metal-oxide sensors for quantitative analysis of gas mixtures. Sens. Actuators B Chem. 2013, 185, 201–210. [Google Scholar]
- Padilla, M.; Perera, A.; Mintoliu, I.; Chaudry, A.; Persaud, K.; Marco, S. Drift compensation of gas sensor array data by orthogonal signal correction. Chemom. Intell. Lab. Syst. 2010, 100, 28–35. [Google Scholar]
- Ziyatdinov, A.; Marco, S.; Chaudry, A.; Persaud, K.; Caminal, P.; Perera, A. Drift compensation of gas sensor array data by common principal component analysis. Sens. Actuators B Chem. 2010, 146, 460–465. [Google Scholar]
- Vergara, A.; Vembu, S.; Ayhan, T.; Ryan, M.A.; Homer, M.L. Chemical gas sensor drift compensation using classifier ensembles. Sens. Actuators B Chem. 2012, 166, 320–329. [Google Scholar]
- Choi, S.-I.; Oh, J.; Choi, C.-H.; Kim, C. Input variable selection for feature extraction in classification problems. Signal Process. 2012, 92, 636–648. [Google Scholar]
- Jeong, G.-M.; Ahn, H.-S.; Choi, S.-I; Kwak, N.; Moon, C. Pattern recognition using feature feedback: Application to face recognition. Int. J. Control. Autom. Syst. 2010, 8, 1–8. [Google Scholar]
- Belhumeur, P.N.; Hespanha, J.P.; Kriegman, D.J. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 711–720. [Google Scholar]
- Yu, H.; Yang, J. A direct LDA algorithm for high-dimensional data-with application to face recognition. Patt. Recog. 2001, 34, 2067–2070. [Google Scholar]
- Turk, M.; Pentland, A. Eigenfaces for recognition. J. Cogn. Neurosci. 1991, 3, 71–86. [Google Scholar]
- Fukunaga, K. Introduction to Statistical Pattern Recognition, 2nd ed.; Academic Press: New York, NY, USA, 1990. [Google Scholar]
- Chen, S.; Zhu, Y.; Zhang, D. Feature extraction approaches based on matrix pattern: MatPCA and MatFLDA. Pattern Recognit. Lett. 2005, 26, 1157–1167. [Google Scholar]
- Xiong, H.; Swamy, M.; Ahmad, M. Two-Dimensional FLD for face recognition. Pattern Recognit. 2005, 38, 1121–1124. [Google Scholar]
- Kim, C.; Choi, C.-H. A discriminant analysis using composite features for classification problems. Pattern Recognit. 2007, 40, 2118–2125. [Google Scholar]
- Kim, C.; Choi, C.-H. Image covariance-based subspace method for face recognition. Pattern Recognit. 2007, 40, 1592–1604. [Google Scholar]
- Choi, S.-I.; Jeong, G.-M.; Kim, C. Classification of odorants in the vapor phase using composite features for a portable e-nose system. Sensors 2012, 12, 16182–16193. [Google Scholar]
- Liang, J.; Yang, S.; Winstanley, A. Invariant optimal feature selection: A distance discriminant and feature ranking based solution. Pattern Recognit. 2008, 41, 1429–1439. [Google Scholar]
- Kim, C.; Choi, S.-I.; Turk, M.; Choi, C.-H. A new biased discriminant analysis using composite vectors for eye detection. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2012, 42, 1095–1106. [Google Scholar]
- Ha, S.-C.; Kim, Y.S.; Yang, Y.; Kim, Y.J.; Cho, S.-M.; Yang, H.; Kim, Y.T. Integrated and microheater embedded gas sensor array based on the polymer composites dispensed in micromachined wells. Sens. Actuators B Chem. 2005, 105, 549–555. [Google Scholar]
- Jeong, G.-M.; Ahn, H.-S.; Choi, S.-I.; Kwak, N.; Moon, C. Pattern recognition using feature feedback: Application to face recognition. Int. J. Control Autom. Syst. 2010, 8, 1–8. [Google Scholar]
- Artursson, T.; Eklov, T.; Lundstrom, I.; Martensson, P.; Sjostrom, M.; Holmberg, M. Drift correction for gas sensors using multivariate methods. J. Chemom. 2000, 14, 711–723. [Google Scholar]
- Refaeilzadeh, P.; Tang, L.; Liu, H. Encyclopedia of Database Systems; Springer: New York, NY, USA, 2009; pp. 532–538. [Google Scholar]
|Ch9||Poly(bisphenol A carbonate)|
|Chll||Poly(vinyl butyral)-co-vinyl alcphol-co-vinyl acetate|
© 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license ( http://creativecommons.org/licenses/by/3.0/).