A Novel Sparse Representation Classification Method for Gas Identification Using Self-Adapted Temperature Modulated Gas Sensors

A novel sparse representation classification method (SRC), namly SRC based on Method of Optimal Directions (SRC_MOD), is proposed for electronic nose system in this paper. By finding both a synthesis dictionary and a corresponding coefficient vector, the i-th class training samples are approximated as a linear combination of a few of the dictionary atoms. The optimal solutions of the synthesis dictionary and coefficient vector are found by MOD. Finally, testing samples are identified by evaluating which class causes the least reconstruction error. The proposed algorithm is evaluated on the analysis of hydrogen, methane, carbon monoxide, and benzene at self-adapted modulated operating temperature. Experimental results show that the proposed method is quite efficient and computationally inexpensive to obtain excellent identification for the target gases.


Introduction
Electronic noses are technical devices that contain a gas sensor array and pattern recognition system [1]. However, the pattern recognition of electronic noses, in many cases, is plagued with problems. It is quite usual to encounter drift, scattering due to concentration effects, highly correlated features, or non-Gaussian data distributions [2,3]. In addition, due to high calibration costs and complex experimental conditions, the number of training samples is limited. Hence, the performance of classifier is very important to electronic noses, as they can improve the robustness to the problems mentioned above.
At present, there are many classification methods for gas sensor data [4][5][6][7][8][9] such as deep learning and support vector machine (SVM). Since the concept of deep learning was put forward, it has attracted the attention of many scholars [10][11][12]. Peng et al. proposed a novel Deep Convolutional Neural Network (DCNN) tailored for gas classification [13]. Wei et al. also proposed a new improved LeNet-5 gas identification convolutional neural network structure for electronic noses [14].
Because support vector machine (SVM) has good generalization properties and robustness against the curse of dimensionality [15], SVM has been widely applied to gas identification [16][17][18]. Vergara et al. used Inhibitory Support Vector Machine (ISVM) to detect and identify odor under complex environmental conditions [17]. Sakumura et al. also used SVM to detect respiratory samples, and achieved high detection accuracy [18].

The Measurement Circuit
The selectivity and sensitivity of the metal oxide gas sensors can be improved by optimizing the operation temperature of the sensors [28,29]. Martinelli designed a self-adapted temperature modulation circuit and achieved high detection accuracy. This method implements the concept of self-adapted temperature modulation, and it is based on the evidence that the sensitivity to the gas of the sensor resistance depends on the operating temperature, and, conversely, the sensitivity to the temperature depends on the gas [28].
In this paper, we proposed an improved self-adapted temperature modulated measurement circuit to improve the performance of the electronic nose. The measurement circuit is shown in Figure 1. It mainly contains a multivibrator circuit, three gas sensors and a comparator C 1 . The resistances of sensor 1 and sensor 2 are part of the multivibrator circuit. V REFL and V REFH represent low and high reference voltage of heating voltage, respectively. In this paper, V REFL = 2 V and V REFH = 5 V. complex environmental conditions [17]. Sakumura et al. also used SVM to detect respiratory samples, and achieved high detection accuracy [18].
Sparse representation classification (SRC) is proved to be robust to outliers, noise, and even incomplete measurements [24][25][26], and some scholars have successfully used SRC to solve the problem of gas classification, such as, Guo et al. [27] who used sparse representation-based classification to identify breath samples. However, SRC is time consuming, which limits its application.
The contribution of this paper is to propose a novel sparse representation classification method, namly SRC based on Method of Optimal Directions (SRC_MOD), to improve the classification performance of electronic nose. In order to improve the learning speed, the training set is divided into some subsets according to the label of samples, and the optimal solution of synthesis dictionary submatrix and coefficient submatrix are solved by MOD. Moreover, the testing phase is separated from the training phase. The structure of the article is as follows: Section 2 describes briefly introduces the experimental setup and data collection. Section 3 analyzes the proposed SRC_MOD method. Section 4 discusses comparison results with other classifiers. Section 5 presents the conclusions.

The Measurement Circuit
The selectivity and sensitivity of the metal oxide gas sensors can be improved by optimizing the operation temperature of the sensors [28,29]. Martinelli designed a self-adapted temperature modulation circuit and achieved high detection accuracy. This method implements the concept of self-adapted temperature modulation, and it is based on the evidence that the sensitivity to the gas of the sensor resistance depends on the operating temperature, and, conversely, the sensitivity to the temperature depends on the gas [28].
In this paper, we proposed an improved self-adapted temperature modulated measurement circuit to improve the performance of the electronic nose. The measurement circuit is shown in Figure 1. It mainly contains a multivibrator circuit, three gas sensors and a comparator 1 C . The resistances of sensor 1 and sensor 2 are part of the multivibrator circuit. VREFL and VREFH represent low and high reference voltage of heating voltage, respectively. In this paper, VREFL = 2 V and VREFH = 5 V.  We can see from Figure 1 that if the output voltage V O3 of the 555 timer is low (V O3 ≈ 0 V), V O3 < V REFL , the output voltage of comparator C 1 is low, and the transistor T 1 will be cut off. When this happens, the heating voltage across sensor 3 or the three sensors? will be close to V REFL . On the other hand, if the output voltage V O3 is high (V O3 ≈ V CC ), V O3 > V REFL , the output voltage of comparator C 1 is high, the transistor T 1 will be turned on. When this occurs, the heating voltage across sensor 3 or the three sensors? will be close to V REFH . V O3 is a square wave signal whose period depends on the capacitor C and the sensor resistances of sensor 1 and sensor 2. The charging time of capacitor C is given by: And the discharge time of the capacitor C is given by: hence, the period of square wave signal is: and the duty cycle of the square wave signal V O3 is: The output voltage of the third sensor 3 is given by: where, R S3 is the sensor resistor of the third TGS2620 sensor. Figure 2 shows the experimental set-up. The testing system uses two computer-controlled, digital mass flow controllers (MFCs). The testing gas at the desired concentration is conveyed to a 300 mL volume testing chamber by MFCs with highly reproducibility and higher accuracy. We keep the total flow constant for each test. In this paper, the total flow rate is set to 500 sccm.  Figure 1. The measurement circuit. Sensor 1 is TGS2610, sensor 2 is TGS2610, and sensor 3 is TGS2620. VREFL and VREFH represent low and high reference voltage of heating voltage, respectively.

The Experimental Set-Up
We can see from Figure 1 that if the output voltage , the output voltage of comparator 1 C is low, and the transistor 1 T will be cut off. When this happens, the heating voltage across sensor 3 or the three sensors? will be close to VREFL. On the other hand, if the output voltage , the output voltage of comparator 1 C is high, the transistor 1 T will be turned on. When this occurs, the heating voltage across sensor 3 or the three sensors? will be close to VREFH.
3 O V is a square wave signal whose period depends on the capacitor C and the sensor resistances of sensor 1 and sensor 2. The charging time of capacitor C is given by: And the discharge time of the capacitor C is given by: hence, the period of square wave signal is: and the duty cycle of the square wave signal VO3 is: The output voltage of the third sensor 3 is given by: where, RS3 is the sensor resistor of the third TGS2620 sensor. Figure 2 shows the experimental set-up. The testing system uses two computer-controlled, digital mass flow controllers (MFCs). The testing gas at the desired concentration is conveyed to a 300 mL volume testing chamber by MFCs with highly reproducibility and higher accuracy. We keep the total flow constant for each test. In this paper, the total flow rate is set to 500 sccm. The above-mentioned measurement circuit that contains three gas sensors (TGS2610, TGS2610, and TGS2620, Figaro, Inc. Japan) is placed into the testing chamber. In order to collect all experimental samples, a LabVIEW environment program running on a PC platform, and the sample frequency is set to 1 Hz.

The Experimental Set-Up
The measurement procedure is as follows: (1) Clean testing chamber with dry air for 50 s.
(2) The testing gas at the desired concentration is conveyed to the testing chamber by MFCs for 100 s.
(3) Clean the testing chamber with dry air for 100 s.
When the three sensors return to baseline steady-state response, repeat step 1 to 3 for the next test until all the experiments are completed.

Data Collection
Four chemical analytes with different concentrations are tested by the electronic nose system. As shown in Table 1, the tested gases are hydrogen, methane, carbon monoxide, and benzene. Each test is repeated 20 times, and finally 400 samples are collected.  Figure 3 shows the heating voltage of 30 ppm benzene and the output voltages of four analytes. We can see from Figure 3a, the frequency of heating waveform in the middle is higher than that on both sides. The reason for this phenomenon is related to the change of resistances of sensor 1 and sensor 2. The reducing gas is injected into the testing chamber from the time of 51 to 150 s, which leads to the decrease of sensor resistances and the increase of waveform frequency. In this paper, setting C = 100 µF, the periods T of the heating voltage range from 2 to 25 s and the frequencies range from 40 to 500 mHz. The above-mentioned measurement circuit that contains three gas sensors (TGS2610, TGS2610, and TGS2620, Figaro, Inc. Japan) is placed into the testing chamber. In order to collect all experimental samples, a LabVIEW environment program running on a PC platform, and the sample frequency is set to 1 Hz.
The measurement procedure is as follows: (1) Clean testing chamber with dry air for 50 s.
(2) The testing gas at the desired concentration is conveyed to the testing chamber by MFCs for 100 s.
(3) Clean the testing chamber with dry air for 100 s.
When the three sensors return to baseline steady-state response, repeat step 1 to 3 for the next test until all the experiments are completed.

Data Collection
Four chemical analytes with different concentrations are tested by the electronic nose system. As shown in Table 1, the tested gases are hydrogen, methane, carbon monoxide, and benzene. Each test is repeated 20 times, and finally 400 samples are collected.  Figure 3 shows the heating voltage of 30 ppm benzene and the output voltages of four analytes. We can see from Figure 3a, the frequency of heating waveform in the middle is higher than that on both sides. The reason for this phenomenon is related to the change of resistances of sensor 1 and sensor 2. The reducing gas is injected into the testing chamber from the time of 51 to 150 s, which leads to the decrease of sensor resistances and the increase of waveform frequency. In this paper, setting C = 100 μF, the periods T of the heating voltage range from 2 to 25 s and the frequencies range from 40 to 500 mHz.  method. 360 samples are randomly divided into 10 subsets with equal size. A single subset is retained as the validation set, and the remaining nine subsets are used as training set. Hence, the number of validation samples is 36 and the number of training samples is 324. The program runs 10 times, with each of the 10 subsets used exactly once as the validation set. The prediction model with the highest recognition rate is used as the final model to identify the testing gas.

The Proposed SRC_MOD Algorithm for Gas Identification
Suppose V O ∈ R 250×1 denotes a sensor signal that is a time-based variable and in total has 250 points. Firstly, the sensor siganl is removed additive noise or drift by x = V O − V re f , where, V re f is baseline steady-state output voltage in dry air. Then, the sensor sample x is normalized by x = (x − min(x))/(max(x) − min(x)), where, min() and max() denote the sample minimum and maximum value.
The normalized training sample dataset is represented by a matrix X = [X 1 , X 2 , · · · , X n ] ∈ R 250×324 , where, X i , i = 1, 2, · · · , n, is a submatrix of training sample corresponding to class i and each column is a sensor sample, n is total number of categories.
The ith class training samples X i is approximated as a linear combination of some few of the dictionary atoms. The approximation X * i can be written as: where, · 0 is l 0 -norm, W i is the coefficients of the ith class training samples and most of the entries in W i are zero, D i is a synthesis dictionary corresponding to class i. Equation (6) describes each given sensor signal as the sparsest representation W i over the synthesis dictionary D i , and aims to jointly find the proper representations and the dictionary. If a solution has been found such that every representation has fewer non-zero entries, a candidate feasible model has been found. In this paper, the synthesis dictionary D i is initialized as a random matrix with the size of 250 × 250. Equation (6) can be formulated as an optimization problem with respect to W i and D i . With γ, we may put it as: As γ increases, the solution is getting more dense. Solutions of Equation (7) can be found by the Method of Optimal Directions (MOD). MOD is a dictionary learning algorithm [30]. It's aim is to find both a dictionary D i and a corresponding coefficient matrix W i such that the representation error R = X i − D i W i is minimized and W i fulfill some sparseness criterion. The procedures of obtaining the optimal solution of D i and W i are summarized in Algorithm 1.

Algorithm 1. Obtain the optimal solution of D i and W i
Input: The i-th class training samples X i , maximum error ε, k = 1.

Initialize dictionary
2 ) by pursuit algorithm [31]; For ∀i, the training samples X can be projected onto a coding coefficient space via P i X, where P i is an analysis dictionary corresponding to class i. The coding coefficient matrix W i is given by: where, P i is a full-rank matrix. If most of large coefficients generated by P i X are concentrated in W i , while the coding coefficients of the other class training samples over P i is as small as possible, the discrimination power of P i can be promoted. Hence, we may improve the discrimination power of P i by min P i X i 2 F , where, X i is the complementary data matrix of X i in the whole training set X and · F is the Hilbert-Schmidt norm or the Frobenius norm.
We evaluate the error using a Frobenius norm. The i-th class analysis dictionary P i can be approximated by where, the first term minimizes the error of the ith class coding coefficients, the second term is used to improve the recognition performance of the analysis dictionary P i , and α is a scalar constant. The third term is to avoid a high risk of overfitting to training samples, and β is a regularization parameter. We then solve for P i by least-squares. Differentiating Formula (9) with respect to P i , such a differentiation results in: Setting Equation (10) equal to zero gives the optimum P i as: Since W i = P i X i , Equation (6) can also be given by Define Φ i = D i P i as a projection matrix, and the approximation X * i is then rewritten as Using Equation (13), an arbitrary testing sample x test can be reconstructed as: We obtain n approximations x * 1 , x * 2 , · · · , x * n , and calculate the residual between x test and x * i by where, · 2 is l 2 -norm. The label corresponding to the minimum residual is the class of the testing sample. Figure 4 shows the original testing sample x test and four reconstrcted samples x * i , (i = 1, 2, 3, 4). x * i is obtained by Formula (14). From Figure 4, we can see that x * 2 is closest to the original testing sample x test The residuals between x test and x * i , (i = 1, 2, 3, 4) are represented in Figure 5. From Figure 5, we can see that the second residual is the smallest. Hence, the testing sample x test is the 2nd class, namely methane.
x is obtained by Formula (14). From Figure 4, we can see that * 2 x is closest to the original testing sample test  Figure 5. From Figure 5, we can see that the second residual is the smallest. Hence, the testing sample test x is the 2nd class, namely methane. x .
x is obtained by Formula (14). From Figure 4, we can see that 2 x is closest to the original testing sample test  Figure 5. From Figure 5, we can see that the second residual is the smallest. Hence, the testing sample test x is the 2nd class, namely methane. x . The proposed SRC_MOD algorithm is summarized in Algorithm 2.

Algorithm 2. The proposed SRC_MOD algorithm
Input: The training samples for n classes, testing samples.
1. for i = 1:n obtain D i and W i by Algorithm 1; obtain P i by Equation (11); end for 2. Φ i = D i P i 3. Reconstructing the testing sample by Equation (14) 4. To identify the testing sample by Equation (15) Output: The label of the testing sample.

Comparisons with Other Classifiers
In order to evaluate the performance of the proposed algorithm, we compare it with other algorithms, such as SRC (used in [27]), the dictionary learning (DL) classifier (proposed in [32]), deep learning (used in [14]) and BP artificial neural network. All experiments in this paper run on a dual-core processor with a CPU main frequency of 2.4 GHz. Python is applied for deep learning and MATLAB for the other algorithms. Accuracy and processing time are the average values of all testing samples.
For SRC, the algorithm is the same as that in [27]. At first, the testing sample is approximated as a linear combination of all training samples where, W is a sparse coefficient vector, X train is a matrix of all training samples. The sparsest solution of Equation (16) is defined as the following an l 1 -minimization problem: where, · 1 is l 1 -norm, λ is the regularization parameter. The solution to the l 1 -minimization problem can also be obtained by using the MATLAB package provided by Reference [33]. Keep the ith class coefficients and set the other coefficients equal to zero. We have Reconstruct the testing sample by x * i = W * i X train , (i = 1, 2, 3, 4) and use Equation (15) to obtain the class of the testing sample.
For DL classifier, the i-th class analysis dictionary, coding coefficients, and synthesis dictionary are trained together to generate the prediction model. It is a simple and effective dictionary learning algorithm. This is our previous work and more details of the algorithm are shown in [32].
For deep learning, as shown in Figure 6, a LeNet-5 convolutional neural network structure is built. C1 and C3 are convolutional layers with kernel size of 3 × 3 and 2 × 2, respectively. C1 computes 20 filters over its input. The first convolutional layer C1 takes a matrix with size of 25 × 10 × 1 and outputs a matrix with size of 25 × 10 × 20. Pooling layer P2 and P4 are all done with 2 × 2 windows. Max pooling consists of extracting windows from the input features and outputting the max value of each channel. Before the first max-pooling layer P2, the feature map is 25 × 10, but the pooling operation halves it to 12 × 5. The numbers of fully connected layer F5 and F6 are 120 and 84, respectively. Finally, the label of gas sample is obtained. This is also our previous work and more details are shown in [14]. For the BP artificial neural network, the structure of the BP network is set to 250-501-4. The transfer function 'tansig' is applied for hidden layer, and 'purelin' for output layer.
The experimental results of the proposed SRC_MOD method are shown in Table 2. From Table  2, we can see that the average accuracy of SRC_MOD is about 98.44%, the average training time is For the BP artificial neural network, the structure of the BP network is set to 250-501-4. The transfer function 'tansig' is applied for hidden layer, and 'purelin' for output layer.
The experimental results of the proposed SRC_MOD method are shown in Table 2. From Table 2, we can see that the average accuracy of SRC_MOD is about 98.44%, the average training time is nearly 0.2061 s and the average testing time is nearly 3.1 ms. SRC_MOD classifier obtains high accuracy in short testing time. Hence, the performance of the SRC_MOD classifier is perfect. The experimental results of other classifiers are also shown in Table 2. The average accuracy of SRC is about 98.52% and the average testing time is nearly 1987.9 ms. The testing time is 641 times longer than that of SRC_MOD. The main reason is that training and testing of SRC are conducted simultaneously, solving l 1 -minimization problem is time consuming, repeatedly training for each test leads to longer processing time.
The average accuracy of DL classifier is about 96.88% and the testing time is nearly 6.5 ms. We find that SRC_MOD method is superior to the DL classifier in both recognition accuracy and testing time. Since alternating direction method of multipliers(ADMM) is more complex than MOD when obtaining the optimal solution of D i .
The average accuracy of deep learning is about 91.87%, the average training time is nearly 12.84 s and testing time is nearly 23.5 ms. The performance of deep learning significantly worse than SRC_MOD method. The main reason is that deep learning is more suitable for large training sets. In this experiment, the size of training samples is too small to show its advantage. For BP artificial neural network, the performance significantly worse than the other classifiers.
In a word, comparison results show that the proposed SRC_MOD method is quite efficient and computationally inexpensive to obtain excellent identification for the target gases.

Conclusions
This paper presents a SRC_MOD gas recognition algorithm. First the i-th class training samples are approximated as a linear combination of the synthesis dictionary atoms. Next, MOD is applied to solve the optimal solution of the synthesis dictionary and coefficient vector. Finally, we obtain the analysis dictionary and establish the prediction model. Compared with other classical classifiers (such as SRC, dictionary learning classifier, deep learning and BP artificial neural network), the experimental results show that SRC_MOD has better performance, not only in recognition rate but also in testing speed.