Using Recurrent Neural Network to Optimize Electronic Nose System with Dimensionality Reduction

Zou, Yanan; Lv, Jianhui

doi:10.3390/electronics9122205

Open AccessArticle

Using Recurrent Neural Network to Optimize Electronic Nose System with Dimensionality Reduction

by

Yanan Zou

¹ and

Jianhui Lv

^2,*

¹

School of Science, Jilin Institute of Chemical Technology, Jilin 132022, China

²

International Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China

^*

Author to whom correspondence should be addressed.

Electronics 2020, 9(12), 2205; https://doi.org/10.3390/electronics9122205

Submission received: 26 October 2020 / Revised: 20 November 2020 / Accepted: 23 November 2020 / Published: 21 December 2020

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Electronic nose is an electronic olfactory system that simulates the biological olfactory mechanism, which mainly includes gas sensor, data pre-processing, and pattern recognition. In recent years, the proposals of electronic nose have been widely developed, which proves that electronic nose is a considerably important tool. However, the most recent studies concentrate on the applications of electronic nose, which gradually neglects the inherent technique improvement of electronic nose. Although there are some proposals on the technique improvement, they usually pay attention to the modification of gas sensor module and barely consider the improvement of the last two modules. Therefore, this paper optimizes the electronic nose system from the perspective of data pre-processing and pattern recognition. Recurrent neural network (RNN) is used to do pattern recognition and guarantee accuracy rate and stability. Regarding the high-dimensional data pre-processing, the method of locally linear embedding (LLE) is used to do dimensionality reduction. The experiments are made based on the real sensor drift dataset, and the results show that the proposed optimization mechanism not only has higher accuracy rate and stability, but also has lower response time than the three baselines. In addition, regarding the usage of RNN model, the experimental results also show its efficiency in terms of recall ratio, precision ratio, and F1 value.

Keywords:

electronic nose; recurrent neural network; dimensionality reduction; locally linear embedding

1. Introduction

Just as image processing originates from the sense of sight, electronic nose is inspired by the sense of smell. In fact, electronic nose (e.g., odor sensor, aroma sensor, mechanical nose, flavor sensor, multi-sensor array, artificial nose, odor sensing system, and electronic olfactometry) is an electronic olfactory system constructed to mimic the biological olfactory mechanism, which also belongs to the important scientific field of artificial intelligence (AI) [1,2]. The whole electronic nose system is usually composed of three modules: gas sensor, data pre-processing, and pattern recognition [3]. At present, the field of electronic nose has attracted worldwide attention, which proves that electronic nose has an important influence on the progress of human society [4]. However, currently, most studies concentrate on the applications of electronic nose such as quality inspection of agricultural and food products, dendrobium classification, classification and evaluation of quality grades of organic green teas, early detection of fish degradation, etc., irrespective of the inherent technique improvement of electronic nose. To the best of our knowledge, although there are some proposals to optimize the inherent technique, they usually focus on modifying the gas sensor module (doing measurements on cross-sensitivity of a variety of gases by sensor array) and barely pay attention to the improvement of data pre-processing and pattern recognition.

The research of electronic nose also belongs to AI field, which results from that data pre-processing and pattern recognition modules strongly rely on AI-related algorithms due to the natural ability to process and comprehend large amounts of data, calibrate gas sensor array, and provide accurate classification and recognition results. In other words, data pre-processing and pattern recognition modules play indispensable roles in the electronic nose system. Without them, it would be very difficult or even impossible for electronic noses to have an intelligent effect. Furthermore, for AI, the process is usually divided into four stages: data collection, modeling, training, and evaluation [5]. Among them, the first two stages are of primary importance. However, the two stages to handle data pre-processing and pattern recognition in the electronic nose system face some limitations, such as low accuracy and stability, along with high response time. As a result, it is of great importance to further study data pre-processing and pattern recognition modules of electronic nose, depending on AI technique.

Regarding the data pre-processing module, its main responsibility is to do dimensionality reduction for high-dimensional data features extracted from gas sensor array, which facilitates computation and visualization by abandoning the redundant information. There are many methods for dimensionality reduction, and the typical representatives are principal component analysis (PCA) [6], linear discriminant analysis (LDA) [7], Laplacian eigenmaps (LE) [8], t-stochastic neighbor embedding (t-SNE) [9], and locally linear embedding (LLE) [10]. Among them, the first two methods belong to linear mapping, which cannot reach the efficient adaptivity as they need to manually adjust the threshold of cumulative interpretable variance. The last three methods belong to nonlinear mapping, which not only overcome the shortcoming of PCA and LDA, but also show more accurate dimensionality reduction to ensure data atomicity. However, LE and t-SNE have higher computation complexity and consume more computation time than LLE. In fact, the response time in the electronic nose system is an important metric, and the computation complexity has a great influence on response time. Thus, this paper uses LLE to reduce data dimensionality instead of PCA, LDA, LE, and t-SNE.

The pattern recognition module in the electronic nose system usually employs some AI algorithms to realize the related classification function. From the perspective of recognition form, there are four kinds of pattern recognition methods [11]: statistical pattern recognition [12], structural pattern recognition [13], fuzzy pattern recognition [14], and neural network (NN)-based pattern recognition [15]. With the rapid development of AI field, NN-based pattern recognition has overwhelming popularity and attracts much attention from the global researchers including those in the field of electronic nose. Furthermore, according to different structures, NN can usually be divided into artificial NN (ANN) [16], convolutional NN (CNN) [17], deep NN (DNN) [18], graph NN (GNN) [19], and recurrent NN (RNN) [20]. Among them, unlike ANN, CNN, DNN, and GNN, RNN includes the special circulation operation which can handle the input at the same time as storing information. In other words, RNN shows the obvious advantages in learning sequence and tree structure aspects, such as natural scene image and natural language processing. In fact, electronic nose is used to capture the natural environment state to do classification detection, which can be addressed by the technique of natural language processing. Given that, this paper uses RNN to realize the pattern recognition module instead of ANN, CNN, DNN, and GNN.

RNN has many parameters which face the problem of weight assignment. The common methods for weight assignment include traditional exact methods, mathematical heuristic methods, and intelligent bio-inspired methods. The traditional exact methods cannot adapt to the large-scale scenario. The mathematical heuristic methods have good computation efficiency under the large-scale scenario while the obtained solution is usually not optimal. Thus, this paper uses genetic algorithm (GA) [21] to do weight assignment for the involved parameters.

According to the above statements, this paper further investigates data pre-processing and pattern recognition modules of electronic nose, depending on AI technique. Regarding the data pre-processing module, LLE is used to reduce data dimensionality. Regarding the pattern recognition module, RNN is used to realize its function. In terms of the problem that RNN has many parameters, GA is employed to do weight assignment for those parameters. To sum up, the major contributions of this paper are as follows: (1) LLE is used to make dimensionality reduction to avoid information redundancy; (2) RNN is used to do pattern recognition, where GA is employed to adjust weight; and (3) based on the real sensor drift dataset of electronic nose, three metrics, i.e., accuracy rate, response time, and stability, are verified.

The remaining of this paper is organized as follows. Section 2 reviews the related research work from two perspectives. Section 3 introduces dimensionality reduction based on LLE. Section 4 presents RNN-based pattern recognition. Section 5 reports the experimental results. Section 6 concludes this paper.

2. Related Work

There have been a lot of studies on electronic nose, including the related applications and the inherent technologies.

2.1. The Related Applications

As the mentioned in Introduction, electronic nose shows the wide applications in all fields. For example, in [22], a model transfer learning framework with back-propagation neural network for win and Chinese liquor detection by electronic nose was proposed. In [23], a deep feature mining method of electronic nose sensor data for identifying beer olfactory information was proposed. References [22,23] indicated that electronic nose can be used to detect ethyl alcohol. In addition, as the classical applications, electronic nose can also be applied to do gas recognition. For example, in [24], a drift-compensating novel deep belief classification network was devised to improve gas recognition of electronic nose. In [25], a minimum distance inlier probability feature selection method was presented to improve gas classification for the electronic nose system. In [26], an efficient electronic nose system for odor analysis and assessment was designed, where the fault detection and alarming design could generate a high-reliability performance by constantly monitoring the working status. In addition, electronic nose has good detection performance on formalin. For example, in [27], formalin fresh noodles with electronic nose based on kernel principal component analysis was introduced. In [28], formalin on fresh tilapia via electronic nose and assessment of toxicity levels with reference to average adult Filipino weight was proposed.

Furthermore, electronic nose also has more advanced applications. For example, in [29], the authors made good optimization of extracted features for an explosive-detecting electronic nose by using GA. In [30], tofu shelf life was monitored by using electronic nose based on curve fitting method. In [31], the authors presented an overview of the most important contributions dealing with the quality control in microbial fermentation process by using electronic nose. In [32], an electronic nose-based assistive diagnostic prototype for lung cancer detection with conformal prediction was proposed. In [33], citrus tristeza virus in mandarin orange was detected by using a custom-developed electronic nose system. In [34], feature extraction of citrus juice during storage for electronic nose based on cellular neural network was developed. In [35], a novel quality evaluation method for magnolia bark was proposed by using electronic nose and colorimeter data with the multiple statistical algorithms. In [36], the authors made the comprehensive research on principles and recent advances in electronic nose for quality inspection of agricultural and food products. In [37], an optimized deep CNN for dendrobium classification based on electronic nose was proposed. In [38], on-line assessment of oil quality during deep frying was addressed by using an electronic nose and proton transfer reaction mass spectrometry. In [39], a novel method for rapid quantitative evaluating formaldehyde in squid based on electronic nose was devised. In [40], quality grades of organic green teas was classified and evaluated by using electronic nose based on machine learning algorithms. In [41], the authors made early detection of fish degradation by electronic nose.

2.2. The Inherent Technologies

Even though the above-reviewed applications show nice performance and obtain general acceptance, they usually neglect the inherent technique improvement of electronic nose. To this end, some solutions regarding this have been proposed. For example, in [42], a novel technique to solve shortages of low-concentration samples of electronic nose based on global and local features fusion was presented. In [43], a natural neural learning model inspired electronic nose system was devised. To be specific, a natural on-line training with only one sample, to extract both eigen-weights and eigen-bias, was built to elaborate a natural identifier neural model in a real work environment. The proposed model efficiently could reduce the maximum extent of traditional neural models complexities, namely generic work-laboratory, dimensional data learning, model adaptability complication, time-consuming, heavy experiment materials, and chemical products. In [44], the authors proposed a sensor drift correction method based on discriminative subspace projection to deal with the sensor drift problem. In [45], the authors employed manifold learning algorithms to improve the classification performance of electronic nose. In [46], multi-sensor electronic nose based on conformal sensor chamber was designed. In [47], the adaptive subspace learning was used to make drift compensation for electronic nose. In [48], drift compensation for electronic nose by multiple classifiers system with GA optimized feature subset was solved. In [49], fuzzy c-means clustering based novel threshold criteria for outlier detection in electronic nose was proposed. In [50], the joint distribution adaptation for drift correction in electronic nose type sensor array was presented. In [51], online drift compensation by the adaptive active learning on mixed kernel for electronic nose was proposed, which depended on an assumption that the calibration samples were gained online with uncertain amount. It redesigned a hybrid sample-evaluation kernel assessing samples comprehensively by introducing a ranking method to normalize the outputs of kernel. In [52], ANN was used to process electronic nose data. In [53], the authors discussed the training technique of electronic nose by using the labeled and unlabeled samples based on multi-kernel support vector machine (SVM). In [54], the rapid detection approach for enhancing the electronic nose system’s performance was verified by using different deep learning models and SVMs, where three deep learning architecture implementations types were used for the classification tasks. Among them, the first deep learning model was implemented employing machine learning framework; the second architecture implementation type was to perform meta-learning, adjusting the connections between different computing cells by differentiable search to obtain the best graph configuration while training; the final model corresponded to a simple multilayer perceptron with the fully connected layers.

Without a doubt, although these technologies improve the performance of electronic nose system, they still have a great optimization space, such as accuracy rate, response time, and stability. Furthermore, different from the current studies, this paper optimizes the electronic nose system from the perspective of data pre-processing and pattern recognition. The mentioned two aspects motivate this paper.

3. LLE-Based Dimensionality Reduction

In terms of electronic nose system, the dimensionality reduction of data plays an important role to improve computation efficiency and guarantee computation accuracy. The dimensionality reduction of data is defined as follows.

Definition 1.

Given N feature vectors, i.e.,

{x_{1}, x_{2}, \dots, x_{N}}

, here

\forall x_{i}

(

i \in [1, N]

) is d-dimensional space and

x_{i} \in R^{d}

, and the feature vectors after dimensionality reduction are

{y_{1}, y_{2}, \dots, y_{N}}

with m-dimensional space, satisfying

y_{i} \in R^{m}

and

m ≪ d

.

Compared with other dimensionality reduction methods, LLE has faster computation speed and more accurate computation result. Therefore, this paper employs LLE for dimensionality reduction, which usually includes three parts: graph construction, weight determination, and data mapping.

3.1. Graph Construction

This paper adopts K-nearest neighbor (KNN) algorithm [55] to construct the graph with respect to all feature vectors, that is to say, for

\forall x_{i}

, its K nearest neighbors (i.e., data points) need to be found. The core idea is described as three steps. First, for

\forall x_{i}

, the distance between it and each

x_{j}

(

i \neq j

) is computed, i.e.,

N - 1

distance values are obtained. Then, these distance values are arranged in the descending order. Finally, the first K data points are regarded as the nearest neighbors of

x_{i}

.

However, the determination of K is key but difficult. To be specific, if K is set as relatively small, it means that the whole model becomes complex and easily causes overfitting. On the contrary, if K is set as relatively large, it means that the whole modes becomes simple and dimensionality reduction cannot reach the satisfactory effect. With such consideration, this paper determines K according to the distribution of sample data points. Let

d_{i, j}

denote the distance between

x_{i}

and

x_{j}

, and

N - 1

distances can be obtained as follows:

d_{i, 1}, d_{i, 2}, \dots, d_{i, N - 1}

. Regarding the

N - 1

distances, the corresponding mean and variance can be defined as follows.

μ = \frac{1}{N - 1} \sum_{j = 1}^{N - 1} d_{i, j}

(1)

σ^{2} = \frac{1}{N - 1} \sum_{j = 1}^{N - 1} {(d_{i, j} - μ_{i})}^{2}

(2)

Suppose that the distance between sample data points and the current using data points follows the Gaussian distribution, and the improved K is defined as follows.

K = f (| μ - ξ * σ |)

(3)

Among them,

f (x)

is the number of feature vectors where the distance is smaller than x, and

ξ

is a parameter. In particular, when

ξ = 3

, the coverage rate in the interval

[μ - 3 σ, μ + 3 σ]

can reach the maximum value, i.e., 99.73%.

3.2. Weight Determination

For all

x_{i}

and

K_{i}

, it is required to build a matrix with respect to the local weight values while guaranteeing the corresponding construction error reaches the minimal value. Let W and

ϵ (W)

denote such matrix and such construction error, and

ϵ (W)

is defined as follows.

ϵ (W) = \sum_{i = 1}^{N} | | x_{i} - \sum_{j = 1}^{K} w_{i, j} x_{j} | |^{2}

(4)

where

x_{j}

is a neighbor of

x_{i}

and

w_{i, j}

is the weight between

x_{i}

and

x_{j}

. In particular,

\sum_{j = 1}^{K} w_{i, j} = 1

is satisfied. Furthermore, Equation (4) is converted as follows.

ϵ (W) = \sum_{i = 1}^{N} | | \sum_{j = 1}^{K} w_{i, j} (x_{i} - x_{j}) | |^{2} = \sum_{i = 1}^{N} W_{i}^{T} {(x_{i} - x_{j})}^{T} (x_{i} - x_{j}) W_{i}

(5)

Put

W_{i} = {(w_{i, 1}, w_{i, 2}, \dots, w_{i, d})}^{T}

and

Z_{i} = {(x_{i} - x_{j})}^{T} (x_{i} - x_{j})

into Equation (5), and Equation (5) is simplified as follows.

ϵ (W) = \sum_{i = 1}^{N} W_{i}^{T} Z_{i} W_{i}

(6)

According to the Lagrange multiplication, a new equation is obtained as follows.

L (W) = \sum_{i = 1}^{N} W_{i}^{T} Z_{i} W_{i} + λ (W_{i}^{T} 1_{d} - 1)

(7)

where,

1_{d}

is a d-dimensional vector with all values for 1. The derivation operation is performed in terms of W, and then the derivative result is set as 0. The following equation is obtained.

2 Z_{i} W_{i} + λ 1_{d} = 0

(8)

Put

W_{i}^{T} 1_{d} = 1

into Equation (8), and

W_{i}

can be obtained as follows.

W_{i} = \frac{Z_{i}^{- 1} 1_{d}}{1_{d}^{T} Z_{i}^{- 1} 1_{d}}

(9)

According to the above formula manipulation, the weight values can be obtained by minimizing the construction error, which has an important property: translation, rotation, and zoom operations have no influence on the weight determination.

3.3. Data Mapping

To guarantee the topology structure consistency of data points between high-dimensional space and low-dimensional space as much as possible, it is required to build a cost function while satisfying the minimal cost function value. Let

Φ (Y)

denote such cost function,

y_{i}

denote the output result of

x_{i}

, and

Φ (Y)

be defined as follows.

Φ (Y) = \sum_{i = 1}^{N} | | y_{i} - \sum_{j = 1}^{K} w_{i, j} y_{j} | |^{2}

(10)

where

y_{j}

is a neighbor of

y_{i}

. In particular, the following two constraint conditions are satisfied.

\sum_{i = 1}^{N} y_{i} = 0, \frac{1}{N} \sum_{i = 1}^{N} y_{i} y_{i}^{T} = I

(11)

where I is the unit matrix. Furthermore,

Φ (Y)

can be simplified as follows.

Φ (Y) = t r (Y_{T} {(I - W)}^{T} (I - W) Y)

(12)

Similarly, according to the Lagrange multiplication, a new equation is obtained as follows.

L (Y) = t r (Y^{T} M Y) + λ (Y^{T} Y - d I)

(13)

where M is symmetric matrix for

N \times N

. The derivation operation is performed in terms of Y, and then the derivative result is set as 0. The following equation is obtained.

M Y = λ Y

(14)

On this basis, for the smallest m, feature values are computed, and their corresponding feature vectors

y_{1}, y_{2}, \dots, y_{N}

are the final solution.

4. RNN-Based Pattern Recognition

In terms of the electronic nose system, the pattern recognition module is the most important part and has a direct influence on accuracy rate and stability. This paper uses RNN to realize the pattern recognition module, including RNN introduction and GA-based weight assignment for the involved parameters.

4.1. LSTM-Based RNN

As is known, there exists the vanishing gradient problem for the traditional RNN, and thus this paper employs long short-term memory (LSTM) [56] for RNN to address the vanishing gradient problem. Each LSTM unit has hidden state

h_{t}

, memory unit

c_{t}

, and three gates (i.e., input gate

i_{t}

, forget gate

f_{t}

, and output gate

o_{t}

). Besides, each gate is activated by the sigmoid function, generating the corresponding values between 0 and 1 as follows.

i_{t} = s (W_{i} v_{t} + U_{i} h_{t - 1} + b_{i})

(15)

f_{t} = s (W_{f} v_{t} + U_{f} h_{t - 1} + b_{f})

(16)

o_{t} = s (W_{o} v_{t} + U_{o} h_{t - 1} + b_{o})

(17)

where

s (\cdot)

denotes the sigmoid function; W and U are two kinds of weight matrixes; and b is the offset. In particular, the time step length t, the collected attribute of electronic nose

v_{t}

, the previous hidden state

h_{t - 1}

, and the previous memory unit

c_{t - 1}

are considered as the inputs of LSTM as follows.

g_{t} = tan h (W_{g} v_{t} + U_{g} h_{t - 1} + b_{g})

(18)

c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ g_{t}

(19)

h_{t} = o_{t} ⊙ tan h_{c_{t}}

(20)

In fact, LSTM usually has two types, denoted by

\vec{L S T M}

and

\overset{\leftarrow}{L S T M}

.

\vec{L S T M}

denotes the LSTM with the forward calculation and

\overset{\leftarrow}{L S T M}

denotes the LSTM with the reverse calculation. Then, two corresponding hidden states are defined as follows.

\vec{h_{t}} = \vec{L S T M} (v_{t}, \vec{h_{t - 1}})

(21)

\overset{\leftarrow}{h_{t}} = \overset{\leftarrow}{L S T M} (v_{t}, \overset{\leftarrow}{h_{t - 1}})

(22)

4.2. GA-based Weight Assignment

As can be seen from Equations (15)–(18), there are many parameters waiting for weight assignment in RNN. Considering that GA has the global optimization performance for these weight parameters, this paper uses GA to do the weight assignment [21]. Regarding this, the objective function can be written as follows.

M i n i m i z e F = α (W_{i} + W_{f} + W_{o} + W_{g}) + β (U_{i} + U_{f} + U_{o} + U_{g})

(23)

The next is to solve Equation (23) via GA. In particular, GA is used to obtain the relatively optimal path and these nodes in the determinate path build a network topology with some designated weight values. Meanwhile, these weight values are considered as a weight assignment solution. GA usually consists of selection operator, crossover operator, variation operator, and fitness function. Different from the traditional GA, this paper presents an adaptive GA to do the automatic selection operator. To be specific, this paper designs an online adjustment method on selection pressure to make sure the tradeoff between fast convergence and population diversity. For the arbitrary individual i, let

p_{i}

denote its selection probability. When the nonlinear relationship is considered,

p_{i}

is defined as follows.

p_{i} = γ {(1 - γ)}^{r_{i}}

(24)

where

γ

is the coefficient of pressure control and

r_{i}

is the rank of individual i. Furthermore, let M denote the initial population size, and the best individual’s section probability and the worst individual’s section probability are defined as follows.

p_{b e s t} = γ, p_{w o r s t} = γ {(1 - γ)}^{M - 1}

(25)

It is obvious that the determination of

γ

is very important. Given that, this paper uses the standard deviation with respect to all individual fitness values to determine

γ

. First, the standard deviation

s d

is defined as follows.

s d = \sqrt{\frac{1}{M} \sum_{i = 1}^{M} {(f i t_{i} - f i t_{a v e})}^{2}}

(26)

where

f i t_{i}

is the fitness of individual i and

f_{a v e}

is the average value. Then, for the Tth iteration, its corresponding

γ

is defined as follows.

γ_{T} = \{\begin{matrix} γ_{T - 1} - 0.05, s d < t h r_{1} \\ γ_{T - 1}, t h r_{1} \leq s d \leq t h r_{2} \\ γ_{T - 1} + 0.05, s d > t h r_{2} \end{matrix}

(27)

where

t h r_{1}

and

t h r_{2}

are two parameters. Especially when

s d > t h r_{2}

, it needs to adjust

γ

to be large so that the selection probability of individual can be increased.

5. Results

5.1. Dataset Collection

The sensor drift dataset comes from [57]. In total, 1604 samples were collected by using the multiple E-nose devices with the same model. Besides, the dataset consists of three batches, i.e., batch master collected five years earlier than the batches slave 1 and slave 2. Meanwhile, there are six kinds of gases to be detected: ammonia, benzene, carbon monoxide, formaldehyde, nitrogen dioxide, and toluene. The detailed information of the dataset is shown in Table 1.

5.2. Experiment Method

The experiments included two parts. The first part was the performance analysis of RNN, testing recall ratio, precision ration, and F1 value. Then, the second part was the comparison analysis, testing accuracy rate, response time, and stability. Meanwhile, three benchmarks were selected from the latest research achievements [43,51,54]. Ref. [43] presented a natural neural learning model inspired electronic nose system, called NNL; ref. [51] proposed an online drift compensation by the adaptive active learning on mixed kernel for electronic nose, called AAL; and ref. [54] used different deep learning models and SVMs to enhance the electronic nose system’s performance, called DLS. In addition, the involved parameters were set as follows:

M = 100

,

α = 0.45

,

β = 0.55

, and the number of simulation times was 10. In particular, the feature extraction was performed for each sensor resulting in a six-dimensional feature vector for each sample.

5.3. RNN Performance Analysis

The average recall ratios regarding six kinds of gases are shown in Table 2. The average precision ratios regarding six kinds of gases are shown in Table 3. The average F1 values regarding six kinds of gases are shown in Table 4.

In Table 1, Table 2 and Table 3, we observe that the used LSTM-based RNN shows good recall ratio, precision ratio, and F1 value, as all related values could reach 95%. It also indicates that using RNN to realize pattern recognition module is feasible.

5.4. Comparison Analysis

The average accuracy rates of the proposed method, NNL, AAL, and DLS, under different groups of experiments are shown in Figure 1. The average response times of the proposed method, NNL, AAL, and DLS, under different groups of experiments, are shown in Figure 2.

We found that our method presents the highest accuracy rate and the lowest response time. This indicates that LLE-based dimensionality reduction and RNN-based pattern recognition can greatly improve the electronic nose system, while obtaining the accurate detection results with low response time.

Furthermore, the standard deviation was used to measure the stability. For two metrics, i.e., accuracy rate and response time, the corresponding standard deviation values in terms of 10 different experiments were computed by Equation (26). A smaller standard deviation value means a higher stability. The experimental results are shown in Figure 3.

It is obvious that the method proposed in this paper has the smallest standard deviation values in terms of both accuracy rate and response time, which further indicates that it has the highest stability.

5.5. Discussion

Since the experimental results are based on the phase of virtual simulation rather than the implemented product, the validation carried out has some threats. For intrinsic reasons, there are three aspects. First, the weight assignment based on GA has different influences in terms of different datasets, i.e. the fixed weight assignment does not mean that the proposed optimization method in this paper can obtain the optimal solution for all datasets. Second, the building of RNN structure can be dynamic and it may be unstable during the process of data training. Third, the written code is probably unstable and even redundancy exists, which has an important influence on the computational efficiency. For the extrinsic reasons, they include two aspects. On the one hand, the adopted datasets lack diversity, and the current experimental results can only demonstrate that the proposed optimization method is efficient within a certain range but cannot guarantee that it is forever efficient, after all it is not a mass-produced product. On the other hand, different coding styles also have a considerable influence on the experimental results. For example, RNN is coded in C++ language and the electronic nose system is implemented in C language.

6. Conclusions

The whole electronic nose system usually includes gas sensor, data pre-processing, and pattern recognition modules. Currently, most studies pay attention to the applications of electronic nose irrespective of the inherent technique improvement of electronic nose. Although there are some proposals to optimize the inherent technique, they usually focus on modifying the gas sensor module and barely pay attention to the improvement of data pre-processing module and pattern recognition module, which are addressed in this paper. First, LLE is employed for dimensionality reduction, including graph construction, weight determination, and data mapping. Then, RNN is used for realizing the pattern recognition module. In particular, LSTM is adopted to improve RNN and GA is leveraged to do the weight assignment for the involved parameters. The experiments are implemented based on the real sensor drift dataset, which include two parts: RNN performance analysis and comparison analysis. The first part tested recall ratio, precision ration, and F1 value, which can reach 95%. The second part tested accuracy rate, response time, and stability. It was found that this method has the best optimization performance on the electronic nose system.

Author Contributions

Conceptualization, Y.Z.; methodology, J.L.; software, Y.Z. and J.L.; formal analysis, J.L.; investigation, Y.Z.; resources, Y.Z. and J.L.; writing—original draft preparation, Y.Z.; writing—review and editing, Y.Z. and J.L.; and funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the project of Education Department of Jilin Province in China under grant JJKH20190833KJ.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

NN	Neural Network
ANN	Artificial NN
CNN	Artificial NN
DNN	Deep NN
GNN	Graph NN
RNN	Recurrent Neural Network
LLE	Locally Linear Embedding
AI	Artificial Intelligence
PCA	Principal Component Analysis
LDA	Linear Discriminant Analysis
LE	Laplacian Eigenmaps
t-SNE	t-Stochastic Neighbor Embedding
GA	Genetic Algorithm
SVM	Support Vector Machine
KNN	K-Nearest Neighbor
LSTM	Long Short-Term Memory
NNL	Natural Neural Learning
AAL	Adaptive Active Learning
DLS	Deep Learning models and SVMs

References

Zhang, L.; Tian, F.; Zhang, D. Electronic Nose: Algorithmic Challenges; Springer Nature: Singapore, 2018. [Google Scholar]
Dymerski, T.M.; Chmiel, T.M.; Wardencki, W. Invited review article: An odor-sensing system-powerful technique for foodstuff studies. Rev. Sci. Instrum. 2011, 82, 1–32. [Google Scholar] [CrossRef] [PubMed]
Karakaya, D.; Ulucan, O.; Turkan, M. Electronic nose and its applications: A survey. Int. J. Autom. Comput. 2020, 17, 179–209. [Google Scholar] [CrossRef] [Green Version]
Hurot, C.; Scaramozzino, N.; Buhot, A. Bio-inspired strategies for improving the selectivity and sensitivity of artificial noses a review. Sensors 2020, 20, 1803. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ma, L.; Cheng, S.; Shi, Y. Enhancing Learning Efficiency of Brain Storm Optimization via Orthogonal Learning Design. IEEE Trans. Syst. Man Cybern. Syst. 2020. [Google Scholar] [CrossRef]
Hassanabadi, M.E.; Heidarpour, A.; Azam, S.E.; Arashpour, M. Recursive principal component analysis for model order reduction with application in nonlinear Bayesian filtering. Comput. Methods Appl. Mech. Eng. 2020, 71, 1–27. [Google Scholar]
Ji, S.; Ye, J. Generalized linear discriminant analysis: A unified framework and efficient model selection. IEEE Trans. Neural Netw. 2018, 19, 1768–1782. [Google Scholar]
Gerber, S.; Tasdizen, T.; Tyler, R. Robust non-linear dimensionality reduction using successive 1-dimensional Laplacian Eigenmaps. In Proceedings of the 24th International Conference on Machine Learning, Corvalis, OR, USA, 20–24 June 2007; pp. 281–288. [Google Scholar]
Cieslak, M.C.; Castelfranco, A.M.; Roncalli, V.; Lenz, P.H.; Hartline, D.K. t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis. Mar. Genom. 2019, 51, 1–13. [Google Scholar] [CrossRef]
Wang, J.; Wong, R.K.W.; Lee, T.C.M. Locally linear embedding with additive noise. Pattern Recognit. Lett. 2019, 123, 47–52. [Google Scholar] [CrossRef]
Vidal, E.; Thollard, F.; de la Higuera, C.; Casacuberta, F.; Carrasco, R.C. Probabilistic finite-state machines-part I. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1013–1025. [Google Scholar] [CrossRef]
Raudys, S.J.; Jain, A.K. Small sample size effects in statistical pattern recognition: Recommendations for practitioners. IEEE Trans. Pattern Anal. Mach. Intell. 1991, 13, 252–264. [Google Scholar] [CrossRef]
Rico-Juan, J.R.; Valero-Mas, J.J.; Inesta, J.M. Bounding Edit Distance for similarity-based sequence classification on structural pattern recognition. Appl. Soft Comput. 2020, 97, 1–9. [Google Scholar] [CrossRef]
Shahmoradi, S.; Shouraki, S.B. Evaluation of a novel fuzzy sequential pattern recognition tool (fuzzy elastic matching machine) and its applications in speech and handwriting recognition. Appl. Soft Comput. 2018, 62, 315–327. [Google Scholar] [CrossRef]
Tarigan, J.; Nadia; Diedan, R.; Suryana, Y. Plate recognition using backpropagation neural network and genetic algorithm. Procedia Comput. Sci. 2017, 116, 365–372. [Google Scholar] [CrossRef]
Yoon, B.L. Artificial neural network technology. ACM Sigsmall/PC Notes 1989, 15, 3–16. [Google Scholar] [CrossRef]
Sirohi, D.; Kumar, N.; Rana, P.S. Convolutional neural networks for 5G-enabled intelligent transportation system: A systematic review. Comput. Commun. 2020, 153, 459–498. [Google Scholar] [CrossRef]
Huang, X.; Kroening, D.; Ruan, W.; Sharp, J.; Sun, Y.; Thamo, E.; Wu, M.; Yi, X. A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability. Comput. Sci. Rev. 2020, 37, 1–35. [Google Scholar] [CrossRef]
Zhang, C.; Song, D.; Huang, C.; Swami, A.; Chawla, N.V. Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 793–803. [Google Scholar]
Yi, J.; Park, J. Hypergraph convolutional recurrent neural network. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, CA, USA, 23–27 August 2020; pp. 3366–3376. [Google Scholar]
Cheng, S.; Ma, L.; Lu, H.; Lei, X.; Shi, Y. Evolutionary computation for solving search-based data analytics problems. Artif. Intell. Rev. 2020. [Google Scholar] [CrossRef]
Yang, Y.; Liu, H.; Gu, Y. A model transfer learning framework with back-propagation neural network for wine and Chinese liquor detection by electronic nose. IEEE Access 2020, 8, 105278–105285. [Google Scholar] [CrossRef]
Shi, Y.; Gong, F.; Wang, M.; Liu, J.; Wu, Y.; Men, H. A deep feature mining method of electronic nose sensor data for identifying beer olfactory information. J. Food Eng. 2019, 163, 437–445. [Google Scholar] [CrossRef]
Tian, Y.; Yan, J.; Zhang, Y.; Yu, T.; Wang, P.; Shi, D.; Duan, S. A drift-compensating novel deep belief classification network to improve gas recognition of electronic noses. IEEE Access 2020, 8, 121385–121397. [Google Scholar] [CrossRef]
Liu, Y.; Tang, K. A minimum distance inlier probability (MDIP) feature selection method to improve gas classification for electronic nose systems. IEEE Access 2020, 8, 133928–133935. [Google Scholar] [CrossRef]
Zhang, W.; Liu, T.; Ueland, M.; Forbes, S.L.; Wang, R.X.; Su, S.W. Design of an efficient electronic nose system for odour analysis and assessment. Measurement 2020, 165, 108089. [Google Scholar] [CrossRef]
Lelono, D.; Abdillah, M.Z.; Widodo, T.W.; Apandi, M. Clusterization of pure and formalin fresh noodles with electronic nose based on kernel principal component analysis. In Proceedings of the International Conference on Science and Technology (ICST), Yogyakarta, Indonesia, 30–31 July 2019; pp. 1–5. [Google Scholar]
Cruz, J.C.D.; Garcia, R.G.; Collado, A.N.M.; Jovero, R.J.S.; Macalangcom, R.V.; Tud, R.C. Formalin on fresh tilapia via electronic nose and assessment of toxicity levels with reference to average adult filipino weight. In Proceedings of the 11th IEEE International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), Laoag, Philippines, 29 November–1 December 2019; pp. 1–6. [Google Scholar]
Espanola, J.L.; Bandala, A.A.; Vicerra, R.R.P.; Dadios, E.P. Optimization of extracted features from an explosive-detecting electronic nose using genetic algorithm. In Proceedings of the IEEE Conference on Robotics, Automation and Mechatronics (RAM), Bangkok, Thailand, 18–20 November 2019; pp. 148–152. [Google Scholar]
Lelono, D.; Putri, R.P.; Atmaji, C. Tofu shelf life using electronic nose based on curve fitting method. In Proceedings of the International Conference on Science and Technology (ICST), Yogyakarta, Indonesia, 30–31 July 2019; pp. 1–4. [Google Scholar]
Jiang, H.; Zhang, H.; Chen, Q.; Mei, C.; Liu, G. Recent advances in electronic nose techniques for monitoring of fermentation process. World J. Microbiol. Biotechnol. 2015, 31, 1845–1852. [Google Scholar] [CrossRef] [PubMed]
Zhang, X.; Wang, Z.; Yang, M.; Luo, Z.; Wang, Y.; Li, G. An electronic nose-based assistive diagnostic prototype for lung cancer detection with conformal prediction. Measurement 2020, 158, 107588. [Google Scholar] [CrossRef]
Hazarika, S.; Choudhury, R.; Montazer, B.; Medhi, S.; Goswami, M.P.; Sarma, U. Detection of citrus tristeza virus in mandarin orange using a custom-developed electronic nose system. IEEE Trans. Instrum. Meas. 2020, 69, 9010–9018. [Google Scholar] [CrossRef]
Gao, H.; Jia, P.; Xu, D.; Jiang, Y.; Qiao, S. Feature extraction of citrus juice during storage for electronic nose based on cellular neural network. IEEE Sens. J. 2020, 20, 3803–3812. [Google Scholar]
Li, J.; Shao, Y.; Yao, Y.; Yu, Y.; Cao, G.; Zou, H.; Yan, Y. A novel quality evaluation method for magnolia bark using electronic nose and colorimeter data with multiple statistical algorithms. J. Tradit. Chin. Med Sci. 2020, 7, 221–227. [Google Scholar] [CrossRef]
Ali, M.M.; Hashim, N.; Aziz, S.A.; Lasekan, O. Principles and recent advances in electronic nose for quality inspection of agricultural and food products, Principles and recent advances in electronic nose for quality inspection of agricultural and food products. Trends Food Sci. Technol. 2020, 99, 1–10. [Google Scholar]
Wang, Y.; Dai, J.; Wang, Z.; Zhan, X.; Zhang, B.; Li, N.; Li, G. An optimized deep convolutional neural network for dendrobium classification based on electronic nose. Sens. Actuators A 2020, 307, 1–9. [Google Scholar] [CrossRef]
Majchrzak, T.; Wojnowski, W.; Glowacz-Rozynska, A.; Wasik, A. On-line assessment of oil quality during deep frying using an electronic nose and proton transfer reaction mass spectrometry. Food Control 2021, 121, 1–9. [Google Scholar] [CrossRef]
Gu, D.; Liu, W.; Yan, Y.; Wei, W.; Gan, J.; Lu, Y.; Jiang, Z.; Wang, X.; Xu, C. A novel method for rapid quantitative evaluating formaldehyde in squid based on electronic nose. LWT Food Sci. Technol. 2019, 101, 382–388. [Google Scholar] [CrossRef]
Liu, H.; Yu, D.; Gu, Y. Classification and evaluation of quality grades of organic green teas using an electronic nose based on machine learning algorithms. IEEE Access 2019, 7, 172965–172973. [Google Scholar] [CrossRef]
Zambotti, G.; Soprani, M.; Gobbi, E.; Capuano, R.; Pasqualetti, V.; Di Natale, C.; Ponzoni, A. Early detection of fish degradation by electronic nose. In Proceedings of the IEEE International Symposium on Olfaction and Electronic Nose (ISOEN), Fukuoka, Japan, 26–29 May 2019; pp. 1–3. [Google Scholar]
Xu, D.; Jia, P.; Cao, H.; Cao, W.; Wu, G. A novel technique solving shortages of low-concentration samples of electronic nose based on global and local features fusion. IEEE Sens. J. 2020, 20, 11412–11420. [Google Scholar] [CrossRef]
Ouhmad, S.; Makkaoui, K.E.; Beni-Hssane, A.; Hajami, A.; Ezzati, A. An electronic nose natural neural learning model in real work environment. IEEE Access 2019, 7, 134871–134880. [Google Scholar] [CrossRef]
Yi, Z.; Li, C. Anti-drift in electronic nose via dimensionality reduction: A discriminative subspace projection approach. IEEE Access 2019, 7, 170087–170095. [Google Scholar] [CrossRef]
Leon-Medina, J.X.; Anaya, M.; Pozo, F.; Tibaduiza, D.A. Application of manifold learning algorithms to improve the classification performance of an electronic nose. In Proceedings of the IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Dubrovnik, Croatia, 25–28 May 2020; pp. 1–6. [Google Scholar]
Qian, J.; Luo, Y.; Tian, F.; Liu, R.; Yang, T. Design of multi-sensor electronic nose based on conformal sensor chamber. IEEE Trans. Ind. Electron. 2020, in press. [Google Scholar] [CrossRef]
Liu, T.; Chen, Y.; Li, D. Drift compensation for an electronic nose by adaptive subspace learning. IEEE Sens. J. 2020, 20, 337–347. [Google Scholar] [CrossRef]
Manna, A. Drift compensation for electronic nose by multiple classifiers system with genetic algorithm optimized feature subset. In Proceedings of the International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 22–24 January 2020; pp. 1–7. [Google Scholar]
Verma, P.; Sinha, M.; Panda, S. Fuzzy c-means clustering based novel threshold criteria for outlier detection in electronic nose. IEEE Sens. J. 2020, in press. [Google Scholar] [CrossRef]
Leon-Medina, J.X.; Pineda-Munoz, W.A.; Burgos, D.A.T. Joint distribution adaptation for drift correction in electronic nose type sensor arrays. IEEE Access 2020, 8, 134413–134421. [Google Scholar] [CrossRef]
Liu, T.; Li, D.; Chen, Y.; Wu, M.; Yang, T.; Cao, J. Online drift compensation by adaptive active learning on mixed kernel for electronic noses. Sens. Actuators B. Chem. 2020, 316, 1–10. [Google Scholar] [CrossRef]
Shaposhnik, A.V.; Moskalev, P.V. Processing electronic nose data using artificial neural networks. In Proceedings of the International Conference on Computing for Sustainable Global Development, New Delhi, India, 12–14 March 2020; pp. 208–209. [Google Scholar]
Jia, P.; Meng, F.; Cao, H.; Duan, S.; Peng, X.; Xu, M. Training technique of electronic nose using labeled and unlabeled samples based on multi-kernel LapSVM. Sens. Actuators B Chem. 2019, 294, 98–105. [Google Scholar] [CrossRef]
Gamboa, J.C.R.; da Silva, A.J.; Araujo, I.C.S.; Albarracin E., E.S.; Duran A., C.M. Validation of the rapid detection approach for enhancing the electronic nose systems performance, using different deep learning models and support vector machines. Sens. Actuators B Chem. 2021, 327, 1–7. [Google Scholar]
Kumbure, M.M.; Luukka, P.; Collan, M. A new fuzzy k-nearest neighbor classifier based on the Bonferroni mean. Pattern Recognit. Lett. 2020, 140, 172–178. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 40, 1–28. [Google Scholar] [CrossRef] [Green Version]
Zhang, L.; Liu, Y.; He, Z.; Liu, J.; Deng, P.; Zhou, X. Anti-drift in E-nose: A subspace projection approach with drift reduction. Sens. Actuators B Chem. 2017, 253, 407–417. [Google Scholar] [CrossRef]

Figure 1. The average accuracy rates of the four methods under different experiments.

Figure 2. The average response times of the four methods under different experiments.

Figure 3. The standard deviation values for all experiments.

Table 1. The sensor drift dataset.

Batch	Ammonia	Benzene	Carbon Monoxide	Formaldehyde	Nitrogen Dioxide	Toluene	Total
Master	60	72	58	126	38	66	420
Slave 1	81	108	98	108	107	106	608
Slave 2	84	87	95	108	108	94	576

Table 2. The average recall ratios for the six kinds of gases (%).

Batch	Ammonia	Benzene	Carbon Monoxide	Formaldehyde	Nitrogen Dioxide	Toluene
Master	98.26	97.38	98.29	97.68	98.16	98.22
Slave 1	95.87	96.19	97.46	96.89	97.37	96.95
Slave 2	96.76	97.13	96.55	97.61	97.08	98.33

Table 3. The average precision ratios for the six kinds of gases (%).

Batch	Ammonia	Benzene	Carbon Monoxide	Formaldehyde	Nitrogen Dioxide	Toluene
Master	96.97	97.83	96.09	97.86	97.66	96.58
Slave 1	96.75	95.28	98.12	99.03	97.64	98.59
Slave 2	97.38	97.19	96.28	96.79	98.32	98.76

Table 4. The average F1 values for the six kinds of gases (%).

Batch	Ammonia	Benzene	Carbon Monoxide	Formaldehyde	Nitrogen Dioxide	Toluene
Master	97.61	97.60	97.18	97.77	97.91	97.39
Slave 1	96.31	95.73	97.79	97.95	97.50	97.76
Slave 2	97.07	97.16	96.41	97.20	97.70	98.54

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zou, Y.; Lv, J. Using Recurrent Neural Network to Optimize Electronic Nose System with Dimensionality Reduction. Electronics 2020, 9, 2205. https://doi.org/10.3390/electronics9122205

AMA Style

Zou Y, Lv J. Using Recurrent Neural Network to Optimize Electronic Nose System with Dimensionality Reduction. Electronics. 2020; 9(12):2205. https://doi.org/10.3390/electronics9122205

Chicago/Turabian Style

Zou, Yanan, and Jianhui Lv. 2020. "Using Recurrent Neural Network to Optimize Electronic Nose System with Dimensionality Reduction" Electronics 9, no. 12: 2205. https://doi.org/10.3390/electronics9122205

APA Style

Zou, Y., & Lv, J. (2020). Using Recurrent Neural Network to Optimize Electronic Nose System with Dimensionality Reduction. Electronics, 9(12), 2205. https://doi.org/10.3390/electronics9122205

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Using Recurrent Neural Network to Optimize Electronic Nose System with Dimensionality Reduction

Abstract

1. Introduction

2. Related Work

2.1. The Related Applications

2.2. The Inherent Technologies

3. LLE-Based Dimensionality Reduction

3.1. Graph Construction

3.2. Weight Determination

3.3. Data Mapping

4. RNN-Based Pattern Recognition

4.1. LSTM-Based RNN

4.2. GA-based Weight Assignment

5. Results

5.1. Dataset Collection

5.2. Experiment Method

5.3. RNN Performance Analysis

5.4. Comparison Analysis

5.5. Discussion

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI