A Novel Optimization Technique to Improve Gas Recognition by Electronic Noses Based on the Enhanced Krill Herd Algorithm

An electronic nose (E-nose) is an intelligent system that we will use in this paper to distinguish three indoor pollutant gases (benzene (C6H6), toluene (C7H8), formaldehyde (CH2O)) and carbon monoxide (CO). The algorithm is a key part of an E-nose system mainly composed of data processing and pattern recognition. In this paper, we employ support vector machine (SVM) to distinguish indoor pollutant gases and two of its parameters need to be optimized, so in order to improve the performance of SVM, in other words, to get a higher gas recognition rate, an effective enhanced krill herd algorithm (EKH) based on a novel decision weighting factor computing method is proposed to optimize the two SVM parameters. Krill herd (KH) is an effective method in practice, however, on occasion, it cannot avoid the influence of some local best solutions so it cannot always find the global optimization value. In addition its search ability relies fully on randomness, so it cannot always converge rapidly. To address these issues we propose an enhanced KH (EKH) to improve the global searching and convergence speed performance of KH. To obtain a more accurate model of the krill behavior, an updated crossover operator is added to the approach. We can guarantee the krill group are diversiform at the early stage of iterations, and have a good performance in local searching ability at the later stage of iterations. The recognition results of EKH are compared with those of other optimization algorithms (including KH, chaotic KH (CKH), quantum-behaved particle swarm optimization (QPSO), particle swarm optimization (PSO) and genetic algorithm (GA)), and we can find that EKH is better than the other considered methods. The research results verify that EKH not only significantly improves the performance of our E-nose system, but also provides a good beginning and theoretical basis for further study about other improved krill algorithms’ applications in all E-nose application areas.


Introduction
An electronic nose (E-nose) is a device composed of a gas sensor array and an artificial intelligence algorithm. It is effective in dealing with odor analysis problems [1][2][3], and during the past decades, much work has been done to prove the efficiency of E-nose technology in many fields such as environmental monitoring [4,5], food engineering [6][7][8], disease diagnosis [9][10][11][12], explosives detection [13][14][15] and spaceflight applications [16]. This paper is mainly about E-nose research in indoor pollutant gas detection.
With the modern improvement of life quality, people demand higher indoor air quality. Because most of the time during a person's life is spent indoors, it is necessary for people's health to be of E-noses, we decide to adopt the KH as the optimization method of an E-nose and apply it to detect indoor harmful gaseous pollution. For the better performance of the E-nose, based on KH with an updated crossover operator [32] (defined as the standard KH in the paper), we propose a novel way of computing the decision weighting factor for KH to guarantee the krill are diversiform at the early stage of iterations, and have good local searching ability performance at the later stage of iteration. The added decision weighting factor updates the krill's position according to the influence of other individuals and their foraging behavior under different iterations. The proposed EKH method is verified via the data obtained by our self-made E-nose.
The rest of the paper is structured as follows: materials and experiments are described clearly in Section 2. An overview of the standard KH algorithm and the proposed EKH are discussed in Section 3. Our method is compared with other optimization techniques (including CKH, KH, QPSO, PSO and GA) and the classification results presented, analyzed and compared in Section 4. Finally, the conclusions of this paper are drawn in Section 5.

Materials and Experiments
The data used in the paper were obtained by a self-made E-nose, whose detailed information can be found in our previous publication [9]. However, to make the paper self-contained, the system structure and experimental setup are briefly repeated in the following subsections.

Target Gas and Experimental Setup
Four common kinds of indoor pollutant gases including C 6 H 6 , C 7 H 8 , CH 2 O and CO are the target gases which will be distinguished by the E-nose in our project. The sensor array of the E-nose presented in this paper contains five sensors: three metal oxide semi-conductor gas sensors (TGS 2201, TGS 2620 and TGS 2602 purchased from Figaro Company (Osaka, Japan); the TGS 2201 has two outputs defined as TGS 2201A and TGS 2201B), one temperature sensor and one humidity sensor. The sensitive characteristics of the three gas sensors are shown in Table 1. Table 1. Main sensitive characteristics of gas sensors.

Sensors
Main Sensitive Characteristics

TGS2201
Carbon monoxide, nitrogen dioxide, nitric oxide, TGS2620 Carbon monoxide, VOCs, methane, ethanol, isobutane, TGS2602 Ammonia, VOCs, toluene, ethanol, hepatic gas, formaldehyde Note: the responses of these three sensors is non-specific. Besides their main sensitive gas listed in Table 1, they are also sensitive to other gases.
A 12-bit analog-digital converter (A/D) is used as interface between the sensor array and a field programmable gate array (FPGA) processor. The A/D converts analog signals taken from the sensor array into digital signals, and the sampling frequency is set as 1 Hz. As shown in Figure 1, the experimental platform mainly consists of the E-nose system, a PC, a temperature-humidity controlling chamber (coated with Teflon to avoid the attachment of VOCs), a flow meter and an air pump. There are two ports on the sidewall of the chamber, and the target gas and the clean air are put into the chamber through ports 1 and 2, respectively. Data collected from the sensor array can be saved in a PC through a joint test action group (JTAG) port with its related software. An image of the experimental setup is shown in Figure 2. There are two ports on the sidewall of the chamber, and the target gas and the clean air are put into the chamber through ports 1 and 2, respectively.

Figure 1.
Schematic diagram of the experimental system. The experimental platform mainly consists of the E-nose system, a PC, a temperature-humidity controlling chamber, a flow meter and an air pump. There are two ports on the sidewall of the chamber, and the target gas and the clean air are put into the chamber through ports 1 and 2, respectively.

Figure 2.
Image of the experimental setup. Data collected from the sensor array can be saved on a PC through a joint test action group (JTAG) port with its related software.

Sampling Experiments and Data Pre-Processing
Before sampling experiments, we firstly set the temperature and humidity of the chamber as 25 °C and 40%, respectively. Then we can begin the gas sampling experiments. A single sampling experiment will implement the following three phases: Phase 1: All sensors are exposed to clean air for 2 min to obtain the baseline; Phase 2: Target gas is introduced into the chamber for 4 min; Phase 3: The array of sensors is exposed to clean air for 9 min again to wash the sensors and make them recover their baseline. Figure 3 illustrates the response of sensors when formaldehyde is introduced into the chamber. We can see that each response curve rises obviously from the third minute when the target gas begins to pass over the sensor array, and recovers to the baseline after the seventh minute when clean air is conveyed to wash the sensors.

Sampling Experiments and Data Pre-Processing
Before sampling experiments, we firstly set the temperature and humidity of the chamber as 25 • C and 40%, respectively. Then we can begin the gas sampling experiments. A single sampling experiment will implement the following three phases: Phase 1: All sensors are exposed to clean air for 2 min to obtain the baseline; Phase 2: Target gas is introduced into the chamber for 4 min; Phase 3: The array of sensors is exposed to clean air for 9 min again to wash the sensors and make them recover their baseline. Figure 3 illustrates the response of sensors when formaldehyde is introduced into the chamber. We can see that each response curve rises obviously from the third minute when the target gas begins to pass over the sensor array, and recovers to the baseline after the seventh minute when clean air is conveyed to wash the sensors. To get the real concentration of gas in the chamber, we extract each gas from the chamber and take it into a gas bag. Then a spectrophotometric method is employed to determine the concentration of formaldehyde and carbon monoxide, and the concentration of benzene and toluene are determined by gas chromatography (GC). For each gas, there are 12, 11, 21 and 29 concentration points, respectively, and 12 sampling experiments are made on each concentration point. The real concentration and the numbers of samples of the four kinds of gases are shown in Table 2. We set different concentrations of gas mainly in order to improve the generalization of algorithm, and try to avoid the misjudgment of the results when the concentration of the test gas is not the same concentrations with the gas of experiment. And the purpose of our work is to distinguish four indoor pollutant gases with E-nose.  [4,12] 348 (12 × 29) Then the maximum value of the steady-state response of sensors is extracted to create the feature matrix of the E-nose. There are 1932 samples in this matrix and the dimension of each sample is 4. We randomly select 70% of the samples of each gas to establish the training data set, and the rest are used as the test data set. Detailed information is shown in Table 3.  To get the real concentration of gas in the chamber, we extract each gas from the chamber and take it into a gas bag. Then a spectrophotometric method is employed to determine the concentration of formaldehyde and carbon monoxide, and the concentration of benzene and toluene are determined by gas chromatography (GC). For each gas, there are 12, 11, 21 and 29 concentration points, respectively, and 12 sampling experiments are made on each concentration point. The real concentration and the numbers of samples of the four kinds of gases are shown in Table 2. We set different concentrations of gas mainly in order to improve the generalization of algorithm, and try to avoid the misjudgment of the results when the concentration of the test gas is not the same concentrations with the gas of experiment. And the purpose of our work is to distinguish four indoor pollutant gases with E-nose. Then the maximum value of the steady-state response of sensors is extracted to create the feature matrix of the E-nose. There are 1932 samples in this matrix and the dimension of each sample is 4. We randomly select 70% of the samples of each gas to establish the training data set, and the rest are used as the test data set. Detailed information is shown in Table 3. Table 3. Amount of samples in training set and test set.

Overview of Standard KH Algorithm
KH is a new generic stochastic optimization approach for the global optimization problem. It is inspired by the behavior of krill swarms. When hunting for the food and communicating with each other, the KH approach repeats the implementation of the three movements and follows search directions that enhance the objective function value. The time-relied position is mostly determined by three movements: i Foraging action; ii Movement influenced by other krill; iii Physical diffusion.
Regular KH approach adopts the Lagrangian model as shown in the following expression: where N i , F i and D i denote the foraging motion, which is influenced by other krill and the physical diffusion of krill i, respectively. The first motion F i covers two parts: the current food location and the information about the previous location. For krill i, we formulate this motion as below: where: and V f is the foraging speed, w f is the inertia weight of the foraging motion in (0, 1), is the last foraging motion. The direction led by the second movement N i , a i is estimated by the three effects: target effect, local effect, and repulsive effect. For a krill i, it can be formulated as below: where N max is the maximum induced speed, w n is the inertia weight of the second motion in (0, 1), is the last motion influenced by other krill. For the i-th krill, in practice, the physical diffusion is a random process. This motion includes two components: a maximum diffusion speed and an oriented vector. The expression of physical diffusion can be given below: where D max is the maximum diffusion speed and δ is the oriented vector whose value is a random number between −1 and 1.
According to the three above-analyzed actions, the time-relied position from time t to t + ∆t can be formulated by the following equation: For more detailed information about the KH method readers may referred to [26].

EKH
It has been proved that KH is an effective method in exploitation. However because the search relies fully on randomness, it cannot converge rapidly. In the group strategy optimization algorithm, the number of iterations could affect the performance of the algorithm, and sometimes even determines whether we can find the global optimal point. We should also also consider the factor of time (the optimization process should be as quick as possible), so we present a novel way of computing the decision weighting factor to give KH better global searching ability performance and a higher convergence speed. Its equation is shown as Equation (7): where MI is the maximum iteration, and I is the current number of iterations. At the early stage of iterations, (MI -I)/MI > I/MI, their foraging actions should have more influence on their decisions for the next position. Because each krill doesn't know the correct direction, that krill start with their own feelings can effectively help them avoid premature. At the later stage of iterations, (MI -I)/MI < I/MI, the experience of other krill has more influence when they update their next position. After all, the correctness of the group direction tends to be higher than that of the individuals. Finally, we define the KH with an updated crossover operator as the standard krill swarm algorithm. The method we proposed as the enhanced KH (EKH). The basic framework of the EKH method and its responding flowchart are shown in Algorithm 1 and Figure 4.

Begin
Step 1: Initialization. Initialize the Iteration counter I=1, the population P of NP krill, V f , D max and N max .
Step 2: Fitness calculation. Calculate fitness for each krill according to its initial position.
Step 3: While I < Maximum Iteration do Sort all the population according to their fitness. for i=1:NP (all krill) do Perform the following motion calculation. Motion induced by other individuals Foraging motion Physical diffusion Compute dx i /dt according to Equation (7). Implement the crossover operator.
Updating the krill individual position in the search space. Calculate fitness for each krill according to its new position. end for i I = I+1.
Step 4: end while End.  Figure 4. Simplified flowchart of EKH. In addition to the basic steps of krill algorithm, the flowchart of EKH also includes the novel computing way of decision weighting factor used in Equation (7) and an updated crossover operator.

Results and Discussion
To evaluate the effectiveness of the optimization algorithm we analyze the discrimination of four different gases with our self-made E-nose. We compare EKH with QPSO, PSO and GA which have been frequently used in E-noses. We also compare EKH with the standard KH and the chaotic KH (CKH) [33]. In CKH, various one-dimensional chaotic maps are employed in place of the parameters used in the KH to accelerate the convergence speed of it. According to the results of [33], we choose Singer map as the proper chaotic map to form the best CKH. It is shown in Equation (8 The parameter setting in all experiments for each algorithm is shown in Table 4. Table 4. Parameter setting. The flow of data processing is as follows: firstly, a normalization processing is performed. Then the SVM [34,35] is employed as the classifier. Its two parameters are optimized by the six considered optimization algorithms. The flow diagram of the experiment is shown in Figure 5. All of the optimization algorithms to optimize parameters of SVM are mainly based on the training data set. Finally SVM will distinguish the class label of each sample in test data set with the knowledge it has learned, and the ratio (the number of points distinguished directly to the number of all points in test data) will be used to evaluate the performance of the different optimization algorithms. In addition to the basic steps of krill algorithm, the flowchart of EKH also includes the novel computing way of decision weighting factor used in Equation (7) and an updated crossover operator.

Results and Discussion
To evaluate the effectiveness of the optimization algorithm we analyze the discrimination of four different gases with our self-made E-nose. We compare EKH with QPSO, PSO and GA which have been frequently used in E-noses. We also compare EKH with the standard KH and the chaotic KH (CKH) [33]. In CKH, various one-dimensional chaotic maps are employed in place of the parameters used in the KH to accelerate the convergence speed of it. According to the results of [33], we choose Singer map as the proper chaotic map to form the best CKH. It is shown in Equation (8).
Singer map: The parameter setting in all experiments for each algorithm is shown in Table 4. Table 4. Parameter setting.

EKH
The The flow of data processing is as follows: firstly, a normalization processing is performed. Then the SVM [34,35] is employed as the classifier. Its two parameters are optimized by the six considered optimization algorithms. The flow diagram of the experiment is shown in Figure 5. All of the optimization algorithms to optimize parameters of SVM are mainly based on the training data set. Finally SVM will distinguish the class label of each sample in test data set with the knowledge it has learned, and the ratio (the number of points distinguished directly to the number of all points in test data) will be used to evaluate the performance of the different optimization algorithms. There are two parameters need to be set in SVM (the spread factor of the Gaussian RBF kernel function and the penalty factor), so krill group search in the two-dimensional space. Each kind of particle number optimization algorithm is set to 30, and in order to compare the differences between the algorithms, we set the number of iterations, to 50, 200 and 400, respectively. To make sure the accuracy of experimental results is correct, each program was repeated 10 times. Then we take the ten times' classification accuracy (the training data set and test data set) in maximum, minimum and average value as a reference to evaluate the performance of the six kinds of optimization algorithms. Tables 5-7 show the classification accuracy of the different optimization algorithms with the number of iterations set as 50, 200 and 400. The best classification accuracy of the four kinds of gases and all the classification accuracies with different optimization algorithms are shown in Tables 8-12. To make our research more persuasive, we use the 10-fold cross validation method to train and test the data and the particle number is set to 50. All the results of algorithms after using 10-fold cross validation are shown in Tables 13-15. It also reachs the same conclusion that the EKH has the best performance. When the particle number is set to 50, the results with the number of iterations set as 50, 200 and 400 are shown in Tables 16-18. In Table 18, the standard deviations (SD) of each kind of algorithm after running for 100 times are shown to evaluate the performance of the algorithm more precisely.  There are two parameters need to be set in SVM (the spread factor of the Gaussian RBF kernel function and the penalty factor), so krill group search in the two-dimensional space. Each kind of particle number optimization algorithm is set to 30, and in order to compare the differences between the algorithms, we set the number of iterations, to 50, 200 and 400, respectively. To make sure the accuracy of experimental results is correct, each program was repeated 10 times. Then we take the ten times' classification accuracy (the training data set and test data set) in maximum, minimum and average value as a reference to evaluate the performance of the six kinds of optimization algorithms. Tables 5-7 show the classification accuracy of the different optimization algorithms with the number of iterations set as 50, 200 and 400. The best classification accuracy of the four kinds of gases and all the classification accuracies with different optimization algorithms are shown in Tables 8-12. To make our research more persuasive, we use the 10-fold cross validation method to train and test the data and the particle number is set to 50. All the results of algorithms after using 10-fold cross validation are shown in Tables 13-15. It also reachs the same conclusion that the EKH has the best performance. When the particle number is set to 50, the results with the number of iterations set as 50, 200 and 400 are shown in Tables 16-18. In Table 18, the standard deviations (SD) of each kind of algorithm after running for 100 times are shown to evaluate the performance of the algorithm more precisely.         EKH and CKH are both the enhanced optimization algorithms based on the KH. and comparing the results of EKH, CKH and KH from Tables 5-7, we can find that the best results are obtained by EKH.
In the case of a higher number of iterations, the CKH performs a little better than KH, however, when it comes to the maximum, minimum or the average value of classification accuracy, the EKH significantly outperforms CKH and KH. This verifies that the EKH we proposed is more appropriate than CKH in the application of E-noses in gas identification. What's more, it's easy to see whatever the number of iterations given, the worst classification accuracy of EKH is higher than the best classification accuracy of CKH and KH. The results when the number of iterations is 200 and 400 are very close. All of these results prove that the global searching and convergence of EKH has been improved with the influence of the novel way of computing the decision weighting factor.
Comparing the EKH with different algorithms (QPSO, PSO and GA), it can be found from Tables 5-7 that GA has the worst performance, while PSO and QPSO are better. In terms of the truth that the EKH has the highest classification accuracy in the same iterations, relative to the three other algorithms, once again it proves that the krill algorithm can be applied well in E-noses. Tables 8-11, respectively, show the classification accuracy of four kinds of gases being measured under the condition that total classification accuracy is best. We can also draw a conclusion from the data that C 6 H 6 is harder to distinguish compared with other gases. For EKH, except for the fact the recognition rate of C 6 H 6 of the test set is a bit low, the other results are very reasonable, not only in terms of itself but also with other algorithms. According to the results in Table 18, we can know that the SD of EKH is the smallest. It suggests that the EKH result is more stable. That is to say, the EKH is better in average performance than KH and other algorithms. In Figures 6 and 7, through the colorful bar chart, we can clearly see the classification results of training set and test set based with different optimization algorithms and the discrepancies between each other.
Sensors 2016, 16, 1275 12 of 15 data that C6H6 is harder to distinguish compared with other gases. For EKH, except for the fact the recognition rate of C6H6 of the test set is a bit low, the other results are very reasonable, not only in terms of itself but also with other algorithms. According to the results in Table 18, we can know that the SD of EKH is the smallest. It suggests that the EKH result is more stable. That is to say, the EKH is better in average performance than KH and other algorithms. In Figures 6 and 7, through the colorful bar chart, we can clearly see the classification results of training set and test set based with different optimization algorithms and the discrepancies between each other. data that C6H6 is harder to distinguish compared with other gases. For EKH, except for the fact the recognition rate of C6H6 of the test set is a bit low, the other results are very reasonable, not only in terms of itself but also with other algorithms. According to the results in Table 18, we can know that the SD of EKH is the smallest. It suggests that the EKH result is more stable. That is to say, the EKH is better in average performance than KH and other algorithms. In Figures 6 and 7, through the colorful bar chart, we can clearly see the classification results of training set and test set based with different optimization algorithms and the discrepancies between each other.

Conclusions
There is no doubt that E-noses play an important role in the field of environmental monitoring and control of pollution emissions. In this experiment an E-nose is applied to distinguish four kinds of indoor pollutant gases. We all know that an E-nose device which has a high recognition rate for pollutant gases is significant to the improvement of the quality of people's indoor life, so we have undertaken further research on the E-nose algorithm to improve its gas recognition rate.
An E-nose mainly consists of an array of sensors and an appropriate pattern-recognition system. The pattern-recognition system has a significant effect in helping E-noses make a correct decision via the algorithm. Furthermore, the value of parameters determines the performance of the pattern recognition system, so some algorithms must be employed to select the appropriate parameters. KH is a new optimization algorithm put forward in recent years, that has not been applied yet to the E-nose technology, so we have creatively applied the krill algorithm to the classification problem of E-noses for indoor pollutant gases. Considering the practical application we propose an EKH based on a novel way of computing the decision weighting factor.
The KH technique has a good performance in exploitation, but it cannot always converge rapidly to find the global optimum. In this paper, we present an effective EKH algorithm based on a novel way of computing decision weighting factors and apply it to optimize the parameters of our self-made E-nose which is employed to distinguish different indoor pollutant gases. Through comparing EKH with other optimization methods, we find that the performance of EKH is better than KH, CKH, QPSO, PSO and GA. We can draw the conclusion according to the results that EKH is an ideal optimization method for E-noses in distinguishing indoor pollutant gases. Of course, we will continue to further study the krill algorithm in the future, and we believe the performance of E-nose will be further improved.