Detecting Changes of a Distant Gas Source with an Array of MOX Gas Sensors

We address the problem of detecting changes in the activity of a distant gas source from the response of an array of metal oxide (MOX) gas sensors deployed in an open sampling system. The main challenge is the turbulent nature of gas dispersion and the response dynamics of the sensors. We propose a change point detection approach and evaluate it on individual gas sensors in an experimental setup where a gas source changes in intensity, compound, or mixture ratio. We also introduce an efficient sensor selection algorithm and evaluate the change point detection approach with the selected sensor array subsets.


Introduction
Change point detection algorithms that analyze the response of an array of gas sensors and detect a change in the exposure of the array to a gas mixture can bring a significant leap forward in the construction of systems for monitoring of hazardous or pollutant gaseous compounds. Up to now, most of the work with gas sensors in an open sampling system (OSS) (i.e., without a sensing chamber that controls the exposure of the sensors to the gas and other variables like temperature and humidity) has been developed under simplified assumptions such as steady air flow and a gas source emitting a single compound with constant emission rate for the whole duration of the experiments. Unfortunately, these assumptions rarely hold in scenarios of interest for practical applications, like monitoring of industrial production plants [1], landfills [2] and demining [3]. In such applications, gas sensors are preferably deployed in an open sampling system since a quick and continuous response is often crucial, and restrictions in costs and payload pose stringent limitations on the hardware that can be considered. Moreover, it is often desirable to expose sensors directly to the environment to be analyzed since the dynamic response of the gas sensors contains crucial information on the gas plume and in particular on the location of the gas source [4]. These aspects acquire additional relevance when the gas sensors are mounted on mobile robots that have to perform tasks like gas distribution mapping or gas source localization [5,6]. However, the OSS configuration entails additional problems [7,8]. The lack of control over the environmental conditions results in a low reproducibility of experiments with gas sensors in OSS configuration. This is mainly due to unpredictable fluctuations in the concentration profile that are due to the mechanisms of gas dispersion, which in natural environments-characterized by a high Reynolds number-are dominated by turbulence and advection [9]. A second key problem is that the dynamics of most gas sensing technologies are slow compared with the fast fluctuations in the concentration profile to which the sensor are exposed.
The problem we address in this work is the detection of changes in the activity of a distant gas source from the response of an array of metal oxide (MOX) gas sensors deployed in an open sampling system. In an attempt to move towards realistic scenarios, we consider a single gas source that changes compound, intensity or mixture of compounds in the course of a single experiment. MOX sensors are the most common gas sensors used in OSS [10], mainly because of their commercial availability and the high sensitivity to non-hazardous compounds like alcohols (which facilitate experiments). However, MOX gas sensors suffer from long response and recovery times and, consequently, the response seldom reaches a steady state when used in an OSS. The change detection problem addressed in this work is thus particularly hard since true changes have to be distinguished from mere fluctuations in the sensor response due to turbulent gas dispersion.
The presented algorithm for change detection is derived from the well known Generalized Likelihood Ratio algorithm [11] and it is evaluated using three performance measures, namely the detection rate, the false alarm rate, and the delay of detection. First, the performance of the algorithm considering a single gas sensor is analyzed, then the scope is extended by applying the algorithm to multiple sensors. An efficient approach to select a subset of the available sensors in order to maximize the change point performance is proposed. The sensor selection policy is based on a trade-off, which can be controlled through a parameter, between the accuracy and the speed of detection.
The rest of this paper is organized as follows. Section 2 presents related works both for what concerns change point detection in a multivariate time series and gas sensing with an OSS. Section 3 describes the experimental setup with which the algorithms have been tested. Section 4 details the change point detection algorithm. Section 5 describes the sensor selection policy. Results are then presented in Section 6, first for what concerns the single sensor change point detection and then considering the change point detection using multiple sensors. Finally, Section 7 draws the conclusions and gives an outline of future works.

Related Works
The detection of changes in the activity of a gas source based on the response of an array of MOX gas sensors has, to the best of our knowledge, not been studied so far. However, we can relate this work to the study of change point detection in the domain of the analysis of multivariate time series. Indeed the response of an array of gas sensors sampled at constant intervals can be considered as a multivariate time series. Change detection in multivariate time series has a wide range of applications such as quality control, segmentation of signals, monitoring of production processes or vehicles. It is obvious that such a wide range of applications corresponds to the development of very different solutions for the change point detection problem. Variations span from on-line to off-line algorithms, multivariate or univariate and detecting additive or multiplicative changes [11].
Probably the simplest solution for change detection is detecting when the measurements fall out of a predefined range. This solution has been proposed for quality control applications [12]. Other techniques estimate change points by investigating the behaviour of the measurements of the time series before and after a hypothetical change point. These techniques often use statistical approaches both in a model based and model free fashion. The most common algorithms, inspired by frequentist statistics, are the Generalized likelihood Ratio (GLR) test [11], the Marginalized Likelihood Ratio (MLR) [13] and the CUmulative SUM (CUSUM) algorithm [11]. If a prior on the time of the change point can be assumed, Bayesian inspired algorithms have also been proposed [14]. Finally, change point detection algorithms inspired by machine learning approaches such as one class Support Vector Machine [15] have been proposed as well.
In this paper we consider change detection at a single location (even in the case where we consider multiple sensors we assume that they are exposed to the same concentration due to spatial proximity). Actual applications will either combine the gas sensor with some means of mobility (for example by mounting the sensors on a mobile robot) or consider a sensor network. The only work proposed in this field, to the knowledge of the authors, is dealing with a sensor network detecting anomalies in the gas distribution in coal mines [16]. A Bayesian Network is proposed to trigger alarms in case the sensor response at different nodes is anomalous, However, the approach in [16] is mainly concerned with the spatial distribution of the gas concentration and neglects many problems entailed with chemical sensing, including the cross correlation among the response of gas sensors of different type.

The Experimental Setup
We carried out experiments in a 5 × 5 × 2 m 3 closed room with static sensors where an artificial airflow of approximately 0.05 m/s is induced. The airflow is created using two arrays of four fans (standard microprocessor cooling fans), one placed on the floor and one on the wall. The gas source is an odour blender, a device developed by Nakamoto et al. [17], which allows fast switches in between different mixtures of compounds with a variable concentration. The outlet of the olfactory blender is placed on the floor 0.5 m upwind with respect to an array of 11 commercial metal oxide gas sensors from Figaro Engineering [18] and e2v Technologies [19]. Table 1 presents a list of the gas sensors models together with the nominal target compounds as declared by the producer on the data sheet of each sensor. The selected sensors have overlapping sensitivity and they respond to a wide range of target compounds. The airflow at the outlet of the odour blender is set to 1 L/min. The sensors are sampled at 4 Hz. Figure 1 shows a picture of the experimental setup.  The two compounds selected for these experiments are ethanol and 2-propanol. Both ethanol (molecular weight 46 g/mol) and 2-propanol (molecular weight 60 g/mol) are heavier than air (average molecular weight 29 g/mol), and therefore will tend to create a plume at the ground level.
The two substances have a similar saturated vapor pressure, namely 5.8 kPa for ethanol and 4.2 kPa for 2-propanol, which means that they have a similar tendency to evaporate. Moreover, MOX gas sensors have comparable sensitivity to the two substances. This is important in order to obtain similar sensor responses for both analytes, thus avoiding to address a trivial instance of the change detection problem.
In order to create a database that allows to study the dynamic behaviour of the sensors when consecutively exposed to different analytes, seven different odour emitting profiles have been applied. For all these profiles the gas source emits clean air for two minutes and the signal of sensors during this period is assumed as a baseline. Also, at the end of all the experiments the source emits clean air for 2 minutes. Figure 2 shows the intensity profile for the gas source in the various emission strategies. A total of 54 experimental runs have been performed. The control signal of the odour blender is used as ground truth for the change point time and provides the time at which the source changes the emission modality. However, in order to know the change point time at the sensors' location, we need to estimate the time it takes the gas to travel from the gas source to the sensor location. Since the sensors are placed 0.5 m away from the location of the source outlet and a steady air flow of 0.05 m/s is induced, the delay time between change times at source and sensor location is estimated to be 10 s.

Change Point Detection Algorithm
No prior information is assumed about the position of the change points. We further assume that no information about the length of the monitoring process is available and have therefore chosen an algorithm that processes data on-line. Because of these reasons, we are using an adaptation of the well-known Generalized Likelihood Ratio (GLR) algorithm [11]. The presented algorithm is schematically shown in Figure 3 and described in the following sub-sections: data preprocessing, GLR algorithm and performance measures used for evaluation. The changes we consider in this work are due to a change in the intensity of the gas source, a change in the chemical compound the gas sensor is exposed to, or a change in the gas mixture. These different types of changes are visualized and shown together with the corresponding sensor response in Figure 4. Notice the fluctuations in the sensor response, which are due to turbulent dispersion and not due to changes in the emission modality of the gas source. This is a key reason for which change point detection is non-trivial.

Data Preprocessing
Before running the GLR algorithm to detect change points, the raw sensor measurements are preprocessed using a low pass filter and a normalization operation described in the two following paragraphs.

Exponential Smoothing (Low Pass Filter)
The gas sensor response contains noise due to the electronics of the acquisition system. To dampen this noise, the sensor response is filtered using an Exponential Moving Average (EMA) filter [20]. The EMA filter is an infinite impulse response filter that applies weighting factors that decrease exponentially. For a time series x 0 , . . . , x N the EMA response can be calculated recursively using the following equations: where x 0 , x t is the raw sensor signal, s 0...N is the smoothed sequence and α is a smoothing factor. The factor α is always between 0 and 1. Values of α close to 0 result in an aggressive smoothing while values of α close to 1 nearly preserve the original time-series. We select a value of α = 0.9 in our experiments since this value gives a cut-off frequency for the filter of 0.44 Hz. Since this cut-off frequency is higher than the one applied by the MOX sensors themselves, the EMA filter mainly removes electronic noise without slowing down the actual response of the sensors.

Normalization
Due to differences in the sensing surface, different models of MOX sensors exhibit a different dynamic range. This means that some of the sensors, when responding, change their resistance value only a few Ohms while others vary hundreds or even thousands of Ohms. Thus, before running change point detection algorithms, we make the dynamic ranges of the sensors comparable by normalizing the response s 1...N of each sensor to the interval [0, 1] using the following linear transformation: whereś 1...N is the normalized sensor response.

GLR Algorithm
Given the smoothed and normalized sensor responseś 1...k where k is the current time index, the GLR algorithm calculates the likelihood ratio between the hypotheses of having a change point at sample j versus the hypothesis of not having a change point: The likelihoods are based on a parametric probability distribution function p θ , which is governed by a set of parameters θ. Since no prior information on the sensor noise is available, the most natural choice for p θ is the Gaussian distribution, which is governed by two parameters, namely the mean and the variance. θ 0 denotes the mean/variance estimated using all samples in the time interval to be checked for change points. θ 1 denotes the mean/variance estimated using only the samples collected after a hypothetical change point j. In this work we are interested in detecting level shifts in the response of one or more gas sensors and therefore the variance is assumed to be constant and is estimated considering all the samples in the considered interval. For numerical reasons, it is more convenient to calculate the log-likelihood value S k j instead of the likelihood Λ k j itself: The decision function g k is obtained by taking the maximum with respect to possible change point times j: If g k is above a pre-selected threshold h, then a change point is declared and the data collected before the change point are not considered any longer to detect new change points. In case a change point is detected, k is the alarm time andĵ, which is the value of j for which S k j attained its maximum, corresponds to the detected time of change. Otherwise, a new sample is acquired, the indexes are updated, and the change point detection process is repeated. Notice that the algorithm is readily extended to a multivariate setting by considering multivariate instead of univariate Gaussian distributions.

Performance Measures
To evaluate the change detection algorithm, three performance measures were used: the true alarm ratio, the false alarm ratio and the delay of detection. Before providing a definition of the performance measures, we define the concepts of true alarm, false alarm, and delay of detection. A true alarm is defined as the first alarm after a change point. Any other alarm coming after the true alarm and before the next change point is defined as a false alarm. The delay of detection is defined as the difference between the alarm time of a true alarm and the time of the change point. Figure 5 shows a graphical representation of these concepts. The first performance measure we consider is the true alarm ratio (TAR), which is given by the total number of true alarms divided by the number of change points. Clearly, the value of this performance measure is bounded between 0 and 1. The second performance measure is the false alarm ratio (FAR), which is calculated as the total number of false alarms divided by the number of change points. Notice that this performance measure is unbounded. The third performance measure is the mean delay of detection (MDD) and is defined as the average of the delay of detection.

Sensor Selection
MOX gas sensors are characterized by high correlation in their response [21]. This entails that the information provided by an array of MOX sensors is highly redundant. As an example, Figure 6 shows the response of five of the sensors in the array to a series of compound switches. This plot evidently shows the high correlation among the responses of the sensors. In order to reduce power consumption and the risk of suffering from the curse of dimensionality, it is necessary to select the minimum number of sensors that gives optimal performance. Among various ways proposed in literature for selecting an optimal subset of features, or sensors in our case [22,23], the Quadratic Programming Feature Selection (QPFS) method suggested by Lujan et al. in [24] is theoretically well founded and computationally very efficient, since it is based on convex optimization. The QPFS method attempts to solve the following optimization problem subject to x 0 (8) If we consider a problem with M sensors, x is a vector of length M that represents the importance of each sensor for the problem at hand, Q is an M × M matrix that measures the redundancy of each pair of sensors, F is a vector of length M that measures the relevance of each sensor for the task at hand, and e is a vector of ones of length M . QPFS ranks the sensors trading off maximum relevance and minimum redundancy and the parameter α ∈ [0, 1] acts as a regulator of this trade-off.
In this work, the matrix Q is formulated calculating the Pearson's coefficient between each pair of sensors in the Steps experiments. Therefore redundancy is in this case expressed by the linear correlation among the sensor responses. The vector F has to express the relevance of each sensor which, according to our performance measures, is a trade-off between the ease in detecting change points and the speed of detection. As a measure for the ease of detecting change point, we use the average Fisher Index of the sensor response considering as classes the sensor response to different gas source emission rates. The Fisher Index is an index of separation between classes that is calculated as the distance between the mean value of the samples belonging to each class divided by the mean standard deviation of the samples in the class. As a measure of the speed of each sensor, we use the average of the inverse time constant estimated in the decay phases of the Step experiments. We decided to use the time constants of the decay phase since they are much larger than the ones of the rise phase. Therefore, the vector F can be expressed by the following equation: where F FI is a vector containing the Fisher Index for each sensor, F ITC is a vector containing the inverse time constant for each sensor, and β is a parameter for controlling the trade-off between the speed of the sensor and the change of the sensor response in correspondence of a change point. Figure 7 shows the performance measures of the proposed algorithm when varying the threshold value h. As expected, the selection of the threshold value governs a trade-off between the change point detection ratio (TAR, prioritized by low threshold values) and the false alarm ratio (FAR, prioritized by high threshold values). Also, the delay of detection (MDD) generally increases with the threshold. This is within expectation since higher thresholds require the collection of more evidence in the data in order to trigger an alarm. Table 2 reports the time constants and Fisher Indices for each of the sensors. The time constants have been calculated by fitting an exponential to each of the transients induced in the Steps experiments. Since no significant variation in the time constant was observed regarding the concentration that the sensors were exposed to, the mean of all the experiments can be considered as a reliable estimator of the time constant of the sensors. The time constants in Table 2 were used in the sensor selection algorithm as an indicator of the speed of a sensor. Section 6.1 presents the results obtained with the proposed change point detection algorithm when considering a single sensor at the time. Section 6.2 presents the results of the proposed sensor selection strategy and Section 6.3 concludes with the results obtained with those sensor combinations that were selected by the sensor selection strategy.   Table 3 reports the results of the single sensor change point detection when the false alarm ratio (FAR) is set to 0.1. Sensor MiCS 5135 is the one that attains the highest overall TAR with also the shortest delay. Figure 8 shows an example of the execution of the proposed algorithm on the response of the sensor MiCS 5135. All the other sensors have a comparable overall MDD. Sensor TGS 2602 is the one that attains the worst overall TAR and it was found to be particularly bad in detecting changes in the mixture of the two compounds.

Results
The sensors MiCS 5135 and MiCS 2610 prove to be particularly good in detecting changes in compound corresponding to the highest TAR values. In addition, the sensor MiCS 2610 has also the lowest MDD for what concerns changes in compound. Together with the sensor MiCS 5135, the sensors MiCS 2610 and MiCS 2710 are the only two other sensors that obtain a TAR higher than 0.7 for changes in mixture. Concerning changes in concentration, only the sensors MiCS 5135 and MiCS 5121 were found to be both efficient and fast (high TAR and a low MDD).   Table 3). In this case, one of the change points was not detected and one false alarm was triggered.

Sensor Selection Results
The proposed sensor selection method has two parameters that need to be selected. The parameter α governs the trade-off between relevance and redundancy of the selected sensor set, while β defines the relevance of a sensor as a trade-off between speed and ease of detecting change points. Figure 9 shows the importance of sensors for some configurations of α and β. It is worth noticing how for high values of α only the most relevant sensor is selected, which is the quickest sensor when β is high, or the most discriminative sensor when beta is low. On the other hand, when α is small, and therefore sets of uncorrelated sensors are preferred, the value of β that governs the trade-off of the relevance term is much less influential.  Table 4 lists the different sensor subsets selected for different values of the parameters α and β. Notice that the configurations containing a single sensor, i.e., those for which α = 0.9 and β ∈ [0, 0.5] ∪ [0.6, 1], have been omitted since they fall back into the single sensor case. It is also worth noticing how all the sensor subsets contain a very low number of sensors, maximum three, compared with the eleven sensors present in the array. Likely, this is due to the high correlation among the responses of MOX gas sensors. Using more than two or three sensors does only increase redundancy of the array without increasing relevance.   Table 5 reports the performance obtained for the selected subsets of the array when the false alarm ratio (FAR) is set to 0.1. First it comes to notice that the selected thresholds h are in general higher than in the single sensor case. Also the selected thresholds h are higher for sets of three sensors compared with sets of two sensors. This can be due to the fact that larger arrays imply the estimation of higher dimensional Gaussian distributions, which are more prone to overfitting than lower dimensional ones. In order to avoid overfitting, higher thresholds tend to be selected, which imply the need of more evidence for declaring a change point. All the considered set of sensors are outperforming the single sensors in the overall performance evaluation, since they manage to achieve a TAR that is comparable to the one of the best single sensors while having a better MDD. It is particularly interesting to notice how the combinations that do not include the MiCS 2710 sensor, which was one of the sensors achieving the highest overall TAR but at the same time suffering from a high MDD, manage to achieve a high TAR and a low MDD.
It is further worth noticing how the difference in performance for the selected sets of sensors is much smaller than the difference among the performance of single sensors. All the sensor sets seem to be able to perform uniformly well in all the tasks.

Conclusion
In this paper we address the problem of detecting changes in the activity of a distant gas source from the response of an array of metal oxide (MOX) gas sensors deployed in an open sampling system. First, we describe an approach for change point detection that is inspired by the well known Generalized Likelihood Ratio (GLR) algorithm. The algorithm can be applied using either the output of a single sensor or the output of a set of sensors. We carried out a substantial number of tests, based on which we evaluate the performance of the proposed algorithm in both cases. Results obtained with a single sensor are presented in Section 6.1 while the results obtained with several sensors are presented in Section 6.3. The sensor subsets were selected using a novel and efficient sensor selection strategy that is presented in Section 5 and evaluated in Section 6.2.
Our results show that, given a fixed rate of false alarms (set to be 0.1 in our case), the selected sets of sensors obtain a detection rate comparable to the best single sensors, however with a significantly lower delay of detection. In particular, the subsets including a fast sensor like the MiCS 5521 have demonstrated short delays of detection. It is noteworthy that these configurations were obtained when quick sensors were included in the array by the proposed sensor selection method, which trades off relevance, ease of change point detection and the quickness of a sensor.
Future work will include the development of methods for automatic selection of the threshold h and testing of the proposed algorithms on more general setups, for example when the sensor array is mounted on a mobile robot or when it is deployed in an outdoor uncontrolled environment. An example of a scenario where it can be very interesting to test the change point algorithms in the future is the Air Quality Egg project for monitoring pollution in towns (http://www.kickstarter.com/projects/edborden/ air-quality-egg).