Wireless E-Nose Sensors to Detect Volatile Organic Gases through Multivariate Analysis

Gas sensors are critical components when adhering to health safety and environmental policies in various manufacturing industries, such as the petroleum and oil industry; scent and makeup production; food and beverage manufacturing; chemical engineering; pollution monitoring. In recent times, gas sensors have been introduced to medical diagnostics, bioprocesses, and plant disease diagnosis processes. There could be an adverse impact on human health due to the mixture of various gases (e.g., acetone (A), ethanol (E), propane (P)) that vent out from industrial areas. Therefore, it is important to accurately detect and differentiate such gases. Towards this goal, this paper presents a novel electronic nose (e-nose) detection method to classify various explosive gases. To detect explosive gases, metal oxide semiconductor (MOS) sensors are used as reliable tools to detect such volatile gases. The data received from MOS sensors are processed through a multivariate analysis technique to classify different categories of gases. Multivariate analysis was done using three variants—differential, relative, and fractional analyses—in principal components analysis (PCA). The MOS sensors also have three different designs: loading design, notch design, and Bi design. The proposed MOS sensor-based e-nose accurately detects and classifies three different gases, which indicates the reliability and practicality of the developed system. The developed system enables discrimination of these gases from the mixture. Based on the results from the proposed system, authorities can take preventive measures to deal with these gases to avoid their potential adverse impacts on employee health.


Introduction
In present times, industries have seen the importance of sensory olfactory systems, in that they can help in detecting the adverse effects of gases on human health. An electronic nose (e-nose) refers to a system that mimics the olfactory system of humans with regards to its functions in order to complete qualitative and quantitative analyses of gases or odors, which is also known as the simulated system of olfaction [1]. The e-nose that functions as a machine/computer has been applied in many areas.
Micromachines 2020, 11 An e-nose follows appropriate signal conditioning and processing to produce a unique pattern that helps discriminate different types of odors. The e-nose is a smart intelligent system that is comprised of sensors that respond to smell, as well as convert chemical information into electrical signals. Moreover, these sensors can be integrated into an e-nose using advanced micro/nano-fabrication technologies. There are several applications based on e-nose, such as the e-nose being used in defense divisions and pharmaceutical laboratories, in areas such as food processing industries, cosmetics industries, environmental condition detection and diverse scientific fields of research [2]. As the sensor is exposed to odor/gas, its resistance changes accordingly [3]. The different classification methods like principal component analysis (PCA) and partial least squares (PLS) or gas chromatography (GC) can help detect gas samples according to the odor and concentration of different gases. Different classification methods are used to detect and differentiate various gases from mixtures. Different classification methods are related to different multivariate analysis methods, such as PCA, Gaussian mixture models (GMMs), neural networks, and so forth [3]. A sample's odor can be compared using sensors arrays. This causes a minor change to the chemical or physical properties of a material, which as a result, introduces a readable change in the electrical properties (e.g., conductivity) of the material [4]. The e-nose reduced the detection period from 65% to 30% (time) by detecting compounds within 16 h instead of taking 1-2 days [5].
Fan et al. [6] have described the supervised learning techniques used for e-nose data analysis and classification, which requires several datasets to impose a limitation for using these methods because a large number of datasets for organic gases-i.e., A, E and P-are unavailable. Instead, in a free environment, a gas is classified via unsupervised learning. Unsupervised learning has more significant advantages over supervised learning because the classification of different gases uses an e-nose. Monroy et al. [7] carried out an experimental analysis of ethanol and acetone on a moving platform rather than a stable platform. In another study, an e-nose was used to experimentally classify compounds such as limonene, ethanol, and dimethyl sulfide [8]. In the gas detection area, a comprehensive review was conducted by Hodgkinson and Tatam [9], which compared methods like photoacoustic spectroscopy, spectrophotometry, tunable diode laser spectroscopy, and non-dispersive infrared. However, they did not consider any multivariate techniques, such as PCA, to detect and classify a mixture of gases. In another study [10], a system to detect and classify carbon monoxide and oxygen was developed. However, a mixture of A, E and P was not considered for the classification [10]. Several techniques were again reviewed by Rydosz et al. [11] for the detection of acetone to analyze the health condition of diabetic patients. However, the combination of organic gases A, E and P was not considered. There are several customized devices that can detect one gas at a time (such as ethanol and acetone gases), yet they are unable to differentiate more than one gas from a mixture [12]. Moreover, instrumental methods to determine volatiles such as gas chromatography-mass spectrometry (GC-MS) are expensive and require trained personnel [13]. Thus, the study conducted and presented in this paper aimed to differentiate and classify the mixture of three volatile gases-i.e., A, E and P-using PCA. The proposed technique classified gases by making boundary clusters for the three gases A, E and P. At present, the e-nose has been widely studied and applied in medical diagnostics [14], food quality testing [15], and environmental monitoring [6]. Table A1 (given in Appendix B) indicates different compounds analyzed by different methods and provides the limitations associated with each. At present, research conducted by Fan et al. [6] has only considered the differentiation of A, E, and P from mixture through the Knuth-Morris-Pratt (KMP) algorithm. However, the KMP algorithm has a prominent limitation, i.e., the poor separability of gas mixtures. These gases are mostly produced in oil and petroleum refineries and can cause different diseases in human beings [6]. Acetone can cause unconsciousness and possibly coma, ethanol can cause heart diseases, and propane can cause suffocation and irregular heartbeat. To avoid these diseases, the detection and differentiation of gases from the mixture is necessary so that people working in oil and petroleum refineries can avoid exposure to these gases in a timely measure. Metal oxide semiconductor (MOS) electronic sensors play an important role in sensing such gases and can provide an alarm to prevent themselves and others from the potential exposure to such gases. Thus, this study further aims to develop a novel MOS-nose sensor-based detection system to classify such volatile gases from a mixture of gases and further classify them using PCA based on a multivariate analysis technique. The contributions of the proposed work are:

•
To probe and differentiate commonly produced gases from mixtures (acetone, ethanol, and propane) in manufacturing industries using a high performance PCA-based multivariate analysis technique.

•
To identify various clusters of these (acetone, ethanol, and propane) using a PCA numerical test.

Metal Oxide Semiconductors (MOS) Gas Sensors
MOS gas sensors can sense a significant amount of hazardous gases. The electrical response of these sensors to the gas concentration is expressed as: R = log a (C), where R is the electrical response and C is the concentration. MOS gas sensors show poor selectivity and their electrical responses to different gases vary over a large magnitude span. The sensors are used in an array of 4-32 sensors and all have different selectivities that can attain a unique, complete sample or fingerprint that resembles the overlying reaction of the different sensors to the range of compounds within the sample. To improve selectivity, the use of such arrays leads to a convincing amount of redundant information, which is very useful if one of the sensors fail.
The sole advantage of an MOS gas sensor is that it can operate at room temperature. However, its response time is too long (20-40 s) and cannot be used for fast analysis. Moreover, this sensor is not robust, and its performance is affected by temperature fluctuation and aging [16]. Some sensors have limitations in the food industry because of their high compassion to compounds such as ethanol, CO 2 , or humidity [17].
Different types of MOS gas sensor arrays such as TGS 2600, TGS 2602, TGS 2611, and TGS 2620 are used in this work (see: Appendix C). All of these sensors consist of features such as high sensitivity to volatile organic compounds (VOCs) and odorous gases, low power consumption, portable in size, high sensitivity to gaseous air, contaminants, long life, low cost, and simple electrical circuits.

The Methodology of Proposed MOS Sensor-Based E-Nose Detector
The block diagram for the flow of gas sensing and classification is shown in Figure 1. The MOS sensor (sensor array) collects the gas mixture produced by an industrial plant, and then, further processes the mixture via a preprocessing technique using PCA.  To probe and differentiate commonly produced gases from mixtures (acetone, ethanol, and propane) in manufacturing industries using a high performance PCA-based multivariate analysis technique.  To identify various clusters of these (acetone, ethanol, and propane) using a PCA numerical test.

Metal Oxide Semiconductors (MOS) Gas Sensors
MOS gas sensors can sense a significant amount of hazardous gases. The electrical response of these sensors to the gas concentration is expressed as: R = log a (C), where R is the electrical response and C is the concentration. MOS gas sensors show poor selectivity and their electrical responses to different gases vary over a large magnitude span. The sensors are used in an array of 4-32 sensors and all have different selectivities that can attain a unique, complete sample or fingerprint that resembles the overlying reaction of the different sensors to the range of compounds within the sample. To improve selectivity, the use of such arrays leads to a convincing amount of redundant information, which is very useful if one of the sensors fail.
The sole advantage of an MOS gas sensor is that it can operate at room temperature. However, its response time is too long (20-40 s) and cannot be used for fast analysis. Moreover, this sensor is not robust, and its performance is affected by temperature fluctuation and aging [16]. Some sensors have limitations in the food industry because of their high compassion to compounds such as ethanol, CO2, or humidity [17].
Different types of MOS gas sensor arrays such as TGS 2600, TGS 2602, TGS 2611, and TGS 2620 are used in this work (see: Appendix C). All of these sensors consist of features such as high sensitivity to volatile organic compounds (VOCs) and odorous gases, low power consumption, portable in size, high sensitivity to gaseous air, contaminants, long life, low cost, and simple electrical circuits.

The Methodology of Proposed MOS Sensor-Based E-Nose Detector
The block diagram for the flow of gas sensing and classification is shown in Figure 1. The MOS sensor (sensor array) collects the gas mixture produced by an industrial plant, and then, further processes the mixture via a preprocessing technique using PCA.  Based on the block diagram proposed in Figure 1, an initial simulation was carried out on the gas mixture. Figure 2 shows the Simulink model of differentiating a gas mixture using sensors with specific parameters according to different gases.
Based on the block diagram proposed in Figure 1, an initial simulation was carried out on the gas mixture. Figure 2 shows the Simulink model of differentiating a gas mixture using sensors with specific parameters according to different gases.  This preprocessing step uses three types of analysis: differential, relative, and fractional. These three types have loading design, notch design, and Bi design plots that are used to classify gases. The loading design demonstrates the performance of the sensor array used for differentiating the variations of sensor measurements. It also shows the correlation between the set of sensors. Notch design examines data differences. It also discriminates against the mixture by forming a cluster boundary around the gases. The Bi design shows the loading and notch design in one plot. In Bi design, both variables and observations are made together.

Theory of Principal Component Analysis
A multivariate classification can be performed using a technique such as PCA. In this study, preprocessing was conducted via odor sampling. Afterwards, specified data were detached and information was stored in a computer system for further analysis. In data classification, the algorithm was examined and realized. To analyze odors, three technologies were used: differential, relative, and fractional (details are appended in Appendix A). PCA used data dimension reduction and clustering methods [18]. The authors in reference [19] used a deep convolutional neural network (DCNN) to classify gases. The critical part of the data analysis was to consider PCA as an eigenvector test of the data report, featuring a fragment into another coordinate framework. By smothering data elements, the critical part of PCA changed the primary data information. In the original data information, using less factors showed that a different pattern of change existed. Accordingly, as a preprocessing procedure, PCA was used.
To confirm the pattern, PCA was used and the analysis of the limited sample circulation using the electronic compress nose was completed. All related PCA data depend on a statistical strategy, through which all associated data can easily be envisioned [20]. In experiments and logic, the PCA strategy incorporates the following steps: (i) The matrix X M×N gives the information on row, which is represented by M, demonstrating that different redundancies happen amid the experiment. Column N delineates non-subordinate sensors. (ii) Normalization and arrangement of data is performed in matrix form Norm (X M × N ) with a mean reduction. Subtraction and estimation of the average of each N column from the informational data is collected. This new information data quantity makes the mean equivalent to zero. (iii) Measurement of the covariance matrix is performed as Cov (X M×N ), which helps discover eigenvectors and eigenvalues of the covariance matrix. The created eigenvectors ought to be unit eigenvectors. (iv) The eigenvalues and eigenvectors are arranged and adjusted. The eigenvalues are adjusted from eigenvectors from maximum to minimum (Cov (X M×N )) with max→min. (v) The output of PCA is restored by using the product of a matrix, transposed and given as ((Cov (X M×N ))max→min*Norm (X M×N )) T . Further to this, the accomplished informational data collection with orthogonal linear change presented in 2D/3D also includes free informational data collection.
For the simulation performed in this study, real-time data were taken from the Mobile Robot Programming Toolkit with a high-quality standard. The collected data consisted of MOS sensor voltage feedback with a MATLAB arrangement cluster. This dataset consists of voltage signals from MOS gas sensors. For different types of volatile organic compounds (VOCs), such as A, E, and P, six MOS sensors recorded data for about 4000 s. The real-time response curves are shown in  All MOS sensors (TGS 2600, TGS 2602, TGS 2611, TGS 2620, MICS 5135, and MICS 5521) responded to the mixture of gases, even at a lower concentration level (1-30 ppm), as depicted in Figure 3. The voltage signals extracted from the MOS sensor response were analyzed using three different analysis methods to determine if the mixture concentration was distinguishable or not. The analysis processes performed were load design, notch design, and Bi design plots.

Result and Discussion
In the section below, we present the results of three preprocessing techniques used to identify and classify three gases using PCA.

Data Difference Preprocessing Used for Principal Component Analysis of Volatile Organic Compounds and Gases Data
Preprocessing data for VOCs such as A, E, and P was conducted using PCA. The loading design for the performance of MOS sensor arrays was used for differentiation in the variations of voltage readings (Figure 4). Herein, we present one to six different irregular points in this design. In Figure  4, point one shows TGS 2600, point two shows TGS 2602, point three shows TGS 2611, point four shows TGS 2620, point five shows MICS 5135, and point six shows MICS 5521. All MOS sensors (TGS 2600, TGS 2602, TGS 2611, TGS 2620, MICS 5135, and MICS 5521) responded to the mixture of gases, even at a lower concentration level (1-30 ppm), as depicted in Figure 3. The voltage signals extracted from the MOS sensor response were analyzed using three different analysis methods to determine if the mixture concentration was distinguishable or not. The analysis processes performed were load design, notch design, and Bi design plots.

Result and Discussion
In the section below, we present the results of three preprocessing techniques used to identify and classify three gases using PCA.

Data Difference Preprocessing Used for Principal Component Analysis of Volatile Organic Compounds and Gases Data
Preprocessing data for VOCs such as A, E, and P was conducted using PCA. The loading design for the performance of MOS sensor arrays was used for differentiation in the variations of voltage  Figure 4 shows the results of the data difference design. It had no sensor set response correlated with the other. Figure 5 shows that the notch design for multivariate techniques examines the relationships among data difference considerations. Here, a multivariate method PCA was used to discriminate all three clusters of A, E, and P, represented in one cluster each, using a separate boundary. Figure 5 also shows that PCA can discriminate all these three clusters by the separate boundary. In Figure 6, both variable scatter models and observations are presented together. On the left part of Figure 6, the principal component of MOS sensors is presented, while observations of gas samples are presented on the right side.   Figure 4 shows the results of the data difference design. It had no sensor set response correlated with the other. Figure 5 shows that the notch design for multivariate techniques examines the relationships among data difference considerations. Here, a multivariate method PCA was used to discriminate all three clusters of A, E, and P, represented in one cluster each, using a separate boundary. Figure 5 also shows that PCA can discriminate all these three clusters by the separate boundary. In Figure 6, both variable scatter models and observations are presented together. On the left part of Figure 6, the principal component of MOS sensors is presented, while observations of gas samples are presented on the right side.   Figure 4 shows the results of the data difference design. It had no sensor set response correlated with the other. Figure 5 shows that the notch design for multivariate techniques examines the relationships among data difference considerations. Here, a multivariate method PCA was used to discriminate all three clusters of A, E, and P, represented in one cluster each, using a separate boundary. Figure 5 also shows that PCA can discriminate all these three clusters by the separate boundary. In Figure 6, both variable scatter models and observations are presented together. On the left part of Figure 6, the principal component of MOS sensors is presented, while observations of gas samples are presented on the right side.    Figure 8 inspects the coordination among the study of data difference. Figure 8 further shows that the multivariate technique (PCA) can discriminate against all three clusters of A, E, and P by the separate boundary. Hence, PCA helps classification in incomplete data preprocessing methods, which can ultimately help differentiate the three gases from the mixture. Figure 9 summarizes the results of the loading and notch designs, known together as the Bi design. Both variable scatter models and observations are shown in Figure  9. On the left part of Figure 9, the principal MOS sensor components are presented, while the observation of gas samples are presented on the right side of Figure 9.    Figure 8 inspects the coordination among the study of data difference. Figure 8 further shows that the multivariate technique (PCA) can discriminate against all three clusters of A, E, and P by the separate boundary. Hence, PCA helps classification in incomplete data preprocessing methods, which can ultimately help differentiate the three gases from the mixture. Figure 9 summarizes the results of the loading and notch designs, known together as the Bi design. Both variable scatter models and observations are shown in Figure 9. On the left part of Figure 9, the principal MOS sensor components are presented, while the observation of gas samples are presented on the right side of Figure 9.   Figure 8 inspects the coordination among the study of data difference. Figure 8 further shows that the multivariate technique (PCA) can discriminate against all three clusters of A, E, and P by the separate boundary. Hence, PCA helps classification in incomplete data preprocessing methods, which can ultimately help differentiate the three gases from the mixture. Figure 9 summarizes the results of the loading and notch designs, known together as the Bi design. Both variable scatter models and observations are shown in Figure  9. On the left part of Figure 9, the principal MOS sensor components are presented, while the observation of gas samples are presented on the right side of Figure 9.    Figure 10 shows the results of the relative data design. In Figure 10 Figure 10 shows the results of the relative data design. In Figure 10 Figure 10 shows the results of the relative data design. In Figure 10 The notch design shown in Figure 11 investigated the relationships between the study of data differences. Figure 11 also shows that the multivariate technique (PCA) can discriminate against all three clusters of A, E, and P by the separate boundary. Hence, PCA is helpful in the relative data preprocessing method to classify all three gas samples. Figure 12 summarizes the results of the loading and notch design together in one design known as the Bi design. Both variable scatter models and observations are shown in Figure 12. As can be seen, the principal sensor components are presented on the left part of Figure 12 while observation of gas samples are seen on the right part of Figure 12. The primary sensor component was usually distributed and did not show the same act for preprocessing done by relevant data. The information's point was too crowded because it became difficult to distinguish the smell by the relative method. The notch design shown in Figure 11 investigated the relationships between the study of data differences. Figure 11 also shows that the multivariate technique (PCA) can discriminate against all three clusters of A, E, and P by the separate boundary. Hence, PCA is helpful in the relative data preprocessing method to classify all three gas samples. Figure 12 summarizes the results of the loading and notch design together in one design known as the Bi design. Both variable scatter models and observations are shown in Figure 12. As can be seen, the principal sensor components are presented on the left part of Figure 12 while observation of gas samples are seen on the right part of Figure 12. The primary sensor component was usually distributed and did not show the same act for preprocessing done by relevant data. The information's point was too crowded because it became difficult to distinguish the smell by the relative method.  The notch design shown in Figure 11 investigated the relationships between the study of data differences. Figure 11 also shows that the multivariate technique (PCA) can discriminate against all three clusters of A, E, and P by the separate boundary. Hence, PCA is helpful in the relative data preprocessing method to classify all three gas samples. Figure 12 summarizes the results of the loading and notch design together in one design known as the Bi design. Both variable scatter models and observations are shown in Figure 12. As can be seen, the principal sensor components are presented on the left part of Figure 12 while observation of gas samples are seen on the right part of Figure 12. The primary sensor component was usually distributed and did not show the same act for preprocessing done by relevant data. The information's point was too crowded because it became difficult to distinguish the smell by the relative method.   According to the different concentration of gases, various sensors are evaluated in Table 1 Table 2 shows a comparison between this paper's proposed method with past studies. Prior research has implemented MIR-PLS, GC-MS, and PLS techniques for differentiation and classification of A, E, and P. Each study presented in Table 2 used either A, E, or P, but did not consider the mixture of these three gases. Additionally, the scope of their studies was focused on detecting human diseases by the gases coming out from petrochemical storages. The proposed method presented in this paper overcame the disadvantage of previous studies by designing multivariate techniques to differentiate and classify the A, E, and P mixture. The three preprocessing methods used here easily organize the gas mixture by making a cluster boundary between the gases. This classification will enable people working in the industry to receive alerts regarding the presence of these gases in the environment and to take preventative measures. According to the different concentration of gases, various sensors are evaluated in Table 1 Table 2 shows a comparison between this paper's proposed method with past studies. Prior research has implemented MIR-PLS, GC-MS, and PLS techniques for differentiation and classification of A, E, and P. Each study presented in Table 2 used either A, E, or P, but did not consider the mixture of these three gases. Additionally, the scope of their studies was focused on detecting human diseases by the gases coming out from petrochemical storages. The proposed method presented in this paper overcame the disadvantage of previous studies by designing multivariate techniques to differentiate and classify the A, E, and P mixture. The three preprocessing methods used here easily organize the gas mixture by making a cluster boundary between the gases. This classification will enable people working in the industry to receive alerts regarding the presence of these gases in the environment and to take preventative measures.

Data Relative Preprocessing Used for Principal Component Analysis of Volatile Organic Compounds and Gas Database
All three pre-handling procedures show them working perfectly without overlapping for each technique, which is also shown in Table 3. Here, the table summarizes classification and overlapping of gases. As depicted in Table 3, each gas sample has different chemical properties, but the preprocessing method works perfectly during the classification of the sample if the sample contains different material properties. Table 3 also shows that there is no overlap of gases found when using the three preprocessing methods.

Conclusions
The e-nose developed using various MOS sensors has become a widespread practice in different fields. In this study, an e-nose sensor-based volatile gas detection system was developed through multivariate analysis. The proposed system assessed organic gas quality, where a metal oxide sensor was used to collect the data. This was then further analyzed using PCA-based multivariate classification methods to discriminate against various odors because of VOCs. Three pre-handling procedures were used: data difference, data fraction, and data relative. Three sorts of designs were developed in this work: notch design, Bi design, and loading design. These designs ultimately helped differentiate and classify the gases. The loading design was used to decide on the centrality and relationship between MOS sensors. The notch design was used to legitimize the ability of the classifier in separating the smells of each gas. Finally, the Bi design was utilized to picture the interdependency in the middle of perception and reaction. To conclude, the method developed in this study and its ultimate outcomes will enable people working in affected industries to get alerts about the presence of these gases in the environment so that they can take preventative measures.
The proposed system to detect volatile cases was developed on a dataset with a limited number of samples. The number could increase in future studies with more data collection to investigate the proposed method's generalizability. Future work should study the effect of volatile gases on human health using brain imaging and physiological sensing techniques such as electroencephalogram (EEG) [24,25], electrocardiogram (ECG) [24], heart rate variability analysis (HRV) [26], and magnetic resonance imaging (MRI) [27]. Moreover, another future direction could be to study the discriminating capabilities of machine learning algorithms that classify volatile gases by comparing the performances of a variety of classifiers, such as support vector machines (SVM) [28], gradient boosting (XGB) [29], artificial neural networks (ANN) [30], and k-nearest neighbors (KNN) [31]. Future work on quality assessments of medical herbs like Glycyrrhiza glabra can be done to know the condition of the herb-whether it is in good, over dried, or bad condition.
Using the knowledge gained from the analysis and studies, one can customize the sensor array towards these main volatiles. In this way, one can identify a volatile pattern, which correlates with a better odor discrimination outcome with a lower cost detection risk. This development is very beneficial in biomedical applications in detecting different kinds of diseases because it can correlate key volatiles (potential uremic toxins) to the volatile shift observed in various data records used in medical tests. In this way, one can proceed the approach in this work with applications of low sensors in pathological diagnosis, as it has been found in various literature that the electronic nose is able to distinguish between control blood and "uremic" blood. In addition, the gas sensor series is not only able to discriminate the performance of after-dialysis blood, but it can also follow the unpredictable shift that happens during a single hemodialysis (simply dialysis, a process of purifying the blood of a person whose kidneys are not working normally) session. The e-nose can be used for equally dialysate-side and blood-side monitoring of hemodialysis. In this way, this work has marvelous scope in social welfare and minimizing expenses in costly pathological tests in the diagnosis of diseases.