1. Introduction
The World Health Organization (WHO) reports that indoor air pollution (IAP) harms 3.8 million people every year [
1]. It is caused by occupant activities such as cooking, smoking, using electronic equipment, or using consumer items, or by emissions from construction materials inside houses or buildings. Carbon monoxide (CO), volatile organic compounds (VOCs), particulate matter (PM), aerosols, and other biological pollutants are some of the harmful pollutants that are found in the indoor ambient air of buildings [
2]. In addition, it has been reported that poor indoor air quality (IAQ) may harm people’s health by increasing disorders that are associated with airborne pollution present in indoor ambient air [
3,
4]. Thus, airborne pollution detection and monitoring of the presence of various hazardous VOCs, gases, and smokes are essential to mitigate poor air quality in indoor spaces.
Low-power wide-area network (LPWAN) technologies have attracted considerable interest and played a significant role in popularizing Internet of Things (IoT)-based applications. Depending on the deployment area, most LPWANs operate in the unlicensed industrial, scientific, and medical (ISM) bands at 169, 433, 868/915 MHz, and 2.4 GHz [
5]. A LoRa-based network link is a wireless communication technology that uses long-range, low-power radio frequency signals to transmit data over a long distance. These LPWAN technologies include NB-IoT, Sigfox, and LoRa. It has been demonstrated that LoRa offers advantages in terms of ultralong transmission, high stability, ultralow power, and low cost [
6]. LoRa-based network links are designed to provide reliable and cost-effective communication for IoT devices in various applications, including smart cities, industrial automation, agriculture, and environmental monitoring [
5].
LoRa-based network links have two main components, i.e., LoRa nodes and LoRa gateways. The LoRa nodes are small, low-power devices that can be embedded in IoT sensors and devices to transmit data wirelessly. The LoRa gateways act as bridges between the LoRa nodes and the Internet, receiving data from the nodes and forwarding them to a cloud-based server or an IoT platform. The characteristics of several network technologies used for the IoT are shown in
Table 1. This study introduces the LoRa WAN protocol-based pollution hazard detection and monitoring system, comprising a microcontroller and up to eight sensors and operating remotely. Further, advanced two-stage analysis space transformation-based approaches were also used to enhance the performance and efficiency of the proposed networked intelligent gas sensor system (N-IGSS) [
7,
8,
9].
The literature review showed a growing interest in environmental monitoring using IoT and cloud-based technologies. Numerous environmental factors, including temperature, humidity, light, noise level, CO, and NO
2, were monitored indoors and outdoors on a university campus using an environmental monitoring system [
12]. This system included the Raspberry Pi and MICS-4514 sensors. An IoT-based indoor environmental quality (IEQ) assessment system employing an Arduino Uno module and economical sensors like the DHT22 was developed in an experimental investigation [
13]. An IoT-based measurement system with simple pressure, temperature, humidity, and gas sensors (BME680, DHT22, and MQ5) interfaced with an Arduino microcontroller has also been reported [
14]. In a similar experiment, Kureshi et al. used temperature, humidity, a VOC sensor (BME680), and PM 2.5 and PM 10 sensors (SDS011) for air quality index (AQI) calculation. They used this information to increase awareness about indoor air pollution’s hazards to human health [
4]. Numerous gases, including CO
2, NO
2, ethanol, methane, and propane, were identified using the gas sensors MICS-6814 in an IoT-based system [
15] developed for real-time IAQ monitoring.
An investigation of the emissions generated by different incense sticks and sparklers, including particles and benzene, was reported by Werner et al. [
16]. Utilized indoors, they had a significantly unfavorable impact on the air quality. The indoor AQI was further explored by using multiple low-cost PM sensors for effective compliance with the U.S. Environmental Protection Agency (EPA) [
17]. The variation in the concentration ranges of CO, CO
2, total volatile organic compounds (TVOCs), and PM 2.5 over a period of 24 h was monitored in indoor environments by Capua et al., and a decision-making algorithm was used to manage the IAQ [
18]. Rebecca et al. described the particles and gases released by burning Arabian incense and investigated the reactions of in vitro human lung cells to incense smoke [
19]. A study on mosquito-repellent incense was conducted by Wang et al. The incense was found to generate excessive gaseous and particle pollutants, as permitted under GB3095-2012 rules and the WHO advice [
20]. Furthermore, green IoT for eco-friendly applications was discussed by the authors of [
21] and opportunistic routing (OR) was widely adopted in wireless sensor networks (WSNs) running asynchronous duty-cycled MAC protocols by the authors of [
22].
In the recent literature, researchers have only considered AQI and used specific sensors to detect and monitor a limited number of pollutant gases and PM concentrations for the evaluation of indoor air quality. The attention towards various VOCs, gases, and smokes in indoor environments is lacking. Furthermore, an electronic nose (e-nose) is a device that mimics the human olfactory system to identify and analyze various VOCs, gases, and smokes present in the environment. E-noses have been widely used for detecting and monitoring air pollution hazards, including industrial emissions, traffic emissions, and indoor air pollutants. The proposed approach for pollution hazard detection and monitoring using e-noses is to provide real-time monitoring of airborne pollutants at a centrally located RDPS. The data collected at the RDPS are analyzed by leveraging artificial intelligence-based algorithms and models to identify the classes of pollutants of interest [
7,
8,
9]. Pollution hazard detection networks are designed to detect, monitor, and predict the class and level of environmental pollution. Such networks can be categorized based on the sensors, spatial coverage, and data transmission methods. In addition, people around the world utilize the Internet and GSM mobile-based networks, which require high power and paid subscriptions [
6].
In this paper, we propose a networked intelligent gas sensor system (N-IGSS) using a LoRa network-link-based IoT platform which can be operated in private places such as industrial settings, mines, etc., and it can operate well even when there is no Internet or GSM mobile-based network connectivity. To the authors’ knowledge, this is the first work to have integrated a LoRa network link to develop a N-IGSS. The proposed N-IGSS can be used in many practical LoRa-based applications, such as hospital management, air quality monitoring, agriculture, smart cities, and industrial management. Therefore, the proposed N-IGSS will produce economic, social, and environmental impacts.
1.1. Motivation and Contributions
We were motivated by the goal of a low-power, low-cost solution for wide monitoring of airborne hazards across remote locations. The proposed solution addresses the challenge of monitoring airborne hazards using Internet connectivity, which consumes significant power. Furthermore, the proposed system can operate independently by utilizing LoRa-based wireless networks that do not require Internet connectivity, thus reducing power consumption and providing extended battery life.
The proposed N-IGSS comprises an array of low-cost, MOX-based gas sensor elements integrated as the gas sensor node. In this experiment, we considered six types of VOCs, gases, and smokes released by burning tobacco, paint, carpet, incense sticks, and alcohol. Our sensor node was placed indoors to capture and transmit sensor responses via the LoRa network link to the remotely placed LoRa gateway (remote data-processing station) for further analysis and predictions for real-time operations. We processed the sensor response data at the RDPS. The block schematic diagram of our proposed N-IGSS is shown in
Figure 1. In the proposed N-IGSS, an array of seven-element tin-oxide-based gas sensor elements, along with one digital temperature and humidity sensor, is used for the detection of six classes of VOCs, gases, and smokes. The respective signature patterns were analyzed using our proposed two-stage analysis space transformation method, which closely mimics the human olfactory system [
23]. The utilities of this work are highlighted as follows:
An N-IGSS is proposed to detect and monitor airborne pollution hazards in indoor ambient air using an RDPS.
For the first time, a LoRa WAN networking link protocol was used for real-time networked operation of e-noses.
The proposed N-IGSS was designed using a two-stage analysis space transformation method to ensure that the classifier models delivered high performance.
1.2. Paper Structure
This paper is structured as follows: An introduction to the N-IGSS and some prominent LoRa network application scenarios are presented in
Section 1. In
Section 2, the architecture of our proposed N-IGSS using LoRa network hardware, subsequent experiments, and contextual details of the tools and processes are presented. The results and discussions are placed in
Section 3. In
Section 4, conclusions and future scope are presented.
2. Materials and Methods
In this section, we present the method used to develop the proposed N-IGSS, which is integrated with the LoRa networking link protocol for hazardous pollution detection in real time at an RDPS using artificial-intelligence-based algorithms and models.
2.1. The Contextual Background of the N-IGSS
The proposed system comprises three parts, as shown in
Figure 2a–c.
Physical gas sensor node (e-nose): The sensor node is the onsite part of the proposed N-IGSS, which consists of an array of seven tin oxide (MOX)-based gas sensor elements, named MQ 2, MQ 3, MQ 5, MQ 6, MQ 7, MQ 8, and MQ 135, and one temperature and humidity sensor, DHT 22, interfaced with a microcontroller and a LoRa module, as shown in
Figure 2a. These MOX sensors are cross-sensitive and respond to multiple VOCs, gases, and smokes with different sensitivities and generate unique signature patterns for various VOCs, gases, and smokes in different types of pollutants.
The LoRa link: The LoRa link is a wireless communication technology designed to enable long-range, low-power communication between devices. The LoRa link operates in the unlicensed radio spectrum, which means that it can be used without needing a license or paying subscription fees. The main function of the LoRa link is to provide a reliable and efficient communication link for IoT devices. It is particularly well-suited for applications that require low data rates and long-range communication, as shown in
Figure 2b.
RDPS: We received the real-time transmissions of the sensor node responses at the other end of the LoRa link using the LoRa gateway. LoRa is a transceiver module, and we were able to use the same module as a transmitter and receiver according to the configuration code. At this RDPS, we processed the gas sensor array responses using our proposed two-stage space transformation method to achieve high performance. The captured gas sensor array responses were first preprocessed via this method to transform the raw sensor responses into a suitable analysis space using the standardized linear discriminant analysis (SLDA) method. In the second stage, different types of classifiers, namely AdaBoost, XGBoost, RF, and MLP, were designed in the SLDA analysis space to efficiently classify the samples of the six classes of VOCs, gases, and smokes, as shown in
Figure 2c.
2.2. The Gas Sensor Node Prototype
The gas sensor node consists of seven tin-oxide-based MOX-based gas sensor elements and one temperature and humidity sensor, which generate real-time signature patterns of the considered VOCs, gases, and smokes released when burning the considered types of pollutant materials. The sensor node was placed indoors to transmit sensor responses to the analyzed VOCs, gases, and smokes using the LoRa network link protocol. These wirelessly transmitted sensor array responses are received at an RDPS (LoRa link gateway). The electrical characteristics of the components used for the sensor node prototype development are shown in
Table 2.
The connection diagrams of the sensors and the LoRa transmitter and receiver with the microcontroller are shown in
Figure 3a,b.
We interfaced the sensors and LoRa modules with their respective microcontrollers to create the sensor node and LoRa receiver gateway of the proposed N-IGSS. It communicates with the sensors and transmits the data through a serial peripheral interface (SPI) to the LoRa transceiver. The LoRa module SX1278 is a 137 MHz to 525 MHz long-range low-power transceiver module which can be used as both a transmitter and a receiver. The sensor node is powered using a 5 V (DC) supply. The LoRa module (SX1278) is supplied with 3.3 V by voltage-dividing the microcontroller’s 5 V supply. The basic circuit diagram of the N-IGSS is depicted in
Figure 4a, showing the sensor node, and
Figure 4b, showing the receiver gateway at the RDPS.
LoRa consists of an antenna that transmits the sensor values through the microcontroller to the receiver node via radio waves. In our case, we used a 915 MHz antenna in the ISM band designated for Asia. A physical view of the fabricated N-IGSS for pollution hazard detection and monitoring is shown in
Figure 5a,b.
The LoRa transceiver at the receiver gateway is coded exclusively to function as the gateway. This gateway can be connected to the Internet to upload real-time sensor responses to the cloud. Multiple nodes can also be used to create a more effective and larger network of IGSS.
2.3. Experimental Setup
In ambient conditions, we conducted our experiments in a closed, semi-furnished room of 114 square meters (1226.64 square feet). We deployed the N-IGSS sensor node at a fixed height of 2 feet and captured real-time responses of the sensor array for the analyzed smokes of tobacco, paints, carpet, alcohol, and incense sticks. In our experiments, we employed the SX1278 LoRa module, which can only operate on the 868 MHz (European ISM) and 915 MHz frequencies (American ISM). To adhere to Federal Communications Commission (FCC) rules, we selected the 915 MHz band.
Initially, the gas sensor system was purged with ambient air for 10 min to obtain the baseline signature pattern of the sensor array responses under steady-state conditions. Once the sensor array responses became steady, the array was exposed to one of the considered classes of VOCs, gases, or smokes for 20 min at a rate of 15 samples per minute. After 20 min of exposure, the gas sensor array was again purged with ambient air until the sensor responses returned to the baseline signature patterns. This process was repeated for all the considered classes of VOCs, gases, and smokes, and the raw sensor array response dataset was captured in its totality.
In this experiment, the sensor node was first heated up for 15 min while keeping the gas chamber closed, and the steady state of the sensor responses was determined. Consequently, the gas chamber inlet was opened for the next 20 min, and samples of the considered analyte were captured at the RDPS. Then the analyte exposure was stopped, and for the next 30 min, the gas chamber was purged with fresh ambient air to ensure that the sensor responses returned to the baseline conditions. The same procedure was repeated to capture samples of the other the analytes.
Accordingly, each experimental phase continued for 65 min, raw sensor responses were captured, and the same process was repeated in all experiments. Therefore, the experiment took 390 min (65 min × 6 types). All sensor responses returned to baseline responses throughout the experiment, and no sensor poisoning occurred. During this period, 1800 samples were captured at a sampling rate of 15 samples per minute. Further details of the dataset and the samples collected are given in
Table 3.
The samples belonged to the six classes of VOCs, gases, and smoke released by burning different pollutants in indoor ambient conditions. The dataset consisted of 300 samples of each class: ambient air, tobacco, paints, carpet, incense sticks, and alcohol. The captured dataset was then segregated into training and testing datasets. The training dataset consisted of 295 × 6 = 1770 samples, and 5 × 6 = 30 samples were used for testing purposes. Finally, the testing dataset was separated from the data used for training or validation of the classifiers at any stage.
A prototype LoRa network was implemented and deployed in a typical urban environment for airborne pollution hazard detection and monitoring and to evaluate the efficacy of the proposed LoRa network. In the experiment, the LoRa node was gradually moved away from the LoRa gateway’s location while sending uplink packets, and the received signal strength was measured continuously. A LoRa node transmits a data packet every 4 s, and the packet is then received at the LoRa gateway. Since the LoRa gateway can communicate across channels, we received the sensor responses correctly at the remote data-processing station.
2.4. Contextual Background of Analysis Space Transformation
The proposed N-IGSS is based on performance enhancement approaches in which suitable classifiers are designed in the analysis space transformation domain. It has been reported that a classifier performs better when it is trained in a suitable transformation space where the dataset shows well-separated clusters with good intercluster separation [
8]. In the following subsections, we describe the processes in greater detail.
2.4.1. Data Preprocessing
This work used the standardized linear discriminant analysis (SLDA) space transformation method. The SLDA transforms raw data into respective space transformation domains [
7,
24,
25,
26,
27,
28]. The first three principal components contained 98.88% of the information, using which we obtained a 3D scatter plot which showed well-separated clusters with good intercluster separation [
8], as shown in
Figure 6.
Standardized linear discriminant analysis (SLDA) is a statistical method that can be used for dimensionality reduction. It is a linear transformation process applied to the input data, which transforms the raw sensor responses into new components to maximize the separation between different classes.
The process of SLDA space transformation is described in the following steps:
Calculate the covariance matrix by adding the scaling factor (1/(
N − 1)) to the within-class scatter matrix
where
Ni = sample size of the VOC class (here = 300 × 6). We can now drop (
Ni − 1) because all classes have an equal sample size.
Scatter matrix between each class (
SB):
where
m = overall mean,
mi = sample mean, and
Ni = sample size of the respective class.
Compute the eigenvectors and eigenvalues:
where λ = eigenvalue and
V = eigenvector of same eigenvalue
Project the data onto the new subspace:
where X =
n-dimension matrix representation of the n samples and Y = transmitted
n × k dimensional samples in the new subspace.
We preprocessed the raw sensor response dataset in this experiment using the standardized linear discriminant analysis (SLDA) space transformation method. The SLDA method transforms raw data into its respective space transformation domains [
7,
20,
23,
24]. The first three principal components contained 98.88% of information, using which we obtained a 3D scatter plot which showed well-separated clusters with good intercluster separation, as shown in
Figure 7. It can be observed that the clusters belonging to the six classes of VOCs/gases/odors were shown clearly in the SLDA-transformed dataset.
2.4.2. Design of the Classifiers
The raw sensor responses were first transformed into the SLDA domain during preprocessing. SLDA is an effective method for feature enhancement. We used all seven principal components (PCs) to train and test the classifier used in the IGSS without any information loss. In the SLDA transformation space, better-performing classifiers were then designed. The transformed dataset consisted of 1800 samples, with each vector consisting of seven elements. The transformed dataset was then segregated into two parts, i.e., the training and testing datasets, consisting of 1770 and 30 samples in the SLDA-transformed domain, respectively. Furthermore, we used four popular classifiers for the experiment, namely AdaBoost, XGBoost, RF, and MLP. AdaBoost and XGBoost are good for boosting, RF for handling large datasets, and MLP for handling complex nonlinear relationships. The classifier process is shown in
Figure 8.
The AdaBoost algorithm is a boosting algorithm that combines multiple weak classifiers to form a robust classifier. The design of an AdaBoost classifier involves selecting several hyperparameters, such as the number of estimators, the learning rate, and the random state, to optimize the model’s performance. The number of estimators is the number of weak classifiers that combine to form a robust classifier. The learning rate is a hyperparameter that controls the contribution of each weak classifier to the final model. The random state is a seed value that ensures the reproducibility of the results. In this case, we set the random state to 1. The AdaBoost classifier is shown in
Figure 9.
The design of an XGBoost classifier involves selecting several hyperparameters, such as the learning rate, number of estimators, maximum depth of the tree, minimum child weight, gamma, regularization alpha, number of threads, and cross-validation. The learning rate is a hyperparameter that controls the contribution of each tree to the final model. The number of estimators is the number of decision trees created in the ensemble. The XGBoost classification model is shown in
Figure 10.
Random Forest builds many decision trees and combines their outputs to make predictions. The design of a Random Forest classifier involves selecting several hyperparameters, such as the number of estimators, criterion, learning rate, random state, and cross-validation. The number of estimators is the number of decision trees created in the forest. The criterion is a hyperparameter that determines the function used to measure the quality of a split. The random state is a hyperparameter that controls the randomness of the algorithm. The Random Forest classification model is shown in
Figure 11.
This MLP classifier has a relatively small number of hidden layers and a moderate batch size, which makes it computationally efficient and suitable for medium-sized datasets. Using the ReLU activation function and adaptive learning rates can help to improve the model’s performance, while using cross-validation can help to reduce overfitting and improve generalization. The ReLU activation function is defined as f(x) = max (0, x), which means that the output of the activation function is the maximum between the input value and zero. The model adapts the learning rate of each weight during training based on the historical gradient information, which can lead to faster convergence and better performance than fixed learning rates. An epoch is defined as one pass through the entire training dataset. The number of iterations required for the model to converge may be less than 100. The proposed MLP classification model is shown in
Figure 12.
The performance parameters taken while designing the classifiers AdaBoost, XGBoost, RF, and MLP are presented in
Table 4.