Deep Learning-Based Indoor Air Quality Forecasting Framework for Indoor Subway Station Platforms

Bakht, Ahtesham; Sharma, Shambhavi; Park, Duckshin; Lee, Hyunsoo

doi:10.3390/toxics10100557

Open AccessArticle

Deep Learning-Based Indoor Air Quality Forecasting Framework for Indoor Subway Station Platforms

¹

School of Industrial Engineering, Kumoh National Institute of Technology, Gumi 39177, Korea

²

Transportation System Engineering, University of Science and Technology (UST), Daejeon 34113, Korea

³

Department of Transportation Environmental Research, Korea Railroad Research Institute (KRRI), Uiwang 16105, Korea

^*

Author to whom correspondence should be addressed.

Toxics 2022, 10(10), 557; https://doi.org/10.3390/toxics10100557

Submission received: 25 August 2022 / Revised: 20 September 2022 / Accepted: 21 September 2022 / Published: 23 September 2022

(This article belongs to the Special Issue Source Identification, Monitoring, Health Effect and Control Technologies of Indoor Air Pollutants in Indoor Such as Subway Systems, Multi-Purpose Utilities, School Rooms)

Download

Browse Figures

Versions Notes

Abstract

:

Particulate matter (PM) of sizes less than 10 µm (

P M_{10}

) and 2.5 µm (

P M_{2.5}

) found in the environment is a major health concern. As PM is more prevalent in an enclosed environment, such as a subway station, this can have a negative impact on the health of commuters and staff. Therefore, it is essential to continuously monitor PM on underground subway platforms and control it using a subway ventilation control system. In order to operate the ventilation system in a predictive way, a credible prediction model for indoor air quality (IAQ) is proposed. While the existing deterministic methods require extensive calculations and domain knowledge, deep learning-based approaches showed good performance in recent studies. In this study, we develop an effective hybrid deep learning framework to forecast future

P M_{10}

and

P M_{2.5}

on a subway platform using past air quality data. This hybrid framework is an integration of several deep learning frameworks, namely, convolution neural network (CNN), long short-term memory (LSTM), and deep neural network (DNN), and is called hybrid CNN-LSTM-DNN; it has the characteristics to capture temporal patterns and informative characteristics from the indoor and outdoor air quality parameters compared with the standalone deep learning models. The effectiveness of the proposed

P M_{10}

and

P M_{2.5}

forecasting framework is demonstrated using comparisons with the different existing deep learning models.

Keywords:

particulate matter; indoor subway station; deep learning; hybrid CNN-LSTM; ventilation control

1. Introduction

Subway transportation is operated globally to cope with rising ground traffic congestions. Fast and convenient subway transport systems help to reduce the traffic pressure within cities [1]. With more than 310 subway stations on ten lines, Seoul is one of the largest and busiest metropolitan cities. Each subway line carries about 700,000 passengers on weekdays and 300,000 passengers on weekends [2]. While it offers a convenient way of transportation, its internal air quality raises concern. If not properly ventilated, it causes nitrogen dioxide, carbon dioxide, carbon monoxide, and particulate matter to accumulate over time [3]. Particulate matter (PM) and pollutants such as sulfur dioxide (

S O_{2}

), nitrogen oxides (

N O_{x}

), carbon monoxide (CO), and others that are present in the air above a certain threshold are known to cause several health problems, such as non-malignant respiratory disease, asthma, and allergies; a higher mortality rate; and early death [4,5]. Particulate matter (PM) recently received much attention because of its negative health impacts.

P M_{2.5}

and

P M_{10}

have aerodynamic dimensions less than 2.5 µm (

P M_{2.5}

) and 10 µm (

P M_{10}

), which can erode the alveolar wall, decrease lung function, and induce various cardiovascular disorders [6,7,8]. Existing studies [9,10,11,12,13] have stated that the concentration of airborne particles in a subway station can be up to ten times higher than the recommended WHO exposure limit. Additionally, the increase in PM concentrations has several negative impacts on the economy [14,15]

Indoor air quality (IAQ) in subway stations depends on various factors, such as outdoor air quality, climatic conditions, abrasion during operations, passenger loads, and subway schedule [16,17]. Studies showed that outdoor

P M_{2.5}

could filtrate indoor buildings even with closed doors [18]. Shrestha et al. [19], in their studies of 28 low-income homes in Denver, Colorado, during the 2016 and 2017 wildfire seasons, showed that outdoor air pollution related to traffic and wildfires increased the indoor air pollutant concentrations due to infiltration and natural ventilation. Other studies showed how the wildfires smoke transported by wind affects the quality, atmospheric chemistry, and visibility of places located hundreds of kilometers away from the location of wildfires [20,21]. Wang et al. [22], in their study, accounted that socio-economic factors such as industrial emissions (i.e., soot,

S O_{2} and N O_{x}

), population density, foreign direct investment, and per capita GDP had significant influences on the environmental

P M_{2.5}

concentrations.

A traditional mechanical ventilation system is commonly observed in subway stations for regulating interior pollutants. It plays an important role in reducing the particulate matter and the energy demand of the subway station [23]. However, its operating mechanism fails to account for the real-time fluctuation in the parameters that may cause energy waste or deficiency. Forecasting

P M_{2.5}

and

P M_{10}

concentrations on platforms is critical for establishing early warning systems and managing ventilation systems to maintain commuter safety [24,25]. In order to forecast these PMs, a new and effective hybrid deep learning framework is proposed. The newly devised framework shows better forecasting performance than existing forecasting frameworks, including contemporary deep learning machines.

The main contribution of this study includes the development of a hybrid CNN-LSTM-DNN framework; we compare its performance with that of existing state-of-the-art deep learning techniques, the RNN and its variants (LSTM and Bi-LSTM), the CNN, and the DNN. The comparison of the performance of each of the deep learning architectures was analyzed using the root mean square error (RMSE), the mean absolute error (MAE), and R². The predictive monitoring of

P M_{10}

and

P M_{2.5}

can help to develop an early monitoring system and to control a ventilation system to maintain sustainable indoor air quality on subway platforms.

The remainder of this paper is organized as follows: In the following section, the relevant background and literature review are provided. Section 3 gives information about the availability of data, the correlations among input data variables, and the model description. In the next section, the analysis and the discussion of the results obtained using different DL frameworks are given. Lastly, the paper is concluded, highlighting the limitations of the present study and future directions.

2. Background and Literature Review

In order to forecast indoor air quality, the first step is to measure the number of contaminants in the air, which may be conducted by putting sensors in strategically placed sites [26]. Placing sensors in many of these sites can be expensive and unfeasible. An alternate strategy could be the use of mathematical models utilizing data obtained from sensors over an extended period and the prediction of their patterns using these models. As a result, there have been a lot of efforts in recent years to construct environmental models using different methodologies [27,28,29].

Commonly used methods for forecasting air pollutants can be categorized as mathematical, statistical, and machine learning methods. Mathematical models or deterministic methods require specific knowledge for parameter identification and know-how of the processes. To overcome the limitation of deterministic models, statistical models that require a large number of observed data were developed. Jian et al. [30] applied an auto-regressive integrated moving average (ARIMA) model to predict the submicron particle concentration in Hangzhou, China. Another stochastic ARIMA model by Slini et al. [31] was used to forecast ozone concentration in Athens, Greece. One drawback of these models is that they consider the relationship between the responses and predictors with comparatively simple linear models. At the same time, these models based on statistics are limited due to linear assumptions and ignorance of multicollinearity.

To overcome this issue, non-linear machine learning (ML) models [32], such as support vector machine [33], k-nearest neighbor [34], fuzzy logic [35], and artificial neural network models [36,37], were adopted. Goulier et al. [37] used an artificial neural network to predict the hourly NO₂ concentration in Central London. However, these machine-learning-based methods are not fully capable of learning from long-term dependencies or capturing time-series patterns from IAQ data [38]. Conventional machine learning and shallow networks are no longer state-of-the-art techniques, as they are unfit to capture the dynamic behavior of PM. Contemporary artificial intelligence (AI) and deep learning techniques are evolved to describe the complex, nonlinear PM relationship in an IAQ system. With several advancements in the areas of deep learning, they can extract features by learning from a large number of data [39,40]. Various deep learning methods are widely applied in air quality monitoring and water effluent quality prediction [41]. The unique ability of deep learning approaches is to learn from the vast number of data without prior experience, and they have many advantages over classical algorithms.

Various deep learning approaches, including the deep recurrent neural network (RNN) and convolutional neural network (CNN), were developed and improved for performing tasks ranging from regression to classification, to prediction. Loy et al. [42] used several types of RNN (long short-term-memory, gated recurrent unit) structures to predict hourly

P M_{2.5}

in a subway station in South Korea. Long short-term memory (LSTM), a variant of the RNN, stands out in time-series forecasting problems due to its property of long-term memory. CNN is a popular technique for image recognition and classification and is successfully applied for time-series forecasting tasks [43]. CNN and other deep learning models are widely used in real-time air quality modeling [44]. Shahzeb et al. [3] used a residual neural network (Resnet-50)-based modified version to predict

P M_{2.5}

concentration in a newly built subway station. Its input data consisted of 5 input attributes and 12 past observations.

Shengdong et al. [45] proposed a hybrid deep learning framework for predicting air quality (

P M_{2.5}

) in Beijing, China. Rahmadani and Lee. [46] proposed a hybrid deep learning model with an LSTM model and ordinary differential equations to model the epidemic prediction framework of SARS-CoV-2. Lee et al. [47] proposed a real-time hybrid deep learning architecture using an RNN and a general DNN to predict running safety for a high-speed train. Yang et al. [48] proposed a model based on empirical mode decomposition and LSTM modules to forecast

P M_{2.5}

in a subway platform. However, these methods are limited from the fact that detailed analyses and comparisons with existing deep learning models are provided comparatively less.

3. Hybrid CNN-LSTM Framework for Forecasting Indoor Subway Air Quality

3.1. Data and Preliminary Information

In this investigation, measurements at the Yeongtong station were made using information from two separate sources. The ambient data were obtained from the Air-Korea website (www.inair.or.kr (accessed on 26 April 2022)), and a GRIMM aerosol spectrometer was used to detect particulate indoors. Figure 1 shows the tele-monitoring system (Model 11-A) used to collect the real-time PM concentration at the platform. The Model 11-A portable aerosol spectrometer detected airborne aerosol particles in the size range of 0.25 µm to 32 µm in 31 channels.

The platform of interest was on the second floor below the surface. The platform and the rail were fully sealed. The platform was the facing type, meaning persons wishing to go in one direction faced people who wished to go in the opposite direction. Subway trains ran from 5:15 am to 11:12 pm during weekdays and between 5:15 am and 12:17 am (the next day) during weekends. The average number of passengers travelling each day was 14,578 at the Yeongtong subway station. The flow of the passengers was not restricted due to COVID-19; however, masks were compulsory for travelling passengers during the study period. A PLC-based mechanical ventilation was used during operating hours. The efficiency of the ventilation system in removing the particulate matter was between 50 and 55% via capture-filtering using a medium filter.

This study considered the measurement of

P M_{10}

,

P M_{2.5}

and

P M_{1}

at the Yeongtong subway station from 22 October 2021 to 26 November 2021 and the measurement of

P M_{10}

,

P M_{2.5}

,

P M_{1}

,

N O_{2}

, and CO outside the subway station (within 500 m from the Yeongtong subway station) during the same period of time. The platform data were collected every six seconds. As a preprocessing step, the data were averaged to a 5 min interval for our analyses. Figure 2 shows the measurement trends of components both inside and outside the subway station.

Table 1 summarizes the basic statistics of the measured variables and data. Platform

P M_{10}

and

P M_{2.5}

were influenced by many inside and outside factors. A preliminary linear regression was performed to determine the correlation between the inside and outside variables. Figure 3 shows the correlation between platform

P M_{2.5}

and the variables.

As shown in Figure 3, platform

P M_{2.5}

and platform

P M_{10}

had a strong correlation. The information on CO and

N O_{2}

indicated more vehicular emission; consequently, it depicted an implicit relation with particulate matter. Analogously, particulate matter from the outside may have also infiltrated the inside of the subway, which was indicated by the correlation values of 0.41 and 0.39. Those variables that showed very low coefficients of correlation (<0.1) were dropped, and only those with

C O R R

values greater than 0.2 were considered for the forecast of platform

P M_{10}

and

P M_{2.5}

. Similarly, the linear regression test for platform

P M_{10}

and other variables is shown in Figure 4.

3.2. Preprocessing for Hybrid Deep Learning Framework

The data that are mentioned in the above section were preprocessed to remove the missing values or outliers obtained due to the malfunction or shock of the sensors. The data obtained from the Yeongtong subway station is of six-second intervals. In order to integrate inner and outside signals, the time scale was modified to five-minute intervals. The outside station data were collected at a one-hour frequency. However, they were converted to five-minute-interval data using spline interpolation. The data were then transformed for the feasibility of the sequential temporal model. The sampling was obtained in the time period

[t_{n} - Δ t, t_{n + k}]

, where

t_{n}

is the current time in the n^th sample;

Δ t

is called the window size, and it refers to one hour in the past from the current time (

t_{n}

);

t_{n + k}

is the ‘k^th’ time ahead in the future. In this study, it was half an hour ahead in the future. Figure 5 shows the past input data (feature data) and the prediction target (the label data).

As it can be seen, a larger window size (

Δ t

) included more features and a smaller sample size, whereas a smaller window size gave more samples but fewer features. In the dataset, we had 7242 sample data points for training and 1080 sample data points for testing, collected over a period form 22 October to 26 November 2021 on the Yeongtong subway platform and outside. The forecasting workflow of platform

P M_{10}

and

P M_{2.5}

is given in Figure 6.

3.3. Proposed Hybrid Deep Learning Framework

To build an efficient

P M_{10}

and

P M_{2.5}

prediction model, we propose a hybrid deep learning framework by integrating Conv1D with LSTM. Figure 7 shows the model structure of the proposed framework.

The proposed framework consists of an input layer, a convolution layer, an LSTM layer, a fully connected layer (DNN layer), and an output layer. The convolution layer learns the local features of the time-series sequence data using its convolutional operation. It shortens the length of time-series data and enhances the dependences among data. Each convolution layer has multiple filters, enabling it to learn more hidden features from the sequence data. The following LSTM block learns the long short-term dependencies in the sequence using the connection of memory cells. The subsequent fully connected layer maps the features into the sample space, while the output layer estimates the target PM value. The integration of the standalone framework with shared representation aids to build an effective time-series model that can learn intelligently from hybrid features. PM forecasting (

y_{p r e d}

) is denoted with function

‘ f ’

using nesting functions

F_{c o n v}, F_{l s t m}, F_{f c}

and the activation function, as shown in Equation (1).

y_{p r e d} = f = F_{f c} (F_{l s t m} (R e L U (B N (F_{c o n v} (X_{i n p u t})))))

(1)

The forward propagation of the proposed deep learning framework follows the equations below.

i_{t} = σ (W_{x i} * X_{t} + W_{h i} * ℋ_{t - 1} + W_{c i} \circ C_{t - 1} + b_{i})

(2)

f_{t} = σ (W_{x f} * X_{t} + W_{h f} * ℋ_{t - 1} + W_{c f} \circ C_{t - 1} + b_{f})

(3)

C_{t} = (f_{t} \circ C_{t - 1} + i_{t} \circ t a n h (W_{x c} * X_{t} + ℋ_{t - 1} + b_{i})

(4)

o_{t} = σ (W_{x o} * X_{t} + W_{h o} * ℋ_{t - 1} + W_{c o} \circ C_{t} + b_{o})

(5)

ℋ = o_{t} \circ t a n h (C_{t})

(6)

where

X_{1} \dots . X_{t}

are all the inputs, the cell outputs are

C_{1} \dots \dots C_{t}

, and

H_{1} \dots … H_{t}

are the hidden states of the proposed framework. ‘o’, denotes the Hadamard product, and ‘*’ is the convolutional operation. The discrepancy between the desired label,

‘ y_{t}^{’}

, and the output, ‘

o_{t}

’, is evaluated using an objective function across all the ‘T’ time steps, as given in Equation (7).

ℒ (x_{1}, \dots, x_{t}, y_{1}, \dots . y_{t}, w_{h}, w_{o}) = \frac{1}{T} \sum^{} l (y_{t}, o_{t})

(7)

As the backpropagation process, the gradient is computed with regard to the weight parameters, ‘w’, as shown in the equation below.

\frac{\partial L}{\partial w_{h}} = \frac{1}{T} \sum_{t = 1}^{T} \frac{\partial (y_{t}, o_{t})}{\partial w_{h}}

(8)

3.4. Comparisons with Existing Deep Learning Models

3.4.1. LSTM and Bidirectional LSTM

LSTM is a special form of RNN architecture proposed by Hochreiter and Schmidhuber [49]. The traditional DNN fails to properly handle the time-series data, as input and output variables are assumed to be independent of each other. The LSTM network is selected owing to its ability to learn short and long impacts from historical air quality data. It shows good performance in air quality prediction [50,51]. LSTM is capable of handling arbitrarily long sequences. Bidirectional LSTM is an upgraded version of LSTM given by Graves and Schmidhuber [52]. For the modeling process, it also considers the information in later time series. In order to show the effectiveness of the proposed framework, the prediction was compared with that obtained using LSTM and Bidirectional LSTM.

3.4.2. DNN and CNN

The DNN is a deep learning-based structure consisting of an input layer, hidden layers, and an output layer. The number of hidden layers is set by the user, and their main function is to transmit data from the input layer to the output layer. After the feed-forward step, the weights of each of the hidden layers are updated based on learning algorithms. We adopted ‘stochastic gradient descent’ for backpropagation. The parameters of this model, such as the number of hidden layers, learning rate, and momentum constant, were determined experimentally with the data. The used activation function was tanh with a dropout probability of 0.3, to prevent it from overfitting. The equations of the DNN were as shown below.

z_{i}^{l} = \sum_{i} w_{i, j}^{l} * x_{i}^{l - 1} + b_{j}^{l}

(9)

a^{l} = t a n h (z_{i}^{l})

(10)

where ‘w’ is the weight matrix, ‘x’ is the input vector, and ‘b’ is the bias.

As another comparison model, the CNN is successfully used in image classification and, more recently, in multivariate time-series data. It is capable of automatically extracting partial features from the data using the convolution operation. Convolutional computing was calculated as shown below.

y_{j}^{l} = \sum_{i} [x_{i}^{l - 1} * {w^{l}}_{i, j} + b_{j}^{l}]

(11)

x_{j}^{l} = R e L U (B N (y_{j}^{l}))

(12)

x_{k}^{l + 1} = F C (w_{k, j}^{l + 1} * x_{j}^{l} + b_{k}^{l + 1})

(13)

where * refers to the convolution operation, and

w_{I, j}^{l}

and

b_{j}^{l}

are the weights of filters and biases.

x_{i}^{l - 1}

and

y_{j}^{l}

represent the input and the output of the ‘

l

’ convolution layer. Each convolution layer is followed by a batch normalization and ReLU activation function.

4. Indoor Air Quality Forecasting and Comparison Analysis

In order to compare the forecasting performance, the RMSE (root mean square error), the MAE (mean absolute error), and

R^{2}

(coefficient of determination) were considered and were calculated using Equations (14)–(16), where

y_{t r u e}^{i} a n d y_{p r e d}^{i}

are the true and predicted values,

\bar{y}

is the average of the truth data, and ‘m’ is the number of test samples.

R M S E = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} {(y_{p r e d}^{i} - y_{t r u e}^{i})}^{2}}

(14)

M A E = \frac{1}{m} \sum_{i = 1}^{m} | y_{p r e d}^{i} - y_{t r u e}^{i} |

(15)

R^{2} = 1 - \frac{\sum_{i = 1}^{m} {(y_{p r e d}^{i} - y_{t r u e}^{i})}^{2}}{\sum_{i = 1}^{m} {(\bar{y} - y_{t r u e}^{i})}^{2}}

(16)

In this section, the performance of each mentioned stand-alone architecture is compared with that of the proposed framework (hybrid CNN-LSTM-DNN framework). For the comparisons,

P M_{10}

and

P M_{2.5}

at the platform were forecasted on the time scale of thirty minutes ahead. Then, past data of an hour from the target time were used to predict

P M_{10}

, and

P M_{2.5}

thirty minutes ahead. As explained in Section 3, the past data were averaged at five-minute intervals, giving twelve attributes for each of the input variables. The performance of each of the deep learning models was evaluated using the RMSE, the MAE, and R². Figure 8 shows the calculated and the measured

P M_{10}

values for the Yeongtong subway platform using different deep learning architectures. The prediction models were implemented using Matlab^® 2021Rb.

The results showed the superior performance of the proposed hybrid deep learning framework in terms of all the performance metrics (RMSE, MAE, and R²) as compared with the other standalone deep learning architectures. The prediction accuracy for platform

P M_{10}

was the highest in the case of the hybrid CNN-LSTM-DNN framework, as depicted by the highest R², 0.55, and the lowest RMSE and MAE values, 8.94 and 6.44, respectively (as shown in Table 2). Bidirectional LSTM performed well in the prediction of both platform

P M_{10}

and

P M_{2.5}

, with RMSE values of 9.8 and 11.95, respectively. The performance of the DNN with regard to the RMSE was good for platform

P M_{10}

but not so good for platform

P M_{2.5} .

A similar forecasting performance for the estimated platform

P M_{2.5}

and the measured platform

P M_{2.5}

is given in Figure 9.

Figure 10 shows the overall forecasting and the RMSE measures.

The variation pattern obtained showed that the forecasted data and the actual measurements were close when using the proposed hybrid deep learning framework. However, a little more variation in the measurements of platform

P M_{10}

was observed for all the models during peak hours (after the 220th data point), as shown with a red vertical line in Figure 10a. This variation in fluctuation was not very high for the hybrid deep learning framework as compared with the other frameworks. The RMSE and MAE for the prediction of platform

P M_{10}

were improved by 8.7% and 10% compared with the second-best deep learning framework, Bi-LSTM. Similarly, for the prediction of platform

P M_{2.5}

, the RMSE and MAE improved by 4% and 10%, respectively, with respect to the second-best deep learning-based framework, LSTM. It could be concluded that the proposed hybrid framework was well able to mimic the behavior of the measured platform

P M_{10}

. Thus, the estimated value of the forecasted platform

P M_{10}

served as a precursor to the incoming peak in the measured value. A similar trend was also observed for the comparison of the measured

P M_{2.5}

and the predicted platform

P M_{2.5}

, as shown in Figure 11.

5. Conclusions

The main highlights of this study are the integration of several deep learning methods into one, called hybrid CNN-LSTM-DNN framework, to make a prediction of

P M_{10}

and

P M_{2.5}

. The performance of the proposed model in terms of forecasting

P M_{10}

and

P M_{2.5}

was better than that of the reference models owing to its ability to capture temporal patterns and informative characteristics from the indoor and outdoor air quality parameters. The proposed hybrid deep learning framework yielded the best results, with an RMSE value of 8.94 and an MAE of 6.4.

The main contribution of this paper can be summarized as follows: The one-dimensional convolution operation filtered original sequence data and reduced their dimension. LSTM learned the long short-term dependencies and effectively built a predictive model. The proposed methodology highlighted the effectiveness of deep learning algorithms in treating the nonlinear, non-stationary time-series data for PM monitoring. A demonstration of the effectiveness of the proposed model was conducted by comparing it with other state-of-the-art deep learning techniques for forecasting platform

P M_{10}

and

P M_{2.5}

. The forecasting of future platform

P M_{10}

and

P M_{2.5}

could be used as a reference variable for the control system of subway ventilation, since there is a time delay to reduce the current PM levels in the air. This could help to more effectively protect passengers from harmful exposure to particulate matter. In other words, the predictive monitoring of

P M_{10}

and

P M_{2.5}

could help to develop early monitoring systems and regulate ventilation systems to maintain a sustainable indoor air quality index.

This paper could be further improved by incorporating more data, for example, geographical and meteorological data such as temperature, humidity, wind speed and direction, etc. It is expected that the addition of such factors could improve the forecasting performance of the proposed model. Lastly, the effectiveness of the model needs to be explored in case of scant data or sensor failure. Future studies should take into consideration all the issues listed above to develop a robust model for the prediction of platform

P M_{10}

and

P M_{2.5} .

Author Contributions

A.B. conceptualized the method and developed the methodologies; A.B. implemented the method; H.L. supported the data and validated the method and the implementation; H.L. supervised the overall research processes and wrote the manuscript; H.L. reviewed and edited the manuscript. S.S. helped with the data collection, organization, and paper assessment. D.P. funded the setup of the experiment, data collection, and article evaluation. All authors have read and agreed to the published version of the manuscript.

Funding

This research study was supported by The Basic Science Research Program through National Research Foundation of Korea (NRF), funded by the Ministry of Education, S. Korea (grant number: NRF-2021R1A2C1008647), and the living-laboratory-based, real-time bio surveillance and response platform project of the Ministry of Environment (ME22001, 2021003380006).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available from the author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Son, Y.S.; Oh, Y.H.; Choi, I.Y.; Dinh, T.V.; Chung, S.G.; Lee, J.H.; Park, D.; Kim, J.C. Development of a Magnetic Hybrid Filter to Reduce PM10 in a Subway Platform. J. Hazard. Mater. 2019, 368, 197–203. [Google Scholar] [CrossRef] [PubMed]
Park, Y.; Choi, Y.; Kim, K.; Yoo, J.K. Machine Learning Approach for Study on Subway Passenger Flow. Sci. Rep. 2022, 12, 1–20. [Google Scholar] [CrossRef] [PubMed]
Tariq, S.; Loy-Benitez, J.; Nam, K.J.; Lee, G.; Kim, M.J.; Park, D.S.; Yoo, C.K. Transfer Learning Driven Sequential Forecasting and Ventilation Control of PM2.5 Associated Health Risk Levels in Underground Public Facilities. J. Hazard. Mater. 2020, 406, 124753. [Google Scholar] [CrossRef] [PubMed]
Rounce, P.; Tsolakis, A.; York, A.P.E. Speciation of Particulate Matter and Hydrocarbon Emissions from Biodiesel Combustion and Its Reduction by Aftertreatment. Fuel 2012, 96, 90–99. [Google Scholar] [CrossRef]
United Nations Environment Programme (UNEP). Summary: Air Pollution in Asia and the Pacific: Science-Based Solutions Identifies; United Nations Environment Programme: Bangkok, Thailand, 2019. [Google Scholar]
Chen, Z.; Cui, L.; Cui, X.; Li, X.; Yu, K.; Yue, K.; Dai, Z.; Zhou, J.; Jia, G.; Zhang, J. The Association between High Ambient Air Pollution Exposure and Respiratory Health of Young Children: A Cross Sectional Study in Jinan, China. Sci. Total Environ. 2018, 656, 740–749. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Zhang, X.; Zhang, Z.; Han, L.; Gong, D.; Li, J.; Wang, T.; Wang, Y.; Gao, S.; Duan, H.; et al. Air Pollution Exposure and Immunological and Systemic Inflammatory Alterations among Schoolchildren in China. Sci. Total Environ. 2018, 657, 1304–1310. [Google Scholar] [CrossRef] [PubMed]
Ngoc, L.T.N.; Lee, Y.; Chun, H.-S.; Moon, J.-Y.; Choi, J.S.; Park, D.; Lee, Y.-C. Correlation of α/γ-Fe2O3 Nanoparticles with the Toxicity of Particulate Matter Originating from Subway Tunnels in Seoul Stations, Korea. J. Hazard. Mater. 2019, 382, 121175. [Google Scholar] [CrossRef]
Li, T.T.; Bai, Y.H.; Liu, Z.R.; Li, J.L. In-Train Air Quality Assessment of the Railway Transit System in Beijing: A Note. Transp. Res. Part D Transp. Environ. 2007, 12, 64–67. [Google Scholar] [CrossRef]
Van Ryswyk, K.; Anastasopolos, A.T.; Evans, G.; Sun, L.; Sabaliauskas, K.; Kulka, R.; Wallace, L.; Weichenthal, S. Metro Commuter Exposures to Particulate Air Pollution and PM_2.5-Associated Elements in Three Canadian Cities: The Urban Transportation Exposure Study. Environ. Sci. Technol. 2017, 51, 5713–5720. [Google Scholar] [CrossRef]
Park, D.U.; Ha, K.C. Characteristics of PM10, PM2.5, CO2 and CO Monitored in Interiors and Platforms of Subway Train in Seoul, Korea. Environ. Int. 2008, 34, 629–634. [Google Scholar] [CrossRef]
Johansson, C.; Johansson, P.Å. Particulate Matter in the Underground of Stockholm. Atmospheric Environ. 2003, 37, 3–9. [Google Scholar] [CrossRef]
Adams, H.S.; Nieuwenhuijsen, M.J.; Colvile, R.N. Determinants of Fine Particle (PM2.5) Personal Exposure Levels in Transport Microenvironments, London, UK. Atmospheric Environ. 2001, 35, 4557–4566. [Google Scholar] [CrossRef]
Liu, H.; Wang, X.; Zhang, J.; He, K.; Wu, Y.; Xu, J. Emission Controls and Changes in Air Quality in Guangzhou during the Asian Games. Atmospheric Environ. 2012, 76, 81–93. [Google Scholar] [CrossRef]
The World Bank Group. The World Bank Annual Report 2017: End Extreme Poverty, Boost Shared Prosperity; World Bank: Washington, DC, USA, 2017; pp. 1–87. [Google Scholar]
Marsik, T.; Johnson, R. HVAC Air-Quality Model and Its Use to Test a PM2.5 Control Strategy. Build. Environ. 2008, 43, 1850–1857. [Google Scholar] [CrossRef]
Kim, M.J.; Braatz, R.D.; Kim, J.T.; Yoo, C.K. Indoor Air Quality Control for Improving Passenger Health in Subway Platforms Using an Outdoor air Quality Dependent Ventilation System. Build. Environ. 2015, 92, 407–417. [Google Scholar] [CrossRef]
Wang, F.; Meng, D.; Li, X.; Tan, J. Indoor-Outdoor Relationships of PM2.5 in Four Residential Dwellings in Winter in the Yangtze River Delta, China. Environ. Pollut. 2016, 215, 280–289. [Google Scholar] [CrossRef]
Shrestha, P.M.; Humphrey, J.L.; Carlton, E.J.; Adgate, J.L.; Barton, K.E.; Root, E.D.; Miller, S.L. Impact of Outdoor Air Pollution on Indoor Air Quality in Low-Income Homes during Wildfire Seasons. Int. J. Environ. Res. Public Health 2019, 16, 3535. [Google Scholar] [CrossRef]
Hodzic, A.; Madronich, S.; Bonn, B.; Massie, S.; Menut, L.; Wiedinmyer, C. Wildfire Particulate Matter in Europe during Summer 2003: Meso-Scale Modeling of Smoke Emissions, Transport and Radiative Effects. Atmospheric Chem. Phys. 2007, 7, 4043–4064. [Google Scholar] [CrossRef]
McMeeking, G.R.; Kreidenweis, S.M.; Lunden, M.; Carrillo, J.; Carrico, C.M.; Lee, T.; Herckes, P.; Engling, G.; Day, D.E.; Hand, J.; et al. Smoke-Impacted Regional Haze in California during the Summer of 2002. Agric. For. Meteorol. 2006, 137, 25–42. [Google Scholar] [CrossRef]
Wang, Y.; Liu, C.G.; Wang, Q.; Qin, Q.; Ren, H.; Cao, J. Impacts of Natural and Socioeconomic Factors on PM2.5 from 2014 to 2017. J. Environ. Manag. 2021, 284, 112071. [Google Scholar] [CrossRef]
Tariq, S.; Loy-Benitez, J.; Nam, K.J.; Heo, S.; Yoo, C.K. Energy-Efficient Time-Delay Compensated Ventilation Control System for Sustainable Subway Air Quality Management under Various Outdoor Conditions. Build. Environ. 2020, 174, 106775. [Google Scholar] [CrossRef]
Yang, Z.; Wang, J. A New Air Quality Monitoring and Early Warning System: Air Quality Assessment and Air Pollutant Concentration Prediction. Environ. Res. 2017, 158, 105–117. [Google Scholar] [CrossRef] [PubMed]
Park, S.; Kim, M.; Kim, M.; Namgung, H.G.; Kim, K.T.; Cho, K.H.; Kwon, S.B. Predicting PM10 Concentration in Seoul Metropolitan Subway Stations Using Artificial Neural Network (ANN). J. Hazard. Mater. 2018, 341, 75–82. [Google Scholar] [CrossRef] [PubMed]
Cashikar, A.; Li, J.; Biswas, P. Particulate Matter Sensors Mounted on a Robot for Environmental Aerosol Measurements. J. Environ. Eng. 2019, 145, 04019057. [Google Scholar] [CrossRef]
Srinivas, C.V.; Subramanian, V.; Kumar, A.; Usha, P.; Sujatha, N.; Singh, A.B.; Rakesh, P.T.; Baskaran, R.; Venkatraman, B. Modeling of Atmospheric Dispersion of Sodium Fire Aerosols for Environmental Impact Analysis during Accidental Leaks. J. Aerosol Sci. 2019, 137, 105432. [Google Scholar] [CrossRef]
Nsir, K.; Sartelet, K.; Bresson, R.; Genon, L.M. Three-Dimensional Computational Fluid Dynamics Modelling of Sodium Oxide Aerosol Atmospheric Dispersion from Indoor Sodium Fire. J. Aerosol Sci. 2019, 137, 105433. [Google Scholar] [CrossRef]
Periáñez, R.; Thiessen, K.M.; Chouhan, S.L.; Mancini, F.; Navarro, E.; Sdouz, G.; Trifunović, D. Mid-Range Atmospheric Dispersion Modelling. Intercomparison of Simple Models in EMRAS-2 Project. J. Environ. Radioact. 2016, 162, 225–234. [Google Scholar] [CrossRef] [PubMed]
Jian, L.; Zhao, Y.; Zhu, Y.-P.; Zhang, M.-B.; Bertolatti, D. An Application of ARIMA Model to Predict Submicron Particle Concentrations from Meteorological Factors at a Busy Roadside in Hangzhou, China. Sci. Total Environ. 2012, 426, 336–345. [Google Scholar] [CrossRef] [PubMed]
Slini, T.; Karatzas, K.; Moussiopoulos, N. Statistical Analysis of Environmental Data as the Basis of Forecasting: An Air Quality Application. Sci. Total Environ. 2002, 288, 227–237. [Google Scholar] [CrossRef]
Suleiman, A.; Tight, M.R.; Quinn, A.D. Applying Machine Learning Methods in Managing Urban Concentrations of Traffic-Related Particulate Matter (PM10 and PM2.5). Atmos. Pollut. Res. 2018, 10, 134–144. [Google Scholar] [CrossRef]
Osowski, S.; Garanty, K. Forecasting of the Daily Meteorological Pollution Using Wavelets and Support Vector Machine. Eng. Appl. Artif. Intell. 2007, 20, 745–755. [Google Scholar] [CrossRef]
Chang, H.; Lee, Y.; Yoon, B.; Baek, S. Dynamic Near-Term Traffic Flow Prediction: System-Oriented Approach Based on Past Experiences. IET Intell. Transp. Syst. 2012, 6, 292–305. [Google Scholar] [CrossRef]
Neagu, C.D.; Avouris, N.; Kalapanidas, E.; Palade, V. Neural and Neuro-Fuzzy Integration in a Knowledge-Based System for Air Quality Prediction. Appl. Intell. 2002, 17, 141–169. [Google Scholar] [CrossRef]
Alimissis, A.; Philippopoulos, K.; Tzanis, C.G.; Deligiorgi, D. Spatial Estimation of Urban Air Pollution with the Use of Artificial Neural Network Models. Atmos. Environ. 2018, 191, 205–213. [Google Scholar] [CrossRef]
Goulier, L.; Paas, B.; Ehrnsperger, L.; Klemm, O. Modelling of Urban Air Pollutant Concentrations with Artificial Neural Networks Using Novel Input Variables. Int. J. Environ. Res. Public Health 2020, 17, 2025. [Google Scholar] [CrossRef]
Elbayoumi, M.; Ramli, N.A.; Yusof, N.F.F.M. Development and Comparison of Regression Models and Feedforward Backpropagation Neural Network Models to Predict Seasonal Indoor PM2.5–10 and PM2.5 Concentrations in Naturally Ventilated Schools. Atmospheric Pollut. Res. 2015, 6, 1013–1023. [Google Scholar] [CrossRef]
Ayturan, Y.A.; Ayturan, Z.C.; Altun, H.O. Air Pollution Modelling with Deep Learning: A Review. Int. J. Enironmental Pollut. Environ. Model. 2018, 1, 58. [Google Scholar]
Bakht, A.; Lee, H. Deep Learning Framework for Spatial Crowdedness Estimation and Comparison Analysis with Machine Learning. J. Korean Inst. Intell. Syst. 2022, 32, 76–85. [Google Scholar] [CrossRef]
Bakht, A.; Nawaz, A.; Lee, M.; Lee, H. Hybrid Multi-Stream Deep Learning-Based Nutrient Estimation Framework in Biological Wastewater Treatement. J. Korean Inst. Intell. Syst. 2022, 32, 209–217. [Google Scholar]
Loy-Benitez, J.; Li, Q.; Ifaei, P.; Nam, K.; Heo, S.K.; Yoo, C. A Dynamic Gain-Scheduled Ventilation Control System for a Subway Station Based on Outdoor Air Quality Conditions. Build. Environ. 2018, 144, 159–170. [Google Scholar] [CrossRef]
Man, Y.; Hu, Y.; Ren, J. Forecasting COD Load in Municipal Sewage Based on ARMA and VAR Algorithms. Resour. Conserv. Recycl. 2019, 144, 56–64. [Google Scholar] [CrossRef]
Qi, Y.; Li, Q.; Karimian, H.; Liu, D. A Hybrid Model for Spatiotemporal Forecasting of PM2.5 Based on Graph Convolutional Neural Network and Long Short-Term Memory. Sci. Total Environ. 2019, 664, 1–10. [Google Scholar] [CrossRef] [PubMed]
Du, S.; Li, T.; Yang, Y.; Horng, S.J. Deep Air Quality Forecasting Using Hybrid Deep Learning Framework. IEEE Trans. Knowl. Data Eng. 2021, 33, 2412–2424. [Google Scholar] [CrossRef]
Rahmadani, F.; Lee, H. Hybrid Deep Learning-Based Epidemic Prediction Framework of COVID-19: South Korea Case. Appl. Sci. 2020, 10, 8539. [Google Scholar] [CrossRef]
Lee, H.; Han, S.-Y.; Park, K.; Lee, H.; Kwon, T. Real-Time Hybrid Deep Learning-Based Train Running Safety Prediction Framework of Railway Vehicle. Machines 2021, 9, 130. [Google Scholar] [CrossRef]
Yang, D.; Wang, J.; Yan, X.; Liu, H. Subway Air Quality Modeling Using Improved Deep Learning Framework. Process Saf. Environ. Prot. 2022, 163, 487–497. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Zhou, Y.; Chang, F.-J.; Chang, L.-C.; Kao, I.-F.; Wang, Y.-S. Explore a Deep Learning Multi-Output Neural Network for Regional Multi-Step-Ahead Air Quality Forecasts. J. Clean. Prod. 2018, 209, 134–145. [Google Scholar] [CrossRef]
Wu, Q.; Lin, H. A Novel Optimal-Hybrid Model for Daily Air Quality Index Prediction Considering Air Pollutant Factors. Sci. Total Environ. 2019, 683, 808–821. [Google Scholar] [CrossRef]
Graves, A.; Schmidhuber, J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Neural Netw. 2005, 18, 602–610. [Google Scholar] [CrossRef]

Figure 1. Spectrometer (Model 11-A) for detecting airborne particles at Yeongtong subway station.

Figure 2. Variation in the input variables over time on the platform and outside the subway station.

Figure 3. Correlation analysis of

P M_{2.5}

with other measured variables.

Figure 3. Correlation analysis of

P M_{2.5}

with other measured variables.

Figure 4. Correlation analysis of

P M_{10}

with other measured variables.

Figure 4. Correlation analysis of

P M_{10}

with other measured variables.

Figure 5. Time-series samples for the forecasting of platform

P M_{10}

and

P M_{2.5}

.

Figure 5. Time-series samples for the forecasting of platform

P M_{10}

and

P M_{2.5}

.

Figure 6. Workflow of the forecast of platform

P M_{10}

and

P M_{2.5}

on the Yeongtong subway platform using hybrid CNN-LSTM-DNN and other deep learning-based architectures.

Figure 6. Workflow of the forecast of platform

P M_{10}

and

P M_{2.5}

on the Yeongtong subway platform using hybrid CNN-LSTM-DNN and other deep learning-based architectures.

Figure 7. The structure of the proposed hybrid Conv-LSTM-DNN framework.

Figure 8. R² comparisons for the calculated platform

P M_{10}

and measured platform

P M_{10}

.

Figure 8. R² comparisons for the calculated platform

P M_{10}

and measured platform

P M_{10}

.

Figure 9. R² comparisons for the calculated platform

P M_{2.5}

and measured platform

P M_{2.5}

.

Figure 9. R² comparisons for the calculated platform

P M_{2.5}

and measured platform

P M_{2.5}

.

Figure 10. Half-an-hour-ahead forecasting results for platform

P M_{10}

for different deep learning models. (a) Hybrid deep learning framework (the proposed model), (b) BiLSTM, (c) DNN, (d) LSTM, (e) RNN, and (f) CNN.

Figure 10. Half-an-hour-ahead forecasting results for platform

P M_{10}

for different deep learning models. (a) Hybrid deep learning framework (the proposed model), (b) BiLSTM, (c) DNN, (d) LSTM, (e) RNN, and (f) CNN.

Figure 11. Half-an-hour-ahead forecasting results for platform

P M_{2.5}

using different deep learning models. (a) Hybrid deep learning framework (the proposed model), (b) BiLSTM, (c) DNN, (d) LSTM, (e) RNN, and (f) CNN.

Figure 11. Half-an-hour-ahead forecasting results for platform

P M_{2.5}

using different deep learning models. (a) Hybrid deep learning framework (the proposed model), (b) BiLSTM, (c) DNN, (d) LSTM, (e) RNN, and (f) CNN.

Table 1. Basic statistics of the measured variables at the Yeongtong subway station and outside (22 October 26 to November 2021).

Item	$Platform P M_{10}$	$Platform P M_{2.5}$	$Platform P M_{1}$	$Outside P M_{10}$	$Outside P M_{2.5}$	$Outside N O_{2}$	$Outside C O$
Item	(µg/m³)	(µg/m³)	(µg/m³)	(µg/m³)	(µg/m³)	(ppm)	(ppm)
Minimum	1.93	1.89	1.27	1.98	0.90	0.01	0.19
Maximum	260.24	145.97	126.36	184.64	114.83	0.08	1.70
Mean	32.95	26.95	22.37	43.86	24.42	0.03	0.62
Standard Deviation	23.51	20.90	18.54	26.65	18.13	0.01	0.24

Table 2. Forecasting performance for platform

P M_{10}

and

P M_{2.5}

.

Table 2. Forecasting performance for platform

P M_{10}

and

P M_{2.5}

.

Comparison Model	$Platform P M_{10}$			$Platform P M_{2.5}$
Comparison Model	RMSE	MAE	R²	RMSE	MAE	R²
Hybrid Deep learning framework (proposed)	8.94	6.44	0.55	10.1	6.81	0.35
BILSTM	9.8	7.15	0.4	11.95	7.99	0.23
DNN	9.93	6.37	0.37	12.83	7.33	0.31
LSTM	10.8	7.89	0.41	10.51	7.55	0.34
RNN	10.98	7.93	0.33	12.62	8.08	0.1
CNN	15.64	10.41	0.15	19.04	11.89	0

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bakht, A.; Sharma, S.; Park, D.; Lee, H. Deep Learning-Based Indoor Air Quality Forecasting Framework for Indoor Subway Station Platforms. Toxics 2022, 10, 557. https://doi.org/10.3390/toxics10100557

AMA Style

Bakht A, Sharma S, Park D, Lee H. Deep Learning-Based Indoor Air Quality Forecasting Framework for Indoor Subway Station Platforms. Toxics. 2022; 10(10):557. https://doi.org/10.3390/toxics10100557

Chicago/Turabian Style

Bakht, Ahtesham, Shambhavi Sharma, Duckshin Park, and Hyunsoo Lee. 2022. "Deep Learning-Based Indoor Air Quality Forecasting Framework for Indoor Subway Station Platforms" Toxics 10, no. 10: 557. https://doi.org/10.3390/toxics10100557

APA Style

Bakht, A., Sharma, S., Park, D., & Lee, H. (2022). Deep Learning-Based Indoor Air Quality Forecasting Framework for Indoor Subway Station Platforms. Toxics, 10(10), 557. https://doi.org/10.3390/toxics10100557

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Based Indoor Air Quality Forecasting Framework for Indoor Subway Station Platforms

Abstract

1. Introduction

2. Background and Literature Review

3. Hybrid CNN-LSTM Framework for Forecasting Indoor Subway Air Quality

3.1. Data and Preliminary Information

3.2. Preprocessing for Hybrid Deep Learning Framework

3.3. Proposed Hybrid Deep Learning Framework

3.4. Comparisons with Existing Deep Learning Models

3.4.1. LSTM and Bidirectional LSTM

3.4.2. DNN and CNN

4. Indoor Air Quality Forecasting and Comparison Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI